I have spent the last hour trying to look for a solution to rate limit my api.
I want to limit a path /users for example. But most rate limits work on 1 rate limit for everyone. I want to use api keys that can be generated by a user. People can generate free api let's say 1000 requests per day. Then if they pay some money they can get 5000 requests per day.
I would like to store these api keys in a mysql database.
Does anyone have any solution for this?
One way to structure your project would be:
user_keys table, includes the api key, the user, time of creation and number of uses so far.
When a user tries to generate a key, check that one doesn't exist yet, and add it to the DB.
When a request arrives, check if the key exists, if it does, do the following:
1: if it has been 24 hours since creation date, set number of uses to 0
2: increment the uses count
if you find the API key and it's at 1k the user reached his limit.
This is a basic implementation, and isn't very efficient, you'll want to cache the API keys in memory, either just in a hashmap in nodejs or using memcached/redis. But, get it working first before trying to optimize it.
EDIT: some code examples
//overly simple in memory cache
const apiKeys = {}
//one day's worth in milliseconds, used later on
const oneDayTime = 1000 * 60 * 60 * 24
//function to generate new API keys
function generateKey(user) {
if (apiKeys[user]) {
throw Error("user already has a key")
}
let key = makeRandomHash(); // just some function that creates a random string like "H#4/&DA23#$X/"
//share object so it can be reached by either key or user
//terrible idea, but when you save this in mysql you can just do a normal search query
apiKeys[user] = {
key: key,
user: user,
checked: Date.Now(),
uses: 0
}
apiKeys[key] = apiKeys[user]
}
// a function that does all the key verification for us
function isValid(key) {
//check if key even exists first
if (!apiKeys[key]) throw Error("invalid key")
//if it's been a whole day since it was last checked, reset its uses
if (Date.now() - apiKeys[key].checked >= oneDayTime) {
apiKeys[key].uses = 0
apiKeys[key].checked = Date.now()
}
//check if the user limit cap is reached
if (apiKeys[key].uses >= 1000) throw error("User daily qouta reached");
//increment the user's count and exit the function without errors
apiKeys[key].uses++;
}
//express middleware function
function limiter(req, res, next) {
try {
// get the API key, can be anywhere, part of json or in the header or even get query
let key = req.body["api_key"]
// if key is not valid, it will error out
isValid(key)
// pass on to the next function if there were no errors
next()
} catch (e) {
req.send(e)
}
}
this is an overly simplified implementation of a simpler idea, but I hope it gets the idea across.
the main thing you want to change here is how the API keys are saved and retrieved
Related
I'm trying to implement an HTTP event streaming server using MySQL where Users are able to append an event to a stream (a MySQL table) and also define the expected sequence number of the event.
The logic is somewhat simple:
Open transaction
get the next sequence number in the table to insert
verify if the next sequence number matches the expected(if supplied)
insert in database
Here's my code:
public async append(
data: any = {},
expectedSeq?: number
): Promise<void> {
let published_at = $date.create();
try {
await $mysql.transaction(async trx => {
let max = await trx(this.table)
.max({
seq: "seq",
})
.first();
if (!max) {
throw $error.InternalError(`unexpected mysql response`);
}
let next = (max.seq || 0) + 1;
if (expectedSeq && expectedSeq !== next) {
throw $error.ExpectationFailed(
`expected seq does not match current seq`
);
}
await trx(this.table).insert({
published_at,
seq: next,
data: $json.stringify(data),
});
});
} catch (err) {
if (err.code === "ER_DUP_ENTRY") {
return this.append(data, expectedSeq);
}
throw err;
}
}
My problem is this is extremely slow since there are race conditions between parallel requests to append to the same stream.. my laptop inserts/second on one stream went from ~1k to ~75.
Any pointers/suggestions on how to optimize this logic?
CONCLUSION
After consideration from comments, I decided to go with auto increment and reset the auto_increment only if there's an error. It yields around the same writes/sec with expectedSeq but much higher rate if ordering is not required.
Here's the solution:
public async append(data: any = {}, expectedSeq?: number): Promise<Event> {
if (!$validator.validate(data, this.schema)) {
throw $error.ValidationFailed("validation failed for event data");
}
let published_at = $date.create();
try {
let seq = await $mysql.transaction(async _trx => {
let result = (await _trx(this.table).insert({
published_at,
data: $json.stringify(data),
})).shift();
if (!result) {
throw $error.InternalError(`unexpected mysql response`);
}
if (expectedSeq && expectedSeq !== result) {
throw $error.ExpectationFailed(
`expected seq ${expectedSeq} but got ${result}`
);
}
return result;
});
return eventFactory(this.topic, seq, published_at, data);
} catch (err) {
await $mysql.raw(`ALTER TABLE ${this.table} auto_increment = ${this.seqStart}`);
throw err;
}
}
Why does the web page need to provide the sequence number? That is just a recipe for messy code, perhaps even messier than what you sketched out. Simply let the auto_increment value be returned to the User.
INSERT ...;
SELECT LAST_INSERT_ID(); -- session-specific, so no need for transaction.
Return that value to user.
Why not use Apache Kafka, It does all of this natively. With the easy answer out of the way, optimization is always tricky with partial information, however I think you've given us one hint that might enable a suggestion. You said without the order clause it performs much faster, which means that getting the max value is what is taking so long. That tells me a few things, first this value is not the clustered index (which is good news), second that you probably do not have sufficient index support (also good news since it's fixable by creating an index on this column, and sorting the index desc). This sounds like a table with millions or billions of rows in it, and this particular column has no guaranteed order, without the right indexing you could be doing a table scan between inserts to get the max value.
Why not use a GUID for your primary key instead of an auto-incremented integer? Then your client could generate the key and would also be able to insert it every time for sure.
Batch inserts versus singleton inserts
Your latency/performance problem is due to a batch size of 1 - as each send to the the database requires multiple round trips to the rdbms. Rather than inserting one row at a time, with a commit and verification after each row, you should rewrite your code to issue batch sizes of 100 or 1000 at a time, inserting n rows and verifying per batch rather than per row. If the batch insert fails, you can retry one row at a time.
I am working on an app that requires a sync to the server after logging in to get all the activities the user has created and saved to the server. Currently, when the user logs in a getActivity() function that makes an API request and returns a response which is then handled.
Say the user has 4 activities saved on the server in this order (The order is determined by the time of the activity being created / saved) ;
Test
Bob
cvb
Testing
looking at the JSONHandler.getActivityResponse , it appears as though the the results are in the correct order. If the request was successful, on the home page where these activities are to be displayed, I currently loop through them like so;
WebAPIHandler.shared.getActivityRequest(completion:
{
success, results in DispatchQueue.main.async {
if(success)
{
for _ in (results)!
{
guard let managedObjectContext = self.managedObjectContext else { return }
let activity = Activity(context: managedObjectContext)
activity.name = results![WebAPIHandler.shared.idCount].name
print("activity name is - \(activity.name)")
WebAPIHandler.shared.idCount += 1
}
}
And the print within the for loop is also outputting in the expected order;
activity name is - Optional("Test")
activity name is - Optional("Bob")
activity name is - Optional("cvb")
activity name is - Optional("Testing")
The CollectionView does then insert new cells, but it seemingly in the wrong order. I'm using a carousel layout on the home page, and the 'cvb' object for example is appearing first in the list, and 'bob' is third in the list. I am using the following
func controller(_ controller: NSFetchedResultsController<NSFetchRequestResult>, didChange anObject: Any, at indexPath: IndexPath?, for type: NSFetchedResultsChangeType, newIndexPath: IndexPath?)
{
switch (type)
{
case .insert:
if var indexPath = newIndexPath
{
// var itemCount = 0
// var arrayWithIndexPaths: [IndexPath] = []
//
// for _ in 0..<(WebAPIHandler.shared.idCount)
// {
// itemCount += 1
//
// arrayWithIndexPaths.append(IndexPath(item: itemCount - 1, section: 0))
// print("itemCount = \(itemCount)")
// }
print("Insert object")
// walkThroughCollectionView.insertItems(at: arrayWithIndexPaths)
walkThroughCollectionView.reloadData()
}
You can see why I've tried to use collectionView.insertItems() but that would cause an error stating:
Invalid update: invalid number of items in section 0. The number of items contained in an existing section after the update (4) must be equal to the number of items contained in that section before the update (4), plus or minus the number of items inserted or deleted from that section (4 inserted, 0 deleted)
I saw a lot of other answers mentioning how reloadData() would fix the issue, but I'm real stuck at this point. I've been using swift for several months now, and this has been the first time I'm truly at a loss. What I also realised is that the order displayed in the carousel is also different to a separate viewController which is passed the same data. I just have no idea why the results return in the correct order, but are then displayed in an incorrect order. Is there a way to sort data in the collectionView after calling reloadData() or am I looking at this from the wrong angle?
Any help would be much appreciated, cheers!
The order of the collection view is specified by the sort descriptor(s) of the fetched results controller.
Usually the workflow of inserting a new NSManagedObject is
Insert the new object into the managed object context.
Save the context. This calls the delegate methods controllerWillChangeContent, controller(:didChange:at: etc.
In controller(:didChange:at: insert the cell into the collection view with insertItems(at:, nothing else. Do not call reloadData() in this method.
I am trying to automate API requests using postman. So first in POST request I wrote a test to store all created IDs in Environment : Which is passing correct.
var jsondata = JSON.parse(responseBody);
tests["Status code is 201"] = responseCode.code === 201;
postman.setEnvironmentVariable("BrandID", jsondata.brand_id);
Then in Delete request I call my Environment in my url like /{{BrandID}} but it is deleting only the last record. So my guess is that environment is keeping only the last ID? What must I do to keep all IDs?
Each time you call your POST request, you overwrite your environment variable
So you can only delete the last one.
In order to process multiple ids, you shall build an array by adding new id at each call
You may proceed as follows in your POST request
my_array = postman.getEnvironmentVariable("BrandID");
if (my_array === undefined) // first time
{
postman.setEnvironmentVariable("BrandID", jsondata.brand_id); // creates your env var with first brand id
}
else
{
postman.setEnvironmentVariable("BrandID", array + "," + jsondata.brand_id); // updates your env var with next brand id
}
You should end up having an environment variable like BrandId = "brand_id1, brand_id2, etc..."
Then when you delete it, you delete the complete array (but that depends on your delete API)
I guess there may be cleaner ways to do so, but I'm not an expert in Postman nor Javascript, though that should work (at least for the environment variable creation).
Alexandre
I am writing a simple API which I have been testing using Postman. I have an auto incrementing primary key (CustomerTypeID) for my "customer_type" table stored in MySQL. For practical reasons I need to be able to create records in this table without sending a CustomerTypeID. When I send the following POST request using Postman:
{
"CustomerType": "testing"
}
The updated table shows a new row with CustomerTypeID of 2 and a CustomerType of NULL.
Below is a snippet of code in my Express API which shows this specific query and how the routing for this POST request works.
var db = require('../dbconnection.js');
var CustomerType = {
addCustomerType:function(CustomerType,callback) {
return db.query("INSERT INTO customer_type (CustomerType) VALUES (?)", [CustomerType.CustomerType], callback);
}
};
module.exports = CustomerType;
I know that I could change the query to say
INSERT INTO customer_type (CustomerTypeID, CustomerType) VALUES (?,?);
and that would fill both columns. But, I do not know how to leave out the CustomerTypeID column as it will be a number that the end user will have no way of knowing.
It turns out that the syntax for the last query I gave is correct. I used the same POST request as before:
{
"CustomerType": "testing"
}
And by using the SQL query that includes CustomerTypeID, MySQL knew to just increment the value of CustomerTypeID since it was not given a value in the POST request. When I ran the same POST again with this query I received a new row with both a CustomerTypeID and a CustomerType.
Here is coding from Couchbase Document and I dont understand it
function(key, values, rereduce) {
var result = {total: 0, count: 0};
for(i=0; i < values.length; i++) {
if(rereduce) {
result.total = result.total + values[i].total;
result.count = result.count + values[i].count;
} else {
result.total = sum(values);
result.count = values.length;
}
}
return(result);
}
rereduce means the current function call has already done the reduce or not. right?
the first argument of the reduce function, key, when will it be used? I saw a numbers of examples, key seems to be unused
When does rereduce return true and the array size is more than 1?
Again, When does rereduce return is false and the array size is more than 1?
Rereduce means that the reduce function is called before and now it is called again with params that were returnd as a result in first reduce call. So if we devide it into two functions it will look like:
function reduce(k,v){
// ... doing something with map results
// instead of returning result we must call rereduce function)
rereduce(null, result)
}
function rereduce(k,v){
// do something with first reduce result
}
In most cases rereduce will happen when you have 2 or more servers in cluster or you have a lot of items in your database and the calculation is done on multiple "nodes" of the B*Tree. Example with 2 servers will be easier to understand:
Let's imagine that your map function returned pairs: [key1-1, key2-2, key6-6] from 1st server and [key5-5,key7-7] from 2nd. You'll get 2 reduce function calls with:
reduce([key1,key2,key6],[1,2,6],false) and reduce([key5,key7],[5,7],false). Then if we just return values (do nothing in reduce, just return values), the reduce function will be called with such params: reduce(null, [[1,2,6],[5,7]], true). Here values will be an array of results that came from first reduce calls.
On rereduce key will be null. Values will be an array of values as returned by a previous reduce() function.
Array size depends only on your data. It not depends on rereduce variable. Same answer for 4th question.
You can just try to run examples from Views basics and Views with reduce. I.e. you can modify reduce function to see what it returns on each step:
function reduce(k,v,r){
if (!r){
// let reduce function return only one value:
return 1;
} else {
// and lets see what values have came in "rereduce"
return v;
}
}
I am also confused by the example from the official couchbase website as well, and below is what i thought.
confusion: the reduce method signature
1) its written as
function (keys, values, rereduce)
2) its written as function(key, values, rereduce)
What exactly is the first param, key or keys
For all my understand from my previous exp on the map/reduce, the key the key that emit from the map function and there is a hidden shuffle method that will aggregate the value into a value list for the same key.
So the key param can be an array under the circumstances that you emit an array as key (which you can use group by level control the level of aggregation)
So i am not agree with the example that given by #m03geek, it should not be a list of different keys, correct me if i am wrong.
My assumption:
Both reduce and rereduce work on the SAME key only.
eg:
reduce is like:
1)reduce(keyA, [1,2,3]) this is precalculated, and stored in Btree structure
2) rereduce(keyA, [6, reduce(keyA, [4,5,6])]), 6 is the sum of [1,2,3] from the first reduce method, then we add a new doc into couchbase, which will trigger the reduce method again, instead of calculating the whole thing again as the original map/reduce will do, couchbase get the precalculated data out from the btree which is 6, and run reduce from the key-value pairs from the map method (which is triggered by adding a new doc), then run re-reduce on the precalculated value + new value.