Trying to create a local copy of our google drive with rclone bringing down all the files, constantly hitting rate limits - google-drive-api

As the title states, I'm trying to create a local copy of our entire google drive, we currently are using it as a file storage service which is obviously not the best use-case, but to migrate else where I of course need to get all the files, the entire google drive is around 800gb~.
I am using rclone specifically the copy command to copy the files FROM google drive TO the local server, however I am constantly running into user Rate Limit errors.
I am using a google service account to authenticate this as well, which I believe should provide more usage limits.
2021/11/22 07:39:50 DEBUG : pacer: low level retry 1/10 (error googleapi: Error 403: User
Rate Limit Exceeded. Rate of requests for user exceed configured project quota. You may
consider re-evaluating expected per-user traffic to the API and adjust project quota
limits accordingly. You may monitor aggregate quota usage and adjust limits in the API
Console: https://console.developers.google.com/apis/api/drive.googleapis.com/quotas?
project=, userRateLimitExceeded)
But I don't really understand since according to my usage it is not even coming close, I am just wondering what exactly can I do to either increase my rate limit (even if that means paying) or is there some sort of solution to this issue? Thanks

Error 403: User
Rate Limit Exceeded. Rate of requests for user exceed configured project quota.
User rate limit is the number of requests your user is making per second. its basically flood protection. You are flooding the server. It is unclear how google calculates this, beyond the 100 requests per user per second. If you are getting the error there is really nothing you can do besides slow down your code. Its also unclear from your question how you are running these requests.
If you could include the code we could see how the requests are being preformed. How ever as you state you are using something called rclone so there is no way of knowing how that works.
Your only option would be to slow your code down if you have any control over that though this third party application. If not you may want to contact the owner of the product for direction as to how to fix it.

Related

Is it better to use cache or use a spreadhsheet is google apps script?

I'm writing google apps script web apps that share data between end users who are using the app at the same time.
I can write the data to a spreadsheet and allow others to read it, or put the data into script cache.
Either way I need a server call. The data is not large....I was just wondering if cache was more server efficient/ faster / better practice ?
Thanks
If you use the Cache Service the max time to live for data in a key 6 hours if you set the expiration, otherwise it lives in the cache for 10 mins. And also the maximum length of a key is 250 characters.
So it really depends on the architecture of your app, but using sheets as a database only perhaps isn't the best solution either, although it may be convenient in many cases.

Google Drive API - Service account storage limitation - Alternatives

I'm developing a web application that's going to start with 200gb of data to be storaged. Over the years, the same application possibly can reach 1tb, perhaps 2tb in 5 years.
What I want from this application is the clients to upload files to the server and the server then upload files do Google Drive, persisting the webviewlink on database. It's working this way on localhost.
I know two options for authentication for Google Drive API: client account and service account.
Service Account's option fits better for me because I want the server to have control of the files, not the client have control.
But Service Account can storage too few data and the storage limit can't be increased. The limit is something around 15gb I guess, not sure.
If the Service Account will not help me, what options would I have to storage 2tb of data or more? Should I find another way to storage the files?
I'd like to stay using Google. If there's not any option using Google Drive API, please, suggest anything else for this scenario.
You have a couple of options.
Use a regular account instead of a Service Account. You will still need to pay for the storage, but it will work and you'll have everything in a single account. From your question "I want the server to have control of the files, not the client have control" I suspect you have looked at the OAuth quickstart examples and concluded that only end users can grant access. That's not the case. It's perfectly valid, and really quite simple, for your server app to grant access to an account it controls. See How do I authorise an app (web or installed) without user intervention? for how to do this.
Use multiple Service Accounts and shard your data across them. The various accounts could all share their folders to a kinda master account which would then have a coherent view of the entire corpus.
Personally I'd go with option 1 because it's the easiest to set up and manage.
Either way, make sure you understand how Google will want to charge you for the storage. For example, although each Service Account has a free quota, it is ultimately owned by the regular user that created it and the standard user quota limits and charges probably apply to that user.

Google Drive API Historical quota data

I had an interesting experience with my Google account quota, which could be solved with access to the following information (if it exists!). Basically, can I get my Google account quota data WITH a time stamp? In other words, can I determine how much memory my account was allocated on a certain date?
Context: I had 122 GB allocated because I was paying for 100GB extra memory. At some point I cleared the memory to get back down to under 22GB used, so I cancelled the 100GB extra memory subscription. Upon doing so, my total allocated storage dropped to 17GB. When I asked why the support simply said 15GB+2GB for the security update (from 2016). And they said there is no historical record so they couldn't verify the 122GB number, and I do not know how I ended up with 20GB of memory +2 for the security update. I also didn't think about taking screenshots with these numbers before cancelling the subscription because this is Google we're talking about and I figured they had all this kind of data in case something went wrong. I've had a Google account since 2005 so I'm thinking maybe at some point I picked up a 5GB bonus somehow.
Anyway, I'm a data scientist so even if there's a complicated way to get this kind of info I'd be interested in hearing what kind of data is available that might verify my account of the memory allocation.
Maybe this post from Google Drive Help Forum may help you.
The first thing to note is that when you delete or remove files from
your google drive storage quota, you will want to empty your trash in
order to free up that space. If you have orphaned files, you can find
them In the Drive search field, type: is:unorganized owner:me
Your Google Drive Storage Quota consist of Google Drive, Google
Photos, and Gmail files. Sometimes changes made to your google drive
account will take the google servers 24-48 hours to properly sync and
reflect back to your drive. Usually this takes far less time but it
can happen. Here are some helpful links I use when managing my google
drive storage quota:
There is no 'history' option but if you want to to check your storage, here is the path that will lead you. From there you will see how much is the capacity of your account can use.
To see how much each of your files is taking up the quota, you may visit this link.
If you want to manage your quota, you may check this link for guidance.

Memory limit exceeded

I am using Google Apps Script to create documents based on a template and some data stored in a spreadsheet. It was previously working well but recently (without having changed any code) I started getting the below error during a "google.script.run" call from a very simple HTML sidebar:
Execution failed: Memory limit exceeded
The error occurs during the copying process - it seems to occur at slightly different places each time the script is run.
I don't see any references to memory limits in the apps script quotas and a general Google search doesn't seem to find anything.
Can anyone shed some light on this and how to determine the limit/what is holding most of the memory/increase the limit if possible?
Check the methods that you're using to get the data ranges.
i.e. if you use sheet.getRange(A:X) it'll get all the rows in the sheet, even the empty ones (probably in the order of tens of thousands, or even hundreds of thousands). It's a better option to use the getDataRange method, which get the last column and last row with data.
Be noted with the Quotas for Google Services. Apps Script services impose daily quotas and hard limitations on some features. If you exceed a quota or limitation, your script will throw an exception and terminate execution.

Amazon API submitting requests too quickly

I am creating a games comparison website and would like to get Amazon prices included within it. The problem I am facing is using their API to get the prices for the 25,000 products I already have.
I am currently using the ItemLookup from Amazons API and have it working to retrieve the price, however after about 10 results I get an error saying 'You are submitting requests too quickly. Please retry your requests at a slower rate'.
What is the best way to slow down the request rate?
Thanks,
If your application is trying to submit requests that exceed the maximum request limit for your account, you may receive error messages from Product Advertising API. The request limit for each account is calculated based on revenue performance. Each account used to access the Product Advertising API is allowed an initial usage limit of 1 request per second. Each account will receive an additional 1 request per second (up to a maximum of 10) for every $4,600 of shipped item revenue driven in a trailing 30-day period (about $0.11 per minute).
From Amazon API Docs
If you're just planning on running this once, then simply sleep for a second in between requests.
If this is something you're planning on running more frequently it'd probably be worth optimising it more by making sure that the length of time it takes the query to return is taken off that sleep (so, if my API query takes 200ms to come back, we only sleep for 800ms)
Since it only says that after 10 results you should check how many results you can get. If it always appears after 10 fast request you could use
wait(500)
or some more ms. If its only after 10 times, you could build a loop and do this every 9th request.
when your request A lot of repetition.
then you can create a cache every day clear context.
or Contact the aws purchase authorization
I went through the same problem even if I put 1 or more seconds delay.
I believe when you begin to make too much requests with only one second delay, Amazon doesn't like that and thinks you're a spammer.
You'll have to generate another key pair (and use it when making further requests) and put a delay of 1.1 second to be able to make fast requests again.
This worked for me.