GAS catch "Exceeded memory limit" exception - google-apps-script

I'm working on a GAS project with Speech to Text API. It's converting flac file, while the file is large than 2MB, execution is interrupted and got "Exceeded memory limit" error in the GAS code editor. Is there anyway I can catch such error in my code? And any way to avoid such error?
I have checked "Quotas for Google Services", my project should not meet any of the criteria.
My project is https://github.com/mushuser/audiolib , stt.gs is the Speech to text parts.

Yep, Exceeding memory limit is possible because You keep content of file(s) in variable.
Google Script is intended to simple and light automation tasks, but essentially task should be only connection commands between few services.
Some limit like Memory limit is not simple defined, they are dynamic (you can store more data in memory across multiple objects than you can save in one variable for example) and can be changed as prevent to abuse (this service is free, it could be abused to consume lot of compute power or memory).
Try to check if Speech to Text API accept input data as URL to content (getDownloadUrl()) instead of send data directly in payload - this will cause the big file content will be exchanged between services outside of Script.

Related

Google Drive API /files slow response

I want to ask for help/ideas on the issue I will describe below.
Our iOS app allows users to access their Google Drive files.
We use Changes API (https://developers.google.com/drive/api/v3/reference/changes). The main pre-condition to using this API is to build a local DB that holds the snapshot of the user's Drive file tree and the token. To initially fill the DB we must request the list of all files from user's Drive. Getting the list of all files (with metadata) takes too long for many of our users. This is the issue I want to address.
We request files with the series of Files requests (https://developers.google.com/drive/api/v3/reference/files/list). Most requests are plain files?q=trashed%20%3D%20false.
For example, at my own private Google Drive:
69K files
initial request of all files takes 5+ minutes with my current network speed (Download 527 Mbps, Upload 417 Mbps; ping www.googleapis.com – 40–45 ms)
~150 requests
each request brings information about ~460 files
each request takes around 2-2.5 seconds
Sometimes I observed requests to take up to 6 seconds, which means that getting all files list took 15 minutes at my account.
If I look at the Developer Console, the latency is below 0.1s
Many of our users have Drives far bigger than mine. Standard iOS app user's session is not long enough to complete the initial request. We do save every intermediate page token so that all data received during single app session is not lost if user leaves the app – next session we will keep downloading data from the last saved token. But still there're some cases when our app needs the DB to be filled out with data before starting some operations – in that case our users see "Pending..." progress and they complain that our app is slow.
So, questions:
is it possible to improve the described request speed/latency?
maybe there's some quota that we are missing and it can be changed?
maybe someone can advice a more effective way of getting all files list?
P.S. We could potentially reduce the amount of requests. We have to perform some double checks for Shared with Me folders as we observed that sometimes request of all files doesn't list all files from Shared folders. That's a bit of a side story, and I don't think this will dramatically improve situation for us. I can provide more details on the actual set of requests we perform if necessary.
Are you returning all the fields - I would assume so since the only query param provided is trashed=false as the query param. Do you need all the fields? Can you try to reduce the query to only return the fields you really care about (using a field mask) and see if that improves your performance?

Google Drive Rest API - Create File - Quota

I have a program where I have to copy about 500,000 files onto google drive to different folders. I use the google drive v3 nodejs api. I issue about 2 uploads per second (every 450ms). After a while, I get ECONNRESET or socket hang up from API.
When I look at the quota on the console.cloud.google.com. I am nowhere near my quota. Why is it failing?
For kicks, I have tried google filestream and it has no problems pushing into the drive under my user account. It's about 5 times faster.
Did anyone run into this problem?
I think your quota per se is not the problem here. This is happening when you're writing too much data within a short time frame. Try to slow it down and try to shard the requests across different user accounts. This should help with the heavy lifting of the many requests you are performing. Also, don't forget to implement exponential backoff for 4xx error retries. My two cents.
This does happen when I call passing a stream. There is no warning in the developers.google.com but there is a warning at their github repository.
You can also upload media by specifying media.body as a Readable stream. This can allow you to upload very large files that cannot fit into memory.
Note: Your readable stream may be unstable. Use at your own risk.
Once I have changed it not to use the streams, I started getting the proper error message such as status code 403, going over your rate limit.
I simply changed my code to use a straight buffer. Buffer is read via fs.readFileSync before the call.
media: {
mimeType: 'text/plain',
body: buf
}

How is memory allocated to a Google Cloud Function?

Today I got this error in cloud function:
Function killed. Error: memory limit exceeded
My function is based on authenticated-json-api example of Firebase sample functions. Because it worked like a charm, I extended it with multiple routes and multiple tasks like connecting with multiple external api, turning base64 strings into pdf on storage, validation, logging, and so on...
I removed some routes and it looks more stable now. My question now is: Can there be a limit to the amount of code / processing there can be within a single function. And would it be a better approach to split them up in multiple express api's ?
I also found some questions about allocating memory to specific functions. However, I can't find the option in Google Cloud Platform to change it nor the option in firebase package.json to set it.
I found the solution:
Go to the Google Cloud Platform Console (not the Firebase console)
Select Cloud Functions in the menu
Now you see your firebase function in here if it's correct. Otherwise check if you selected the right project.
Ignore all checkboxes, buttons and menu items, just click on the name of the function.
Click on edit (top menu) and only change the allocated memory and click save.
Regards, Peter

GAS - What's the max size of the payload I can send through the Execution API

I'm building a relatively big Google Docs file using Google Apps Script, and I basically need to inject a lot of data in order to build it programatically.
I'm thinking of executing a function init() and passing the json string as it value through the Execution API. I'm worried about the max size of the string that I can pass. What's the max size?
Checked the documentation for the docs on Execution API but there is no mention of such limit. If it follows the standard protocol RFC 2616 as mentioned in this thread, then your big payload may push through. The only thing you need to do now is actually try.

Google Drive multiple files download

We have a client-server architecture that uses Google Drive for sharing files between the client and the server, without having to actually send them.
The client uses the Google Drive API to get a list of file IDs of all files it wants to share with the server.
The server then downloads the files with the appropriate authorization token.
Server response time is crucial for user experience.
We tried a few approaches:
First, we used the webContentLink. This worked until we started receiving large files from the client. Instead of getting the files' content, we got an html with a warning "exceeds the maximum size that Google can scan". We could not find a header we can use to skip this check.
Second, we switched to the Google API resource URL with the alt=media query param. This works, but we then hit API quota errors (User Rate Limit Exceeded). Since this is server code, it was identified as a single user for all requests.
Then we added the quotaUser param to represent on behalf of which user each request is. We still got many 403 responses.
In addition, we implemented exponential backoff for the failed requests.
We also added a cache for the successful requests.
Our current solution is a combination of the two. Using the webContentLink whenever possible (which appears not to affect the Google API quota). If the response is not as expected, (i.e. an html, wrong size, etc.), we try the Google API resource URL (with exponential backoff).
(Most of the files are small enough to not exceed the scan size limit)
Both client and server uses the same OAuth 2.0 client ID.
Here are my questions:
1. Is it possible to skip the virus scan, so that all files can be downloaded using the webContentLink?
2. Is the size threshold for the virus scan documented? Assuming we know the file size we can then save the round-trip of the first request (using the webContentLink)
3. Is there anything else we can do other than applying for a higher quota?
Is it possible to skip the virus scan, so that all files can be downloaded using the webContentLink?
If it is greater than 25MB it is not possible with webContentLink but since you are using authorized request use files.get with alt=media. Apply appropriate error handling options (which you have done using exponential backoff). The next step would be checking if you code is optimized then after checking and applied recommended optimization and still received Error 403 Limit Exceed, time to apply for a higher quota.
Is the size threshold for the virus scan documented? Assuming we know the file size we can then save the round-trip of the first request (using the webContentLink)
To answer this, you can refer to the Google Drive Help Forum : How can I successfully download large files from google drive without network errors at the most end of the download:
Only files smaller than 25 MB can be scanned for viruses.
Is there anything else we can do other than applying for a higher quota?
You can do the following before applying for a higher quota:
Performance Tips
Drive Platform Best Practices
Handling API Errors
After all optimization is done, the only option is to apply for higher quota limit.
Hope this helps!