High Traffic & Excessive Script Execution Time - google-apps-script

I have a container-bound Apps Script Project contained to a Form Response Google Sheet, triggered on form submit. The script runs as me. I'm dealing with execution runtimes 6-8x the nominal run time during peak hours of the day, which seems largely correlated to increased traffic of form submissions. The program follows this series of steps:
Form response collects a string of information and triggers the Apps Script Project to execute
Project creates a copy of a complex Google sheet of a few tabs and a lot of complex formulas
Project pastes the string collected from the form response into this google sheet copy, flushes the sheet, and then returns a string of results calculated within the sheet
Google sheet file is named and the Project creates a unique Drive folder where this sheet eventually gets moved to.
Run complete
The Project performs a wide variety of getValue and SetValue calls throughout the run as it's updating cell values or reading calculated results. Over the last year, I've improved optimization in many ways (i.e. batch calls to getValues or setValues, batch API calls, etc). It's normal run time is 25-45 seconds, but increases to 200+ seconds during my company's peak business hours. Using logs, there is no one particular step that gets hung up. But rather the script lags in all aspects (When it creates the file copy, SpreadsheetApp.flush(), append or delete rows in google other sheets it references by Sheet ID, etc). Although, I'll say a common reason for a failed execution returns the error message "Service Spreadsheets timed out while accessing document with id..." But most of the executions complete successfully, just after a lengthy run.
It seems there is a clear correlation between execution time and traffic. So my questions are:
Can anyone confirm that theory?
Is there a way I can increase the service's bandwidth for my project?
Is there a better alternative to having the script run as me? (mind you I'm performing the rest of my job using Chrome throughout the day while this program continues to automatically run in the background)
This is an Apps Script managed project; it is not tied to a personal GCP Project. Is that a potential factor at play here?

Related

How can I optimize my Google Sheet's importXML() calls with an Apps Script to avoid loading errors?

I've been trying to use importXML in Google Sheets to import specific data (in this case, only player name) from several players via the Steam Web API.
I encountered what seems to be a limit with the number of importXML calls I can make in my sheet, because I get loading errors:
Loading data may take a while because of the large number of requests. Try to reduce the amount of IMPORTHTML, IMPORTDATA, IMPORTFEED or IMPORTXML functions across spreadsheets you've created.
This list will likely grow (currently at about 170) and I need a way for it to be able to handle the calls. I don't need the data to update very frequently (even 2-3 times a day is sufficient).
I've tried the code I found from another SO post, but that seems to refresh all the importxml calls at once, so I still got loading errors.
From what I've researched so far, it seems like I'll need to use an Apps Script to optimize my sheet by creating intervals for the calls. Is there a way I could have a script do the following:
Call 25 rows (or whichever limit is optimal)
Wait some amount of time
Call next 25 rows
Continue till the end of the sheet, then restart loop
I'm not too savvy with writing functions so don't know how to edit the code to achieve that. Any help would be appreciated.
If you'd like to take a look at the spreadsheet I'm working with, here it is. For now, only Column B has the importXML calls and the url's are concatenated using cells in Column H. So there's one importXML call per row.

Apps Script Activity Reporting/Visualization

I've been developing an apps script project for my company that tracks our time/expenses. I've structured the project like so:
The company has a paid Gsuite account that owns all the spreadsheets hosted on the company's google drive.
Each employee has their own "user" spreadsheet which is shared from the company Gsuite account with the employee's personal gmail account.
Each of the user spreadsheets has a container-bound script that accesses a central library script.
The library script allows us to update the script centrally and the effects are immediate for each user. It also prevents users from seeing the central script and meddling with it.
Each of the user container-bound scripts have installable triggers that are authorized by the company account so that the code being run has full authority to do what it needs to to the spreadsheets.
This setup has been working quite well for us with about 40 users. The drawback to this setup is that since all the script activity is run by the company account via the triggers, the activity of all our users is logged under the single company account and therefore capped by the apps script server quotas for a single user. This hasn't been much of an issue for us yet as long as our script is efficient in how it runs. I have looked into deploying this project as a web-app for our company, but there doesn't seem to be a good way to control/limit user access to the central files. In other words, if this project was running as a web app installed by each user, each user would need to have access to all the central spreadsheets that the project uses behind the scenes. And we don't want that.
SO with that background, here is my question. How do I efficiently track apps script activity to see how close we are to hitting our server quota, and identify which of my functions need to be optimized?
I started doing this by writing a entry into a "activity log" spreadsheet every time the script was called. It tracked what function was called, and who the user was and it had a start time entry and and end time entry so I can see how long unique executions took and which ones failed. This was great because I had a live view into the project activity and could graph it using the spreadsheet graphs tools. Where this began to break down was the fact that every execution of the script required two write-actions: one for initialization and another for completion. Since the script is being executed every time a user made an edit to their spreadsheet, during times of high traffic, the activity log spreadsheet became inaccessible and errors would be thrown all over the place.
So I have since transitioned to tracking activity by connecting each script file to a single Google Cloud Platform (GCP) project and using the Logger API. Writing logs is a lot more efficient than writing an entry to a spreadsheet, so the high traffic errors are all but gone. The problem now is that the GCP log browser isn't as easy to use as a spreadsheet and I can't graph the logs or sum up the activity to see where we stand with our server quota.
I've spent some time now trying to figure out how to automatically export the logs from the GCP so I can process the logs in real-time. I see how to download the logs as csv files, which I can then import into a google spreadsheet and do the calcs and graphing I need, but this is a manual process, and doesn't show live data.
I have also figured out how to stream the logs from GCP by setting up a "sink" that transfers the logs to a "bucket" which can theoretically be read by other services. This got me excited to try out Google Data Studio, which I saw is able to use Google Cloud Storage "buckets" as a data source. Unfortunately though, Google Data Studio can only read csv files in cloud storage, and not the json files that my "sink" is generating in my "bucket" for the logs.
So I've hit a wall. Am I missing something here? I'm just trying to get live data showing current activity on our apps script project so I can identify failed executions, see total processing time, and sort the logs by user or function so I can quickly identify where I need to optimize my script.
You've already referenced using GCP side of your Apps Script.
Have a look at Metric explorer, it lets you see quota usage per resource and auto generates graph for you.
But long term I think re-building your solution may be a better idea. At minimum switching to submitting data via Google Forms will save you on operation.

Google Apps Scripts self-starting at night

Context: I have a Google Spreadsheet with some data imported from the external API + some calculations done. API access + calculations are done using Google Apps Script. All those functions are within two files that belong to one project. Today in the morning I noticed multiple
Exception: Service invoked too many times for one day: urlfetch. errors. Strange enough, as neither I nor the second collaborator was working overnight. When I checked executions, it turned out there were multiple executions over the night. It looked as if the document was refreshed every 20-30 minutes.
Questions
How can I check what triggered those functions?
Any ideas how to prevent those executions?

calling functions asynchronously from Google Sheets using script

Is there any way to make Google Script call functions asynchronously? My scenario is that I have a main spreadsheet that information is entered into and a script then passes the relevant information to other spreadsheets.
There are then other functions that manipulate the data in those other spreadsheets. Unfortunately, because of the high volume of data, calling all the functions on one action causes the script to hit the 6 minute time out.
I tried using the onEdit trigger in the other spreadsheets, but it doesn't seem to work unless the sheets are opened by a user.
The way it is just now the user would have to hit 4 different buttons to trigger the various functions and not get a time out.
Thanks for any help
Blair
Depending on how realtime the updates need to be, you could consider creating a queue that contains all of the updates to be made (perhaps stored in the PropertiesService as a stringified JSON object).
Then your update code could be triggered regularly, say every 5 minutes, and read the next element of the queue and execute the update, before removing that entry from the queue. This would mean each individual update fitted within the 6 minute window, but it would also mean that if there were 4 additional updates for every update to the main sheet it might be up to 24 minutes before all of them had been made.

How do I obtain the last mod time for a SHEET in a Google workBOOK?

I have a bunch of workbooks that have dozens... up to 100... sheets in them, and we do an import process, loading one sheet after the other, on these workbooks that tends to get throttled when we collate our entire body of data. In other words, we read every sheet in every workbook and partway through the process, we get bogged down by the fact that Google is putting off our calls.
What I'd like to do is look at the last modification time for each sheet before I request it. If it is more recent than the data I've cached for that sheet, I'll download it. Otherwise, I'll skip it and use my cached version.
It's absolutely possible to get the last mod time for the workbook, but if I used that, and only one sheet in the book changed, I'd be stuck downloading all 100 sheets in that book when I may only need one.
I've looked at the idea of having Google notify me any time a cell changes, but there are limits on how many notifications are sent in a day. If I were to activate the notification for all the sheets I'm watching, and turn them off as I mark cached data as being dirty, I would still have the problem that I wouldn't know if I'd suddenly stopped getting notifications because I was being throttled. There are some sneaky workarounds to this, but none of them would notice if I'd been throttled for a while, but wasn't anymore.
So, how do I find the last modification for a sheet - preferably via one of the Python libraries?