Timeout errors with Google Slides API, and terrible write performance - google-apps-script

I'm appending dozens of slides to presentations with a simple app script :
var presentation_to = SlidesApp.openById(presentation_to_id);
var presentation_from = SlidesApp.openById(presentation_from_id);
var slide = presentation_from.getSlideById(slide_id);
var newSlide = presentation_to.appendSlide(slide);
I have an app script API endpoint for that.
I started with one app script that would loop through all the slides IDs, but this had terrible performance and would timeout after 5 minutes. I've split my calls to the API app script to ask for one slide at a time, with parallelization (so I run several request to add a slide to the same presentation at once).
When the slides have big pictures in them, I still end up getting this :
Google::Apis::TransmissionError: execution expired
Is appendSlide() performance so bad what I want to do is not possible, or is there a way for me to make it work without having to wait 1h to generate one 50 slides presentation ?
PS : You'll find attached the logs of the script. Each line is meant to append ONE slide to a presentation (always the same destination). The execution times and error rates are just through the roof. Is performance simply limited by Google or is there a way to bypass this issue ?

Considerations
Using Apps Script SlideApp built-in Class you are basically creating an API call every time you run .appendSlide() method. This will cause big network overhead when your script inserts a lot of slides.
Generally this is solved by using batch requests via Advanced Google Services.
Unfortunately, there is no method to create a Page copy request that you could insert in a batch operation. If this is really important for your workflow you should consider filing a Feature Request
Reference
Slides API

Related

High Traffic & Excessive Script Execution Time

I have a container-bound Apps Script Project contained to a Form Response Google Sheet, triggered on form submit. The script runs as me. I'm dealing with execution runtimes 6-8x the nominal run time during peak hours of the day, which seems largely correlated to increased traffic of form submissions. The program follows this series of steps:
Form response collects a string of information and triggers the Apps Script Project to execute
Project creates a copy of a complex Google sheet of a few tabs and a lot of complex formulas
Project pastes the string collected from the form response into this google sheet copy, flushes the sheet, and then returns a string of results calculated within the sheet
Google sheet file is named and the Project creates a unique Drive folder where this sheet eventually gets moved to.
Run complete
The Project performs a wide variety of getValue and SetValue calls throughout the run as it's updating cell values or reading calculated results. Over the last year, I've improved optimization in many ways (i.e. batch calls to getValues or setValues, batch API calls, etc). It's normal run time is 25-45 seconds, but increases to 200+ seconds during my company's peak business hours. Using logs, there is no one particular step that gets hung up. But rather the script lags in all aspects (When it creates the file copy, SpreadsheetApp.flush(), append or delete rows in google other sheets it references by Sheet ID, etc). Although, I'll say a common reason for a failed execution returns the error message "Service Spreadsheets timed out while accessing document with id..." But most of the executions complete successfully, just after a lengthy run.
It seems there is a clear correlation between execution time and traffic. So my questions are:
Can anyone confirm that theory?
Is there a way I can increase the service's bandwidth for my project?
Is there a better alternative to having the script run as me? (mind you I'm performing the rest of my job using Chrome throughout the day while this program continues to automatically run in the background)
This is an Apps Script managed project; it is not tied to a personal GCP Project. Is that a potential factor at play here?

Does this risk me to exceed Google spreadsheet API calls quota?

I am using google spreadsheet to collaborate on some common data, and processing this data using spreadsheet macros and saving it there itself in cells. This approach is error prone as the macro functions which are processing the data depend on inputs apart from what is given as parameters.
Example of how the common data-looks
Content Sheet
Sections Sheet
Pages Sheet
Explanation of this common data
The three sheets are filled by various collaborators
Content sheet defines base elements of a page, they are referred (using their UUIDs) in the sections sheet (using macros), and finally all sections add together to make publishable html page.
The output html format varies depending upon the destination which are multiple - static html, Google Document, Google Slides, Word-press, MediaWiki etc. The formatting is done using GAS macros.
I tried various solutions, but nothing is working properly. Finally I have thought to keep the google spreadsheet as a data source only and move the entire computation to my local machine. This means I have to download the sheets data (3 sheets) which are some 1,000 rows, but can become voluminous with time.
Earlier when the entire computation was on google spreadsheets, I had to only fetch the final data and use it, which amounted to a lot fewer APIs calls. Referring to the example above, It means I would only fetch the output html of the "Pages sheet".
Q1) So my question is, given that I plan to move the entire computation to local machine, if I make only 3 APIs calls to bulk fetch data of three sheets, and pull the entire data, does it still counts as just 3 API calls, or big requests have a different API cost? Are there any hidden risks of over-running the quota in this approach?
Q2) And let's say i use hooks, both spreadsheet hooks and drive hooks to sync with real-time changes, what are my risks of running out of quota. Does each hook call constitute a API call. Any general advice on best practices for doing this?
Thanks
As per the documentation says by default Google Sheets API has a limit of 500 requests per 100 seconds per project, and 100 requests per 100 seconds per user unless you change it on your project.
A1) If you make a single request it counts as 1 request no matter how large the data is, for that reason I'd make a single request in order to GET the entire Spreadsheet rather than making 3 API calls and then make the whole process you mentioned before.
A2) If you've considered using Push Notifications keep in mind that you will need some extra configuration such as a domain.
As a workaround I recommend you to use Drive File Stream
Since you want move the computation to your local machine, this option would be better but will need some changes. For example, you could change the output to CSV in order to better handle the data and it still will be a Google Sheets valid file type and then host your app in the same machine or make your local machine accessible.

Guarantee mutual exclusion on access and modification of google sheet data in app script

I have an app script deployed web app that uses google sheet as a crappy database. I was wondering how I can guarantee mutual exclusion on accessing and modifying data from the app script? (Like a mutex / semaphore)
I was concerned since an instance of the web app cannot share variables (obviously), and I'm not sure that access to google sheet data is fast enough to prevent that problem (like making a semaphore in google sheets)
Thanks!
You can use the LockService to achieve that.
That said, I think you should try to minimize its usage the most you can, to prevent your app from slowing down even more (Apps Script and Sheets are not very fast to begin with). Setup the data in your spreadsheet in a way that you can fetch all you need in one go, and same for setting it back.

Google apps script slow parsing

I was trying to parse vmstat using Google Apps script so that everyone in the company could use this to create a graph of the data. You can find my code here but this code is really slow. Is there something I can do to make this better or isn't Google Apps Script suitable for this? The problem the ammount of rows that needs to be processed. Any suggestions are welcome.
function doGet(){
var file = DriveApp.getFileById(id)
var docContent = file.getAs('application/octet-stream').getDataAsString();
var data = Charts.newDataTable()
.addColumn(Charts.ColumnType.STRING, 'TIME')
.addColumn(Charts.ColumnType.NUMBER, 'Memory');
var lines = docContent.split("\n");
Logger.log(lines.length);
var i = 1;
lines.forEach(function(line) {
if ((line.indexOf('mem') < 0) && (line.indexOf('free') < 0)) {
var values = line.match(/\S+/g);
data.addRow(['5',parseInt(values[3])]);
Logger.log(i)
}
if (i == 20){
return;
}
i++;
});
for( var i=0;i< lines.length;i++){
data.addRow(['5',10]);
}
data.build();
var chart = Charts.newAreaChart()
.setDataTable(data)
.setStacked()
.setRange(0, 400)
.setTitle('Memory')
.build();
return UiApp.createApplication().add(chart);
}
This isn't a problem of code optimization (although the code isn't perfect), as much as division of work.
The accepted approach to web application performance optimization involves separating three concerns; presentation, business logic and data accessref. With the exception of the generation of the vmstat output, you've got all of that in one place, making the user wait while you locate a file on Google Drive (using two exhaustive searches, btw) and then parse it into a Charts DataTable, and finally generate HTML (via UiApp).
You may find that the accessibility of a Google Apps Script presentation is useful to your organization. (I know in my workplace that our IT folks clamp down on in-house web servers, for example.) If so, consider what you have as prototype, and refactor it to give better perceived performance.
Presentation: Move from UiApp + Charts to HtmlService + Google Visualization. This moves the generation of the chart into the web client, instead of keeping it in the server. This will give a faster page load, to start.
Business Logic: This will be the rules that map your data into the Visualization. Like the Charts Service that is built over it, GViz uses DataTables with column definitions and rows of data.
One option here is to repeat the column definition & data load you already have, except on the client in JavaScript. Doing that will be significantly faster than via Google Apps Script.
A second option, which is even faster, especially with large datasets, is to load the data from an array.
google.visualization.arrayToDataTable(...)
Either way, you need to get your data to the JavaScript function that will build your chart.
Data Access: (I assume) you're currently running a shell script in Linux that calls vmstat and pipes the output to a file in your local Google Drive folder. (Alternatively, the script may be using the Drive API to push the file to Google Drive.) This file is plain text.
The change I'd make here would be to produce csv output from vmstat, and use Google Apps Script to import the csv into a spreadsheet. Then, you can use Sheet.getSheetValues() to read all the data in one shot, in a server side function to be called from the client JavaScript.
This would not be as fast as a local server solution, but it's probably the best way to do this using the Google Apps Script environment.
Edit: See more about this in my blog post, Converting from UiApp + Chart Service to Html Service + Google Visualization API.

How to dynamically update a page served by a Web App? [duplicate]

This question already has answers here:
build real time dashboard using google apps script
(3 answers)
Closed 2 years ago.
I'm developing a Web App using google apps script and a spreadsheet as storage.
Basically, an HTML showing some tables for the different tabs.
From my app, users can add new tasks, edit tasks, and mark tasks as completed.
Since there are many users using the app, the data showed on each client will get outdated very fast.
I would like to update the app with new records and changes to the existing ones as soon as possible.
I thought in logging the last edit tab+row in a cell and pull that data from the server every minute, but, what if many entries are added/edited during that minute?
I think WebSocket is not possible. Any other idea?
I'm using JQuery client-side.
To help avoid conflicts, give every task a unique ID. Something like creation time + random string. That way you can look it up in the spreadsheet. Also, I think the Lock Service can prevent concurrent edits temporarily to avoid conflicts:
https://developers.google.com/apps-script/reference/lock/
To check for updates, try polling the last edit time of the spreadsheet. If it's greater than the previous poll, fetch updates.
https://developers.google.com/apps-script/reference/drive/file#getLastUpdated()
No other way besides polling. You can't have sockets or callbacks from HTML service. You could poll frequently but that may run you out of quotas.
If you really want to poll and avoid quotas you can log the last edit on a published public spreadsheet and read it with ajax from the client, however published spreadsheets update every minute only.
You could try something like this:
var lock = LockService.getPublicLock();
var success = lock.tryLock(10000);
if (success) {
// check your spreadsheet for lastUpdated or PropertiesService.getScriptProperties();
}
} else {
// do something else (try it again or error msg)
}
lock.releaseLock();
I have found that it works well on my app and I have around 1000 users.