how to speedup several hundred calls DriveApp.getFolderById(id) - google-apps-script

I noticed folder = DriveApp.getFolderById(id); takes a lot of time and I would like to know if there is a faster way of retrieving hundreds of folders by their id.
I stored info about pathes and folder-Ids in scriptProperties. Without doing that building the tree takes about 45 seconds.
If I use that and provide a 'dummy'name (= method C) the whole function, containing the code below plus other code, takes just below 1 second.
If I get the names of the folders the normal way by 315 times using methode A (folder = DriveApp.getFolderByid(id)) it takes 27 seconds.
I can optimize this by (effectively) removing calls to getFolderById for folders that have several parents (= method B), but still that takes 22 seconds.
Using DriveApp.getFolders() and prepare the names before the loop (= method D) will reduce time to about 4 seconds. Workable, but still quite long.
for (var i=1; i<numPathes; i++)
{ // Create treeItems using pathes generated
var folderPath = pathes[i];
var arrPathes = folderPath.split(underscore);
var numArrPathes = arrPathes.length;
var folderIndex = arrPathes[--numArrPathes];
var parentIndex = arrPathes[--numArrPathes];
var folderId = folderIds[folderIndex];
///////////// Alternatives
// Method A) --> takes 27 seconds
var folder = DriveApp.getFolderById(folderId);
var folderName = folder.getName(); // The normal way
// Method B) --> takes 22 seconds
var folderName = folderNames[folderIndex]; // Prepared before the current loop
// Method C) --> takes 1 second (just for reference)
var foldeName = folderId; // Obtained by folder.getId() before (= dummy)
// Method D) --> takes 4 seconds
var folderName = folderNames[folderId]; // Prepared using DriveApp.getFolders)
// Method E) --> takes 1 second
var folderName = folderNames[folderId]; // Using cacheService (mentioned by Zig Mandel)
///////////// End alternatives
var txtFolder = underscore + folderId + underscore + folderPath;
var chk = appLocal.createCheckBox(folderName).setValue(status)
.setId(chkPrefix + txtFolder).setName(chkPrefix + txtFolder)
.addClickHandler(onChangeStatusChkTreeItem);
var tri = appLocal.createTreeItem(chk).setId(triPrefix + txtFolder);
treeItems[i] = tri;
treeItems[parentIndex].addItem(tri);
}
tree.addItem(treeItems[0]); // Add the root to the tree --> all treeItems will be added as well
So my question is if there is a method for obtaining several hundreds of calls to getFolderId(id) fast? I think about caching, but how?
To me it seems GAS lacks a (fast?) map from ids to folders.
EDIT-1
I implemented caching , mapping folder ids to folder names (method E).
Currently I use a trigger to start updating the cache and scriptProperties every 5 hours and 50 minutes.
I will implement validation of the data in the background using a trigger while the program is running, updating the cache and rebuild the tree if required.
This approach makes it possible to show a tree containing hundreds of folders within a few seconds and without the user waiting for the UI to appear.

There is no way to improve the call times to the google apis.
However, if the folder names dont change often you can cache them in a public cache (see cache services) for up to 6 hours by mapping the folder id (property) to the folder name (cache value).
Dont use getAllFolders as it has a maximum limit and may not get them all.

Related

Google Script timeout after 6 min

I want to list all the folderName and their folderID present in a team Drive(more than 3000 folders). I am using speedsheet and running following code in script-
function listFilesInFolder(folderName) {
var sheet = SpreadsheetApp.getActiveSheet();
sheet.appendRow(['Name','File-Id']);
//change the folder ID below to reflect your folder's ID (look in the URL when you're in your folder)
var folder = DriveApp.getFolderById('folder ID');
var contents = folder.getFolders();
var cnt = 0;
var folder;
while (contents.hasNext()) {
var folder = contents.next();
cnt++;
data = [
folder.getName(),
folder.getId(),
];
sheet.appendRow(data);
};
}
But this is getting Error Exceeded maximum execution time which is 6 min by default.
I tried adding triggers from script app, but after triggering it get start from beginning and script still ends after 6min.
How to add a triggers which starts from where it left?
Answer:
The slow part of this script is the repeated call to sheet.appendRow(). You can speed this up by pushing the results to an array and setting the values at the end of the loop, rather than appending a row on each iteration of the while loop.
More Information:
Using the built-in services such as SpreadsheetApp often can be slow when making many changes to a sheet in a short space of time. You can combat this by minimising the number of calls to the built-in Apps Script services as possible, relying on pure JavaScript to do your processing.
Code Change:
function listFilesInFolder(folderName) {
const sheet = SpreadsheetApp.getActiveSheet()
//change the folder ID below to reflect your folder's ID (look in the URL when you're in your folder)
let folder = DriveApp.getFolderById('')
let contents = folder.getFolders()
let cnt = 0
let data = [['Name','File-Id']]
while (contents.hasNext()) {
folder = contents.next()
cnt++
data.push([
folder.getName(),
folder.getId(),
])
}
sheet.insertRows(sheet.getMaxRows(), data.length)
sheet.getRange(2, 1, data.length, 2).setValues(data)
}
Code changes:
data is declared as an array initialised with the header row, as opposed to appending it directly to the sheet
On each iteration of the loop, the current folder's name and ID is appended to the data array as a new row of data.
After all folders have been looped through, the number of rows in the sheet is extended by the number of rows in data so to not hit an out of bounds error
All rows inside data are added to the sheet using setValues().
In my test environment, I had the following set up:
Drive folder containing 3424 folders
Using the appendRow() method inside the while loop, execution took 1105.256 seconds (or 18 minutes)
Using push() with the .setValues() method outside the loop, execution took 4.478 seconds.
References:
Class Range - setValues() | Apps Script | Google Developers

Script to open a long list of URLs one at a time every 5 seconds and then open the next URL

I have a google spreadsheet with a large number of API Urls.
They look like this => http://oasis.caiso.com/oasisapi/SingleZip?resultformat=6&queryname=PRC_LMP&version=1&startdatetime=20160101T08:00-0000&enddatetime=20160103T08:00-0000&market_run_id=DAM&grp_type=ALL
The database I am drawing from limits requests to one every 5 seconds.
When you follow the link it will download a zip file with cvs files.
I would like to write a script that will follow a URL, wait 6 seconds and then move on to the next URL on the list.
I would like it to stop when it gets to the last URL
I am imagining that I would need to use a "when" loop, but I cannot figure out how to install a wait period, or how to get it to open the URL.
HELP!!!!
I tried a batch URL follow, which failed because of the timing issue.
I began to write the When loop, but I am totally stuck.
I would like to run through the huge list of links fully once. To date I cannot make anything work.
function flink(){
var app = spreasheetapp
//access the current open sheet
var activesheet = app.getactivespreadsheet().getactivesheet()
var activecell= activesheet.getrange(11,11).openurl
//I am getting totally stuck here
I have tried using an iterator but I have no idea how to add the time delay and then I cannot seem to get the syntax for the iterator correct.
To access a url from AppsScript, you can use UrlFetchApp.fetch(url).
To force the script to wait for a certain amount of time, you can use Utilities.sleep(milliseconds).
References:
https://developers.google.com/apps-script/reference/url-fetch/url-fetch-app#fetch(String)
https://developers.google.com/apps-script/reference/utilities/utilities#sleepmilliseconds
I came up with this code that accesses each URL in the sheet. I added some comments so you can see what the script is doing step by step:
function accessURLs() {
// Copy urls
var ss = SpreadsheetApp.getActive();
var sheet = ss.getSheetByName("Links up to today");
// Copy the contents of the url to another column removing the formulas (your script will appreciate this)
sheet.getRange("C3:C").copyTo(sheet.getRange("D3:D"), {contentsOnly:true});
var column = 4; // URLs are in column 4
var row = 3; // URLs start at row 3
var url = sheet.getRange(row, column).getValue();
var responses = []
// Loop through all urls in the sheet until it finds a cell that is blank (no more urls left)
do {
var options = {
muteHttpExceptions: true
}
var response = UrlFetchApp.fetch(url, options); // Script makes a request to the url
responses.push(response); // The latest response is added to the responses array
Utilities.sleep(6000); // The script stops for 6 seconds
row++;
url = sheet.getRange(row, column).getValue();
} while (url != ""); // Cell is not blank
}
Take into account that if we are to access ~1400 URLs and the script stops for 6 seconds after each fetch, it will take more than 2 hours to access all URLs.
Also, take into account that this script is just getting the data coming from the URL requests (it's stored in the variable responses), but it is not doing anything else. Depending on what you want to do with this data, you might want to add some extra stuff there.
I hope this is of any help.

Using DocumentApp's append functions to 'MailMerge' data into single Google Document

Update: I have updated my code with some of the suggestions as well as a feature that allows for easy multiple markers and changed the arrayOfData into a 2D Array or strings. Has similar runtimes, if not slightly slower - 50pg avg: 12.499s, 100pg avg: 21.688s, per page avg: 0.233s.
I am writing a script that takes some data and a template and performs a 'mail merge' type function into another document. The general idea for this is easy and I can do it no problem.
However, I am currently 'mail merging' many rows (150-300+) of ~5 columns of data into predefined fields of a single page template (certificates) into a single document. The result is a single Google Document with 150 - 300 pages of the certificate pages. The alternative is to generate many documents and, somehow, combine them.
Is This a Good/Efficient Way of Doing This?
It took me a while to work out put together this example from the documentation alone as I couldn't find anything online. I feel like there should be a simpler way to do this but can not find anything close to it (ie. appending a Body to a Body). Is this the best way to do get this functionality right now?
Edit: What about using bytes from the Body's Blob? I'm not experienced with this but would it work faster? Though then the issue becomes replacing text without generating many Documents before converting to Blobs?
*Note: I know Code Review exists, but they don't seem to have many users who understand Google Apps Script well enough to offer improvement. There is a much bigger community here! Please excuse it this time.
Here Is My Code (Updated Feb 23, 2018 # 3:00PM PST)
Essentially it takes each child element of the Body, replaces some fields, then detects its type and appends it using the appropriate append function.
/* Document Generation Statistics:
* 50 elements replaced:
* [12.482 seconds total runtime]
* [13.272 seconds total runtime]
* [12.069 seconds total runtime]
* [12.719 seconds total runtime]
* [11.951 seconds total runtime]
*
* 100 elements replaced:
* [22.265 seconds total runtime]
* [21.111 seconds total runtime]
*/
var TEMPLATE_ID = "Document_ID";
function createCerts(){
createOneDocumentFromTemplate(
[
['John', 'Doe'], ['Jane', 'Doe'], ['Jack', 'Turner'], ['Jordan', 'Bell'],['Lacy', 'Kim']
],
["<<First>>","<<Last>>"]);
}
function createOneDocumentFromTemplate(arrayOfData, arrayOfMarkers) {
var file = DriveApp.getFileById(TEMPLATE_ID).makeCopy("Certificates");
var doc = DocumentApp.openById(file.getId());
var body = doc.getBody();
var fixed = body.copy();
body.clear();
var copy;
for(var j=0; j<arrayOfData.length;j++){
var item = arrayOfData[j];
copy = fixed.copy();
for (var i = 1; i < copy.getNumChildren() - 1; i++) {
for(var k=0; k<arrayOfMarkers.length; k++){
copy.replaceText(arrayOfMarkers[k], item[k]);
}
switch (copy.getChild(i).getType()) {
case DocumentApp.ElementType.PARAGRAPH:
body.appendParagraph(copy.getChild(i).asParagraph().copy());
break;
case DocumentApp.ElementType.LIST_ITEM:
body.appendListItem(copy.getChild(i).asListItem().copy());
break;
case DocumentApp.ElementType.TABLE:
body.appendTable(copy.getChild(i).asTable().copy());
break;
}
}
}
doc.saveAndClose();
return doc;
}
Gist
This is more of a Code Review question, but no, as written, I don't see any way to make it more efficient. I run a similar script for creating documents at work, though mine creates separate PDF files to share with the user rather than creating something we would print. It may save you time and effort to look into an AddOn like docAppender (if you're coming from a form) or autoCrat.
A couple of suggestions:
I'm more of a for loop person because it's easier to log errors on particular rows with the indexing variable. It's also more efficient if you're pulling from a spreadsheet where some rows could be skipped (already merged, let's say). Using forEach gives more readable code and is good if you always want to go over the entire array, but is less flexible with conditionals. Using a for loop will also allow you to set a row as merged with a boolean variable in the last column.
The other thing I can suggest would be to use some kind of time-based test to stop execution before you time the script out, especially if you're merging hundreds of rows of data.
// Limit script execution to 4.5 minutes to avoid execution timeouts
// #param {Object} - Date object from loop
// return Boolean
function isTimeUp_(starttime) {
var now = new Date();
return now.getTime() - starttime.getTime() > 270000; // 4.5 minutes
}
Then, inside your function:
var starttime = new Date();
replace.forEach(...
// include this line somewhere before you begin writing data
if (isTimeUp_(starttime )) {
Logger.log("Time up, finished on row " + i);
break;
}
... // rest of your forEach()

How to speed up searching DriveApp files using GAS

I noticed that just looping through the files stored on Google Drive takes a LOT of time.
var startTime = Date.now();
var count = 0;
var files = DriveApp.getFiles();
while (files.hasNext()) { count++; var file=files.next();};
var endTime = Date.now();
Logger.log('Loop through ' + count + ' files takes ' + ((endTime-startTime)/1000) + ' seconds');
It takes about 1 seconds to loop through 100 files.
Storing fileinfo in cache and looping through it after retrieval makes it possible to handle about 20000 files a second (on my system)
var startTime = Date.now();
var fileIds = getFileIds(); // retrieve info cache (stored before)
var count = 0;
var numFiles = fileIds.length;
for (var i=0; i<numFiles; i++) { count++; var file = fileIds[i];};
var endTime = Date.now();
Logger.log('Loop through ' + count + ' files takes ' + ((endTime-startTime)/1000) + ' seconds');
The results above are nothing special, but it makes you think if it will be possible to speedup certain action once you have stored fileinfo in cache.
In my case I notice that specifying several searchcriteria and performing a search
var files = DriveApp.searchFiles(criteria);
might take a lot of time (over 20 seconds) processing results.
So I wonder if there is a way to speedup searching for files.
Does anybody have ideas how to speedup and/or to avoid looping through all files the way described in the first test?
Not possible to speed it up. The comparison you make is not very relevant because the 2nd time you are not making any drive api calls.
That 2nd call is just measusing the time it takes to run a right loop with no api calls.
The first time, all time is consumed calling next which does a roundgrip to the drive api.
If your data doesnt change often you can use the cache to avoid making the same searches again. Make sure to deal with stale results and such.

How do I return all folder names and IDs without reaching execution limit?

I am attempting to retrieve a list of all folders and their respective IDs using a Google Apps Script. Currently, I am writing the result to an array which is then posted to a spreadsheet every 5000 records. Unfortunately, the script reaches the execution limit (5 minutes) before completion. How can I work around this? And, would I have more success doing RESTful API calls over HTTP than using Apps Script?
I've noted the following:
Code already follows Google's bulk-writes best practice.
Slow execution is as a result of Apps Script indexing Drive slowly.
Results appear to follow a consistent indexing pattern.
Multiple runs produce results in same order
Unknown how items are re-indexed upon addition preventing meaningful caching between runs
Delta not reliable unless indexing method is identified
Looked into Drive caching.
Still required to loop through FolderIterator object
Theoretical performance would be even worse imo (correct?)
Code is below:
function LogAllFolders() {
var ss_index = 1;
var idx = 0;
var folder;
var data = new Array(5000);
for (i=0;i<5000;i++){
data[i] = new Array(2);
}
var ss = SpreadsheetApp.create("FolderInv2",1,2).getSheets()[0];
var root = DriveApp.getFolders();
while(root.hasNext()) {
folder = root.next();
data[idx][0] = folder.getName();
data[idx][1] = folder.getId();
idx++;
if ((ss_index % 5000) == 0) {
ss.insertRowsAfter(ss.getLastRow()+1, 5000);
ss.getRange(ss.getLastRow()+1,1,5000,2).setValues(data);
SpreadsheetApp.flush();
idx = 0;
}
ss_index++;
}
}
I would first collect all the folder ids you wanted to process first, then you could save the folder ID (or maybe array index) that you've processed to your project properties and then run the job as a CRON every five minutes and just resume from that folder ID or index that you saved previously.
I guess when it's done, remove the CRON trigger programatically.