How to speed up searching DriveApp files using GAS - google-apps-script

I noticed that just looping through the files stored on Google Drive takes a LOT of time.
var startTime = Date.now();
var count = 0;
var files = DriveApp.getFiles();
while (files.hasNext()) { count++; var file=files.next();};
var endTime = Date.now();
Logger.log('Loop through ' + count + ' files takes ' + ((endTime-startTime)/1000) + ' seconds');
It takes about 1 seconds to loop through 100 files.
Storing fileinfo in cache and looping through it after retrieval makes it possible to handle about 20000 files a second (on my system)
var startTime = Date.now();
var fileIds = getFileIds(); // retrieve info cache (stored before)
var count = 0;
var numFiles = fileIds.length;
for (var i=0; i<numFiles; i++) { count++; var file = fileIds[i];};
var endTime = Date.now();
Logger.log('Loop through ' + count + ' files takes ' + ((endTime-startTime)/1000) + ' seconds');
The results above are nothing special, but it makes you think if it will be possible to speedup certain action once you have stored fileinfo in cache.
In my case I notice that specifying several searchcriteria and performing a search
var files = DriveApp.searchFiles(criteria);
might take a lot of time (over 20 seconds) processing results.
So I wonder if there is a way to speedup searching for files.
Does anybody have ideas how to speedup and/or to avoid looping through all files the way described in the first test?

Not possible to speed it up. The comparison you make is not very relevant because the 2nd time you are not making any drive api calls.
That 2nd call is just measusing the time it takes to run a right loop with no api calls.
The first time, all time is consumed calling next which does a roundgrip to the drive api.
If your data doesnt change often you can use the cache to avoid making the same searches again. Make sure to deal with stale results and such.

Related

Speed up adding Items to Forms with Apps Script

I'm wondering if there is a more efficient way to add Items to Google Forms with App Script.
My script is taking a long time and I'm wondering if there's anyway to add items more efficiently using an array or something rather than one by one. Here's the code I'm running...
function addFormItems() {
var startTime = new Date(); var form = FormApp.create("Test") form.setIsQuiz(true)
for (var i = 1; i < 100; i++) {
form.addTextItem().setTitle("Question number " + i).setHelpText("Hint: answer should not be more than a couple of words")
}
var endTime = new Date();
//Get runTime in seconds var runTime = (endTime - startTime)/1000;
Logger.log("runtime is: " + runTime)
}
Currently it takes quite a long time a minute to a minute and a half (odd thing is every time I execute I get a very different runtime not sure why that happens) Any thoughts on how to speed this up is much appreciated.
I searched Documentation and couldn't find anything about adding multiple items with one call.

Google Sheets error - Service invoked too many times for one day: urlfetch

I have problem with my function in Google Sheets. I am getting every day this error: "Exception: Service invoked too many times for one day: urlfetch." I have about 1000 urls in document. I am looked for solution at google. I find some topics where is recommended to add cache to function but I dont know how to do it. Does somebody have any idea? My function:
function ImportCeny(url, HTMLClass) {
var output = '';
var fetchedUrl = UrlFetchApp.fetch(url, {muteHttpExceptions: true});
if (fetchedUrl) {
var html = fetchedUrl.getContentText();
}
// Grace period to avoid call limit
Utilities.sleep(1000);
var priceposition = html.search(HTMLClass);
return html.slice(priceposition,priceposition + 70).match(/(-\d+|\d+)(,\d+)*(\.\d+)*/g);
}
You may try to add a randomly generated number, for example 6 digits and add this number to the URL as a parameter each time before calling "UrlFetchApp"
i.e.;
url = url & "?t=458796"
You can certainly use Utilities.sleep() to force the program to stop for some time before making the next call. However, using the built-in Cache class (you can see the docs here) is much more suitable as it is specially designed for these scenarios.
So, if you wanted to leave one second you could replace:
Utilities.sleep(1000); //In milliseconds
with
var cache = CacheService.getScriptCache(); //Create a cache instance
cache.put("my_content", html, 1); // In seconds

Daily URLFetch quotas and "Service invoked too many times for one day: urlfetch" error

I have a script/sheet that is running up against the dreaded "Service invoked too many times for one day: urlfetch" error. I've checked the "Quotas for Google Services" here, but for the life of me, I can't see why I'm going above the stated URL fetch calls limit of 20,000/day. I'm running a script that plugs 6 instances of an ImportJSON function with different URLs into cells in my spreadsheet, and this script is triggered to run every minute. So my calculations are 6 calls per minute = 360 calls/hr = 8640 calls/day... so what gives?
I've read that there may also be a daily data limit of 30MB, but again, my calculations put me well under: 7.4KB/minute (total data received from the 6 calls) = 444 KB/hr = 10656 KB/day or 10.6 MB/day.
The objective of the script I'm running is to automatically refresh a trading spreadsheet with the latest ticker and orderbook data for three Bitcoin markets (XBTUSD, XBTEUR, and XBTCAD) from the Kraken exchange. Below is the function that inserts the importJSON calls into the spreadsheet cells:
function importJSONupdate() {
var d = new Date();
var timeStamp = d.toLocaleTimeString();
var ss = SpreadsheetApp.getActiveSpreadsheet();
var XBT_Orderbooks = ss.getSheetByName("XBT Orderbooks");
var XBTUSD_LAST = '=ImportJSON("https://api.kraken.com/0/public/Ticker?pair=XBTUSD", "/result/XXBTZUSD", "noHeaders", "' + timeStamp + '")';
XBT_Orderbooks.getRange('B129').setValue(XBTUSD_LAST);
var XBTUSD_ORDERBOOK = '=ImportJSON("https://api.kraken.com/0/public/Depth?pair=XBTUSD&count=30", "/result/XXBTZUSD", "noHeaders,noTruncate","' + timeStamp + '")';
XBT_Orderbooks.getRange('B134').setValue(XBTUSD_ORDERBOOK);
var XBTEUR_LAST = '=ImportJSON("https://api.kraken.com/0/public/Ticker?pair=XBTEUR", "/result/XXBTZEUR", "noHeaders", "' + timeStamp + '")';
XBT_Orderbooks.getRange('B130').setValue(XBTEUR_LAST);
var XBTEUR_ORDERBOOK = '=ImportJSON("https://api.kraken.com/0/public/Depth?pair=XBTEUR&count=30", "/result/XXBTZEUR", "noHeaders,noTruncate","' + timeStamp + '")';
XBT_Orderbooks.getRange('B135').setValue(XBTEUR_ORDERBOOK);
var XBTCAD_LAST = '=ImportJSON("https://api.kraken.com/0/public/Ticker?pair=XBTCAD", "/result/XXBTZCAD", "noHeaders", "' + timeStamp + '")';
XBT_Orderbooks.getRange('B131').setValue(XBTCAD_LAST);
var XBTCAD_ORDERBOOK = '=ImportJSON("https://api.kraken.com/0/public/Depth?pair=XBTCAD&count=30", "/result/XXBTZCAD", "noHeaders,noTruncate","' + timeStamp + '")';
XBT_Orderbooks.getRange('B136').setValue(XBTCAD_ORDERBOOK);
XBT_Orderbooks.getRange('B3').setValue(timeStamp);
}
Anything I'm miscalculating or otherwise not understanding here? The only other thing I can think of is that the spreadsheet also has 2 GoogleFinance() calls that fetch the CADUSD and EURUSD exchange rates. Might this need to be factored in to my usage calculations? The GoogleFinance() function seems to update on its own schedule, so I'm not exactly sure how I might account for this...

How do I return all folder names and IDs without reaching execution limit?

I am attempting to retrieve a list of all folders and their respective IDs using a Google Apps Script. Currently, I am writing the result to an array which is then posted to a spreadsheet every 5000 records. Unfortunately, the script reaches the execution limit (5 minutes) before completion. How can I work around this? And, would I have more success doing RESTful API calls over HTTP than using Apps Script?
I've noted the following:
Code already follows Google's bulk-writes best practice.
Slow execution is as a result of Apps Script indexing Drive slowly.
Results appear to follow a consistent indexing pattern.
Multiple runs produce results in same order
Unknown how items are re-indexed upon addition preventing meaningful caching between runs
Delta not reliable unless indexing method is identified
Looked into Drive caching.
Still required to loop through FolderIterator object
Theoretical performance would be even worse imo (correct?)
Code is below:
function LogAllFolders() {
var ss_index = 1;
var idx = 0;
var folder;
var data = new Array(5000);
for (i=0;i<5000;i++){
data[i] = new Array(2);
}
var ss = SpreadsheetApp.create("FolderInv2",1,2).getSheets()[0];
var root = DriveApp.getFolders();
while(root.hasNext()) {
folder = root.next();
data[idx][0] = folder.getName();
data[idx][1] = folder.getId();
idx++;
if ((ss_index % 5000) == 0) {
ss.insertRowsAfter(ss.getLastRow()+1, 5000);
ss.getRange(ss.getLastRow()+1,1,5000,2).setValues(data);
SpreadsheetApp.flush();
idx = 0;
}
ss_index++;
}
}
I would first collect all the folder ids you wanted to process first, then you could save the folder ID (or maybe array index) that you've processed to your project properties and then run the job as a CRON every five minutes and just resume from that folder ID or index that you saved previously.
I guess when it's done, remove the CRON trigger programatically.

how to speedup several hundred calls DriveApp.getFolderById(id)

I noticed folder = DriveApp.getFolderById(id); takes a lot of time and I would like to know if there is a faster way of retrieving hundreds of folders by their id.
I stored info about pathes and folder-Ids in scriptProperties. Without doing that building the tree takes about 45 seconds.
If I use that and provide a 'dummy'name (= method C) the whole function, containing the code below plus other code, takes just below 1 second.
If I get the names of the folders the normal way by 315 times using methode A (folder = DriveApp.getFolderByid(id)) it takes 27 seconds.
I can optimize this by (effectively) removing calls to getFolderById for folders that have several parents (= method B), but still that takes 22 seconds.
Using DriveApp.getFolders() and prepare the names before the loop (= method D) will reduce time to about 4 seconds. Workable, but still quite long.
for (var i=1; i<numPathes; i++)
{ // Create treeItems using pathes generated
var folderPath = pathes[i];
var arrPathes = folderPath.split(underscore);
var numArrPathes = arrPathes.length;
var folderIndex = arrPathes[--numArrPathes];
var parentIndex = arrPathes[--numArrPathes];
var folderId = folderIds[folderIndex];
///////////// Alternatives
// Method A) --> takes 27 seconds
var folder = DriveApp.getFolderById(folderId);
var folderName = folder.getName(); // The normal way
// Method B) --> takes 22 seconds
var folderName = folderNames[folderIndex]; // Prepared before the current loop
// Method C) --> takes 1 second (just for reference)
var foldeName = folderId; // Obtained by folder.getId() before (= dummy)
// Method D) --> takes 4 seconds
var folderName = folderNames[folderId]; // Prepared using DriveApp.getFolders)
// Method E) --> takes 1 second
var folderName = folderNames[folderId]; // Using cacheService (mentioned by Zig Mandel)
///////////// End alternatives
var txtFolder = underscore + folderId + underscore + folderPath;
var chk = appLocal.createCheckBox(folderName).setValue(status)
.setId(chkPrefix + txtFolder).setName(chkPrefix + txtFolder)
.addClickHandler(onChangeStatusChkTreeItem);
var tri = appLocal.createTreeItem(chk).setId(triPrefix + txtFolder);
treeItems[i] = tri;
treeItems[parentIndex].addItem(tri);
}
tree.addItem(treeItems[0]); // Add the root to the tree --> all treeItems will be added as well
So my question is if there is a method for obtaining several hundreds of calls to getFolderId(id) fast? I think about caching, but how?
To me it seems GAS lacks a (fast?) map from ids to folders.
EDIT-1
I implemented caching , mapping folder ids to folder names (method E).
Currently I use a trigger to start updating the cache and scriptProperties every 5 hours and 50 minutes.
I will implement validation of the data in the background using a trigger while the program is running, updating the cache and rebuild the tree if required.
This approach makes it possible to show a tree containing hundreds of folders within a few seconds and without the user waiting for the UI to appear.
There is no way to improve the call times to the google apis.
However, if the folder names dont change often you can cache them in a public cache (see cache services) for up to 6 hours by mapping the folder id (property) to the folder name (cache value).
Dont use getAllFolders as it has a maximum limit and may not get them all.