Google Apps Script Class GmailApp Batch Operations? - google-apps-script

I've been fooling around with GAS for a month or so now, and I've become fairly familiar with using batch operations to read/write to/from spreadsheets (e.g. getValues(), setValues()). However, I'm currently writing a script that pulls a sizable amount of data out of Gmail using class GmailApp, my code is running very slowly (and even timing out), and I can't seem to figure out how to use batch operations for what I'm trying to do. Here's my code thus far (with the email address and name changed):
function fetchEmails(){
var sheet = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
var threads = GmailApp.search('in: label:searchedLabel');
var messages = new Array();
function Email(message){
this.date = new Date(message.getDate());
this.body = message.getBody();
}
for(var i=0;i<threads.length;i++){
for(var j=0;j<threads[i].getMessageCount();j++){
if(threads[i].getMessages()[j].getFrom()=="firstName lastName <email#domain.com>"){
var message = new Email(threads[i].getMessages()[j]);
messages.push(message);
}
}
}
}
As you can see, I'm querying my email for all threads with the given label,
making an object constructor for a custom Email object (which will have the body and date of an email as properties). Then I'm looping through each thread and when a given email matches the sender I'm looking for, I create an instance of the Email object for that email and place that Email object into an array. The goal is that in the end I'll have an array of Email objects that are all from my desired sender. However as you've probably noticed the code calls Google's APIs way too often, but I can't seem to figure out batch operations for interfacing with Gmail. Any ideas? Thanks so much.

I think you are looking for GmailApp.getMessagesForThreads().
function fetchEmails(){
var sheet = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
var threads = GmailApp.search('label:searchedLabel');
var messages = new Array();
function Email(message){
this.date = new Date(message.getDate());
this.body = message.getBody();
}
var gmailMessages = GmailApp.getMessagesForThreads(threads);
for(var i=0;i<thread.length;i++){
var messagesForThread = gmailMessages[i];
for(var j=0;j<messagesForThread.length;j++){
if(messagesForThread[j].getFrom()=="firstName lastName <email#domain.com>"){
var message = new Email(messagesForThread[j]);
messages.push(message);
}
}
}
}
Of course you can also write this a little more concisely (sorry, I can't turn up an opportunity to educate about the wonders of JavaScript):
function fetchEmails(){
var messages = Array.prototype.concat.apply([], GmailApp.getMessagesForThreads(
GmailApp.search('label:searchedLabel')).map(function(messagesForThread) {
return messagesForThread.filter(function(message) {
return message.getFrom() == "firstName lastName <email#domain.com>";
}).map(function(message) {
return { date: new Date(message.getDate()), body: message.getBody() };
});}));
}
This makes a grand total of 2 calls to Gmail, so it's going to be fast.
In fact, if you integrate the 'from' part into the search as was suggested above, all you need is:
function fetchEmails(){
var messages = Array.prototype.concat.apply([], GmailApp.getMessagesForThreads(
GmailApp.search('label:searchedLabel from:email#domain.com')).map(
function(messagesForThread) {
return messagesForThread.map(function(message) {
return { date: new Date(message.getDate()), body: message.getBody() };
});}));
}
Finally, since you don't actually care about the thread structure, you can just concat the arrays before the map, which leads to this:
function fetchEmails(){
var messages = GmailApp.getMessagesForThreads(
GmailApp.search('label:searchedLabel from:email#domain.com'))
.reduce(function(a, b){ return a.concat(b); })
.map(function(message) {
return { date: new Date(message.getDate()), body: message.getBody() };
});
}
(I left in the earlier samples in case you do care about the thread structure and you were just giving a minimal example).

Related

Code shortening for data entry form Google App script

I'm new to coding, and I know I am going a long way about this and making my script run slow, but I can't figure out how to shorten and optimise it (now I have tried to map the second bit of code using Marios comment)
I have made a data entry form on Google Sheets for athletes I coach to use as a training diary. After recording training data in a session, they hit the save button and this script transfers it to a different spreadsheet with all of their training data ever in.
Below is a section of code I have attempted to shorten with Marios comment:
function submitSession1() {
workloadSubmit();
myValue();
}
function workloadSubmit(){
var inputSS = SpreadsheetApp.getActiveSpreadsheet();
var inputS = inputSS.getSheetByName("Session 1");
var outputSS = SpreadsheetApp.openByUrl()
var workloadS = outputSS.getSheetByName();
var dtCurrentTime = new Date();
//Input Values for Workload data
var workloads = [[inputS.getRange("M1").getValue(),
inputS.getRange("N1").getValue(),
inputS.getRange("O1").getValue(),
inputS.getRange("P1").getValue(),
inputS.getRange("AK3").getValue(),
inputS.getRange("AK5").getValue(),
inputS.getRange("AL3").getValue(),
inputS.getRange("AL5").getValue(),
inputS.getRange("BC3").getValue(),
inputS.getRange("BC5").getValue(),
inputS.getRange("BD3").getValue(),
inputS.getRange("BD5").getValue(),
inputS.getRange("AM3").getValue(),
inputS.getRange("AM5").getValue(),
inputS.getRange("AN3").getValue(),
inputS.getRange("AN5").getValue(),
inputS.getRange("AO3").getValue(),
inputS.getRange("AO5").getValue(),
inputS.getRange("AP3").getValue(),
inputS.getRange("AP5").getValue(),
inputS.getRange("AQ3").getValue(),
inputS.getRange("AQ5").getValue(),
inputS.getRange("AR3").getValue(),
inputS.getRange("AR5").getValue(),
inputS.getRange("AS3").getValue(),
inputS.getRange("AS5").getValue(),
inputS.getRange("AT3").getValue(),
inputS.getRange("AT5").getValue(),
inputS.getRange("AU3").getValue(),
inputS.getRange("AU5").getValue(),
inputS.getRange("AV3").getValue(),
inputS.getRange("AV5").getValue(),
inputS.getRange("AW3").getValue(),
inputS.getRange("AW5").getValue(),
inputS.getRange("AX3").getValue(),
inputS.getRange("AX5").getValue(),
inputS.getRange("AY3").getValue(),
inputS.getRange("AY5").getValue(),
inputS.getRange("AZ3").getValue(),
inputS.getRange("AZ5").getValue(),
inputS.getRange("BA3").getValue(),
inputS.getRange("BA5").getValue(),
inputS.getRange("BB3").getValue(),
inputS.getRange("BB5").getValue(),
dtCurrentTime]];
workloadS.getRange(workloadS.getLastRow()+1, 1, 1,
45).setValues(workloads);
}
// Drills Data Submit
function myValue(col) {
var inputSS = SpreadsheetApp.getActiveSpreadsheet();
var inputS = inputSS.getSheetByName("Session 1");
var outputSS = SpreadsheetApp.openByUrl()
var drillsS = outputSS.getSheetByName("Drills Data");
var dtCurrentTime = new Date();
return inputS.getRange(col).getValue();
}
var colns = ["M1", "N1", "O1", "P1", "A14","B14","D14","F14","G14","H14","J14","K14","L14","M14","N14","O14","P14","Q14","R14","S14","T14","U14"];
var drillsData = colns.map(myValue)
drillsData.push(dtCurrentTime)
I am now getting the error code:
Exception: Argument cannot be null: a1Notation (line 71, file "Code")Dismiss
Any help is much appreciated
You can calculate drillsData using maps:
function myValue(col) {
return inputS.getRange(col).getValue();
}
var colns= ["M1", "N1", "O1", "P1", "A14","B14","D14","F14","G14","H14","J14",
"K14","L14","M14","N14","O14","P14","Q14","R14","S14","T14","U14"];
var drillsData = colns.map(myValue)
drillsData.push(dtCurrentTime)
*Don't forget to call drillsData as [drillsData].
Unfortunately, the columns you want to retrieve are not sequential, therefore selecting the full range is not an option.
Or you can create custom functions to make your code look cleaner:
function importSheets(sheetN) {
return outputSS.getSheetByName(sheetN);
}
var workloadS = importSheets("W.L + Full Routine Data")
For the latter you can again create maps using the same logic described for one.
As a result, you can have a collection of sheets objects as elements in an array and call by using their index.

XmlService.parse() not able to handle HTML tables

I am looking for help from this community regarding the below issue.
// I am searching my Gmail inbox for a specific email
function getWeeklyEmail() {
var emailFilter = 'newer_than:7d AND label:inbox AND "Report: Launchpad filter"';
var threads = GmailApp.search(emailFilter, 0, 5);
var messages=[];
threads.forEach(function(threads)
{
messages.push(threads.getMessages()[0]);
});
return messages;
}
// Trying to parse the HTML table contained within the email
function getParsedMsg() {
var messages = getWeeklyEmail();
var msgbody = messages[0].getBody();
var doc = XmlService.parse(msgbody);
var html = doc.getRootElement();
var tables = doc.getDescendants();
var templ = HtmlService.createTemplateFromFile('Messages1');
templ.tables = [];
return templ.evaluate();
}
The debugger crashes when I try to step over the XmlService.parse function. The msgbody of the email contains both text and HTML formatted table. I am getting the following error: TypeError: Cannot read property 'getBody' of undefined (line 19, file "Code")
If I remove the getParsedMsg function and instead just display the content of the email, I get the email body along with the element tags etc in html format.
Workaround
Hi ! The issue you are experiencing is due to (as you previously mentioned) XmlService only recognising canonical XML rather than HTML. One possible workaround to solve this issue is to search in the string you are obtaining with getBody() for your desired tags.
In your case your main issue is var doc = XmlService.parse(msgbody);. To solve it you could iterate through the whole string looking for the table tags you need using Javascript search method. Here is an example piece of code retrieving an email with a single table:
function getWeeklyEmail() {
var emailFilter = 'newer_than:7d AND label:inbox AND "Report: Launchpad filter"';
var threads = GmailApp.search(emailFilter, 0, 5);
var messages=[];
threads.forEach(function(threads)
{
messages.push(threads.getMessages()[0]);
});
return messages;
}
// Trying to parse the HTML table contained within the email
function getParsedMsg() {
var messages = getWeeklyEmail();
var msgbody = messages[0].getBody();
var indexOrigin = msgbody.search('<table');
var indexEnd = msgbody.search('</table');
// Get what is in between those indexes of the string.
// I am adding 8 as it indexEnd only gets the first index of </table
// i.e the one before <
var Table = msgbody.substring(indexOrigin,indexEnd+8);
Logger.log(Table);
}
If you are looking for more than one table in your message, you can change getParsedMsg to the following:
function getParsedMsg() {
// If you are not sure about how many you would be expecting, use an approximate number
var totalTables = 2;
var messages = getWeeklyEmail();
var msgbody = messages[0].getBody();
var indexOrigin = msgbody.indexOf('<table');
var indexEnd = msgbody.indexOf('</table');
var Table = []
for(i=0;i<totalTables;i++){
// go over each stable and store their strings in elements of an array
var start = msgbody.indexOf('<table', (indexOrigin + i))
var end = msgbody.indexOf('</table', (indexEnd + i))
Table.push(msgbody.substring(start,end+8));
}
Logger.log(Table);
}
This will let you store each table in an element of an array. If you want to use these you would just need to retrieve the elements of this array and use them accordingly (for exaple to use them as HTML tables.
I hope this has helped you. Let me know if you need anything else or if you did not understood something. :)

Ignore same-thread emails that have different labels

I am writing the Date and Subject from specific new emails to a new row of a Google Sheet.
I apply a label to the new mail items with a filter.
the script processes those labeled emails
the label is removed
A new label is applied, so that these emails won't be processed next time.
Problem: When there is a myLabel email, the script processes all emails in the same thread (eg same subject and sender) regardless of their label (even Inbox and Trash).
Question: How to only process new emails i.e. ones with the label myLabel - even when the thread of those messages extends outside the myLabel folder?
My current script:
function fetchmaildata() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sheet = ss.getSheetByName('mySheetName');
var label = GmailApp.getUserLabelByName('myLabel');
var threads = label.getThreads();
for (var i = 0; i < threads.length; i++)
{
var messages = threads[i].getMessages();
for (var j = 0; j < messages.length; j++)
{
var sub = messages[j].getSubject();
var dat = messages[j].getDate();
ss.appendRow([dat, sub])
}
threads[i].removeLabel(label);
threads[i].addLabel(newlabel);
}
}
I hacked a solution for my purposes by changing my for loop to this:
for (var j = messages.length-1; j > messages.length-2; j--)
This says to process only the latest email in the thread, even when there is more than one email of a thread in the myLabel folder. Oddly, the script still changes the Labels of all the myLabel emails, but only the latest one of a thread gets written to the spreadsheet, so it works for me.
I had to make another change to the code because the above code does not run as a time-triggered scheduled task. I changed the code in this way and it now runs on a time schedule !!
//var ss = SpreadsheetApp.getActiveSpreadsheet();
var ss = SpreadsheetApp.openById("myGoogleSheetID");
A label can be on a thread due to being on a single message in said thread. Your code simply goes label -> all label threads -> all thread messages, rather than accessing only the messages in a thread with a given label. That's not really your fault - it's a limitation of the Gmail Service. There are two approaches that you can use to remedy this behavior:
The (enable-before-use "advanced service") Gmail REST API
The REST API supports detailed querying of messages, including per-message label status, with Gmail.Users.Messages.list and the labelIds optional argument. For example:
// Get all messages (not threads) with this label:
function getMessageIdsWithLabel_(labelClass) {
const labelId = labelClass.getId();
const options = {
labelIds: [ labelId ],
// Only retrieve the id metadata from each message.
fields: "nextPageToken,messages/id"
};
const messages = [];
// Could be multiple pages of results.
do {
var search = Gmail.Users.Messages.list("me", options);
if (search.messages && search.messages.length)
Array.prototype.push.apply(messages, search.messages);
options.pageToken = search.nextPageToken;
} while (options.pageToken);
// Return an array of the messages' ids.
return messages.map(function (m) { return m.id; });
}
Once using the REST API, there are other methods you might utilize, such as batch message label adjustment:
function removeLabelFromMessages_(messageIds, labelClass) {
const labelId = labelClass.getId();
const resource = {
ids: messageIds,
// addLabelIds: [ ... ],
removeLabelIds: [ labelId ]
};
// https://developers.google.com/gmail/api/v1/reference/users/messages/batchModify
Gmail.Users.Messages.batchModify(resource, "me");
}
Result:
function foo() {
const myLabel = /* get the Label somehow */;
const ids = getMessageIdsWithLabel_(myLabel);
ids.forEach(function (messageId) {
var msg = GmailApp.getMessageById(messageId);
/* do stuff with the message */
});
removeLabelFromMessages_(ids, myLabel);
}
Recommended Reading:
Advanced Services
Gmail Service
Messages#list
Message#batchModify
Partial responses aka the 'fields' parameter
Tracked Processing
You could also store each message ID somewhere, and use the stored IDs to check if you've already processed a given message. The message Ids are unique.
This example uses a native JavaScript object for extremely fast lookups (vs. simply storing the ids in an array and needing to use Array#indexOf). To maintain the processed ids between script execution, it uses a sheet on either the active workbook, or a workbook of your choosing:
var MSG_HIST_NAME = "___processedMessages";
function getProcessedMessages(wb) {
// Read from a sheet on the given spreadsheet.
if (!wb) wb = SpreadsheetApp.getActive();
const sheet = wb.getSheetByName(MSG_HIST_NAME)
if (!sheet) {
try { wb.insertSheet(MSG_HIST_NAME).hideSheet(); }
catch (e) { }
// By definition, no processed messages.
return {};
}
const vals = sheet.getSheetValues(1, 1, sheet.getLastRow(), 1);
return vals.reduce(function (acc, row) {
// acc is our "accumulator", and row is an array with a single message id.
acc[ row[0] ] = true;
return acc;
}, {});
}
function setProcessedMessages(msgObject, wb) {
if (!wb) wb = SpreadsheetApp.getActive();
if (!msgObject) return;
var sheet = wb.getSheetByName(MSG_HIST_NAME);
if (!sheet) {
sheet = wb.insertSheet(MSG_HIST_NAME);
if (!sheet)
throw new Error("Unable to make sheet for storing data");
try { sheet.hideSheet(); }
catch (e) { }
}
// Convert the object into a serializable 2D array. Assumes we only care
// about the keys of the object, and not the values.
const data = Object.keys(msgObject).map(function (msgId) { return [msgId]; });
if (data.length) {
sheet.getDataRange().clearContent();
SpreadsheetApp.flush();
sheet.getRange(1, 1, data.length, data[0].length).setValues(data);
}
}
Usage would be something like:
function foo() {
const myLabel = /* get label somehow */;
const processed = getProcessedMessages();
myLabel.getThreads().forEach(function (thread) {
thread.getMessages().forEach(function (msg) {
var msgId = msg.getId();
if (processed[msgId])
return; // nothing to do for this message.
processed[msgId] = true;
// do stuff with this message
});
// do more stuff with the thread
});
setProcessedMessages(processed);
// do other stuff
}
Recommended Reading:
Is checking an object for a key more efficient than searching an array for a string?
Array#reduce
Array#map
Array#forEach

getMessageById() slows down

I am working on a script that works with e-mails and it needs to fetch the timestamp, sender, receiver and subject for an e-mail. The Google script project has several functions in separate script files so I won't be listing everything here, but essentially the main function performs a query and passes it on to a function that fetches data:
queriedMessages = Gmail.Users.Messages.list(authUsr.mail, {'q':query, 'pageToken':pageToken});
dataOutput_double(sSheet, queriedMessages.messages, queriedMessages.messages.length);
So this will send an object to the function dataOutput_double and the size of the array (if I try to get the size of the array inside the function that outputs data I get an error so that is why this is passed here). The function that outputs the data looks like this:
function dataOutput_double(sSheet, messageInfo, aLenght) {
var sheet = sSheet.getSheets()[0],
message,
dataArray = new Array(),
row = 2;
var i, dateCheck = new Date;
dateCheck.setDate(dateCheck.getDate()-1);
for (i=aLenght-1; i>=0; i--) {
message = GmailApp.getMessageById(messageInfo[i].id);
if (message.getDate().getDate() == dateCheck.getDate()) {
sheet.insertRowBefore(2);
sheet.getRange(row, 1).setValue(message.getDate());
sheet.getRange(row, 2).setValue(message.getFrom());
sheet.getRange(row, 3).setValue(message.getTo());
sheet.getRange(row, 4).setValue(message.getSubject());
}
}
return;
};
Some of this code will get removed as there are leftovers from other types of handling this.
The problem as I noticed is that some messages take a long time to get with the getMessageById() method (~ 4 seconds to be exact) and when the script is intended to work with ~1500 mails every day this makes it drag on for quite a while forcing google to stop the script as it takes too long.
Any ideas of how to go around this issue or is this just something that I have to live with?
Here is something I whipped up:
function processEmails() {
var ss = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
var messages = Gmail.Users.Messages.list('me', {maxResults:200, q:"newer_than:1d AND label:INBOX NOT label:PROCESSED"}).messages,
headers,
headersFields = ["Date","From","To","Subject"],
outputValue=[],thisRowValue = [],
message
if(messages.length > 0){
for(var i in messages){
message = Gmail.Users.Messages.get('me', messages[i].id);
Gmail.Users.Messages.modify( {addLabelIds:["Label_4"]},'me',messages[i].id);
headers = message.payload.headers
for(var ii in headers){
if(headersFields.indexOf(headers[ii].name) != -1){
thisRowValue.push(headers[ii].value);
}
}
outputValue.push(thisRowValue)
thisRowValue = [];
}
var range = ss.getRange(ss.getLastRow()+1, ss.getLastColumn()+1, outputValue.length, outputValue[0].length);
range.setValues(outputValue);
}
}
NOTE: This is intended to run as a trigger. This will batch the trigger call in 200 messages. You will need to add the label PROCESSED to gmail. Also on the line:
Gmail.Users.Messages.modify( {addLabelIds:["Label_4"]},'me',messages[i].id);
it shows Label_4. In my gmail account "PROCESSED" is my 4th custom label.

What alternative to ScriptDB I could use to store a big array of arrays? (without using external DB)

I was a user of the deprecated ScriptDB. The use I made of ScriptDB was fairly simple: to store a certain amount of information contained on a panel options, this way:
var db = ScriptDb.getMyDb();
function showList(folderID) {
var folder = DocsList.getFolderById(folderID);
var files = folder.getFiles();
var arrayList = [];
for (var file in files) {
file = files[file];
var thesesName = file.getName();
var thesesId = file.getId();
var thesesDoc = DocumentApp.openById(thesesId);
for (var child = 0; child < thesesDoc.getNumChildren(); child++){
var thesesFirstParagraph = thesesDoc.getChild(child);
var thesesType = thesesFirstParagraph.getText();
if (thesesType != ''){
var newArray = [thesesName, thesesType, thesesId];
arrayList.push(newArray);
break;
}
}
}
arrayList.sort();
var result = db.query({arrayName: 'savedArray'});
if (result.hasNext()) {
var savedArray = result.next();
savedArray.arrayValue = arrayList;
db.save(savedArray);
}
else {
var record = db.save({arrayName: "savedArray", arrayValue:arrayList});
}
var mydoc = SpreadsheetApp.getActiveSpreadsheet();
var app = UiApp.createApplication().setWidth(550).setHeight(450);
var panel = app.createVerticalPanel()
.setId('panel');
var label = app.createLabel("Choose the options").setStyleAttribute("fontSize", 18);
app.add(label);
panel.add(app.createHidden('checkbox_total', arrayList.length));
for(var i = 0; i < arrayList.length; i++){
var checkbox = app.createCheckBox().setName('checkbox_isChecked_'+i).setText(arrayList[i][0]);
panel.add(checkbox);
}
var handler = app.createServerHandler('submit').addCallbackElement(panel);
panel.add(app.createButton('Submit', handler));
var scroll = app.createScrollPanel().setPixelSize(500, 400);
scroll.add(panel);
app.add(scroll);
mydoc.show(app);
}
function include(arr, obj) {
for(var i=0; i<arr.length; i++) {
if (arr[i] == obj) // if we find a match, return true
return true; }
return false; // if we got here, there was no match, so return false
}
function submit(e){
var scriptDbObject = db.query({arrayName: "savedArray"});
var result = scriptDbObject.next();
var arrayList = result.arrayValue;
db.remove(result);
// continues...
}
I thought I could simply replace the ScriptDB by userProperties (using JSON to turn the array into string). However, an error warns me that my piece of information is too large to be stored in userProperties.
I did not want to use external databases (parse or MongoDB), because I think it isn't necessary for my (simple) purpose.
So, what solution I could use as a replacement to ScriptDB?
You could store a string using the HtmlOutput Class.
var output = HtmlService.createHtmlOutput('<b>Hello, world!</b>');
output.append('<p>Hello again, world.</p>');
Logger.log(output.getContent());
Google Documentation - HtmlOutput
There are methods to append, clear and get the content out of the HtmlOutput object.
OR
Maybe create a Blob:
Google Documentation - Utilities Class - newBlob Method
Then you can get the data out of the blob as a string.
getDataAsString
Then if you need to you can convert the string to an object if it's in the right JSON format.
Firstly, if you're hitting the limits on the Properties service, I would recommend you look at an alternative external store, as you're manipulating a large amount of data, and any workaround given here is possibly going to be slower and less efficient then simply using a dedicated service.
Alternatively of course, you could look at making your data come under the limits for the properties service by splitting it up and using multiple properties etc.
One other alternative would be to use a Google Doc or Sheet to store the string. When you're required to pull the data again, you can simply access the sheet and get the string, but this might be slow depending on the size of the string. At a glance it looks like you're just pulling Data on the folders in your drive, so you could consider writing it to a sheet, which would allow you to even display the information in a user friendly way. Given your use of arrays already, you can write them to a sheet easily using .setValues() if you convert them to a 2D array.
Bruce McPherson has done a lot of work on abstracting databases. Take a look at his cDbAbstraction library then you could easily chop and change which DB you use and compare performance. Maybe even create a cDbAbstraction library to use HTMLOutput (I like that idea Sandy, Bruce does some funky stuff with parallel processes via HTMLService)