How to use a for loop with .createChoice in Google Apps Script to create a quiz from a sheet? - google-apps-script

I am using Google Apps Script to generate Google Forms from a Sheet. Questions are in rows and question choices are in columns.
Here is a link to the Google sheet if needed.
It is a straightforward task when using .setChoiceValues(values)
if (questionType == 'CHOICE') {
var choicesForQuestion = [];
for (var j = 4; j < numberColumns; j++)
if (data[i][j] != "")
choicesForQuestion.push(data[i][j]);
form.addMultipleChoiceItem()
.setChoiceValues(choicesForQuestion);
}
However, when I try to use .createChoice(value, isCorrect), the parameters call for value to be a string and isCorrect to be Boolean.
An example without a loop looks like this:
var item = FormApp.getActiveForm().addCheckboxItem();
item.setTitle(data[3][1]);
// Set options and correct answers
item.setChoices([
item.createChoice("chocolate", true),
item.createChoice("vanilla", true),
item.createChoice("rum raisin", false),
item.createChoice("strawberry", true),
item.createChoice("mint", false)
]);
I can not figure out how to add the loop. After reading over other posts, I have tried the following:
if (questionType == 'CHOICE') {
var questionInfo = [];
for (var j = optionsCol; j < maxOptions + 1; j++)
if (data[i][j] != "")
questionInfo.push( form.createChoice(data[i][j], data[i][j + maxOptions]) );
form.addMultipleChoiceItem()
.setChoices(questionInfo);
}
optionsCol is the first column of questions options
maxOptions is how many options are allowed by the sheet (currently 5). The isCorrect information is 5 columns to the right.
However, this not working because the array questionsInfo is empty.
What is the best way to do this?

Probably your issue is related to the method you reference--Form#createChoice--not existing. You need to call MultipleChoiceItem#createChoice, by first creating the item:
/**
* #param {Form} formObj the Google Form Quiz being created
* #param {any[]} data a 1-D array of data for configuring a multiple-choice quiz question
* #param {number} index The index into `data` that specifies the first choice
* #param {number} numChoices The maximum possible number of choices for the new item
*/
function addMCItemToForm_(formObj, data, index, numChoices) {
if (!formObj || !data || !Array.isArray(data)
|| Array.isArray(data[0]) || data.length < (index + 2 * numChoices))
{
console.error({message: "Bad args given", hasForm: !!formObj, info: data,
optionIndex: index, numChoices: numChoices});
throw new Error("Bad arguments given to `addMCItemToForm_` (view on StackDriver)");
}
const title = data[1];
// Shallow-copy the desired half-open interval [index, index + numChoices).
const choices = data.slice(index, index + numChoices);
// Shallow-copy the associated true/false data.
const correctness = data.slice(index + numChoices, index + 2 * numChoices);
const hasAtLeastOneChoice = choices.some(function (c, i) {
return (c && typeof correctness[i] === 'boolean');
});
if (hasAtLeastOneChoice) {
const mc = formObj.addMultipleChoiceItem().setTitle(title);
// Remove empty/unspecified choices.
while (choices[choices.length - 1] === "") {
choices.pop();
}
// Convert to choices for this specific MultipleChoiceItem.
mc.setChoices(choices.map(function (choice, i) {
return mc.createChoice(choice, correctness[i]);
});
} else {
console.warn({message: "Skipped bad mc-item inputs", config: data,
choices: choices, correctness: correctness});
}
}
You would use the above function as described by its JSDoc - pass it a Google Form object instance to create the quiz item in, an array of the details for the question, and the description of the location of choice information within the details array. For example:
function foo() {
const form = FormApp.openById("some id");
const data = SpreadsheetApp.getActive().getSheetByName("Form Initializer")
.getSheetValues(/*row*/, /*col*/, /*numRows*/, /*numCols*/);
data.forEach(function (row) {
var qType = row[0];
...
if (qType === "CHOICE") {
addMCItemToForm_(form, row, optionColumn, numOptions);
} else if (qType === ...
...
}
References
Array#slice
Array#forEach
Array#map
Array#some

I am sure the above answer is very good and works but I am just a beginner and needed a more obvious (plodding) method. I am generating a form from a spreadsheet. Question types can include: short answer (text item), long answer (paragraph), drop down (list item), multiple choice, grid item, and checkbox questions, as well as sections.
I had to be able to randomize the input from the spreadsheet for multiple choice and sort the input for drop downs. I am only allowing one correct answer at this time.
The columns in the question building area of the spreadsheet are: question type, question, is it required, does it have points, hint, correct answer, and unlimited choice columns.
qShtArr: getDataRange of the entire sheet
corrAnsCol: index within the above of the column with the correct answer
begChoiceCol: index within the above of first column with choices
I hope this helps other less skilled coders.
/**
* Build array of choices. One may be identified as correct.
* I have not tried to handle multiple correct answers.
*/
function createChoices(make, qShtArr, r, action) {
// console.log('Begin createChoices - r: ', r);
let retObj = {}, choiceArr = [], corrArr = [], aChoice, numCol, hasCorr;
numCol = qShtArr[r].length - 1; // arrays start at zero
if ((qShtArr[r][corrAnsCol] != '') && (qShtArr[r][corrAnsCol] != null)) {
hasCorr = true;
choiceArr.push([qShtArr[r][corrAnsCol], true]);
for (let c = begChoiceCol ; c < numCol ; c++) {
aChoice = qShtArr[r][c];
if ((aChoice != '') && (aChoice != null)) { /* skip all blank elements */
choiceArr.push([aChoice, false]);
}
} //end for loop for multiple choice options
} else {
hasCorr = false;
for (let c = begChoiceCol ; c < numCol ; c++) {
aChoice = qShtArr[r][c];
if ((aChoice != '') && (aChoice != null)) { /* skip all blank elements */
choiceArr.push(aChoice);
}
} //end for loop for multiple choice options
}
if (action == 'random')
choiceArr = shuffleArrayOrder(choiceArr);
if (action == 'sort')
choiceArr.sort();
console.log('choiceArr: ', JSON.stringify(choiceArr) );
let choices = [], correctArr = [] ;
if (hasCorr) {
for ( let i = 0 ; i < choiceArr.length ; i++ ) {
choices.push(choiceArr[i][0]);
// console.log('choices: ', JSON.stringify(choices) );
correctArr.push(choiceArr[i][1]);
// console.log('correctArr: ', JSON.stringify(correctArr) );
}
make.setChoices(choices.map(function (choice, i) {
return make.createChoice(choice, correctArr[i]);
}));
} else { // no correct answer
if (action == 'columns' ) {
make.setColumns(choiceArr);
} else {
make.setChoices(choiceArr.map(function (choice, i) {
return make.createChoice(choice);
}));
}
}
}

Related

xpath in apps script?

I made a formula to extract some Wikipedia data in Google Seets which works fine. Here is the formula:
=regexreplace(join("",flatten(IMPORTXML(D2,".//p[preceding-sibling::h2[1][contains(., 'Geography')]]"))),"\[[^\]]+\]","")&char(10)&char(10)&iferror(regexreplace(join("",flatten(IMPORTXML(D2,".//p[preceding-sibling::h2[1][contains(., 'Education')]]"))),"\[[^\]]+\]",""))
Where D2 is a URL like https://en.wikipedia.org/wiki/Abbeville,_Alabama
This extracts some Geography and Education data from the Wikipedia page. Trouble is that importxml only runs a few times before it dies due to quota.
So I thought maybe better to use Apps Script where there are much higher limits on fetching and parsing. I could not see a good way however of using Xpath in Apps Script. Older posts on the web discuss using a deprecated service called Xml but it seems to no longer work. There is a Service called XmlService which looks like it may do the job but you can't just plug in an Xpath. It looks like a lot of sweating to get to the result. Any solutions out there where you can just plug in Xpath?
Here is an alternative solution I actually do in a case like this.
I have used XmlService but only for parsing the content, not for using Xpath. This makes use of the element tags and so far pretty consistent on my tests. Although, it might need tweaks when certain tags are in the result and you might have to include them into the exclusion condition.
Tested the code below in both links:
https://en.wikipedia.org/wiki/Abbeville,_Alabama#Geography
https://en.wikipedia.org/wiki/Montgomery,_Alabama#Education
My test shows that the formula above used did not return the proper output from the 2nd link while the code does. (Maybe because it was too long)
Code:
function getGeoAndEdu(path) {
var data = UrlFetchApp.fetch(path).getContentText();
// wikipedia is divided into sections, if output is cut, increase the number
var regex = /.{1,100000}/g;
var results = [];
// flag to determine if matches should be added
var foundFlag = false;
do {
m = regex.exec(data);
if (foundFlag) {
// if another header is found during generation of data, stop appending the matches
if (matchTag(m[0], "<h2>"))
foundFlag = false;
// exclude tables, sub-headers and divs containing image description
else if(matchTag(m[0], "<div") || matchTag(m[0], "<h3") ||
matchTag(m[0], "<td") || matchTag(m[0], "<th"))
continue;
else
results.push(m[0]);
}
// start capturing if either IDs are found
if (m != null && (matchTag(m[0], "id=\"Geography\"") ||
matchTag(m[0], "id=\"Education\""))) {
foundFlag = true;
}
} while (m);
var output = results.map(function (str) {
// clean tags for XmlService
str = str.replace(/<[^>]*>/g, '').trim();
decode = XmlService.parse('<d>' + str + '</d>')
// convert html entity codes (e.g.  ) to text
return decode.getRootElement().getText();
// filter blank results due to cleaning and empty sections
// separate data and remove citations before returning output
}).filter(result => result.trim().length > 1).join("\n").replace(/\[\d+\]/g, '');
return output;
}
// check if tag is found in string
function matchTag(string, tag) {
var regex = RegExp(tag);
return string.match(regex) && string.match(regex)[0] == tag;
}
Output:
Difference:
Formula ending output
Script ending output
Education ending in wikipedia
Note:
You still have quota when using UrlFetchApp but should be better than IMPORTXML's limit depending on the type of your account.
Reference:
Apps Script Quotas
Sorry I got very busy this week so I didn't reply. I took a look at your answer which seems to work fine, but it was quite code heavy. I wanted something I would understand so I coded my own solution. not that mine is any simpler. It's just my own code so it's easier for me to follow:
function getTextBetweenTags(html, paramatersInFirstTag, paramatersInLastTag) { //finds text values between 2 tags and removes internal tags to leave plain text.
//eg getTextBetweenTags(html,[['class="mw-headline"'],['id="Geography"']],[['class="wikitable mw-collapsible mw-made-collapsible"']])
// **Note: you may want to replace &#number; with ascII number
var openingTagPos = null;
var closingTagPos = null;
var previousChar = '';
var readingTag = false;
var newTag = '';
var tagEnd = false;
var regexFirstTagParams = [];
var regexLastTagParams = [];
//prepare regexes to test for parameters in opening and closing tags. put regexes in arrays so each condition can be tested separately
for (var i in paramatersInFirstTag) {
regexFirstTagParams.push(new RegExp(escapeRegex(paramatersInFirstTag[i][0])))
}
for (var i in paramatersInLastTag) {
regexLastTagParams.push(new RegExp(escapeRegex(paramatersInLastTag[i][0])))
}
var startTagIndex = null;
var endTagIndex = null;
var matches = 0;
for (var i = 0; i < html.length - 1; i++) {
var nextChar = html.substr(i, 1);
if (nextChar == '<' && previousChar != '\\') {
readingTag = true;
}
if (nextChar == '>' && previousChar != '\\') { //if end of tag found, check tag matches start or end tag
readingTag = false;
newTag += nextChar;
//test for firstTag
if (startTagIndex == null) {
var alltestsPass = true;
for (var j in regexFirstTagParams) {
if (!regexFirstTagParams[j].test(newTag)) alltestsPass = false;
}
if (alltestsPass) {
startTagIndex = i + 1;
//console.log('Start Tag',startTagIndex)
matches++;
}
}
//test for lastTag
else if (startTagIndex != null) {
var alltestsPass = true;
for (var j in regexLastTagParams) {
if (!regexLastTagParams[j].test(newTag)) alltestsPass = false;
}
if (alltestsPass) {
endTagIndex = i + 1;
matches++;
}
}
if(startTagIndex && endTagIndex) break;
newTag = '';
}
if (readingTag) newTag += nextChar;
previousChar = nextChar;
}
if (matches < 2) return 'No matches';
else return html.substring(startTagIndex, endTagIndex).replace(/<[^>]+>/g, '');
}
function escapeRegex(string) {
if (string == null) return string;
return string.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
}
My function requires an array of attributes for the start tag and an array of attributes for the end tag. It gets any text in between and removes any tags found inbetween. One issue I also noticed was there were often special characters (eg  ) so they need to be replaced. I did that outside the scope of the function above.
The function could be easily improved to check the tag type (eg h2), but it wasn't necessary for the wikipedia case.
Here is a function where I called the above function. the html variable is just the result of UrlFetchApp.fetch('some wikipedia city url').getContextText();
function getWikiTexts(html) {
var geography = getTextBetweenTags(html, [['class="mw-headline"'], ['id="Geography']], [['class="mw-headline"']]);
var economy = getTextBetweenTags(html, 'span', [['class="mw-headline"'], ['id="Economy']], 'span', [['class="mw-headline"']])
var education = getTextBetweenTags(html, 'span', [['class="mw-headline"'], ['id="Education']], 'span', [['class="mw-headline"']])
var returnString = '';
if (geography != 'No matches' && !/Wikipedia/.test(geography)) returnString += geography + '\n';
if (economy != 'No matches' && !/Wikipedia/.test(economy)) returnString += economy + '\n';
if (education != 'No matches' && !/Wikipedia/.test(education)) returnString += education + '\n';
return returnString
}
Thanks for posting your answer.

Google app script - getting all hyperlinks from document [duplicate]

Given a "normal document" in Google Docs/Drive (e.g. paragraphs, lists, tables) which contains external links scattered throughout the content, how do you compile a list of links present using Google Apps Script?
Specifically, I want to update all broken links in the document by searching for oldText in each url and replace it with newText in each url, but not the text.
I don't think the replacing text section of the Dev Documentation is what I need -- do I need to scan every element of the doc? Can I just editAsText and use an html regex? Examples would be appreciated.
This is only mostly painful! Code is available as part of a gist.
Yeah, I can't spell.
getAllLinks
Here's a utility function that scans the document for all LinkUrls, returning them in an array.
/**
* Get an array of all LinkUrls in the document. The function is
* recursive, and if no element is provided, it will default to
* the active document's Body element.
*
* #param {Element} element The document element to operate on.
* .
* #returns {Array} Array of objects, vis
* {element,
* startOffset,
* endOffsetInclusive,
* url}
*/
function getAllLinks(element) {
var links = [];
element = element || DocumentApp.getActiveDocument().getBody();
if (element.getType() === DocumentApp.ElementType.TEXT) {
var textObj = element.editAsText();
var text = element.getText();
var inUrl = false;
for (var ch=0; ch < text.length; ch++) {
var url = textObj.getLinkUrl(ch);
if (url != null) {
if (!inUrl) {
// We are now!
inUrl = true;
var curUrl = {};
curUrl.element = element;
curUrl.url = String( url ); // grab a copy
curUrl.startOffset = ch;
}
else {
curUrl.endOffsetInclusive = ch;
}
}
else {
if (inUrl) {
// Not any more, we're not.
inUrl = false;
links.push(curUrl); // add to links
curUrl = {};
}
}
}
if (inUrl) {
// in case the link ends on the same char that the element does
links.push(curUrl);
}
}
else {
var numChildren = element.getNumChildren();
for (var i=0; i<numChildren; i++) {
links = links.concat(getAllLinks(element.getChild(i)));
}
}
return links;
}
findAndReplaceLinks
This utility builds on getAllLinks to do a find & replace function.
/**
* Replace all or part of UrlLinks in the document.
*
* #param {String} searchPattern the regex pattern to search for
* #param {String} replacement the text to use as replacement
*
* #returns {Number} number of Urls changed
*/
function findAndReplaceLinks(searchPattern,replacement) {
var links = getAllLinks();
var numChanged = 0;
for (var l=0; l<links.length; l++) {
var link = links[l];
if (link.url.match(searchPattern)) {
// This link needs to be changed
var newUrl = link.url.replace(searchPattern,replacement);
link.element.setLinkUrl(link.startOffset, link.endOffsetInclusive, newUrl);
numChanged++
}
}
return numChanged;
}
Demo UI
To demonstrate the use of these utilities, here are a couple of UI extensions:
function onOpen() {
// Add a menu with some items, some separators, and a sub-menu.
DocumentApp.getUi().createMenu('Utils')
.addItem('List Links', 'sidebarLinks')
.addItem('Replace Link Text', 'searchReplaceLinks')
.addToUi();
}
function searchReplaceLinks() {
var ui = DocumentApp.getUi();
var app = UiApp.createApplication()
.setWidth(250)
.setHeight(100)
.setTitle('Change Url text');
var form = app.createFormPanel();
var flow = app.createFlowPanel();
flow.add(app.createLabel("Find: "));
flow.add(app.createTextBox().setName("searchPattern"));
flow.add(app.createLabel("Replace: "));
flow.add(app.createTextBox().setName("replacement"));
var handler = app.createServerHandler('myClickHandler');
flow.add(app.createSubmitButton("Submit").addClickHandler(handler));
form.add(flow);
app.add(form);
ui.showDialog(app);
}
// ClickHandler to close dialog
function myClickHandler(e) {
var app = UiApp.getActiveApplication();
app.close();
return app;
}
function doPost(e) {
var numChanged = findAndReplaceLinks(e.parameter.searchPattern,e.parameter.replacement);
var ui = DocumentApp.getUi();
var app = UiApp.createApplication();
sidebarLinks(); // Update list
var result = DocumentApp.getUi().alert(
'Results',
"Changed "+numChanged+" urls.",
DocumentApp.getUi().ButtonSet.OK);
}
/**
* Shows a custom HTML user interface in a sidebar in the Google Docs editor.
*/
function sidebarLinks() {
var links = getAllLinks();
var sidebar = HtmlService
.createHtmlOutput()
.setTitle('URL Links')
.setWidth(350 /* pixels */);
// Display list of links, url only.
for (var l=0; l<links.length; l++) {
var link = links[l];
sidebar.append('<p>'+link.url);
}
DocumentApp.getUi().showSidebar(sidebar);
}
I offer another, shorter answer for your first question, concerning iterating through all links in a document's body. This instructive code returns a flat array of links in the current document's body, where each link is represented by an object with entries pointing to the text element (text), the paragraph element or list item element in which it's contained (paragraph), the offset index in the text where the link appears (startOffset) and the URL itself (url). Hopefully, you'll find it easy to suit it for your own needs.
It uses the getTextAttributeIndices() method rather than iterating over every character of the text, and is thus expected to perform much more quickly than previously written answers.
EDIT: Since originally posting this answer, I modified the function a couple of times. It now also (1) includes the endOffsetInclusive property for each link (note that it can be null for links that extend to the end of the text element - in this case one can use link.text.length-1 instead); (2) finds links in all sections of the document, not only the body, and (3) includes the section and isFirstPageSection properties to indicate where the link is located; (4) accepts the argument mergeAdjacent, which when set to true, will return only a single link entry for a continuous stretch of text linked to the same URL (which would be considered separate if, for instance, part of the text is styled differently than another part).
For the purpose of including links under all sections, a new utility function, iterateSections(), was introduced.
/**
* Returns a flat array of links which appear in the active document's body.
* Each link is represented by a simple Javascript object with the following
* keys:
* - "section": {ContainerElement} the document section in which the link is
* found.
* - "isFirstPageSection": {Boolean} whether the given section is a first-page
* header/footer section.
* - "paragraph": {ContainerElement} contains a reference to the Paragraph
* or ListItem element in which the link is found.
* - "text": the Text element in which the link is found.
* - "startOffset": {Number} the position (offset) in the link text begins.
* - "endOffsetInclusive": the position of the last character of the link
* text, or null if the link extends to the end of the text element.
* - "url": the URL of the link.
*
* #param {boolean} mergeAdjacent Whether consecutive links which carry
* different attributes (for any reason) should be returned as a single
* entry.
*
* #returns {Array} the aforementioned flat array of links.
*/
function getAllLinks(mergeAdjacent) {
var links = [];
var doc = DocumentApp.getActiveDocument();
iterateSections(doc, function(section, sectionIndex, isFirstPageSection) {
if (!("getParagraphs" in section)) {
// as we're using some undocumented API, adding this to avoid cryptic
// messages upon possible API changes.
throw new Error("An API change has caused this script to stop " +
"working.\n" +
"Section #" + sectionIndex + " of type " +
section.getType() + " has no .getParagraphs() method. " +
"Stopping script.");
}
section.getParagraphs().forEach(function(par) {
// skip empty paragraphs
if (par.getNumChildren() == 0) {
return;
}
// go over all text elements in paragraph / list-item
for (var el=par.getChild(0); el!=null; el=el.getNextSibling()) {
if (el.getType() != DocumentApp.ElementType.TEXT) {
continue;
}
// go over all styling segments in text element
var attributeIndices = el.getTextAttributeIndices();
var lastLink = null;
attributeIndices.forEach(function(startOffset, i, attributeIndices) {
var url = el.getLinkUrl(startOffset);
if (url != null) {
// we hit a link
var endOffsetInclusive = (i+1 < attributeIndices.length?
attributeIndices[i+1]-1 : null);
// check if this and the last found link are continuous
if (mergeAdjacent && lastLink != null && lastLink.url == url &&
lastLink.endOffsetInclusive == startOffset - 1) {
// this and the previous style segment are continuous
lastLink.endOffsetInclusive = endOffsetInclusive;
return;
}
lastLink = {
"section": section,
"isFirstPageSection": isFirstPageSection,
"paragraph": par,
"textEl": el,
"startOffset": startOffset,
"endOffsetInclusive": endOffsetInclusive,
"url": url
};
links.push(lastLink);
}
});
}
});
});
return links;
}
/**
* Calls the given function for each section of the document (body, header,
* etc.). Sections are children of the DocumentElement object.
*
* #param {Document} doc The Document object (such as the one obtained via
* a call to DocumentApp.getActiveDocument()) with the sections to iterate
* over.
* #param {Function} func A callback function which will be called, for each
* section, with the following arguments (in order):
* - {ContainerElement} section - the section element
* - {Number} sectionIndex - the child index of the section, such that
* doc.getBody().getParent().getChild(sectionIndex) == section.
* - {Boolean} isFirstPageSection - whether the section is a first-page
* header/footer section.
*/
function iterateSections(doc, func) {
// get the DocumentElement interface to iterate over all sections
// this bit is undocumented API
var docEl = doc.getBody().getParent();
var regularHeaderSectionIndex = (doc.getHeader() == null? -1 :
docEl.getChildIndex(doc.getHeader()));
var regularFooterSectionIndex = (doc.getFooter() == null? -1 :
docEl.getChildIndex(doc.getFooter()));
for (var i=0; i<docEl.getNumChildren(); ++i) {
var section = docEl.getChild(i);
var sectionType = section.getType();
var uniqueSectionName;
var isFirstPageSection = (
i != regularHeaderSectionIndex &&
i != regularFooterSectionIndex &&
(sectionType == DocumentApp.ElementType.HEADER_SECTION ||
sectionType == DocumentApp.ElementType.FOOTER_SECTION));
func(section, i, isFirstPageSection);
}
}
I was playing around and incorporated #Mogsdad's answer -- here's the really complicated version:
var _ = Underscorejs.load(); // loaded via http://googleappsdeveloper.blogspot.com/2012/11/using-open-source-libraries-in-apps.html, rolled my own
var ui = DocumentApp.getUi();
// #region --------------------- Utilities -----------------------------
var gDocsHelper = (function(P, un) {
// heavily based on answer https://stackoverflow.com/a/18731628/1037948
var updatedLinkText = function(link, offset) {
return function() { return 'Text: ' + link.getText().substring(offset,100) + ((link.getText().length-offset) > 100 ? '...' : ''); }
}
P.updateLink = function updateLink(link, oldText, newText, start, end) {
var oldLink = link.getLinkUrl(start);
if(0 > oldLink.indexOf(oldText)) return false;
var newLink = oldLink.replace(new RegExp(oldText, 'g'), newText);
link.setLinkUrl(start || 0, (end || oldLink.length), newLink);
log(true, "Updating Link: ", oldLink, newLink, start, end, updatedLinkText(link, start) );
return { old: oldLink, "new": newLink, getText: updatedLinkText(link, start) };
};
// moving this reused block out to 'private' fn
var updateLinkResult = function(text, oldText, newText, link, urls, sidebar, updateResult) {
// and may as well update the link while we're here
if(false !== (updateResult = P.updateLink(text, oldText, newText, link.start, link.end))) {
sidebar.append('<li>' + updateResult['old'] + ' → ' + updateResult['new'] + ' at ' + updateResult['getText']() + '</li>');
}
urls.push(link.url); // so multiple links get added to list
};
P.updateLinksMenu = function() {
// https://developers.google.com/apps-script/reference/base/prompt-response
var oldText = ui.prompt('Old link text to replace').getResponseText();
var newText = ui.prompt('New link text to replace with').getResponseText();
log('Replacing: ' + oldText + ', ' + newText);
var sidebar = gDocUiHelper.createSidebar('Update All Links', '<h3>Replacing</h3><p><code>' + oldText + '</code> → <code>' + newText + '</code></p><hr /><ol>');
// current doc available to script
var doc = DocumentApp.getActiveDocument().getBody();//.getActiveSection();
// Search until a link is found
var links = P.findAllElementsFor(doc, function(text) {
var i = -1, n = text.getText().length, link = false, url, urls = [], updateResult;
// note: the following only gets the FIRST link in the text -- while(i < n && !(url = text.getLinkUrl(i++)));
// scan the text element for links
while(++i < n) {
// getLinkUrl will continue to get a link while INSIDE the stupid link, so only do this once
if(url = text.getLinkUrl(i)) {
if(false === link) {
link = { start: i, end: -1, url: url };
// log(true, 'Type: ' + text.getType(), 'Link: ' + url, function() { return 'Text: ' + text.getText().substring(i,100) + ((n-i) > 100 ? '...' : '')});
}
else {
link.end = i; // keep updating the end position until we leave
}
}
// just left the link -- reset link tracking
else if(false !== link) {
// and may as well update the link while we're here
updateLinkResult(text, oldText, newText, link, urls, sidebar);
link = false; // reset "counter"
}
}
// once we've reached the end of the text, must also check to see if the last thing we found was a link
if(false !== link) updateLinkResult(text, oldText, newText, link, urls, sidebar);
return urls;
});
sidebar.append('</ol><p><strong>' + links.length + ' links reviewed</strong></p>');
gDocUiHelper.attachSidebar(sidebar);
log(links);
};
P.findAllElementsFor = function(el, test) {
// generic utility function to recursively find all elements; heavily based on https://stackoverflow.com/a/18731628/1037948
var results = [], searchResult = null, i, result;
// https://developers.google.com/apps-script/reference/document/body#findElement(ElementType)
while (searchResult = el.findElement(DocumentApp.ElementType.TEXT, searchResult)) {
var t = searchResult.getElement().editAsText(); // .asParagraph()
// check to add to list
if(test && (result = test(t))) {
if( _.isArray(result) ) results = results.concat(result); // could be big? http://jsperf.com/self-concatenation/
else results.push(result);
}
}
// recurse children if not plain text item
if(el.getType() !== DocumentApp.ElementType.TEXT) {
i = el.getNumChildren();
var result;
while(--i > 0) {
result = P.findAllElementsFor(el.getChild(i));
if(result && result.length > 0) results = results.concat(result);
}
}
return results;
};
return P;
})({});
// really? it can't handle object properties?
function gDocsUpdateLinksMenu() {
gDocsHelper.updateLinksMenu();
}
gDocUiHelper.addMenu('Zaus', [ ['Update links', 'gDocsUpdateLinksMenu'] ]);
// #endregion --------------------- Utilities -----------------------------
And I'm including the "extra" utility classes for creating menus, sidebars, etc below for completeness:
var log = function() {
// return false;
var args = Array.prototype.slice.call(arguments);
// allowing functions delegates execution so we can save some non-debug cycles if code left in?
if(args[0] === true) Logger.log(_.map(args, function(v) { return _.isFunction(v) ? v() : v; }).join('; '));
else
_.each(args, function(v) {
Logger.log(_.isFunction(v) ? v() : v);
});
}
// #region --------------------- Menu -----------------------------
var gDocUiHelper = (function(P, un) {
P.addMenuToSheet = function addMenu(spreadsheet, title, items) {
var menu = ui.createMenu(title);
// make sure menu items are correct format
_.each(items, function(v,k) {
var err = [];
// provided in format [ [name, fn],... ] instead
if( _.isArray(v) ) {
if ( v.length === 2 ) {
menu.addItem(v[0], v[1]);
}
else {
err.push('Menu item ' + k + ' missing name or function: ' + v.join(';'))
}
}
else {
if( !v.name ) err.push('Menu item ' + k + ' lacks name');
if( !v.functionName ) err.push('Menu item ' + k + ' lacks function');
if(!err.length) menu.addItem(v.name, v.functionName);
}
if(err.length) {
log(err);
ui.alert(err.join('; '));
}
});
menu.addToUi();
};
// list of things to hook into
var initializers = {};
P.addMenu = function(menuTitle, menuItems) {
if(initializers[menuTitle] === un) {
initializers[menuTitle] = [];
}
initializers[menuTitle] = initializers[menuTitle].concat(menuItems);
};
P.createSidebar = function(title, content, options) {
var sidebar = HtmlService
.createHtmlOutput()
.setTitle(title)
.setWidth( (options && options.width) ? width : 350 /* pixels */);
sidebar.append(content);
if(options && options.on) DocumentApp.getUi().showSidebar(sidebar);
// else { sidebar.attach = function() { DocumentApp.getUi().showSidebar(this); }; } // should really attach to prototype...
return sidebar;
};
P.attachSidebar = function(sidebar) {
DocumentApp.getUi().showSidebar(sidebar);
};
P.onOpen = function() {
var spreadsheet = SpreadsheetApp.getActive();
log(initializers);
_.each(initializers, function(v,k) {
P.addMenuToSheet(spreadsheet, k, v);
});
};
return P;
})({});
// #endregion --------------------- Menu -----------------------------
/**
* A special function that runs when the spreadsheet is open, used to add a
* custom menu to the spreadsheet.
*/
function onOpen() {
gDocUiHelper.onOpen();
}
Had some trouble getting Mogsdad's solution to work. Specifically it misses links which end their parent element so there isn't a trailing non-link character to terminate it. I've implemented something which addresses this and returns a standard range element. Sharing here incase someone finds it useful.
function getAllLinks(element) {
var rangeBuilder = DocumentApp.getActiveDocument().newRange();
// Parse the text iteratively to find the start and end indices for each link
if (element.getType() === DocumentApp.ElementType.TEXT) {
var links = [];
var string = element.getText();
var previousUrl = null; // The URL of the previous character
var currentLink = null; // The latest link being built
for (var charIndex = 0; charIndex < string.length; charIndex++) {
var currentUrl = element.getLinkUrl(charIndex);
// New URL means create a new link
if (currentUrl !== null && previousUrl !== currentUrl) {
if (currentLink !== null) links.push(currentLink);
currentLink = {};
currentLink.url = String(currentUrl);
currentLink.startOffset = charIndex;
}
// In a URL means extend the end of the current link
if (currentUrl !== null) {
currentLink.endOffsetInclusive = charIndex;
}
// Not in a URL means close and push the link if ready
if (currentUrl === null) {
if (currentLink !== null) links.push(currentLink);
currentLink = null;
}
// End the loop and go again
previousUrl = currentUrl;
}
// Handle the end case when final character is a link
if (currentLink !== null) links.push(currentLink);
// Convert the links into a range before returning
links.forEach(function(link) {
rangeBuilder.addElement(element, link.startOffset, link.endOffsetInclusive);
});
}
// If not a text element then recursively get links from child elements
else if (element.getNumChildren) {
for (var i = 0; i < element.getNumChildren(); i++) {
rangeBuilder.addRange(getAllLinks(element.getChild(i)));
}
}
return rangeBuilder.build();
}
You are right ... search and replace is not applicable here.
Use setLinkUrl() https://developers.google.com/apps-script/reference/document/container-element#setLinkUrl(String)
Basically you have to iterate through the elements recursively (elements can contain elements) and for each
use getLinkUrl() to get the oldText
if not null , setLinkUrl(newText) .... leaves displayed text unchanged
This Excel macro lists the links from a Word doc. You'd need to copy your data into a Word doc first.
Sub getLinks()
Dim wApp As Word.Application, wDoc As Word.Document
Dim i As Integer, r As Range
Const filePath = "C:\test\test.docx"
Set wApp = CreateObject("Word.Application")
'wApp.Visible = True
Set wDoc = wApp.Documents.Open(filePath)
Set r = Range("A1")
For i = 1 To wDoc.Hyperlinks.Count
r = wDoc.Hyperlinks(i).Address
Set r = r.Offset(1, 0)
Next i
wApp.Quit
Set wDoc = Nothing
Set wApp = Nothing
End Sub
Here's a quick and dirty way to accomplish the same goal with no scripting:
From Google Docs, save the document in RTF format.
In your editor of choice, edit the links in the RTF file (in my case, I wanted to modify all the hyperlinks, so I used Emacs and regexp-replace). Save the file when you're done.
Create a fresh, new Google Doc, and from the menu, select File>Open and open the RTF file. Docs will convert your edited RTF file back into a proper Google Doc, restoring all formatting.
Google Docs' RTF format is pretty complete--I haven't noticed any loss of fidelity in making the round trip, and it has the advantage of fully exposing all the hyperlinks, formatting, and everything else about the document in a form that's easy to edit and to apply regex tools to.

Can Google apps script be used to randomize page order on Google forms?

Update #2: Okay, I'm pretty sure my error in update #1 was because of indexing out of bounds over the array (I'm still not used to JS indexing at 0). But here is the new problem... if I write out the different combinations of the loop manually, setting the page index to 1 in moveItem() like so:
newForm.moveItem(itemsArray[0][0], 1);
newForm.moveItem(itemsArray[0][1], 1);
newForm.moveItem(itemsArray[0][2], 1);
newForm.moveItem(itemsArray[1][0], 1);
newForm.moveItem(itemsArray[1][1], 1);
newForm.moveItem(itemsArray[1][2], 1);
newForm.moveItem(itemsArray[2][0], 1);
...
...I don't get any errors but the items end up on different pages! What is going on?
Update #1:: Using Sandy Good's answer as well as a script I found at this WordPress blog, I have managed to get closer to what I needed. I believe Sandy Good misinterpreted what I wanted to do because I wasn't specific enough in my question.
I would like to:
Get all items from a page (section header, images, question etc)
Put them into an array
Do this for all pages, adding these arrays to an array (i.e: [[all items from page 1][all items from page 2][all items from page 3]...])
Shuffle the elements of this array
Repopulate a new form with each element of this array. In this way, page order will be randomized.
My JavaScript skills are poor (this is the first time I've used it). There is a step that produces null entries and I don't know why... I had to remove them manually. I am not able to complete step 5 as I get the following error:
Cannot convert Item,Item,Item to (class).
"Item,Item,Item" is the array element containing all the items from a particular page. So it seems that I can't add three items to a page at a time? Or is something else going on here?
Here is my code:
function shuffleForms() {
var itemsArray,shuffleQuestionsInNewForm,fncGetQuestionID,
newFormFile,newForm,newID,shuffle, sections;
// Copy template form by ID, set a new name
newFormFile = DriveApp.getFileById('1prfcl-RhaD4gn0b2oP4sbcKaRcZT5XoCAQCbLm1PR7I')
.makeCopy();
newFormFile.setName('AAAAA_Shuffled_Form');
// Get ID of new form and open it
newID = newFormFile.getId();
newForm = FormApp.openById(newID);
// Initialize array to put IDs in
itemsArray = [];
function getPageItems(thisPageNum) {
Logger.log("Getting items for page number: " + thisPageNum );
var thisPageItems = []; // Used for result
var thisPageBreakIndex = getPageItem(thisPageNum).getIndex();
Logger.log( "This is index num : " + thisPageBreakIndex );
// Get all items from page
var allItems = newForm.getItems();
thisPageItems.push(allItems[thisPageBreakIndex]);
Logger.log( "Added pagebreak item: " + allItems[thisPageBreakIndex].getIndex() );
for( var i = thisPageBreakIndex+1; ( i < allItems.length ) && ( allItems[i].getType() != FormApp.ItemType.PAGE_BREAK ); ++i ) {
thisPageItems.push(allItems[i]);
Logger.log( "Added non-pagebreak item: " + allItems[i].getIndex() );
}
return thisPageItems;
}
function shuffle(array) {
var currentIndex = array.length, temporaryValue, randomIndex;
Logger.log('shuffle ran')
// While there remain elements to shuffle...
while (0 !== currentIndex) {
// Pick a remaining element...
randomIndex = Math.floor(Math.random() * currentIndex);
currentIndex -= 1;
// And swap it with the current element.
temporaryValue = array[currentIndex];
array[currentIndex] = array[randomIndex];
array[randomIndex] = temporaryValue;
}
return array;
}
function shuffleAndMove() {
// Get page items for all pages into an array
for(i = 2; i <= 5; i++) {
itemsArray[i] = getPageItems(i);
}
// Removes null values from array
itemsArray = itemsArray.filter(function(x){return x});
// Shuffle page items
itemsArray = shuffle(itemsArray);
// Move page items to the new form
for(i = 2; i <= 5; ++i) {
newForm.moveItem(itemsArray[i], i);
}
}
shuffleAndMove();
}
Original post: I have used Google forms to create a questionnaire. For my purposes, each question needs to be on a separate page but I need the pages to be randomized. A quick Google search shows this feature has not been added yet.
I see that the Form class in the Google apps script has a number of methods that alter/give access to various properties of Google Forms. Since I do not know Javascript and am not too familiar with Google apps/API I would like to know if what I am trying to do is even possible before diving in and figuring it all out.
If it is possible, I would appreciate any insight on what methods would be relevant for this task just to give me some direction to get started.
Based on comments from Sandy Good and two SE questions found here and here, this is the code I have so far:
// Script to shuffle question in a Google Form when the questions are in separate sections
function shuffleFormSections() {
getQuestionID();
createNewShuffledForm();
}
// Get question IDs
function getQuestionID() {
var form = FormApp.getActiveForm();
var items = form.getItems();
arrayID = [];
for (var i in items) {
arrayID[i] = items[i].getId();
}
// Logger.log(arrayID);
return(arrayID);
}
// Shuffle function
function shuffle(a) {
var j, x, i;
for (i = a.length; i; i--) {
j = Math.floor(Math.random() * i);
x = a[i - 1];
a[i - 1] = a[j];
a[j] = x;
}
}
// Shuffle IDs and create new form with new question order
function createNewShuffledForm() {
shuffle(arrayID);
// Logger.log(arrayID);
var newForm = FormApp.create('Shuffled Form');
for (var i in arrayID) {
arrayID[i].getItemsbyId();
}
}
Try this. There's a few "constants" to be set at the top of the function, check the comments. Form file copying and opening borrowed from Sandy Good's answer, thanks!
// This is the function to run, all the others here are helper functions
// You'll need to set your source file id and your destination file name in the
// constants at the top of this function here.
// It appears that the "Title" page does not count as a page, so you don't need
// to include it in the PAGES_AT_BEGINNING_TO_NOT_SHUFFLE count.
function shuffleFormPages() {
// UPDATE THESE CONSTANTS AS NEEDED
var PAGES_AT_BEGINNING_TO_NOT_SHUFFLE = 2; // preserve X intro pages; shuffle everything after page X
var SOURCE_FILE_ID = 'YOUR_SOURCE_FILE_ID_HERE';
var DESTINATION_FILE_NAME = 'YOUR_DESTINATION_FILE_NAME_HERE';
// Copy template form by ID, set a new name
var newFormFile = DriveApp.getFileById(SOURCE_FILE_ID).makeCopy();
newFormFile.setName(DESTINATION_FILE_NAME);
// Open the duplicated form file as a form
var newForm = FormApp.openById(newFormFile.getId());
var pages = extractPages(newForm);
shuffleEndOfPages(pages, PAGES_AT_BEGINNING_TO_NOT_SHUFFLE);
var shuffledFormItems = flatten(pages);
setFormItems(newForm, shuffledFormItems);
}
// Builds an array of "page" arrays. Each page array starts with a page break
// and continues until the next page break.
function extractPages(form) {
var formItems = form.getItems();
var currentPage = [];
var allPages = [];
formItems.forEach(function(item) {
if (item.getType() == FormApp.ItemType.PAGE_BREAK && currentPage.length > 0) {
// found a page break (and it isn't the first one)
allPages.push(currentPage); // push what we've built for this page onto the output array
currentPage = [item]; // reset the current page to just this most recent item
} else {
currentPage.push(item);
}
});
// We've got the last page dangling, so add it
allPages.push(currentPage);
return allPages;
};
// startIndex is the array index to start shuffling from. E.g. to start
// shuffling on page 5, startIndex should be 4. startIndex could also be thought
// of as the number of pages to keep unshuffled.
// This function has no return value, it just mutates pages
function shuffleEndOfPages(pages, startIndex) {
var currentIndex = pages.length;
// While there remain elements to shuffle...
while (currentIndex > startIndex) {
// Pick an element between startIndex and currentIndex (inclusive)
var randomIndex = Math.floor(Math.random() * (currentIndex - startIndex)) + startIndex;
currentIndex -= 1;
// And swap it with the current element.
var temporaryValue = pages[currentIndex];
pages[currentIndex] = pages[randomIndex];
pages[randomIndex] = temporaryValue;
}
};
// Sourced from elsewhere on SO:
// https://stackoverflow.com/a/15030117/4280232
function flatten(array) {
return array.reduce(
function (flattenedArray, toFlatten) {
return flattenedArray.concat(Array.isArray(toFlatten) ? flatten(toFlatten) : toFlatten);
},
[]
);
};
// No safety checks around items being the same as the form length or whatever.
// This mutates form.
function setFormItems(form, items) {
items.forEach(function(item, index) {
form.moveItem(item, index);
});
};
I tested this code. It created a new Form, and then shuffled the questions in the new Form. It excludes page breaks, images and section headers. You need to provide a source file ID for the original template Form. This function has 3 inner sub-functions. The inner functions are at the top, and they are called at the bottom of the outer function. The arrayOfIDs variable does not need to be returned or passed to another function because it is available in the outer scope.
function shuffleFormSections() {
var arrayOfIDs,shuffleQuestionsInNewForm,fncGetQuestionID,
newFormFile,newForm,newID,items,shuffle;
newFormFile = DriveApp.getFileById('Put the source file ID here')
.makeCopy();
newFormFile.setName('AAAAA_Shuffled_Form');
newID = newFormFile.getId();
newForm = FormApp.openById(newID);
arrayOfIDs = [];
fncGetQuestionID = function() {
var i,L,thisID,thisItem,thisType;
items = newForm.getItems();
L = items.length;
for (i=0;i<L;i++) {
thisItem = items[i];
thisType = thisItem.getType();
if (thisType === FormApp.ItemType.PAGE_BREAK ||
thisType === FormApp.ItemType.SECTION_HEADER ||
thisType === FormApp.ItemType.IMAGE) {
continue;
}
thisID = thisItem.getId();
arrayOfIDs.push(thisID);
}
Logger.log('arrayOfIDs: ' + arrayOfIDs);
//the array arrayOfIDs does not need to be returned since it is available
//in the outermost scope
}// End of fncGetQuestionID function
shuffle = function() {// Shuffle function
var j, x, i;
Logger.log('shuffle ran')
for (i = arrayOfIDs.length; i; i--) {
j = Math.floor(Math.random() * i);
Logger.log('j: ' + j)
x = arrayOfIDs[i - 1];
Logger.log('x: ' + x)
arrayOfIDs[i - 1] = arrayOfIDs[j];
arrayOfIDs[j] = x;
}
Logger.log('arrayOfIDs: ' + arrayOfIDs)
}
shuffleQuestionsInNewForm = function() {
var i,L,thisID,thisItem,thisQuestion,questionType;
L = arrayOfIDs.length;
for (i=0;i<L;i++) {
thisID = arrayOfIDs[i];
Logger.log('thisID: ' + thisID)
thisItem = newForm.getItemById(thisID);
newForm.moveItem(thisItem, i)
}
}
fncGetQuestionID();//Get all the question ID's and put them into an array
shuffle();
shuffleQuestionsInNewForm();
}

Get All Links in a Document

Given a "normal document" in Google Docs/Drive (e.g. paragraphs, lists, tables) which contains external links scattered throughout the content, how do you compile a list of links present using Google Apps Script?
Specifically, I want to update all broken links in the document by searching for oldText in each url and replace it with newText in each url, but not the text.
I don't think the replacing text section of the Dev Documentation is what I need -- do I need to scan every element of the doc? Can I just editAsText and use an html regex? Examples would be appreciated.
This is only mostly painful! Code is available as part of a gist.
Yeah, I can't spell.
getAllLinks
Here's a utility function that scans the document for all LinkUrls, returning them in an array.
/**
* Get an array of all LinkUrls in the document. The function is
* recursive, and if no element is provided, it will default to
* the active document's Body element.
*
* #param {Element} element The document element to operate on.
* .
* #returns {Array} Array of objects, vis
* {element,
* startOffset,
* endOffsetInclusive,
* url}
*/
function getAllLinks(element) {
var links = [];
element = element || DocumentApp.getActiveDocument().getBody();
if (element.getType() === DocumentApp.ElementType.TEXT) {
var textObj = element.editAsText();
var text = element.getText();
var inUrl = false;
for (var ch=0; ch < text.length; ch++) {
var url = textObj.getLinkUrl(ch);
if (url != null) {
if (!inUrl) {
// We are now!
inUrl = true;
var curUrl = {};
curUrl.element = element;
curUrl.url = String( url ); // grab a copy
curUrl.startOffset = ch;
}
else {
curUrl.endOffsetInclusive = ch;
}
}
else {
if (inUrl) {
// Not any more, we're not.
inUrl = false;
links.push(curUrl); // add to links
curUrl = {};
}
}
}
if (inUrl) {
// in case the link ends on the same char that the element does
links.push(curUrl);
}
}
else {
var numChildren = element.getNumChildren();
for (var i=0; i<numChildren; i++) {
links = links.concat(getAllLinks(element.getChild(i)));
}
}
return links;
}
findAndReplaceLinks
This utility builds on getAllLinks to do a find & replace function.
/**
* Replace all or part of UrlLinks in the document.
*
* #param {String} searchPattern the regex pattern to search for
* #param {String} replacement the text to use as replacement
*
* #returns {Number} number of Urls changed
*/
function findAndReplaceLinks(searchPattern,replacement) {
var links = getAllLinks();
var numChanged = 0;
for (var l=0; l<links.length; l++) {
var link = links[l];
if (link.url.match(searchPattern)) {
// This link needs to be changed
var newUrl = link.url.replace(searchPattern,replacement);
link.element.setLinkUrl(link.startOffset, link.endOffsetInclusive, newUrl);
numChanged++
}
}
return numChanged;
}
Demo UI
To demonstrate the use of these utilities, here are a couple of UI extensions:
function onOpen() {
// Add a menu with some items, some separators, and a sub-menu.
DocumentApp.getUi().createMenu('Utils')
.addItem('List Links', 'sidebarLinks')
.addItem('Replace Link Text', 'searchReplaceLinks')
.addToUi();
}
function searchReplaceLinks() {
var ui = DocumentApp.getUi();
var app = UiApp.createApplication()
.setWidth(250)
.setHeight(100)
.setTitle('Change Url text');
var form = app.createFormPanel();
var flow = app.createFlowPanel();
flow.add(app.createLabel("Find: "));
flow.add(app.createTextBox().setName("searchPattern"));
flow.add(app.createLabel("Replace: "));
flow.add(app.createTextBox().setName("replacement"));
var handler = app.createServerHandler('myClickHandler');
flow.add(app.createSubmitButton("Submit").addClickHandler(handler));
form.add(flow);
app.add(form);
ui.showDialog(app);
}
// ClickHandler to close dialog
function myClickHandler(e) {
var app = UiApp.getActiveApplication();
app.close();
return app;
}
function doPost(e) {
var numChanged = findAndReplaceLinks(e.parameter.searchPattern,e.parameter.replacement);
var ui = DocumentApp.getUi();
var app = UiApp.createApplication();
sidebarLinks(); // Update list
var result = DocumentApp.getUi().alert(
'Results',
"Changed "+numChanged+" urls.",
DocumentApp.getUi().ButtonSet.OK);
}
/**
* Shows a custom HTML user interface in a sidebar in the Google Docs editor.
*/
function sidebarLinks() {
var links = getAllLinks();
var sidebar = HtmlService
.createHtmlOutput()
.setTitle('URL Links')
.setWidth(350 /* pixels */);
// Display list of links, url only.
for (var l=0; l<links.length; l++) {
var link = links[l];
sidebar.append('<p>'+link.url);
}
DocumentApp.getUi().showSidebar(sidebar);
}
I offer another, shorter answer for your first question, concerning iterating through all links in a document's body. This instructive code returns a flat array of links in the current document's body, where each link is represented by an object with entries pointing to the text element (text), the paragraph element or list item element in which it's contained (paragraph), the offset index in the text where the link appears (startOffset) and the URL itself (url). Hopefully, you'll find it easy to suit it for your own needs.
It uses the getTextAttributeIndices() method rather than iterating over every character of the text, and is thus expected to perform much more quickly than previously written answers.
EDIT: Since originally posting this answer, I modified the function a couple of times. It now also (1) includes the endOffsetInclusive property for each link (note that it can be null for links that extend to the end of the text element - in this case one can use link.text.length-1 instead); (2) finds links in all sections of the document, not only the body, and (3) includes the section and isFirstPageSection properties to indicate where the link is located; (4) accepts the argument mergeAdjacent, which when set to true, will return only a single link entry for a continuous stretch of text linked to the same URL (which would be considered separate if, for instance, part of the text is styled differently than another part).
For the purpose of including links under all sections, a new utility function, iterateSections(), was introduced.
/**
* Returns a flat array of links which appear in the active document's body.
* Each link is represented by a simple Javascript object with the following
* keys:
* - "section": {ContainerElement} the document section in which the link is
* found.
* - "isFirstPageSection": {Boolean} whether the given section is a first-page
* header/footer section.
* - "paragraph": {ContainerElement} contains a reference to the Paragraph
* or ListItem element in which the link is found.
* - "text": the Text element in which the link is found.
* - "startOffset": {Number} the position (offset) in the link text begins.
* - "endOffsetInclusive": the position of the last character of the link
* text, or null if the link extends to the end of the text element.
* - "url": the URL of the link.
*
* #param {boolean} mergeAdjacent Whether consecutive links which carry
* different attributes (for any reason) should be returned as a single
* entry.
*
* #returns {Array} the aforementioned flat array of links.
*/
function getAllLinks(mergeAdjacent) {
var links = [];
var doc = DocumentApp.getActiveDocument();
iterateSections(doc, function(section, sectionIndex, isFirstPageSection) {
if (!("getParagraphs" in section)) {
// as we're using some undocumented API, adding this to avoid cryptic
// messages upon possible API changes.
throw new Error("An API change has caused this script to stop " +
"working.\n" +
"Section #" + sectionIndex + " of type " +
section.getType() + " has no .getParagraphs() method. " +
"Stopping script.");
}
section.getParagraphs().forEach(function(par) {
// skip empty paragraphs
if (par.getNumChildren() == 0) {
return;
}
// go over all text elements in paragraph / list-item
for (var el=par.getChild(0); el!=null; el=el.getNextSibling()) {
if (el.getType() != DocumentApp.ElementType.TEXT) {
continue;
}
// go over all styling segments in text element
var attributeIndices = el.getTextAttributeIndices();
var lastLink = null;
attributeIndices.forEach(function(startOffset, i, attributeIndices) {
var url = el.getLinkUrl(startOffset);
if (url != null) {
// we hit a link
var endOffsetInclusive = (i+1 < attributeIndices.length?
attributeIndices[i+1]-1 : null);
// check if this and the last found link are continuous
if (mergeAdjacent && lastLink != null && lastLink.url == url &&
lastLink.endOffsetInclusive == startOffset - 1) {
// this and the previous style segment are continuous
lastLink.endOffsetInclusive = endOffsetInclusive;
return;
}
lastLink = {
"section": section,
"isFirstPageSection": isFirstPageSection,
"paragraph": par,
"textEl": el,
"startOffset": startOffset,
"endOffsetInclusive": endOffsetInclusive,
"url": url
};
links.push(lastLink);
}
});
}
});
});
return links;
}
/**
* Calls the given function for each section of the document (body, header,
* etc.). Sections are children of the DocumentElement object.
*
* #param {Document} doc The Document object (such as the one obtained via
* a call to DocumentApp.getActiveDocument()) with the sections to iterate
* over.
* #param {Function} func A callback function which will be called, for each
* section, with the following arguments (in order):
* - {ContainerElement} section - the section element
* - {Number} sectionIndex - the child index of the section, such that
* doc.getBody().getParent().getChild(sectionIndex) == section.
* - {Boolean} isFirstPageSection - whether the section is a first-page
* header/footer section.
*/
function iterateSections(doc, func) {
// get the DocumentElement interface to iterate over all sections
// this bit is undocumented API
var docEl = doc.getBody().getParent();
var regularHeaderSectionIndex = (doc.getHeader() == null? -1 :
docEl.getChildIndex(doc.getHeader()));
var regularFooterSectionIndex = (doc.getFooter() == null? -1 :
docEl.getChildIndex(doc.getFooter()));
for (var i=0; i<docEl.getNumChildren(); ++i) {
var section = docEl.getChild(i);
var sectionType = section.getType();
var uniqueSectionName;
var isFirstPageSection = (
i != regularHeaderSectionIndex &&
i != regularFooterSectionIndex &&
(sectionType == DocumentApp.ElementType.HEADER_SECTION ||
sectionType == DocumentApp.ElementType.FOOTER_SECTION));
func(section, i, isFirstPageSection);
}
}
I was playing around and incorporated #Mogsdad's answer -- here's the really complicated version:
var _ = Underscorejs.load(); // loaded via http://googleappsdeveloper.blogspot.com/2012/11/using-open-source-libraries-in-apps.html, rolled my own
var ui = DocumentApp.getUi();
// #region --------------------- Utilities -----------------------------
var gDocsHelper = (function(P, un) {
// heavily based on answer https://stackoverflow.com/a/18731628/1037948
var updatedLinkText = function(link, offset) {
return function() { return 'Text: ' + link.getText().substring(offset,100) + ((link.getText().length-offset) > 100 ? '...' : ''); }
}
P.updateLink = function updateLink(link, oldText, newText, start, end) {
var oldLink = link.getLinkUrl(start);
if(0 > oldLink.indexOf(oldText)) return false;
var newLink = oldLink.replace(new RegExp(oldText, 'g'), newText);
link.setLinkUrl(start || 0, (end || oldLink.length), newLink);
log(true, "Updating Link: ", oldLink, newLink, start, end, updatedLinkText(link, start) );
return { old: oldLink, "new": newLink, getText: updatedLinkText(link, start) };
};
// moving this reused block out to 'private' fn
var updateLinkResult = function(text, oldText, newText, link, urls, sidebar, updateResult) {
// and may as well update the link while we're here
if(false !== (updateResult = P.updateLink(text, oldText, newText, link.start, link.end))) {
sidebar.append('<li>' + updateResult['old'] + ' → ' + updateResult['new'] + ' at ' + updateResult['getText']() + '</li>');
}
urls.push(link.url); // so multiple links get added to list
};
P.updateLinksMenu = function() {
// https://developers.google.com/apps-script/reference/base/prompt-response
var oldText = ui.prompt('Old link text to replace').getResponseText();
var newText = ui.prompt('New link text to replace with').getResponseText();
log('Replacing: ' + oldText + ', ' + newText);
var sidebar = gDocUiHelper.createSidebar('Update All Links', '<h3>Replacing</h3><p><code>' + oldText + '</code> → <code>' + newText + '</code></p><hr /><ol>');
// current doc available to script
var doc = DocumentApp.getActiveDocument().getBody();//.getActiveSection();
// Search until a link is found
var links = P.findAllElementsFor(doc, function(text) {
var i = -1, n = text.getText().length, link = false, url, urls = [], updateResult;
// note: the following only gets the FIRST link in the text -- while(i < n && !(url = text.getLinkUrl(i++)));
// scan the text element for links
while(++i < n) {
// getLinkUrl will continue to get a link while INSIDE the stupid link, so only do this once
if(url = text.getLinkUrl(i)) {
if(false === link) {
link = { start: i, end: -1, url: url };
// log(true, 'Type: ' + text.getType(), 'Link: ' + url, function() { return 'Text: ' + text.getText().substring(i,100) + ((n-i) > 100 ? '...' : '')});
}
else {
link.end = i; // keep updating the end position until we leave
}
}
// just left the link -- reset link tracking
else if(false !== link) {
// and may as well update the link while we're here
updateLinkResult(text, oldText, newText, link, urls, sidebar);
link = false; // reset "counter"
}
}
// once we've reached the end of the text, must also check to see if the last thing we found was a link
if(false !== link) updateLinkResult(text, oldText, newText, link, urls, sidebar);
return urls;
});
sidebar.append('</ol><p><strong>' + links.length + ' links reviewed</strong></p>');
gDocUiHelper.attachSidebar(sidebar);
log(links);
};
P.findAllElementsFor = function(el, test) {
// generic utility function to recursively find all elements; heavily based on https://stackoverflow.com/a/18731628/1037948
var results = [], searchResult = null, i, result;
// https://developers.google.com/apps-script/reference/document/body#findElement(ElementType)
while (searchResult = el.findElement(DocumentApp.ElementType.TEXT, searchResult)) {
var t = searchResult.getElement().editAsText(); // .asParagraph()
// check to add to list
if(test && (result = test(t))) {
if( _.isArray(result) ) results = results.concat(result); // could be big? http://jsperf.com/self-concatenation/
else results.push(result);
}
}
// recurse children if not plain text item
if(el.getType() !== DocumentApp.ElementType.TEXT) {
i = el.getNumChildren();
var result;
while(--i > 0) {
result = P.findAllElementsFor(el.getChild(i));
if(result && result.length > 0) results = results.concat(result);
}
}
return results;
};
return P;
})({});
// really? it can't handle object properties?
function gDocsUpdateLinksMenu() {
gDocsHelper.updateLinksMenu();
}
gDocUiHelper.addMenu('Zaus', [ ['Update links', 'gDocsUpdateLinksMenu'] ]);
// #endregion --------------------- Utilities -----------------------------
And I'm including the "extra" utility classes for creating menus, sidebars, etc below for completeness:
var log = function() {
// return false;
var args = Array.prototype.slice.call(arguments);
// allowing functions delegates execution so we can save some non-debug cycles if code left in?
if(args[0] === true) Logger.log(_.map(args, function(v) { return _.isFunction(v) ? v() : v; }).join('; '));
else
_.each(args, function(v) {
Logger.log(_.isFunction(v) ? v() : v);
});
}
// #region --------------------- Menu -----------------------------
var gDocUiHelper = (function(P, un) {
P.addMenuToSheet = function addMenu(spreadsheet, title, items) {
var menu = ui.createMenu(title);
// make sure menu items are correct format
_.each(items, function(v,k) {
var err = [];
// provided in format [ [name, fn],... ] instead
if( _.isArray(v) ) {
if ( v.length === 2 ) {
menu.addItem(v[0], v[1]);
}
else {
err.push('Menu item ' + k + ' missing name or function: ' + v.join(';'))
}
}
else {
if( !v.name ) err.push('Menu item ' + k + ' lacks name');
if( !v.functionName ) err.push('Menu item ' + k + ' lacks function');
if(!err.length) menu.addItem(v.name, v.functionName);
}
if(err.length) {
log(err);
ui.alert(err.join('; '));
}
});
menu.addToUi();
};
// list of things to hook into
var initializers = {};
P.addMenu = function(menuTitle, menuItems) {
if(initializers[menuTitle] === un) {
initializers[menuTitle] = [];
}
initializers[menuTitle] = initializers[menuTitle].concat(menuItems);
};
P.createSidebar = function(title, content, options) {
var sidebar = HtmlService
.createHtmlOutput()
.setTitle(title)
.setWidth( (options && options.width) ? width : 350 /* pixels */);
sidebar.append(content);
if(options && options.on) DocumentApp.getUi().showSidebar(sidebar);
// else { sidebar.attach = function() { DocumentApp.getUi().showSidebar(this); }; } // should really attach to prototype...
return sidebar;
};
P.attachSidebar = function(sidebar) {
DocumentApp.getUi().showSidebar(sidebar);
};
P.onOpen = function() {
var spreadsheet = SpreadsheetApp.getActive();
log(initializers);
_.each(initializers, function(v,k) {
P.addMenuToSheet(spreadsheet, k, v);
});
};
return P;
})({});
// #endregion --------------------- Menu -----------------------------
/**
* A special function that runs when the spreadsheet is open, used to add a
* custom menu to the spreadsheet.
*/
function onOpen() {
gDocUiHelper.onOpen();
}
Had some trouble getting Mogsdad's solution to work. Specifically it misses links which end their parent element so there isn't a trailing non-link character to terminate it. I've implemented something which addresses this and returns a standard range element. Sharing here incase someone finds it useful.
function getAllLinks(element) {
var rangeBuilder = DocumentApp.getActiveDocument().newRange();
// Parse the text iteratively to find the start and end indices for each link
if (element.getType() === DocumentApp.ElementType.TEXT) {
var links = [];
var string = element.getText();
var previousUrl = null; // The URL of the previous character
var currentLink = null; // The latest link being built
for (var charIndex = 0; charIndex < string.length; charIndex++) {
var currentUrl = element.getLinkUrl(charIndex);
// New URL means create a new link
if (currentUrl !== null && previousUrl !== currentUrl) {
if (currentLink !== null) links.push(currentLink);
currentLink = {};
currentLink.url = String(currentUrl);
currentLink.startOffset = charIndex;
}
// In a URL means extend the end of the current link
if (currentUrl !== null) {
currentLink.endOffsetInclusive = charIndex;
}
// Not in a URL means close and push the link if ready
if (currentUrl === null) {
if (currentLink !== null) links.push(currentLink);
currentLink = null;
}
// End the loop and go again
previousUrl = currentUrl;
}
// Handle the end case when final character is a link
if (currentLink !== null) links.push(currentLink);
// Convert the links into a range before returning
links.forEach(function(link) {
rangeBuilder.addElement(element, link.startOffset, link.endOffsetInclusive);
});
}
// If not a text element then recursively get links from child elements
else if (element.getNumChildren) {
for (var i = 0; i < element.getNumChildren(); i++) {
rangeBuilder.addRange(getAllLinks(element.getChild(i)));
}
}
return rangeBuilder.build();
}
You are right ... search and replace is not applicable here.
Use setLinkUrl() https://developers.google.com/apps-script/reference/document/container-element#setLinkUrl(String)
Basically you have to iterate through the elements recursively (elements can contain elements) and for each
use getLinkUrl() to get the oldText
if not null , setLinkUrl(newText) .... leaves displayed text unchanged
This Excel macro lists the links from a Word doc. You'd need to copy your data into a Word doc first.
Sub getLinks()
Dim wApp As Word.Application, wDoc As Word.Document
Dim i As Integer, r As Range
Const filePath = "C:\test\test.docx"
Set wApp = CreateObject("Word.Application")
'wApp.Visible = True
Set wDoc = wApp.Documents.Open(filePath)
Set r = Range("A1")
For i = 1 To wDoc.Hyperlinks.Count
r = wDoc.Hyperlinks(i).Address
Set r = r.Offset(1, 0)
Next i
wApp.Quit
Set wDoc = Nothing
Set wApp = Nothing
End Sub
Here's a quick and dirty way to accomplish the same goal with no scripting:
From Google Docs, save the document in RTF format.
In your editor of choice, edit the links in the RTF file (in my case, I wanted to modify all the hyperlinks, so I used Emacs and regexp-replace). Save the file when you're done.
Create a fresh, new Google Doc, and from the menu, select File>Open and open the RTF file. Docs will convert your edited RTF file back into a proper Google Doc, restoring all formatting.
Google Docs' RTF format is pretty complete--I haven't noticed any loss of fidelity in making the round trip, and it has the advantage of fully exposing all the hyperlinks, formatting, and everything else about the document in a form that's easy to edit and to apply regex tools to.

How can I use a custom function with FILTER?

I have a custom function defined that extracts part of an address from a string:
/*
* Return the number preceding 'N' in an address
* '445 N 400 E' => '445'
* '1083 E 500 N' => '500'
*/
function NorthAddress(address) {
if (!address) return null;
else {
var North = new RegExp('([0-9]+)[\\s]+N');
var match = address.match(North);
if (match && match.length >= 2) {
return match[1];
}
return null;
}
}
I want to use this function as one of the conditions in a call to FILTER(...) in the spreadsheet where I have these addresses stored:
=FILTER('Sheet 1'!A:A, NorthAddress('Sheet 1'!B:B) >= 450))
But when I call NorthAddress like this, it gets an array of all the values in column B and I can't for the life of me find any documentation as to how I need to handle that. The most obvious way (to me) doesn't seem to work: iterate over the array calling NorthAddress on each value, and return an array of the results.
What does my function need to return for FILTER to work as expected?
When a custom function is called passing a multi-cell range, it receives a matrix of values (2d array), it's doesn't matter if the range is a single column or a single row, it's always a matrix. And you should return a matrix as well.
Anyway, I would not use a custom function to this, as there is already the native spreadsheet formulas: RegexMatch, RegexExtract and RegexReplace formulas. To get the "if match" behavior, just wrap them in a IfError formula.
It doesn't work because address is, if you pass only one cell as arg a string, a range, a matrix of string.
So you return a string, FILTER use a boolean array to filter data, so the condition of your filter is string < number.
You just have to convert the string to a number when you returning a value
/*
* Return the number preceding 'N' in an address
* '445 N 400 E' => '445'
* '1083 E 500 N' => '500'
*/
function NorthAddress(address) {
if(typeof address == "string"){
if (!address) return "#N/A";
else {
var North = new RegExp('([0-9]+)[\\s]+N');
var match = address.match(North);
if (match && match.length >= 2) {
return parseInt(match[1]);
}
return "#N/A";
}
} else {
var matrix = new Array();
for(var i = 0; i<address.length; i++){
matrix[i] = new Array();
for(var j = 0; j<address[i].length; j++){
var North = new RegExp('([0-9]+)[\\s]+N');
var match = address[i][j].match(North);
if (match && match.length >= 2) {
matrix[i].push(parseInt(match[1]));
}
}
}
return matrix;
}
}
Hope this will help.
I will add this as an answer, because I found the custom function returns an error if numerical values are passed in the referenced cell or range when toString() is not invoked:
function NorthAddress(address) {
if (!address) return null;
else {
if (address.constructor == Array) {
var result = address;
}
else {
var result = [[address]];
}
var north = new RegExp('([0-9]+)[\\s]+N');
var match;
for (var i = 0; i < result.length; i++) {
for (var j = 0; j < result[0].length; j++) {
match = result[i][j].toString().match(north);
if (match && match.length >= 2) {
result[i][j] = parseInt(match[1]);
}
else {
result[i][j] = null;
}
}
}
return result;
}
}