Google Docs API - complete documentation (hyperlink issue)

Google Docs API - complete documentation (hyperlink issue) - google-apps-script

I hope everyone is in good health. This post is my continue of my previous post
My main goal
So main goal was to get the hyperlink and change it the text linked with it. I initially used code from this post and modified it to change the text of first hyperlink. Here is my modified code to change the text of first hyperlink.
function onOpen() {
const ui = DocumentApp.getUi();
ui.createMenu('What to do?')
.addItem('HyperLink Modifier', 'findAndReplacetext')
.addToUi();
}
/**
* Get an array of all LinkUrls in the document. The function is
* recursive, and if no element is provided, it will default to
* the active document's Body element.
*
* #param element The document element to operate on.
* .
* #returns {Array} Array of objects, vis
* {element,
* startOffset,
* endOffsetInclusive,
* url}
*/
function getAllLinks(element) {
var links = [];
element = element || DocumentApp.getActiveDocument().getBody();
if (element.getType() === DocumentApp.ElementType.TEXT) {
var textObj = element.editAsText();
var text = element.getText();
var inUrl = false;
for (var ch=0; ch < text.length; ch++) {
var url = textObj.getLinkUrl(ch);
if (url != null) {
if (!inUrl) {
// We are now!
inUrl = true;
var curUrl = {};
curUrl.element = element;
curUrl.url = String( url ); // grab a copy
curUrl.startOffset = ch;
}
else {
curUrl.endOffsetInclusive = ch;
}
}
else {
if (inUrl) {
// Not any more, we're not.
inUrl = false;
links.push(curUrl); // add to links
curUrl = {};
}
}
}
if (inUrl) {
// in case the link ends on the same char that the element does
links.push(curUrl);
}
}
else {
var numChildren = element.getNumChildren();
for (var i=0; i<numChildren; i++) {
links = links.concat(getAllLinks(element.getChild(i)));
}
}
return links;
}
/**
* Replace all or part of UrlLinks in the document.
*
* #param {String} searchPattern the regex pattern to search for
* #param {String} replacement the text to use as replacement
*
* #returns {Number} number of Urls changed
*/
function findAndReplacetext() {
var links = getAllLinks();
while(links.length > 0){
var link = links[0];
var paragraph = link.element.getText();
var linkText = paragraph.substring(link.startOffset, link.endOffsetInclusive+1);
var newlinkText = `(${linkText})[${link.url}]`
link.element.deleteText(link.startOffset, link.endOffsetInclusive);
link.element.insertText(link.startOffset, newlinkText);
links = getAllLinks();
}
}
String.prototype.betterReplace = function(search, replace, position) {
if (this.length > position) {
return this.slice(0, position) + this.slice(position).replace(search, replace);
}
return this;
}
Note: I used insertText and deleteText functions to update the text value of hyperlink.
My problem with above code
Now the problem was that this code was running too slow. I thought may be it was because I was running the script every-time I needed to search for next hyperlink, So maybe I can break the loop and only get the first hyperlink each time. Then from my previous post the guy gave me a solution to break loop and only get the first hyperlink but when I tried the new code unfortunately it was still slow. In that post he also proposed me a new method by using Google Docs API, I tried using that it was was super fast. Here is the code using Google Docs API
function myFunction() {
const doc = DocumentApp.getActiveDocument();
const res = Docs.Documents.get(doc.getId()).body.content.reduce((ar, {paragraph}) => {
if (paragraph && paragraph.elements) {
paragraph.elements.forEach(({textRun}) => {
if (textRun && textRun.textStyle && textRun.textStyle.link) {
ar.push({text: textRun.content, url: textRun.textStyle.link.url});
}
});
}
return ar;
}, []);
console.log(res) // You can retrieve 1st link and test by console.log(res[0]).
}
My new problem
I liked the new code but I am stuck again at this point as I am unable to find how can I change the text associated with the hyperlink. I tried using the functions setContent and setUrl but they don't seem to work. Also I am unable to find the documentation for these functions on main documentation of this API. I did find I reference for previously mentioned functions here but they are not available for appscript. Here is the sample document I am working on
https://docs.google.com/document/d/1eRvnR2NCdsO94C5nqly4nRXCttNziGhwgR99jElcJ_I/edit?usp=sharing
End note:
I hope I was able to completly convey my message and all the details assosiated with it. If not kindly don't be mad at me, I am still in learning process and my English skills are pretty weak. Anyway if you want any other data let me know in the comments and Thanks for giving your time I really appreciate that.

In order to remove all the hyperlink from your document, you can do the following:
First, retrieve the start and end indexes of these hyperlinks. This can be done by calling documents.get, iterate through all elements in the body content, checking which ones are paragraphs, iterating through the corresponding TextRun, and checking which TextRuns contain a TextStyle with a link property. All this is already done in the code you provided in your question.
Next, for all TextRuns that include a link, retrieve their startIndex and endIndex.
Using these retrieved indexes, call batchUpdate to make an UpdateTextStyleRequest. You want to remove the link property between each pair of indexes, and for that you would just need to set fields to link (in order to specify which properties you want to update) and don't set a link property in the textStyle property you provide in the request since, as the docs for TextStyle say:
link: If unset, there is no link.
Code sample:
function removeHyperlinks() {
const doc = DocumentApp.getActiveDocument();
const hyperlinkIndexes = Docs.Documents.get(doc.getId()).body.content.reduce((ar, {paragraph}) => {
if (paragraph && paragraph.elements) {
paragraph.elements.forEach(element => {
const textRun = element.textRun;
if (textRun && textRun.textStyle && textRun.textStyle.link) {
ar.push({startIndex: element.startIndex, endIndex: element.endIndex });
}
});
}
return ar;
}, []);
hyperlinkIndexes.forEach(hyperlinkIndex => {
const resourceUpdateStyle = {
requests: [
{
updateTextStyle: {
textStyle: {},
fields: "link",
range: {
startIndex: hyperlinkIndex.startIndex,
endIndex: hyperlinkIndex.endIndex
}
}
}
]
}
Docs.Documents.batchUpdate(resourceUpdateStyle, doc.getId());
});
}

Related

Any time a section is mentioned in the document, I want that mention to become a link to the corresponding bookmark

Goal: I have a very long document with many unique sections that each have bookmarks. Any time a section is mentioned in the document, I want that mention to become a link to the corresponding bookmark. It doesn't have to be event-driven, I intend to do it from a menu.
I have the below code written to get a list of the names of each bookmarked line so I can match it to the words in the doc. I'm trying to figure out what line of code to use to link specific text to that bookmark. I've tried to use the setLinkUrl("beginningofurl" + id[i]) code, but the ID of the bookmarks doesn't tell me if it's a header or regular text, and sometimes it is just regular text. I'm wondering if there's a better way of doing this?
var DOC = DocumentApp.getActiveDocument();
function Setlink() {
var bookmarks = DOC.getBookmarks();
var names = [];
for (var i = 0; i < bookmarks.length; i++){
names.push(bookmarks[i].getPosition().getSurroundingText().getText());
}
Logger.log(names);
}

Headings are a property of Paragraph elements. To check a Bookmark to see if it is in a paragraph of a certain Paragraph Heading, we need to get the Position, then the Element, and then check if the Element is indeed a Paragraph before we can check the Paragraph Heading.
We can put our test for if an Element is a heading in a predicate function named isElementInHeading that will return true or false when given an Element.
function isElementInHeading(element) {
if (element.getType() !== DocumentApp.ElementType.PARAGRAPH) {
return false;
}
const {ParagraphHeading} = DocumentApp;
switch (element.getHeading()) {
case ParagraphHeading.HEADING1:
case ParagraphHeading.HEADING2:
case ParagraphHeading.HEADING3:
case ParagraphHeading.HEADING4:
case ParagraphHeading.HEADING5:
case ParagraphHeading.HEADING6:
return true;
}
return false;
}
This can be used to both filter the bookmarks to include only those that mark headings, and to skip over the same headings when using setLinkUrl.
The strategy in this example is to collect both the bookmark's ID and the desired text in one go using a reducer function, then search through the document for each bit of text, check that we didn't just find the header again, and then apply the link.
I am not quite sure how you are getting the URL, but I found just copying and pasting the URL into the script as const url = "https://docs.google.com/.../edit#bookmark="; worked for me.
// for Array.prototype.reduce
function getHeadingBookmarksInfo(bookmarks, bookmark) {
const element = bookmark.getPosition().getElement();
if (isElementInHeading(element)) {
return [
...bookmarks,
{ id: bookmark.getId(), text: element.getText() }
];
}
return bookmarks;
}
function updateLinks() {
const doc = DocumentApp.getActiveDocument();
const bookmarks = doc.getBookmarks();
const headingBookmarksInfo = bookmarks.reduce(getHeadingBookmarksInfo, []);
const body = doc.getBody();
headingBookmarksInfo.forEach(function(info) {
const {id, text} = info;
let foundRef = body.findText(text);
while (foundRef !== null) {
const element = foundRef.getElement();
if (!isElementInHeading(element.getParent())) {
element.asText()
.setLinkUrl(
foundRef.getStartOffset(),
foundRef.getEndOffsetInclusive(),
url + id // assumes url is hardcoded in global scope
);
}
foundRef = body.findText(text, foundRef);
}
});
}

How to check for URL redirects in Google Sheets with Google Apps Script

I have been trying to run some URL redirect testing using Google Apps Script in Google Sheets, I've been successful by getting a response code and also the final redirect URL for some of them but most of the links are not working.
Examples of the links I would like to check:
https://www.airbnb.com/rooms/4606613
https://www.airbnb.com/rooms/4661522
https://www.airbnb.com/rooms/6014647
https://www.airbnb.com/rooms/14452305
https://www.airbnb.com/rooms/15910617
Pretty much I need to check if those links will redirect to https://www.airbnb.com/s/homes
Using the script below, I get the following list, which is not correct since all of them will redirect to https://www.airbnb.com/s/homes:
https://www.airbnb.com/rooms/4606613
https://www.airbnb.com/s/homes
https://www.airbnb.com/s/homes
https://www.airbnb.com/rooms/14452305
https://www.airbnb.com/rooms/15910617
It seems that the website is taking 1 second to do the redirect and probably that could be the issue.
Below the code:
function urlProtocol(url){
return URI(url).protocol()
}
function urlHostname(url){
return URI(url).hostname()
}
function getRedirects(url) {
eval(UrlFetchApp.fetch('https://rawgit.com/medialize/URI.js/gh-pages/src/URI.js').getContentText());
var params = {
'followRedirects': false,
'muteHttpExceptions': true
};
var baseUrl = urlProtocol(url) + "://" + urlHostname(url),
response = UrlFetchApp.fetch(url, params),
responseCode = response.getResponseCode();
if(response.getHeaders()['Location']){
var redirectedUrl = getRedirects(baseUrl + response.getHeaders()['Location']);
return redirectedUrl;
} else {
return url;
}
}

Seems like the final redirect on some of the URLs happens after the page is loaded. Most likely there is a client-side script that initiates the change of window.location. Therefore, your correct logic fails to catch such pages.
To make matters worse, after-load redirect seem to be inconsistent as sometimes the pages you provided are not redirected to https://www.airbnb.com/s/homes. I was able to stop this redirect from happening, so the theory is confirmed - will update with what exactly causes it.
Apart from that, there are several optimizations you can apply to your script:
Get rid of eval and, actually, of the whole library unless you really need it (see how to do the same in just two lines). Improved security is the main benefit: no eval() of external scripts means less possibilities for breach.
Check for status code in 3xx range before looking through the Location header (as a precaucion).
/**
*
* #param {string} target
*/
const getRedirects = (target) =>
/**
* #param {string}
* #returns {boolean}
*/
(url) => {
if(url === target) {
return false;
}
const response = UrlFetchApp.fetch(url, {
'followRedirects': false,
'muteHttpExceptions': true
});
const code = response.getResponseCode();
let { Location } = response.getHeaders();
if (code < 300 || code >= 400) {
return true;
}
if (!Location) {
return false;
}
if (/^\/\w+/.test(Location)) {
const [protocol, , base] = url.split("/");
Location = `${protocol}//${base}${Location}`;
}
console.log(Location);
return getRedirects(target)(Location);
};
const testRedirects = () => {
const redirectsToHome = getRedirects("https://www.airbnb.com/s/homes");
const accessible = [
"https://www.airbnb.com/rooms/23861670",
"https://www.airbnb.com/rooms/4606613",
"https://www.airbnb.com/rooms/4661522",
"https://www.airbnb.com/rooms/6014647",
"https://www.airbnb.com/rooms/14452305",
"https://www.airbnb.com/rooms/15910617"
].filter(redirectsToHome);
console.log(accessible);
};
Since the clarification that the function is a custom function, you can add a wrapper function that will serve as public API that you can reference in a cell that will call the utility, something like this:
const checkIfRedirects = (source, target = "https://www.airbnb.com/s/homes") => getRedirects(target)(source);
You can then use it like you would do a formula:
=checkIfRedirects(A20)

How to make a closed search in Google Docs?

I have a document where I need to find a text or word, each time i run a function the selection has to go to next if a word or text is found. If it is at the end it should take me to top in a circular way just like find option in notepad.
Is there a way to do it?
I know about findText(searchPattern, from) but I do not understand how to use it.

There are several wrappers and classes in the DocumentApp. They help to work with the contents of the file.
Class Range
Class RangeElement
Class RangeBuilder
It is necessary to understand carefully what they are responsible. In your case the code below should be work fine:
function myFunctionDoc() {
// sets the search pattern
var searchPattern = '29';
// works with current document
var document = DocumentApp.getActiveDocument();
// detects selection
var selection = document.getSelection();
if (!selection) {
if (!document.getCursor()) return;
selection = document.setSelection(document.newRange().addElement(document.getCursor().getElement()).build()).getSelection();
}
selection = selection.getRangeElements()[0];
// searches
var currentDocument = findNext(document, searchPattern, selection, function(rangeElement) {
// This is the callback body
var doc = this;
var rangeBuilder = doc.newRange();
if (rangeElement) {
rangeBuilder.addElement(rangeElement.getElement());
} else {
rangeBuilder.addElement(doc.getBody().asText(), 0, 0);
}
return doc.setSelection(rangeBuilder.build());
}.bind(document));
}
// the search engine is implemented on body.findText
function findNext(document, searchPattern, from, callback) {
var body = document.getBody();
var rangeElement = body.findText(searchPattern, from);
return callback(rangeElement);
}
It looks for the pattern. If body.findText returns undefined then it sets on top of the document.
I have a gist about the subject https://gist.github.com/oshliaer/d468759b3587cfb424348fa722765187

Is there a way to undo changes made by a google apps script?

So I wonder what it takes to make changes made by google apps script to a document reversible.
In particular I am working on a script that applies custom styles to selected elements from a document in Google Docs. It's not a hard thing to do. The problem is that the changes made by the script are not reflected in the history of the document and thus cannot be undone. There is no notion of a reversible editing session either as far as I can tell.
So is there a way to undo the changes made by a script?
function onOpen() {
DocumentApp.getUi()
.createMenu('Extras')
.addItem('Apply code style', 'applyCodeStyle')
.addToUi();
}
function applyCodeStyle() {
var selection = DocumentApp.getActiveDocument().getSelection();
if (selection) {
var elements = selection.getSelectedElements();
for (var i = 0; i < elements.length; i++) {
var element = elements[i];
// Only modify elements that can be edited as text; skip images and other non-text elements.
if (element.getElement().editAsText) {
var text = element.getElement().editAsText();
// Bold the selected part of the element, or the full element if it's completely selected.
if (element.isPartial()) {
text.setBold(element.getStartOffset(), element.getEndOffsetInclusive(), true);
} else {
text.setBold(true);
}
}
}
}
}

The closest I can imagine it to create a backup copy of your file in a specific folder every 5 minutes or so when you are modifying it so you have at least a copy of this doc version. Not ideal but it works...
Here is a piece of code that does it, starting from your code I just added the timer/copy stuff, you can try it by changing the folder ID.
EDIT : added a try/catch for first execution without error.
function applyCodeStyle() {
var selection = DocumentApp.getActiveDocument().getSelection();
try{
var x = new Date().getTime()/60000-new Date(Utilities.jsonParse(ScriptProperties.getProperty('lastBKP'))).getTime()/60000 ;
}catch(e){
ScriptProperties.setProperty('lastBKP', Utilities.jsonStringify(new Date()));
var x = 0
}
Logger.log(x+' minutes')
if (selection) {
if(x > 5){
var docId = DocumentApp.getActiveDocument().getId();
DriveApp.getFileById(docId).makeCopy(DriveApp.getFolderById('0B3qSFd3iikE3NWd5TmRZdjdmMEk')).setName('backup_of_'+DocumentApp.getActiveDocument().getName()+'_on_'+Utilities.formatDate(new Date(),'GMT','yyyy-MMM-dd-HH-mm'));
Logger.log("file copied because new Date().getTime()/3600-new Date(Utilities.jsonParse(ScriptProperties.getProperty('lastBKP'))).getTime()/3600 ="+x);
ScriptProperties.setProperty('lastBKP', Utilities.jsonStringify(new Date()));
}
var elements = selection.getSelectedElements();
for (var i = 0; i < elements.length; i++) {
var element = elements[i];
if (element.getElement().editAsText) {
var text = element.getElement().editAsText();
if (element.isPartial()) {
text.setBold(element.getStartOffset(), element.getEndOffsetInclusive(), true);
} else {
text.setBold(true);
}
}
}
}
}

How to use .findElement(DocumentApp.ElementType.TABLE_OF_CONTENTS) to get and parse a Document's Table of Contents Element

My goal is to parse a TableOfContents element in a Google Document and write it to another one. I want to do this for every document in a folder.
Having gone to the bother of converting each document to the type generated by DocsList just so I can use this method [ which a document generated by DocumentApp does not have. Why, I don't understand, because otherwise the two 'documents' are similar when it comes to finding parts. ], I find that what I get back is a SearchResult. How is this elusive construction used? I've tried converting it into a TableOfContents element [ ele = searchResult.asTableOfContents() ], which does not error out, but nothing I do allows me parse through its child elements to recover their text works. Interestingly enough, if you get a TableOfContents element by parsing through the document's paragraphs to get it, THAT let's you parse the TOC.
Would someone speak to this question. I sure would appreciate a code snippet because I'm getting nowhere, and I have put some hours into this.

The asTableOfContents() method is only there to help the editor's autocomplete function. It has no run-time impact, and cannot be used to cast to a different type. (See ContainerElement documentation.)
To parse the table of contents, start by retrieving the element from the SearchResult. Below is an example that goes through the items in a document's table of contents to produce an array of item information.
Example Document
Parsing results
On a simple document with a few headings and a table of contents, here's what it produced:
[13-08-20 16:31:56:415 EDT]
[
{text=Heading 1.0, linkUrl=#heading=h.50tkhklducwk, indentFirstLine=18.0, indentStart=18.0},
{text=Heading 1.1, linkUrl=#heading=h.ugj69zpoikat, indentFirstLine=36.0, indentStart=36.0},
{text=Heading 1.2, linkUrl=#heading=h.xb0y0mu59rag, indentFirstLine=36.0, indentStart=36.0},
{text=Heading 2.0, linkUrl=#heading=h.gebx44eft4kq, indentFirstLine=18.0, indentStart=18.0}
]
Code
function test_parseTOC() {
var fileId = '--Doc-ID--';
Logger.log( parseTOC( fileId ) );
}
function parseTOC( docId ) {
var contents = [];
var doc = DocumentApp.openById(docId);
// Define the search parameters.
var searchElement = doc.getBody();
var searchType = DocumentApp.ElementType.TABLE_OF_CONTENTS;
// Search for TOC. Assume there's only one.
var searchResult = searchElement.findElement(searchType);
if (searchResult) {
// TOC was found
var toc = searchResult.getElement().asTableOfContents();
// Parse all entries in TOC. The TOC contains child Paragraph elements,
// and each of those has a child Text element. The attributes of both
// the Paragraph and Text combine to make the TOC item functional.
var numChildren = toc.getNumChildren();
for (var i=0; i < numChildren; i++) {
var itemInfo = {}
var tocItem = toc.getChild(i).asParagraph();
var tocItemAttrs = tocItem.getAttributes();
var tocItemText = tocItem.getChild(0).asText();
// Set itemInfo attributes for this TOC item, first from Paragraph
itemInfo.text = tocItem.getText(); // Displayed text
itemInfo.indentStart = tocItem.getIndentStart(); // TOC Indentation
itemInfo.indentFirstLine = tocItem.getIndentFirstLine();
// ... then from child Text
itemInfo.linkUrl = tocItemText.getLinkUrl(); // URL Link in document
contents.push(itemInfo);
}
}
// Return array of objects containing TOC info
return contents;
}
Bad news
The bad news is that you are limited in what you can do to a table of contents from a script. You cannot insert a TOC or add new items to an existing one.
See Issue 2502 in the issue tracker, and star it for updates.
If you can post code or explain your issue with DocsList vs DocumentApp, it could be looked at. The elements of a Google Document can only be manipulated via DocumentApp.

I modified the above code to re-create the TOC in a table only with the desired levels(i.e. h1, h2). The only caveat is that TOC must be present & updated before running this.
function findToc(body, level = 2) {
const indent = 18;
let contents = [];
const tocType = TABLE_OF_CONTENTS;
const tocContainer = body.findElement(tocType);
if (tocContainer) {
// TOC was found
const toc = tocContainer.getElement().asTableOfContents();
const totalLines = toc.getNumChildren();
for (let lineIndex = 0; lineIndex < totalLines; lineIndex++) {
const tocItem = toc.getChild(lineIndex).asParagraph();
const { INDENT_START } = tocItem.getAttributes();
const isDesiredLevel = Number(INDENT_START) <= indent * (level - 1);
if (isDesiredLevel) {
contents.push(tocItem.copy());
}
}
}
return contents;
}
function addToTable(cellText) {
body = DocumentApp.openById(docId).getBody();
const table = body.appendTable();
const tr = table.insertTableRow(0);
const td = tr.insertTableCell(0);
cellText.forEach(text => {
td.appendParagraph(text);
})
}
function parseTOC(docId) {
body = DocumentApp.openById(docId).getBody();
const contents = findToc(body);
addToTable(contents);
}

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Google Docs API - complete documentation (hyperlink issue) - google-apps-script

Related

Any time a section is mentioned in the document, I want that mention to become a link to the corresponding bookmark

How to check for URL redirects in Google Sheets with Google Apps Script

How to make a closed search in Google Docs?

Is there a way to undo changes made by a google apps script?

How to use .findElement(DocumentApp.ElementType.TABLE_OF_CONTENTS) to get and parse a Document's Table of Contents Element

Categories

Resources