Google App Script replaceText to replace only first occurrence of matched string - google-apps-script

I would like to use google appscript to replace text on my google doc to convert it to PDF. But the problem is the function replaceText(textToReplace, newText); just remove every occurrence of the matched text. I just want to remove only the first occurrence. How to do that?

The replaceText method can be limited in scope to an element, by calling it on that element. But that does not help if the first paragraph where the text is found contains multiple instances of it: they are all going to be replaced.
Instead, use findText to find the first match, and then call deleteText and insertText to execute replacement.
// replaces the first occurrence of old
function replaceFirst(old, replacement) {
var body = DocumentApp.getActiveDocument().getBody();
var found = body.findText(old);
if (found) {
var start = found.getStartOffset();
var end = found.getEndOffsetInclusive();
var text = found.getElement().asText();
text.deleteText(start, end);
text.insertText(start, replacement);
}
}
If you think this ought to be easier, you are not alone.

Related

Find and format all words that start with a set string in Google Docs

I made a function in Google Apps Script that searches for all words in a Google Docs and changes their colors to a desired color. It takes as inputs: the doc ID, the color desired and the word to look for.
However, what I really need is a function that finds all the words that start with a particular string. For example, "change all words that start with # to blue". I tried messing with findText() but had no luck. Any ideas on how to fix the function below to do what I need? Thanks!
Currently, my function looks like this:
function colorTheWords(findMe,color,documentID) {
//color input must be formatted in CSS notation like '#ffffff'
//documentID must be formated as text in between ''
//findMe word must be formatted as ''
//open doc
var document = DocumentApp.openById(documentID);
var body = document.getBody();
var foundElement = body.findText(findMe);
while (foundElement != null) {
// Get the text object from the element
var foundText = foundElement.getElement().asText();
// Where in the Element is the found text?
var start = foundElement.getStartOffset();
var end = foundElement.getEndOffsetInclusive();
// Change the current color to the desired color
foundText.setForegroundColor(start, end, color);
// Find the next match
foundElement = body.findText(findMe, foundElement);
}
}
You can use regular expressions findText, which will allow you to do this easily. There is an answer to a similar question here:
Regex to check whether string starts with, ignoring case differences
I always use this site to help me to test my regular expressions before adding them to the code. Paste the contents of your document in and then fiddle with your regex until you just select what you need.
https://regexr.com/
The main issue you are encountering is that findText does not use normal regular expressions but a flavour called re2. This has some slight variations and restrictions. If you want to find all words that start with a specific string or character, this is the expression you should be using:
#([^\s]+)

Formatting in replaceText()

I've got doc.getBody().replaceText(oldregex,newstring) working fine in a Google Document script at the minute, and was hoping to set some bold/italic on newstring. This looks harder than I thought it would be. Has anyone found a tidy way to do this?
I'm currently thinking I'll need to...
Build newtext as a range with rangeBuilder
Find oldtext and select it as a range (somehow...)
Clear the oldtext range and insert the newtext range at the find location
This seems like a lot of work for something that would be trivial with HTML-like tags. I'm definitely missing something. Would really appreciate any suggestions.
Since replaceText only changes the plain text content, leaving formatting in place, the goal can be achieved by applying formatting before the replacement. First, findText goes through the text and sets bold to every match; then replaceText performs the replacement.
There are two cases to consider: only a part of text in an element is matched (which is typical) and entire element is matched. The property isPartial of RangeElement class distinguishes between these.
function replaceWithBold(pattern, newString) {
var body = DocumentApp.getActiveDocument().getBody();
var found = body.findText(pattern);
while (found) {
var elem = found.getElement();
if (found.isPartial()) {
var start = found.getStartOffset();
var end = found.getEndOffsetInclusive();
elem.setBold(start, end, true);
}
else {
elem.setBold(true);
}
found = body.findText(pattern, newString);
}
body.replaceText(pattern, newString);
}
This seems like a lot of work for something that would be trivial
This is both correct and typical for working with Google Documents using Apps Script.

Selecting text with google app script in Docs

Is it possible for an app script to highlight (as in select) text? I want to run the script from the menu and then have all matching instances of some text selected so they can be formatted in one go.
Specifically, I want to write a script to highlight all footnotes in a Google Doc so that they can be formatted simultaneously. I am the creator of the Footnote Stylist add on for Docs, which allows users to style footnotes. But I want to include the option of using any formatting, without having to include every available formatting choice in the add on itself.
How about skip the highlighting portion and just format them direct? The code below searches for the word "Testing" and bolds it & highlights it yellow. Hope this helps.
function bold() {
var body = DocumentApp.getActiveDocument().getBody();
var foundElement = body.findText("Testing");
while (foundElement != null) {
// Get the text object from the element
var foundText = foundElement.getElement().asText();
// Where in the element is the found text?
var start = foundElement.getStartOffset();
var end = foundElement.getEndOffsetInclusive();
// Set Bold
foundText.setBold(start, end, true);
// Change the background color to yellow
foundText.setBackgroundColor(start, end, "#FCFC00");
// Find the next match
foundElement = body.findText("Testing", foundElement);
}
}

How do you change formatting within a google doc for multiple occurrences using findText()?

I am trying to find text within a google doc and replace with a subscript notation - replace "a3" with a3 but with the 3 now formatted as a subscript.
based on the answer here
I wrote some code that is working but only replaces the 1st instance of any occurrence (some are repeated).
I wrote the following:
for (var k=0; k<subscriptsReplace.length; k++) {
subscript = ' a'+subscriptsReplace[k];
find = ' a'+subscriptsReplace[k]+' ';
Logger.log(find)
var element = body.findText(find);
if(element){ // if found a match
var start = element.getStartOffset();
var text = element.getElement().asText();
text.replaceText(find, subscript);
text.setTextAlignment(start+2, start+2, DocumentApp.TextAlignment.SUBSCRIPT);
Logger.log("found one");
} // else do nothing
}
note that subscriptsReplace is an array that contains all the numbers of the subscripts throughout the document.
I cannot figure out why it's not getting the repeats, by looking at the logs, I know that it's not running the conditional on the repeats - so it's not re-replacing the same subscript it already replaced.
can someone see what's going on?
THank you!
Ultimately the issue was that using replaceText() was replacing all the occurences of the text throughout the document and therefor, it wasn't available to find and replace the formatting after the 1st iteration.
Here's the code that replaced all occurences:
for (var k=0; k<subscriptsReplace.length; k++) {
find = 'a'+subscriptsReplace[k]+'_';
var element = body.findText(find);
if(element){ // if found a match
var start = element.getStartOffset();
var text = element.getElement().asText();
text.setTextAlignment(start+1, start+1, DocumentApp.TextAlignment.SUBSCRIPT);
text.deleteText(start+2, start+2);
} // else do nothing
}
you'll see that rather than replacing, I added a special character "_" as a marker to find and then used deleteText() to get rid of them 1 at a time as I reformatted into subscripts
You can replace everything in the entire body with this:
function testReplace() {
var docBody = DocumentApp.getActiveDocument().getBody();
docBody.replaceText(searchPattern, replacement);
};
Google Documentation - Replace Text

prevent regex errors with unpredictable values

In a mail merge application I use the .replace() method to replace field identifiers by custom values and also in a reverse process to get the identifiers back.
The first way works every time since the replace first argument is a pretty normal string that I have chosen on purpose... but when I reverse the process it happens sometimes that the string contains incorrect regular expression characters.
This happens mainly on phone numbers in the form +32 2 345 345 or even with some accentuated characters.
Given I can't prevent this from happening and that I have little hope that my endusers won't use this phone number format I was wondering if someone could suggest a workaround to escape illegal characters when they come up ? note : it can be at any place in the string.
below is the code for both functions.
... (partial code)
var newField = ChampSpecial(curData,realIdx,fctSpe);// returns the value from the database
if(newField!=''){replacements.push(newField+'∏'+'#ch'+(n+1)+'#')};
//Logger.log('value in '+n+'='+realIdx+' >> '+Headers[realIdx]+' = '+ChampSpecial(curData,realIdx,fctSpe))
app.getElementById('textField'+(n+1)).setHTML(ChampSpecial(curData,realIdx,fctSpe));
if(e.parameter.source=='insertInText'){
body.replaceText('#ch'+(n+1)+'#',newField);
}
}
UserProperties.setProperty('replacements',replacements.join('|'));
cloakOn();
colorize('#ffff44');
return app;
}
function fieldsInDoc(e){
cloakOff();// remet d'abord les champs vides
var replacements = UserProperties.getProperty('replacements').split('|');
var doc = DocumentApp.getActiveDocument();
var body = doc.getBody();
for(var n=0;n<replacements.length;++n){
var field = replacements[n].split('∏')[1];
var testVal = replacements[n].split('∏')[0];
body.replaceText(testVal,field);
}
colorize('#ffff44');
}
In the reverse process you are using the fieldvalues provided that can include regex special characters. you have to escape them before replacing:
body.replaceText(field.replace(/[[\]{}()*-+?.,\\^$|#\s]/, '\\$&'), '#ch'+(n+1)+'#');
This said, the "replace back the markers" a bad idea. What happens if two fields of the mail merge have the same value or the replacement text is already present in the document template...
One possible solution was to prevent the example fields in the doc from containing regex special characters so the replace had to occur in the forward process, not in the reverse (as suggested in the other answer).
Escaping these character in the fields values didn't work* so I ended up with a simple replacement by a hyphen (which make sense in most cases to replace a slash or a '+').
(*) the reverse process uses the value kept in memory so the escape sign was disturbing the replace in that function, preventing it to work properly.
the final working code goes simply like this :
//(in the first function)
var newField = ChampSpecial(curData,realIdx,fctSpe).replace(/([*+?^=!:${}()|\[\]\/\\])/g, "-");// replace every occurrence of *+?^... by '-' (global search)
About the comment stating that this approach is a bad idea I can only say that I'm afraid there is not really other ways to get that behavior and that the probability to get errors if finally quite low since the main usage of mail merge is to insert proper names, adresses, emails and phone numbers that are rarely in the template itself.
As for the field indicators they will never have the same name since they are numerically indexed (#chXX#).
EDIT : following Taras's comment I'll try another solution, will update later if it works as expected.
EDIT June 19 , Yesssss... found it.
I finally found a far better solution that doesn't use regular expression so I'm not forced to escape special characters ... the .find() method accepts any string.
The code is a bit more complex but the results is worth the pain :-))
here is the full code in 2 functions if ever someone looks for something similar.
function valuesInDoc(e){
var lock = LockService.getPrivateLock(); // just in case one clicks the second button before this one ends
var success = lock.tryLock(5000);
if (!success) {
Logger.log('tryLock failed to get the lock');
return
}
colorize('#ffffff');// this function removes the color tags on the field marlers
var app = UiApp.getActiveApplication();
var listVal = UserProperties.getProperty('listSel').split(',');
var replacements = [];
var doc = DocumentApp.getActiveDocument();
var body = doc.getBody();
var find = body.findText('#ch');
if(find == null){return app };
var curData = UserProperties.getProperty('selItem').split('|');
var Headers = [];
var OriHeaders = UserProperties.getProperty('Headers').split('|');
for(n=0;n<OriHeaders.length;++n){
Headers.push('#'+OriHeaders[n]+'#');
}
var fctSpe = 0 ;
for(var i in Headers){if(Headers[i].indexOf('SS')>-1){fctSpe = i}}
for(var n=0;n<listVal.length;++n){
var realIdx = Number(listVal[n]);
Logger.log(n);
var newField = ChampSpecial(curData,realIdx,fctSpe);
//Logger.log(newField);
app.getElementById('textField'+(n+1)).setHTML(ChampSpecial(curData,realIdx,fctSpe));
if(e.parameter.source=='insertInText'){
var found = body.findText('#ch'+(n+1)+'#');// look for every field markers in the whole doc
while(found!=null){
var elemTxt = found.getElement().asText();
var startOffset = found.getStartOffset();
var len = ('#ch'+(n+1)+'#').length;
elemTxt.deleteText(startOffset, found.getEndOffsetInclusive())
elemTxt.insertText(startOffset,newField);// remove the marker and write the sample value in place
Logger.log('n='+n+' newField = '+newField+' for '+'#ch'+(n+1)+'#'+' at position '+startOffset)
replacements.push(newField+'∏'+'#ch'+(n+1)+'#'+'∏'+startOffset);// memorize the change that just occured
found = body.findText('#ch'+(n+1)+'#',found); //loop until all markers are replaced
}
}
}
UserProperties.setProperty('replacements',replacements.join('|'));
cloakOn();
colorize('#ffff44');// colorize the markers if ever one is left but it shouldn't happen
lock.releaseLock();
return app;
}
function fieldsInDoc(e){
var lock = LockService.getPrivateLock();
var success = lock.tryLock(5000);
if (!success) {
Logger.log('tryLock failed to get the lock');
return
}
cloakOff();// remet d'abord les champs vides > shows the hidden fields (markers that had no sample velue in the first function
var replacements = UserProperties.getProperty('replacements').split('|');// recover replacement data as an array
Logger.log(replacements)
var doc = DocumentApp.getActiveDocument();
var body = doc.getBody();
for(var n=replacements.length-1;n>=0;n--){ // for each replacement find the data in doc and write a field marker in place
var testVal = replacements[n].split('∏')[0]; // [0] is the sample value
if(body.findText(testVal)==null){break};// this is only to handle the case one click on the wrong button trying to place markers again when they are already there ;-)
var field = replacements[n].split('∏')[1];
var testValLength = testVal.length;
var found = body.findText(testVal);
var startOffset = found.getStartOffset();
Logger.log(testVal+' = '+field+' / start: '+startOffset+' / Length: '+ testValLength)
var elemTxt = found.getElement().asText();
elemTxt.deleteText(startOffset, startOffset+testValLength-1);// remove the text
// elemTxt.deleteText(startOffset, found.getEndOffsetInclusive() )
elemTxt.insertText(startOffset,field);// and write the marker
}
colorize('#ffff44'); // colorize the marker
lock.releaseLock();// and release the lock
}