Google Apps Script findtext searchpattern format? - google-apps-script

I am trying to pass in simple regex strings like
findText("/a/");
or
findText(/a/);
but it does not find anything. If I pass in only the text that works like this
findText("a");
How to pass regex strings in there?

It's not super clear in the documentation of findText, but the documentation for replaceText is more clear:
The search pattern is passed as a string, not a JavaScript regular expression object.
The example shown in the documentation of replaceText shows that your 3rd example is the correct one (where the search for a is shown as just the string, "a".
body.replaceText("^.*Apps ?Script.*$", "Apps Script");
Obviously, String.search() will work here as well, but if you're looking to manipulate the attributes of the text, rather than just the string contents, using the built-in javascript function might leave you hanging.

Use the String.search() method.
function test(){
var testString = "1212a1212";
var results = testString.search(/a/);
Logger.log(results); //results = 4;
}

It is possible to use regex expressions with the findText(searchPattern) function however the expression needs to be in the RE2 syntax.
For example if you wanted to do a case insensitive search for the word 'antevasin' you could specify
let searchResult = DocumentApp.getActiveDocument().getBody().editAsText().findText( '(?i)antevasin' );
where (?i) turns on case-insensitive matching and would find 'Antevasin' in the document.
This page and this one have some examples and more detail.

Related

How to I save text from a document into a variable?

I want to take the text for a document and save it as a variable. I looked in the documentation and found "getText" something I think shall work. https://developers.google.com/apps-script/reference/document/footnote-section#gettext
I just get a problem when I try using it, because it's not a pre built function it gives the error massage "TypeError: Cannot read property 'getText' of null". So I looked at some more into it and noticed I needed Authorization:
"Scripts that use this method require authorization with one or more of the following scopes:
https://www.googleapis.com/auth/documents.currentonly
https://www.googleapis.com/auth/documents"
So how do I get the required authorization, do I need to do something different or is there another way i could do it?
It's just going to run on some of my docs for fun to se what funny things I am able to do with the program.
(New to programing, now the basics but just trying to see if programing is something for me)
Thanks in advance
Given with this sample document
You can start with this sample code below.
Code:
function myFunction() {
var body = DocumentApp.getActiveDocument().getBody();
var text = body.editAsText();
Logger.log(text.getText()); // returns all text in document ("Hey, search for me!!! I am <here>!!")
// 1. Regular expression (exec)
var regExp = new RegExp("<\(.*\)>", "gi"); // "i" is for case insensitive
var search = regExp.exec(text.getText())[1];
Logger.log(search); // returns "here"
// 2. (search)
Logger.log(text.getText().search(/here/)); // returns the index where the string was found, or -1 if not found
}
Output:
Note:
getText will return the text of the Class Text.
If you want to get a specific pattern from the document, you need to use Regular expression. I prefer the exec method. See usage for more details
For the authorization, you only need to click allow. I assume you are using apps script due to the tag.

Apps script JSON.parse() returns unexpected result, how can I solve this?

I am currently working on external app using Google Sheets and JSON for data transmission via Fetch API. I decided to mock the scenario (for debugging matters) then simple JSON comes from my external app through prepared Code.gs to be posted on Google sheets. The code snippet I run through Apps-scripts looks like this:
function _doPost(/* e */) {
// const body = e.postData.contents;
const bodyJSON = JSON.parse("{\"coords\" : \"123,456,789,112,113,114,115,116\"}" /* instead of : body */);
const db = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
db.getRange("A1:A10").setValue(bodyJSON.coords).setNumberFormat("#"); // get range, set value, set text format
}
The problem is the result I get: 123,456,789,112,113,000,000,000 As you see, starting from 114 and the later it outputs me 000,... instead. I thought, okay I am gonna explicitly specify format to be returned (saved) as a text format. If the output within the range selected on Google Sheets UI : Format -> Number -> it shows me Text.
However, interesting magic happens, let's say if I would update the body of the JSON to be parsed something like that when the sequence of numbers composed of 2 digits instead of 3 (notice: those are actual part of string, not true numbers, separated by comma!) : "{\"coords\" : \"123,456,789,112,113,114,115,116,17,18\"}" it would not only show response result as expected but also brings back id est fixes the "corrupted" values hidden under the 000,... as so : "{"coords" : "123,456,789,112,113,114,115,116,17,18 "}".
Even Logger.log() returns me initial JSON input as expected. I really have no clue what is going on. I would really appreciate one's correspondence to help solving this issue. Thank you.
You can try directly assigning a JSON formatted string in your bodyJSON variable instead of parsing a set of string using JSON.parse.
Part of your code should look like this:
const bodyJSON = {
"coords" : "123,456,789,112,113,114,115,116"
}
I found simple workaround after all: just added the preceding pair of zeros 0,0,123,... at the very beginning of coords. This prevents so called culprit I defined in my issue. If anyone interested, the external app I am building currently, it's called Hotspot widget : play around with DOM, append a marker which coordinates (coords) being pushed through Apps-script and saved to Google Sheets. I am providing a link with instructions on how to set up one's own copy of the app. It's a decent start-off for learning Vanilla JavaScript basics including simple database approach on the fly. Thank you and good luck!
Hotspot widget on Github

How to get validation pattern from Google Form item

In Google Forms, a question/item of class TextItem can have a validation pattern set (i.e., the response must match the pattern to be accepted by the form) by doing something like:
var textValidation = FormApp.createTextValidation()
.requireTextMatchesPattern(my_regex_pattern)
.build();
textItem.setValidation(textValidation);
I would like to read that pattern from an existing textItem (in order to have a script go through several questions with unique patterns and save them), but I cannot find any references to relevant methods or properties for validation rules, e.g. (made-up examples):
var existing_pattern = textItem.getTextMatchesPattern();
or
var existing_patern = textItem.getValidation.getTextMatchesPattern();
From what I can see, it looks like this is not actually possible, but that seems quite strange. Am I missing something?
Note: Using the Web interface, it's trivial to get the pattern for a given question/item. I'm specifically asking how to get it through a script.

Smarter way to isolate a value in an unformated string?

I'm using xpdf in an AIR app to convert PDFs to PNGs on the fly. Before conversion I want to get a page count and am using xdf's pdfinfo utility to print to stdout and then parsing that string to get the page count.
My first pass solution: split the string by line breaks, test the resulting array for the "Pages:" string, etc.
My solution works but it feels clunky and fragile. I thought about replacing all the double spaces, doing a split on ":" and building a hash table – but there are timestamps with colons in the string which would screw that up.
Is there a better or smarter way to do this?
protected function processPDFinfo(data:String):void
{
var pageCount:Number = 0;
var tmp:Array = data.split("\n");
for (var i:int = 0; i < tmp.length; i++){
var tmpStr:String = tmp[i];
if (tmpStr.indexOf("Pages:") != -1){
var tmpSub:Array = tmpStr.split(":");
if (tmpSub.length){
pageCount = Number(tmpSub[tmpSub.length - 1]);
}
break;
}
}
trace("pageCount", pageCount);
}
Title: Developing Native Extensions
Subject: Adobe Flash Platform
Author: Adobe Systems Incorporated
Creator: FrameMaker 8.0
Producer: Acrobat Distiller Server 8.1.0
CreationDate: Mon Dec 7 05:45:39 2015
ModDate: Mon Dec 7 05:45:39 2015
Tagged: yes
Form: none
Pages: 140
Encrypted: no
Page size: 612 x 783 pts (rotated 0 degrees)
File size: 2505564 bytes
Optimized: yes
PDF version: 1.4
Use regular expressions like this one for example:
/Pages:\s*(\d+)/g
The first (and only) capturing group is the string of digits you are looking for.
var pattern:RegExp = /Pages:\s*(\d+)/g;
var pageCount:int = parseInt(patern.exec(data)[1]);
I understand about 2% of that (/Pages: /g). It is looking for the string literal Pages: and and then something with spaces wildcard and escaping d+??
I know, regex can be hard. What really helps creating them is if your IDE supports them. There are also online tools like regexr (me first time using version 2 here and it's even better than version 1, very nice!) In general, you want to have a tool that gives you immediate visual feedback of what's being matched.
Below is a screenshot with your text and my pattern in regexr.
You can hover over things and get all kinds of information.
The sidebar to the left is a full fledged documentation on regex.
The optional explain tab goes through the given pattern step by step.
\s* is any amount of whitespace characters and \d+ is at least one numeric digit character.
and returning an array??
This is the As3 part of the story. Once you create a RegExp object with he pattern, you can use exec() to execute it on some String. (not sure why they picked the retarded abbreviation for the method name)
The return value is a little funky:
Returns
Object — If there is no match, null; otherwise, an object with the following properties:
An array, in which element 0 contains the complete matching substring, and other elements of the array (1 through n) contain substrings that match parenthetical groups in the regular expression
index — The character position of the matched substring within the string
input — The string (str)
You have to check the documentation of exec() to really understand this. It's kind of JS style, returning a bunch of variables held together in a generic object that also acts as an array.
This is where the [1] in my example code comes from.

Creating new attribute

In rapidminer i have a data set with an attribute called address which contain property address, what i need to do is create a new attribute which will only have the last 3 words contain in each property address. ie 231 new road County Dublin Ireland what i want to have is County Dublin Ireland in the new attribute. Could anybody help me with this process as i am very new to rapidminer. I have tried to do it with the generate attribute process useing the function expression options but no success.
There might be an easier way to achieve that, but you can use the Execute Script operator and some regular expressions. This example script will replace the values of attribute "att1" with only the last three words:
import java.util.regex.*
exampleSet = operator.getInput(ExampleSet.class)
Pattern p = Pattern.compile("^.*?(\\S+\\s\\S+\\s\\S+)\$")
for(Example example : exampleSet){
value = example["att1"]
print(value)
Matcher m = p.matcher(value)
if(m.matches()){
example["att1"] = m.group(1)
}
}
return exampleSet
Edit:
There really is much easier way: Use the Generate extract operator with regular expression: (\S+\s\S+\s\S+)$ You may need to adapt the regular expression to your data.