How do I remove text between two "tags" in Google Apps Script - google-apps-script

I am using Google Apps Script to make documents from a "template". This template in Google Docs has the following format:
(Text above the area of interest...) Culpa cillum excepteur cupidatat cillum ex.
{mySpecialText}
Est consectetur irure non quis sint exercitation velit qui consequat incididunt officia laborum ea veniam.
{/mySpecialText}
(More ignored text here...) Tempor voluptate irure laboris occaecat enim ipsum consequat velit.
The text between the two tags {mySpecialText} and {/mySpecialText} should be deleted, in addition to the tags. How do I go about doing this? My current code shown below removes the tags but not the paragraph in between.
/**
* #param doc - a `Document` instance.
* #param sectionId - the name of the tag
*/
function removeSection (doc, sectionId) {
var startTag = '{' + sectionId + '}'
var endTag = '{/' + sectionId + '}'
var body = doc.getBody()
var startElem = body.findText(startTag)
var endElem = body.findText(endTag)
if (startElem == null) {
return Logger.log('Couldn\'t find startElem with tag "' + startTag + '"')
}
if (endElem == null) {
return Logger.log('Couldn\'t find endElem with tag "' + endTag + '"')
}
startElem = startElem.getElement()
endElem = endElem.getElement()
var toRemove = []
var currentElement = startElem
while (currentElement !== null && !isSameElement(currentElement, endElem)) {
toRemove.push(currentElement)
currentElement = currentElement.getNextSibling()
}
toRemove.push(endElem)
for (var i = 0; i < toRemove.length; i++) {
toRemove[i].removeFromParent()
}
}

You need to delete the entire paragraph between the tags. Assuming this is only one paragraph, you could retrieve the paragraph element and delete it as follows:
var removeme = startElem.getParent().getNextSibling().asParagraph();
removeme.removeFromParent();
If there could be multiple paragraphs, you could write some script with a loop that detects whether the next sibling of removeme is an end tag. If it is not, then delete the paragraph, otherwise stop the loop.

Related

Count instances of text string and replace with number

I am trying to find ways to speed up adding footnotes to a Google slides document. What I want is a script that looks for every instance of a text string throughout the document (say ‘*’) and then replaces each instance of that string with the number corresponding to that instance e.g. the first * gets replaced with 1, second * gets replaced with 2, and so on. I realise this method can only be used once but this would still save me a lot of time. Is there an easy way to do this? I can’t work out how to replace with a variable but it seems like it should be possible.
Assuming that we have this slide below as our sample data.
Sample Data:
If we want to replace all occurrences of a string (e.g. "replace"), then we will need to traverse all shapes of each slides and replace its occurrences with the counter.
Code:
function myFunction() {
var presentation = SlidesApp.getActivePresentation();
var slides = presentation.getSlides();
var counter = 0;
// traverse each slide
slides.forEach(function (slide) {
var shapes = slide.getShapes();
// traverse each shape
shapes.forEach(function (shape) {
// get its text content
var text = shape.getText()
var string = text.asString();
// replace all occurrences of string (e.g. "replace")
// by an incrementing number
string = string.replace(/replace/g, function() {
return ++counter;
});
// set the shape's text
text.setText(string);
});
});
}
Output:
Not exactly a ready solution. Rather the way to solve the task.
You can download all texts of your presentation as a TXT file:
Then you can process this text with JS script. Something like this:
// your text with markers (#)
var txt = `
doleste # atus etur, consequi odi quos alit audipsunt as is est# ant.
consequi # odi quos alit audipsunt es vere ipsam aut am
doluptae et que nonse # um volupta aped ulloreictat as is est ant.
`;
// get every marker + several characters before and after
var find_for = txt.match(/...#.../g);
console.log(find_for); // Output: [ 'te # at', 'est# an', 'ui # od', 'se # um' ]
// replace marker with numbers 1, 2, 3...
var replace_with = find_for.map((m,i) => m.replace(/#/, i+1));
console.log(replace_with); // Output: [ 'te 1 at', 'est2 an', 'ui 3 od', 'se 4 um' ]
This way you will get two arrays: find_for and replace_with.
Then you will need a script to perform the text replaces.
'te # at' --> 'te 1 at'
'est# an' --> 'est2 an'
'ui # od' --> 'ui 3 od'
'se # um' --> 'se 4 um'
Which is, I believe, a quite trivial task.
But there can be errors if some markets has the same neighbor characters. Probably you need to take four or five neighbor characters with markers: ....#.... or asymetric ......#... It's up to you.

Automatic multipage with multiline text

I have got a dynamic text, which has an unknown number of lines. This number of lines can be between 1 and for example 1000.
Now I want do create a PDF document, which automatically creates a new page if a specific number of lines is reached.
I already found that it would probably work with MigraDoc, but I tested it already and well.. it didn't work like I tested it.
// You always need a MigraDoc document for rendering.
Document doc = new Document();
MigraDoc.DocumentObjectModel.Section sec = doc.AddSection();
// Add a single paragraph with some text and format information.
MigraDoc.DocumentObjectModel.Paragraph para = sec.AddParagraph();
para.Format.Alignment = ParagraphAlignment.Justify;
para.Format.Font.Name = "Times New Roman";
para.Format.Font.Size = 12;
para.Format.Font.Color = MigraDoc.DocumentObjectModel.Colors.DarkGray;
para.Format.Font.Color = MigraDoc.DocumentObjectModel.Colors.DarkGray;
para.AddText("Duisism odigna acipsum delesenisl ");
para.AddFormattedText("ullum in velenit", TextFormat.Bold);
para.AddText(" ipit iurero dolum zzriliquisis nit wis dolore vel et nonsequipit, velendigna " +
"auguercilit lor se dipisl duismod tatem zzrit at laore magna feummod oloborting ea con vel " +
"essit augiati onsequat luptat nos diatum vel ullum illummy nonsent \nA \n B\nV \nD \nE\nF\nG\nA \n B\nV \nD \nE\nF\nG\nA \n B\nV \nD \nE\nF\nG\nA \n B\nV \nD \nE\nF\nG\nA \n B\nV \nD \nE\nF\nGnit ipis et nonsequis " +
"niation utpat. Odolobor augait et non etueril landre min ut ulla feugiam commodo lortie ex " +
"essent augait el ing eumsan hendre feugait prat augiatem amconul laoreet. ≤≥≈≠");
para.Format.Borders.Distance = "5pt";
para.Format.Borders.Color = MigraDoc.DocumentObjectModel.Colors.Gold;
// Create a renderer and prepare (=layout) the document
MigraDoc.Rendering.DocumentRenderer docRenderer = new DocumentRenderer(doc);
docRenderer.PrepareDocument();
This is the code I took from a MigraDoc example, but it doesn't really work as I want. Instead of creating a new page after the appropiate number of lines, it just writes further out of the border of the first page.
Can you give me an example where multiline text creates a new page if the appropiate number of lines ist reached?
To create a PDF document from MigraDoc, use the PdfDocumentRenderer class and you will get as many pages as needed.
A sample can be found here:
http://www.pdfsharp.net/wiki/HelloMigraDoc-sample.ashx
The class DocumentRenderer you are using is for special cases. By design, it cannot handle page breaks automatically.

In Razor View how to access the Id of dynamically created number of textbox

I am trying to create a number of text box dynamically in my Razor view. How can I ensure a different id is assigned to each of the text box dynamically? My objective is to access the numeric value entered in the dynamic textbox (QuestionCount) and calculate the sum of values entered.
Below is the part of code used in my view.
#foreach (var QP_Count in ViewBag.NonUniformTempCount)
{
var str = #ViewBag.NonUniformTempNames[tempindex];
<b>#str</b>
for (int QCount = 1; QCount <= QP_Count; QCount++)
{
**<br /> <b>#QCount</b> <b>#Html.TextBox("QuestionCount")</b>**
}
tempindex++;
}
In Order to give different id's to dynamicaly generated textboxes try this :-
#foreach (var QP_Count in ViewBag.NonUniformTempCount)
{
var str = #ViewBag.NonUniformTempNames[tempindex];
<b>#str</b>
for (int QCount = 1; QCount <= QP_Count; QCount++)
{
<br /> <b>#QCount</b> <b>#Html.TextBox("QuestionCount",null,new{ id = "Question-" + #QCount + #QP_Count })</b>
}
tempindex++;
}

Regex for index of string match spanning across several XML tags

I'm trying to insert a link in TLF. Normally you would simply simply use
var linkElement:LinkElement = textArea.textFlow.interactionManager.applyLink( ... );
The problem is that, if I create a link which spans across differently formatted text (bold, italic, etc), or heaven forbid across paragraphs and list items, it completely and utterly crashes and burns. Link formatting is completely lost, and list structures collapse.
Simply adding a LinkElement via addChild() doesn't work either, if we're going to keep both the formatting and the structure within the selected text.
Ripping out the textFlow for the selection with interactionManager.cutTextScrap(...), wrapping it in a LinkElement with interactionManager.applyLink( ... ), and then "pasting" back in... also creates a mess.
So I have to create my own link insertion routine.
What I've resolved to do is to:
1) convert the textflow tags to a string
2) find the start and end indexes of the selection within the textflow string
3) insert the following string at the start index:
</span><a href="[hrefVar]" target="[targetVar]"><span>
4) insert the following string at the end index:
</span></a><span>
5) reconvert the textflow string into a textflow object for the TextArea
And voila! Instant RTF link!
The only problem is... I have no idea how to write a regex parsing equation which can find the start and ending indexes for a string match inside XML markup where the result may be spread across several tags.
For instance, if the TextFlow is (abbreviated):
<TextFlow><p><span>Lorem Ip</span><span fontWeight="bold">sum do</span><span>
lor sit am</span><span fontStyle="italic">et, consectetur adipiscing elit.
</span></p></TextFlow>
Say, for instance, the user has selected "Ipsum dolor sit amet" to be converted into a link. I need to find the first and last indexes of "Ipsum dolor sit amet" within that RTF markup, and then insert the strings indicated in 3) & 4) above, so that the end result looks like this:
<TextFlow><p><span>Lorem </span><a href="http://www.google.ca" target="_blank">
<span>Ip</span><span fontWeight="bold">sum do</span><span>lor sit am</span>
<span fontStyle="italic">et</span></a><span>, consectetur adipiscing elit.
</span></p></TextFlow>
You might lose some style formatting, but I can fix that later parsing through the textflow formatting.
What I need is the regex to do step 2).
I know the regex to ignore tags and strip out the text between tags, and how to find a string match of the selected text in the stripped textflow text... but not how to find the match indexes within the original (unstripped) textflow string.
Anyone?
IMHO better way is to go through out the string instead of trying to go with regex.
Here is an idea for quick dirty way, this code need to be improved, but anyway it might give directions.
So main goal might be just "throw out" tags and match text, but counting gow many chars passed within the process.
//This code might need revision for not to get < and > symbols as fake tags starting and finishing points, also reseting searchwhen text not completly done.
var sourceStr:String = '<TextFlow><p><span>Lorem Ip</span><span fontWeight="bold">sum do</span><span>lor sit am</span><span fontStyle="italic">et, consectetur adipiscing elit.</span></p></TextFlow>';
var searchStr:String = "Lorem Ipsum d";
var indexes:Object = firstLast(sourceStr, searchStr);
trace(indexes.startIndex,indexes.finishIndex);
function firstLast(sourceStr:String, searchStr:String):Object
{
var indexCounter:int = -1;
var searchFlag:Boolean = true;
var searchPos:int = 0;
var searchChar:String;
var sourceChar:String;
var startIndex:int;
var finishIndex:int;
for (var i:int = 0; i < sourceStr.length; i++ )
{
indexCounter++;
sourceChar = sourceStr.substr(i, 1);
if (sourceChar == "<")
{
searchFlag = false;
}
else if (sourceChar == ">")
{
searchFlag = true;
}
if (!searchFlag)
{
continue;
}
searchChar = searchStr.substr(searchPos, 1);
if (sourceChar == searchChar)
{
if (searchPos == 0)
{
startIndex = indexCounter;
}
if (searchPos == searchStr.length - 1)
{
finishIndex = indexCounter;
}
searchPos++;
}
}
return { startIndex:startIndex, finishIndex:finishIndex };
}

Which javascript library or framework supports "Table Of Content" generation?

I am looking for a javascript on the fly "Table Of Contents" generation from HTML (with anchors).
Example:
<h1>First level1 heading</h1>
lorem ipsum
<h2>1a heading</h2>
lorem ipsum
<h2>1b heading</h2>
lorem ipsum
<h1>Second level1 heading</h1>
lorem ipsum
Should return something like
First level1 heading
1a heading
1b heading
Second level1 heading
with the lines linked to the headings, and also the orignal html should be returned with anchors inserted.
Is there something included in one of the big javascript libraries or frameworks?
If none of them has, has someone seen a good JS module for this purpose?
jQuery is your friend, with this plugin: table of contents. Home page is http://code.google.com/p/samaxesjs/
Make it yourself, i wrote it :), hope it helps
add a div element as first child of body element and give an id as "tableOfContents"
and add the script below as last child of body element
<script>
var el = document.getElementsByTagName("*") || [];
var toc = "<ul>";
var lvl = 1;
for(var i=0;i<el.length;i++)
{
var ce = el[i];
var tag = ce.tagName + "";
var m = tag.match(/^h([1-5])$/i);
if(m)
{
var n = Number(m[1]);
if(lvl > n)
{
while(lvl-->n)toc+="</ul></li>";
}else if(lvl < n){
while(lvl++<n)toc+="<li style='list-style:none'><ul>";
}
toc += '<li><a href="#toc_' + i + '">' +
(ce.innerText || ce.text()) +
'</a></li>';
var ta = document.createElement("div");
ta.innerHTML = '<a name="toc_' + i + '" />';
ce.insertBefore(ta, ce.firstChild);
}
}
while(lvl-->1)toc+="</ul></li>";
toc+="</ul>";
document.getElementById("tableOfContents").
innerHTML = toc;
</script>
this script will detects each H (1 to 5) and generates your table of contents
This is a very simple problem that could be solved with a 10-20 line function. No framework required. Either walk the DOM with getElementsByTagName('h1'), getElementsByTagName('h2') or use regular expressions. Loading frameworks comes with performance implications and risks so I suggest not installing one for simple problems.