Google Docs Apps script - how to remove really empty paragraphs - google-apps-script

How we can remove really empty paragraphs from Google Document?
This code will remove any paragraphs that contains images, hr's etc. Can't understand how to check if a paragraph really empty?
getText() and editAsText() gives nothing to this.
var doc = DocumentApp.getActiveDocument();
var body = doc.getBody();
var docParagraphs = doc.getBody().getParagraphs();
for (var i = 0; i < docParagraphs.length; i++) {
if (docParagraph[i].getText() === '') {
docParagraphs[i].removeFromParent();
}
}
I know there are topic related to ~similar problem, but not the same. How to find and remove blank paragraphs in a Google Document with Google Apps Script?
UPD: correct answer:
var doc = DocumentApp.getActiveDocument();
var body = doc.getBody();
for (var i = 0; i < doc.getBody().getParagraphs().length; i++) {
if (docParagraph[i].getText() === '') {
if (docParagraph_i.getNumChildren() == 0 && i < (docParagraphs.length - 1)) {
docParagraph[i].removeFromParent();
}
}
}
I've changed for expression too to limit i by a current elements count, not stored in variable. Because elements are deleted (or in other implementation added, for example) there will be error when no paragraphs but cycle will want to get them.
&& i < (...)
because last paragraph can't be deleted and throws an error.

You will want to check for children in the paragraph
paragraph.getNumChildren() //returns the number of children

Related

How to delete selected text in a Google doc using Google Apps Script

In a Google document is there a way to delete selected text with Google Apps Script? The find criterion for the text to delete is not a string, but instead is relative to a bookmark. This question is related to a workaround for my open question at https://webapps.stackexchange.com/questions/166391/how-to-move-cursor-to-a-named-bookmark-using-google-apps-script).
Here is code I wish worked.
function UpdateBookmarkedText() {
var doc = DocumentApp.getActiveDocument();
var bookmarks = doc.getBookmarks();
for (var i = 0; i < bookmarks.length; i++){
// Delete the old text, one step in a longer process.
var text = bookmarks[i].getPosition().getElement().asText().editAsText();
var range = doc.newRange().addElementsBetween(text, 5, text, 7).build(); // arbitrary offsets for testing
doc.setSelection(range); // The selected range is successfully highlighted in the document.
doc.deleteSelection(); // This command does not exist.
} }
This documentation seems relevant but is over my head: https://developers.google.com/docs/api/how-tos/move-text
Use deleteText()
You may use the following script as the basis for your script:
function deleteSelectedText() {
var selection = DocumentApp.getActiveDocument().getSelection();
if (selection) {
var elements = selection.getRangeElements();
if (elements[0].getElement().editAsText) {
var text = elements[0].getElement().editAsText();
if (elements[0].isPartial()) {
text.deleteText(elements[0].getStartOffset(), elements[0].getEndOffsetInclusive());
}
}
}
}
This is a modified version of the script featured in the Class Range guide. This modification works for selected sentences within a paragraph. Thus, the use of the for loop (in the sample script) is not anymore necessary since the script operates within a single element/paragraph.
Optimized Script:
function test() {
var selection = DocumentApp.getActiveDocument().getSelection();
var elements = selection.getRangeElements();
var text = elements[0].getElement().editAsText();
(selection && elements[0].getElement().editAsText && elements[0].isPartial()) ? text.deleteText(elements[0].getStartOffset(), elements[0].getEndOffsetInclusive()):null;
}
References:
Class Range
deleteText(startOffset, endOffsetInclusive)
I'm not sure what exactly you're trying to do, so here is a guess. If you able to select something you can remove the selected text about this way:
function UpdateBookmarkedText() {
var doc = DocumentApp.getActiveDocument();
var bookmarks = doc.getBookmarks();
for (var i = 0; i < bookmarks.length; i++){
// Delete the old text, one step in a longer process.
var text = bookmarks[i].getPosition().getElement().asText().editAsText();
var range = doc.newRange().addElementsBetween(text, 5, text, 7).build(); // arbitrary offsets for testing
doc.setSelection(range); // The selected range is successfully highlighted in the document.
// the way to handle a selection
// from the official documentation
// https://developers.google.com/apps-script/reference/document/range
var selection = DocumentApp.getActiveDocument().getSelection();
if (selection) {
var elements = selection.getRangeElements();
for (let element of elements) {
if (element.getElement().editAsText) {
var text = element.getElement().editAsText();
if (element.isPartial()) {
text.deleteText(element.getStartOffset(), element.getEndOffsetInclusive());
} else {
text.setText(''); // not sure about this line
}
}
}
}
}
}

Remove Horizontal Line in Google Doc with Google Apps Script

I have a Google Doc with text followed by a Horizontal Line below. If the user selects "NO" from a ui.alert, I need to remove all this text (simple using regex) and the horizontal line. I have no clue how to remove this Horizontal Line via Google Apps Script. Can't find anything about it in the documentation. Anyone have any ideas? Thanks!
var regExpFirstBriefing = "[A-Z \(\)]{42}\\v+[A-Za-z\.\", ]*[\\v+]{1}"; // This accounts for all the text I need removed along with an extra new line. The horizontal line is the next line.
// Ask user if this is the first briefing
var responseFirstBriefing = ui.alert('Question here...' , ui.ButtonSet.YES_NO);
if (responseFirstBriefing == ui.Button.YES) {
document.replaceText(regExpFirstBriefing, '');
}
You want to remove the searched text in Google Document.
You want to delete "HORIZONTAL_RULE" below the text.
You want to run above when the user selects "NO" from a ui.alert.
You want to achieve this using Google Apps Script.
If my understanding is correct, how about this sample script? Although I'm not sure about your actual Document, from your explanation, I imaged about it and prepare a sample script. Please think of this as just one of several answers. The flow of this sample script is as follows.
Flow:
Search text is searched using findText().
Put the element of searched text in an array.
This array is used for deleting element.
Search "HORIZONTAL_RULE" below the searched text.
In this case, when "HORIZONTAL_RULE" doesn't adjacent the searched text, "HORIZONTAL_RULE" is searched by offsetValue. In this sample, it is searched up to 3 paragraph ahead.
When "HORIZONTAL_RULE" is found, the element is put to the array.
Delete elements in the array.
From your script, the searched text is cleared. In this case, the paragraph is not deleted.
From your question, about "HORIZONTAL_RULE", the paragraph is deleted.
When above flow is reflected to the script, it becomes as follows.
Sample script:
When you run the script, the texts searched with regExpFirstBriefing are cleared and "HORIZONTAL_RULE" below the text is also removed.
function myFunction() {
var document = DocumentApp.getActiveDocument(); // Added
var ui = DocumentApp.getUi(); // Added
var regExpFirstBriefing = "[A-Z \(\)]{42}\\v+[A-Za-z\.\", ]*[\\v+]{1}";
var responseFirstBriefing = ui.alert('Question here...' , ui.ButtonSet.YES_NO);
if (responseFirstBriefing == ui.Button.YES) {
document.replaceText(regExpFirstBriefing, '');
// I added below script.
} else if (responseFirstBriefing == ui.Button.NO) {
var offsetValue = 3; // When "HORIZONTAL_RULE" doesn't adjacent the searched text, "HORIZONTAL_RULE" is searched by "offsetValue". In this sample, it is searched up to 3 paragraph ahead.
var body = document.getBody();
var r = body.findText(regExpFirstBriefing);
var remove = [];
while (r) {
remove.push(r.getElement().asText())
var parentParagraph = body.getChildIndex(r.getElement().getParent());
var totalChildren = body.getNumChildren();
for (var offset = 1; offset <= offsetValue; offset++) {
if (parentParagraph + offset <= totalChildren) {
var nextParagraph = body.getChild(parentParagraph + offset);
if (nextParagraph.getType() === DocumentApp.ElementType.PARAGRAPH) {
var c = nextParagraph.asParagraph().getNumChildren();
for (var i = 0; i < c; i++) {
var childOfNextParagraph = nextParagraph.asParagraph().getChild(i);
if (childOfNextParagraph.getType() === DocumentApp.ElementType.HORIZONTAL_RULE) {
remove.push(childOfNextParagraph.asHorizontalRule());
break;
}
}
if (remove[remove.length - 1].getType === DocumentApp.ElementType.HORIZONTAL_RULE) {
break;
}
}
}
}
r = body.findText(regExpFirstBriefing, r);
}
for (var i = remove.length - 1; i >=0; i--) {
/////
// If you want to delete the paragraph of searched text, please delete this if statement.
if (remove[i].getType() === DocumentApp.ElementType.TEXT) {
remove[i].removeFromParent();
continue;
}
/////
remove[i].getParent().asParagraph().removeFromParent();
}
}
}
Note:
This script supposes that the regex of [A-Z \(\)]{42}\\v+[A-Za-z\.\", ]*[\\v+]{1} works for your Document.
If you want to delete the paragraph of searched text, please delete this if statement of as follows from above script.
if (remove[i].getType() === DocumentApp.ElementType.TEXT) {
remove[i].removeFromParent();
continue;
}
References:
findText(searchPattern, from)
removeFromParent()
Class HorizontalRule
If I misunderstood your question and this was not the result you want, I apologize. At that time, in order to correctly understand your situation, can you provide a sample Document you want to use? Of course, please remove your personal information. I would like to confirm the issue from it.

Read a Special Character from bulleted List Item

I am writing a docs script to print something when certain characters appear in the document.
The problem is that I can't read bulleted-list special characters (check-marks for example)
I can read copied check-marks from a web page or a check mark that was added by script ( using append ) but it will not read a check mark added from the bulleted list.
Is there any way to read bulleted-list items?
function myFunction() {
var body = DocumentApp.getActiveDocument().getBody();
var text = [] ;
text = DocumentApp.getActiveDocument().getBody().getParagraphs();
for (var i = 0; i < text.length; i++)
{
if ( text[i].findText(String.fromCharCode(0x2713)) != null)
{
body.appendParagraph("Para");
}
}}

Deleting all content down from the second horizontal line in a document

I'm trying to create a script to delete all text/content downwards from a page. Below you can see the current document.
Currently, I have the script set-up so that it deletes everything down and including from a text of, "STARTHERE". However, I want it to delete down from the second horizontal line in the image, however, not including the line.
Any ideas on how to delete down from the second horizontal line?
What does deleteText startOffset and endOffsetInclusive actually mean? Is it like a line number or?
Previous Script:
function removeText() {
var body = DocumentApp.getActiveDocument().getBody();
var rangeElement = body.editAsText();
var start = "STARTHERE";
var end = "ENDHERE";
var rangeElement1 = DocumentApp.getActiveDocument().getBody().findText(start);
var rangeElement2 = DocumentApp.getActiveDocument().getBody().findText(end);
if (rangeElement1.isPartial()) {
var startOffset = rangeElement1.getStartOffset();
var endOffset = rangeElement2.getEndOffsetInclusive();
rangeElement1.getElement().asText().deleteText(startOffset,endOffset);
}
}
You'll need to change your approach completely, because findText only finds text, and a horizontal line is not text; it is a special type of document element, HorizontalRule.
(Since you asked: startOffset and endOffsetInclusive are character counts within an element; e.g., if the text "red" is found in a paragraph that consists of "A big red dog", then startOffset is 6 and endOffset is 9. None of this helps here)
Here is my approach: loop over the Paragraph elements, looking for those that contain a HorizontalRule element (with findElement method). Once we found two such paragraphs, delete all subsequent ones.
There is a catch in that Apps Script can't delete the last paragraph of a document; for this reason I append empty paragraph ahead of time, and do not delete it.
function removeAfterSecondLine() {
var body = DocumentApp.getActiveDocument().getBody();
body.appendParagraph('');
var para = body.getParagraphs();
var ruleCount = 0;
for (var i = 0; i < para.length - 1; i++) {
if (ruleCount >= 2) {
body.removeChild(para[i]);
}
else if (para[i].findElement(DocumentApp.ElementType.HORIZONTAL_RULE)) {
ruleCount++;
}
}
}

Remove line breaks using apps scripts in a Google Document

Trying to work out how to remove multiple line breaks from Google Documents (not spreadsheets).
I've tried this and many variations thereof:
function searchAndReplace() {
var bodyElement = DocumentApp.getActiveDocument().getBody();
bodyElement.replaceText("\r\r", '\r');
}
Any idea please?
Noob to all of this...
Purpose is to replicate the search and replace in MS Word for ^p
Here is a rather "radical" method if your document has only paragraphs with text (images or other elements will be lost). See doc about element types here
(comments in code)
function removeNewLines(){
var doc = DocumentApp.getActiveDocument();
var text = doc.getBody().getText();// get a string
var textMod=text.replace(/\n/g,'');// replace all \n with ''
Logger.log(textMod);//optional check in logger
doc.getBody().clear().appendParagraph(textMod);// empty the doc and apend new texr
doc.saveAndClose();// save the result
}
I wanted to do the same thing (replace two new lines with a single new line). Ended up with the following as replaceText() doesn't accept \n for some reason.
function myFunction() {
var body = DocumentApp.getActiveDocument().getBody();
var text = body.editAsText();
var text_content = text.getText();
for(var i = 0, offset_i = 0; i < (text_content.length); i++){
if((text_content.charCodeAt(i)==10) && (text_content.charCodeAt(i-1)==10)){
text.deleteText(i-1-offset_i, i-1-offset_i)
offset_i++;
}
}
}
This code helped me to remove doubled new lines in document:
function removeDoubleNewLines(){
var doc = DocumentApp.getActiveDocument();
var paragraphs = doc.getBody().getParagraphs();
var paragraph;
for (var i = 0; i < paragraphs.length-1; i++) {
paragraph = paragraphs[i];
if(paragraph.getText() === '' &&
paragraph.getNumChildren() === 0) {
paragraph.removeFromParent();
}
}
}