Split long string to obtain multiple values/strings - google spreadsheet - google-apps-script

I have complex string in which I need to pull single words and/or multiple words.
Here is the string:
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:count="5" yahoo:created="2013-07-28T18:37:23Z" yahoo:lang="en-US"><diagnostics><publiclyCallable>true</publiclyCallable><user-time>145</user-time><service-time>141</service-time><build-version>38483</build-version></diagnostics><results><Result xmlns="urn:yahoo:cate">**RED**</Result><Result xmlns="urn:yahoo:cate">**GREEN**</Result><Result xmlns="urn:yahoo:cate">**BLUE**</Result><Result xmlns="urn:yahoo:cate">**A, E, I, O, U **</Result><Result xmlns="urn:yahoo:cate">**SOMETIMES Y**</Result></results></query><!-- total: 145 -->
(I really wish that wouldn't scroll, since it makes it difficult to see the entire picture)
Anyway, I need to be able to pull out the:
RED
GREEN
BLUE
A, E, I, O, U
SOMETIMES Y
++++ btw, I tried to make those values BOLD in the big string, but they show up with asteriks instead. Disredard the asterisks. They are not part of the string. However I'm leaving them in there since it makes them easier to find when you look at the entire string)
++++
My goal is to turn that complex string into this:
RED|GREEN|BLUE|A, E, I, O, U|SOMETIMES Y
My preference is to do this on the sheet level using a single nested function (or a combination of multiple functions if necessary).
Failing that, a script version would be preferable to nothing.
I've been at this for hours using SPLIT, FIND, SUBSTITUTE, and a few other things that I tried on a whim - just to try everything. But I've now reached the saturation point of thinking clearly on this, and I'm hoping that someone can put me on a path for how to attack this logically.
I'm truly stumped (and frustrated).
==========================================
I said that I'd post the solution if I figured out the sheet-level solution, so this is it:
=mid(substitute(substitute(regexreplace(mid(A1,find("<Result",A1),find("</query",A1)-find("<Result",A1)),"<.*?>+","-"),"--","|"),"-","|"),2,len(substitute(substitute(regexreplace(mid(A1,find("<Result",A1),find("</query",A1)-find("<Result",A1)),"<.*?>+","-"),"--","|"),"-","|"))-2)

Have you considered using XmlService Services? https://developers.google.com/apps-script/reference/xml-service
Simple example:
/* CODE FOR DEMONSTRATION PURPOSES */
function testXML() {
var result = [];
var document = XmlService.parse('<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:count="5" yahoo:created="2013-07-28T18:37:23Z" yahoo:lang="en-US"><diagnostics><publiclyCallable>true</publiclyCallable><user-time>145</user-time><service-time>141</service-time><build-version>38483</build-version></diagnostics><results><Result xmlns="urn:yahoo:cate">RED</Result><Result xmlns="urn:yahoo:cate">GREEN</Result><Result xmlns="urn:yahoo:cate">BLUE</Result><Result xmlns="urn:yahoo:cate">A, E, I, O, U</Result><Result xmlns="urn:yahoo:cate">SOMETIMES Y</Result></results></query><!-- total: 145 -->');
var entries = document.getRootElement().getChildren('results')[0].getChildren();
for (var i = 0, len = entries.length; i < len; ++i)
result.push(entries[i].getText());
Logger.log(result.join('|'));
}

Related

Vertical html table without repeating th tags

I'm generating a table using xslt, but for this question I'll keep that side out of it, as it relates more to the actual generated structure of a html table.
What I do is make a vertical table as follows, which suits the layout needed for the data concerned that originated in a spreadsheet. Example is contrived for brevity, actual data fields contain lengthy strings and many more fields.
Title: something or rather bla bla
Description: very long desription
Field1: asdfasdfasdfsdfsd
Field2: asdfasfasdfasdfsdfjasdlfksdjaflk
Title: another title
Description: another description
Field1:
Field2: my previous field was blank but this one is not, anyways
etc.
The only way so far I found to generate such a html table is using repeating tags for every field and every record e.g.:
<tr><th>Title</th><td>something or rather bla bla</td></tr>
<tr><th>Description</th><td>very long desription</td></tr>
...
<tr><th>Title</th><td>another title</td></tr>
<tr><th>Description</th><td>another description</td></tr>
...
Of course this is semantically incorrect but produces correct visual layout. I need it to be semantically correct html, as that's the only sane way of later attaching a filtering javascript facility.
The following correct semantically produces an extremely wide table with a single set of field headers on the left:
<tr><th>Title</th><td>something or rather bla bla</td><td>another title</td></tr>
<tr><th>Description</th><td>very long desription</td><td>another description</td></tr>
...
So to summarise, need a html table (or other html structure) where it's one record under another (visually) with repeating field headers, but the field headers must not be repeated in actual code because that would wreck any record based filtering to be added later on.
Yo. Thanks for updating your question, and including some code. Typically you'd also post what you've tried to correct this issue - but I'm satisfied enough with this post.
Since you want the repeating headers in vertical layout (not something I've seen often, but I can understand the desire), you don't have to modify the HTML formatting, just use a bit more JavaScript to figure it out. I haven't gone through and checked to see if I'm doing things efficiently (I'm probably not, since there are so many loops), but in my testing the following can attach to a vertical table and filter using a couple variables to indicate how many rows there are in each entry.
Firstly, here's the HTML I'm testing this one with. Notice I have a div with the id of filters, and each of my filter inputs has a custom attribute named filter that matches the header of the rows they are supposed to filter:
<div id='filters'>
Title: <input filter='Title'><br>
Desc: <input filter='Description'>
</div>
<table>
<tr><th>Title</th><td>abcd</td></tr>
<tr><th>Description</th><td>efgh</td></tr>
<tr><th>Title</th><td>ijkl</td></tr>
<tr><th>Description</th><td>mnop</td></tr>
<tr><th>Title</th><td>ijkl</td></tr>
<tr><th>Description</th><td>mdep</td></tr>
<tr><th>Title</th><td>ijkl</td></tr>
<tr><th>Description</th><td>mnop</td></tr>
<tr><th>Title</th><td>ijkl</td></tr>
<tr><th>Description</th><td>mnop</td></tr>
</table>
Here are the variables I use at the start:
var filterTable = $('table');
var rowsPerEntry = 2;
var totalEntries = filterTable.find('tbody tr').size() / rowsPerEntry;
var currentEntryNumber = 1;
var currentRowInEntry = 0;
And this little loop will add a class for each entry (based on the rowsPerEntry as seen above) to group the rows together (this way all rows for an entry can be selected together with a class selector in jQuery):
filterTable.find('tbody tr').each(function(){
$(this).addClass('entry' + currentEntryNumber);
currentRowInEntry += 1;
if(currentRowInEntry == rowsPerEntry){
currentRowInEntry = 0;
currentEntryNumber += 1;
}
});
And the magic; on keyup for the filters run a loop through the total number of entries, then a nested loop through the filters to determine if that entry does not match either filter's input. If either field for the entry does not match the corresponding filter value, then we add the entry number to our hide array and move along. Once we've determined which entries should be hidden, we can show all of the entries, and hide the specific ones that should be hidden:
$('#filters input').keyup(function(){
var hide = [];
for(var i = 0; i < totalEntries; i++){
var entryNumber = i + 1;
if($.inArray(entryNumber, hide) == -1){
$('#filters input').each(function(){
var val = $(this).val().toLowerCase();
var fHeader = $(this).attr('filter');
var fRow = $('.entry' + entryNumber + ' th:contains(' + fHeader + ')').closest('tr');
if(fRow.find('td').text().toLowerCase().indexOf(val) == -1){
hide.push(entryNumber);
return false;
}
});
}
}
filterTable.find('tbody tr').show();
$.each(hide, function(k, v){
filterTable.find('.entry' + v).hide();
});
});
It's no masterpiece, but I hope it'll get you started down the right path.
Here's a fiddle too: https://jsfiddle.net/bzjyfejc/

insertListItem(index, text) Wrong Index

I am trying to use Google Apps Script to replace text into a Docs template and save it as a .pdf. I am mostly successful, but I am having one problem. I'd like for the script to search for a text in the template, replace the text with provided text, using bullets. It will ignore any extra \n that may have been placed into the text. Here is an example text:
Today was a good day.
Tomorrow will be a good day.
Yesterday was a decent day.
In my document, I would like the text to replace _text_ in a line: Comments: _text_. Ultimately, what should print out is the following:
Comments:
- Today was a good day.
- Tomorrow will be a good day.
- Yesteday was a decent day.
This is the code that I have so far, but it is not working too well. If anyone could offer any help, it would be greatly appreciated.
var listr = "";
var trunc = text.split("\n"); \\ where text is to be placed into the template
var index = b.findText("_text_").getStartOffset(); \\ var b is getBody()
for (var j = (trunc.length - 1); j >= 0; j--)
if(!trunc[j].equals("")) b.insertListItem(index, trunc[j]);
b.replaceText("_text_", "");
Any help would be much appreciated. I am having the hardest time understanding the concept of the indexes in Google Docs. Thank you.
Hello. Just wanted to let you know how I have ended up implementing this:
var trunc = text.split("\n"); \\where text is to be placed into the template
var index = b.getChildIndex(b.findText("foo").getElement().getParent()) + 1;
for (var j = (trunc.length - 1); j >= 0; j--)
if (trunc[j] != "") b.insertListItem(index, trunc[j]);
Hope that helps. It pushes the elements back on to each other backwards.
Ok well this crude code seems to insert at the beginning of the PARAGRAPH containing the TEXT, which appears to be a separate child element. Probably this particular code will only work if the text is not inside a sub-table, sub-list, etc... but maybe it will help.
var element = DocumentApp.create('newDoc').getBody()
.appendListItem('testing').copy();
var index = b.getChildIndex(
b.findText('_text_').getElement().getParent().asParagraph() );
b.insertListItem(index, element);
I've had a long day maybe I can improve it later, mostly what I think was missing was the getChildIndex() function. Also using the newDoc the insertListItem() had some weird "Element must be detached." message until I used .copy() so mental note there.

Possible multiple enumeration of IEnumerable when counting and skipping

I'm preparing data for a datatable in Linq2Sql
This code highlights as a 'Possible multiple enumeration of IEnumerable' (in Resharper)
// filtered is an IEnumerable or an IQueryable
var total = filtered.Count();
var displayed = filtered
.Skip(param.iDisplayStart)
.Take(param.iDisplayLength).ToList();
And I am 100% sure Resharper is right.
How do I rewrite this to avoid the warning
To clarify, I get that I can put a ToList on the end of filtered to only do one query to the Database eg.
var filteredAndRun = filtered.ToList();
var total = filteredAndRun.Count();
var displayed = filteredAndRun
.Skip(param.iDisplayStart)
.Take(param.iDisplayLength).ToList();
but this brings back a ton more data than I want to transport over the network.
I'm expecting that I can't have my cake and eat it too. :(
It sounds like you're more concerned with multiple enumeration of IQueryable<T> rather than IEnumerable<T>.
However, in your case, it doesn't matter.
The Count call should translate to a simple and very fast SQL count query. It's only the second query that actually brings back any records.
If it is an IEnumerable<T> then the data is in memory and it'll be super fast in any case.
I'd keep your code exactly the same as it is and only worry about performance tuning when you discover you have a significant performance issue. :-)
You could also do something like
count = 0;
displayed = new List();
iDisplayStop = param.iDisplayStart + param.iDisplayLength;
foreach (element in filteredAndRun) {
++count;
if ((count < param.iDisplayStart) || (count > iDisplayStop))
continue;
displayed.Add(element);
}
That's pseudocode, obviously, and I might be off-by-one in the edge conditions, but that algorithm gets you the count with only a single iteration and you have the list of displayed items only at the end.

How to sort var length ids (composite string + numeric)?

I have a MySQL database whose keys are of this type:
A_10
A_10A
A_10B
A_101
QAb801
QAc5
QAc25
QAd2993
I would like them to sort first by the alpha portion, then by the numeric portion, just like above. I would like this to be the default sorting of this column.
1) how can I sort as specified above, i.e. write a MySQL function?
2) how can I set this column to use the sorting routine by default?
some constraints that might be helpful: the numeric portion of my ID's never exceeds 100,000. I use this fact in some javascript code to convert my ID's to strings concatenating the non-numeric portion with the (number + 1,000,000). (At the time I had not noticed the variations/subparts as above such as A_10A, A_10B, so I'll have to revamp that part of my code.)
The best way to achieve what you want is to store each part in its own column, and I would strongly recommend to change table structure. If it's impossible, you can try the following:
Create 3 UDFs which returns prefix, numeric part, and postfix of your string. For a better performance they should be native (Mysql, as any other RDMS, is not really good in complex string parsing). Then you can call these functions in ORDER BY clause or in trigger body which validates your column. In any case, it will work slower than if you create 3 columns.
No simple answer that I know of. I had something similar a while back but had to use jQuery to sort it. So what I did was first get the output into an javascript array. Then you may want to insert a zero padding to your numbers. Separate the Alpha from Nummerics using a regex, then reassemble the array:
var zarr = new Array();
for(var i=0; i<val.length; i++){
var chunk = val[i].match(/(\d+|[^\d]+)/g).join(',');
var chunks = chunk.split(",");
for(var s=0; s<chunks.length; s++){
if(isNaN(chunks[s]) == true)
zarr.push(chunks[s]);
else
zarr.push(zeroPad(chunks[s], 5));
}
}
function zeroPad(num,count){
var numZeropad = num + '';
while(numZeropad.length < count) {
numZeropad = "0" + numZeropad;
}
return numZeropad;
}
You'll end up with an array like this:
A_00100
QAb00801
QAc00005
QAc00025
QAd02993
Then you can do a natural sort. I know you may want to do it through straight MySQL but I am not to sure if it does natural sorting.
Good luck!

Is there functions in coldfusion to get just 2 lines of text from a string?

I know this works in other languages, but wanted to see if there is existing code/functions.
This string can be populated from numerous different queries, but they need to be all displayed the same way, same length etc.
I have a function, to control string length by word count, but I would prefer to make sure that I have at least 2 sentences or 2 lines of text at most.
Thanks
I had a similar task at my job and you have to pick an arbitrary number, and it looks like you've chosen 190. That being said, you can't just hope that the characters/words returned are relevant. You have to ensure that they are if its something you care about, which is seems like you do looking at your comments.
Try to find the keyword in the string and use the mid() function to get a certain number of characters on either side of the keyword:
<cfscript>
max_chars = 190;
full_article = #the full article#;
keyword_position = find(keyword, full_article);
if( keyword_position != 0 ) {
excerpt = mid(full_article,
keyword_position - max_chars / 2 - len(keyword_position) / 2,
max_chars);
}
</cfscript>
...or something like that. I'll leave it to you to make sure that you're not trying to get characters before the start of the full_article, or after the end of it, and adding ellipses and stuff.
Try something like fullLeft or dig through the other string manipulation UDFs at CFLib. If you're looking for something more specific could you show us a comparable function in another language and we'd be better able to point you to something similar.
_TestString = "I know this works in other languages, but wanted to see if there is existing code/functions. This string can be populated from numerous different queries, but they need to be";
if ( len(_TestString) GT 190)
{
_TestString = Left(_TestString,190) & "...";
}
That will output:
I know this works in other languages, but wanted to see if there is existing code/functions. This string can be populated from numerous different queries, but they need to be all displayed t...
You probably don't want to do anything more than that, string manipulation can get expensive for no reason, you shouldn't waste processing on the display layer unless you have to.
CFLIB has plenty of string manipulation functions on offer. You may find abbreviate() is useful, especially for search results: http://cflib.org/udf/abbreviate
<cfscript>
/**
* Abbreviates a given string to roughly the given length, stripping any tags, making sure the ending doesn't chop a word in two, and adding an ellipsis character at the end.
* Fix by Patrick McElhaney
* v3 by Ken Fricklas kenf#accessnet.net, takes care of too many spaces in text.
*
* #param string String to use. (Required)
* #param len Length to use. (Required)
* #return Returns a string.
* #author Gyrus (kenf#accessnet.netgyrus#norlonto.net)
* #version 3, September 6, 2005
*/
function abbreviate(string,len) {
var newString = REReplace(string, "<[^>]*>", " ", "ALL");
var lastSpace = 0;
newString = REReplace(newString, " \s*", " ", "ALL");
if lenn(newString) gt len) {
newString = left(newString, len-2);
lastSpace = find(" ", reverse(newString));
lastSpace = len(newString) - lastSpace;
newString = left(newString, lastSpace) & " &##8230;";
}
return newString;
}
</cfscript>