i got a quick one for someone who can help. I've downloaded some data from yahoo. I want to split the data into a N x 7 array. (is that the correct term?).
I want it to look like this:
[[2013-01-29,64.25,65.03,64.00,64.24,4883100,64.24],[2013-01-28,64.51,64.87,63.27,64.59,7591300,64.59],...]
but now, as you can see, it's not in that format. Novice to javascript. Please help.
function function() {
var ticker='YUM';
var startMonth=0; var startDate=1; var startYear=2013;
var endMonth=0; var endDate=25; var endYear=2013;
var fetchString="http://ichart.finance.yahoo.com/table.csv?s="+ticker+"&a="+startMonth+"&b="+startDate+"&c="+startYear+"&d="+endMonth+"e="+endDate+"&f="+endYear+"&g=d";
var response = UrlFetchApp.fetch(fetchString);
a=response.getContentText();
var allData = a.slice(a.indexOf("2013"));
}
Assuming you don't want the column headers, this is a one line change:
var allData = a.match(/(.*?)\n/g) // convert each line to a row
.splice(1) // remove headers row
.map(function(row){
return row.replace(/\n/,'').split(',');
}); // convert row string to array
Related
I have the following data example in a google sheet:
url
https://www.testwebsite.com/compute/v1/test/images-prd-5d4d/glob/images/testimage-vsfd
https://www.testwebsite.com/compute/v1/test/images-prd-5d4d/glob/images/testimage-sdawr|
What I need is to extract the data after the substring "images/" and have something like this:
url
extract
https://www.testwebsite.com/compute/v1/test/images-prd-5d4d/glob/images/testimage-vsfd
testimage-vsfd
https://www.testwebsite.com/compute/v1/test/images-prd-5d4d/glob/images/testimage-sdawr|testimage-sdawr
I have created the following function to get this but is only extracting everything after the last "-":
function strip() {
const ss = SpreadsheetApp.getActive();
const sh = ss.getSheetByName("Sheet6");
const vs = sh.getRange(2,1,sh.getLastRow() - 1).getDisplayValues().flat();
let vo = vs.map(s => [s.match(/\b[0-9A-Zaz/]+$/gi)[0]]);
sh.getRange(2,2,vo.length,1).setValues(vo);
}
What is the proper way to extract the data it's mentioned above?
You could use this on Apps Script:
function strip() {
const ss = SpreadsheetApp.getActive();
const sh = ss.getSheetByName("Sheet6");
const vs = sh.getRange(2,1,sh.getLastRow() - 1).getDisplayValues().flat();
const string= "/images/";
for (i = 0; i < vs.length; i++){
//Using substrings:
const extract = vs[i].substring(vs[i].indexOf(string) + string.length);
sh.getRange(i+2,2).setValue(extract);
//Using .split():
// const extract = vs[i].split(string); //This splits the string in 2.
// sh.getRange(i+2,2).setValue(extract[1]); //Adding the second part of the array;
}
}
If you want to do it like a custom function you can try the following code:
function strip(url) {
var text = url;
var splittedValue = text.split("/images/");
return splittedValue[1];
}
It would work something like this:
Input:
Result:
The script can also be changed to get a specific range of data automatically so that every time you add a new URL you get the result in the next column automatically, but this is just for you to get the idea.
References:
Custom Functions in Google Sheets
Say your URLs are in A2:A. You can use
=arrayformula(if(isblank(A2:A),,substitute(REGEXEXTRACT(A2:A,"/images/[A-Za-z0-9-_|/\.]+"),"/images/","")))
Use native formulas where possible. That is more efficient.
If you already dealt with the issue with run time delay, and have a need to use custom function for other reasons, you can match with the "/image/" part and then remove it, or, alternatively, specifying a capturing group. Also don't forget other value characters such as _, |.
I have a long list of websites where I want to search for a specific string. In this case the string is "cast area". I have come up with this script that looks through the list (I have confined it to the row that I know has a valid output). In the log, it says Logging output too large. Truncating output. I have read online that this shouldn't matter - it's just too big for the log, not that it's given up on looking through the rest.
function getData() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sheet = ss.getSheetByName('URLs');
var firstURL = 24673
var lastURL = 24678
var i
for (i=firstURL; i<lastURL; i++) {
var data = sheet.getRange(i,1).getValue();
var response = UrlFetchApp.fetch(data).getContentText();
Logger.log(data)
//Logger.log(response)
if (response.toLowerCase().indexOf("cast area")>-1) {
Logger.log(1)
}
}
}
To test it out, here is the code with the url included that has the words "CAST AREA" in the Notes section of the page. I'm hoping the log should return the number 1 to show that it works.
function getData() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sheet = ss.getSheetByName('URLs');
var data = "https://elanthipedia.play.net/Faenella%27s_Grace";
var response = UrlFetchApp.fetch(data).getContentText();
Logger.log(data)
//Logger.log(response)
if (response.indexOf("CAST AREA")>-1) {
Logger.log(1)
}
}
If you want to visualize a contentText that is too large for the Logging output - write this data somewhere
The simplest solution would be to write it into a spreadsheet.
Sample:
function getData() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sheet = ss.getSheetByName('URLs');
var data = "https://elanthipedia.play.net/Faenella%27s_Grace";
var response = UrlFetchApp.fetch(data).getContentText();
sheet.getRange(sheet.getLastRow()+1,1).setValue(response);
}
Obviously, to visualize the data in a nicer way it makes sense to parse the responce and write the different objects into different columns of the spreadsheet.
I am looking for help from this community regarding the below issue.
// I am searching my Gmail inbox for a specific email
function getWeeklyEmail() {
var emailFilter = 'newer_than:7d AND label:inbox AND "Report: Launchpad filter"';
var threads = GmailApp.search(emailFilter, 0, 5);
var messages=[];
threads.forEach(function(threads)
{
messages.push(threads.getMessages()[0]);
});
return messages;
}
// Trying to parse the HTML table contained within the email
function getParsedMsg() {
var messages = getWeeklyEmail();
var msgbody = messages[0].getBody();
var doc = XmlService.parse(msgbody);
var html = doc.getRootElement();
var tables = doc.getDescendants();
var templ = HtmlService.createTemplateFromFile('Messages1');
templ.tables = [];
return templ.evaluate();
}
The debugger crashes when I try to step over the XmlService.parse function. The msgbody of the email contains both text and HTML formatted table. I am getting the following error: TypeError: Cannot read property 'getBody' of undefined (line 19, file "Code")
If I remove the getParsedMsg function and instead just display the content of the email, I get the email body along with the element tags etc in html format.
Workaround
Hi ! The issue you are experiencing is due to (as you previously mentioned) XmlService only recognising canonical XML rather than HTML. One possible workaround to solve this issue is to search in the string you are obtaining with getBody() for your desired tags.
In your case your main issue is var doc = XmlService.parse(msgbody);. To solve it you could iterate through the whole string looking for the table tags you need using Javascript search method. Here is an example piece of code retrieving an email with a single table:
function getWeeklyEmail() {
var emailFilter = 'newer_than:7d AND label:inbox AND "Report: Launchpad filter"';
var threads = GmailApp.search(emailFilter, 0, 5);
var messages=[];
threads.forEach(function(threads)
{
messages.push(threads.getMessages()[0]);
});
return messages;
}
// Trying to parse the HTML table contained within the email
function getParsedMsg() {
var messages = getWeeklyEmail();
var msgbody = messages[0].getBody();
var indexOrigin = msgbody.search('<table');
var indexEnd = msgbody.search('</table');
// Get what is in between those indexes of the string.
// I am adding 8 as it indexEnd only gets the first index of </table
// i.e the one before <
var Table = msgbody.substring(indexOrigin,indexEnd+8);
Logger.log(Table);
}
If you are looking for more than one table in your message, you can change getParsedMsg to the following:
function getParsedMsg() {
// If you are not sure about how many you would be expecting, use an approximate number
var totalTables = 2;
var messages = getWeeklyEmail();
var msgbody = messages[0].getBody();
var indexOrigin = msgbody.indexOf('<table');
var indexEnd = msgbody.indexOf('</table');
var Table = []
for(i=0;i<totalTables;i++){
// go over each stable and store their strings in elements of an array
var start = msgbody.indexOf('<table', (indexOrigin + i))
var end = msgbody.indexOf('</table', (indexEnd + i))
Table.push(msgbody.substring(start,end+8));
}
Logger.log(Table);
}
This will let you store each table in an element of an array. If you want to use these you would just need to retrieve the elements of this array and use them accordingly (for exaple to use them as HTML tables.
I hope this has helped you. Let me know if you need anything else or if you did not understood something. :)
I am trying to scrape a table of price data from this website using the following code;
function scrapeData() {
// Retrieve table as a string using Parser.
var url = "https://stooq.com/q/d/?s=barc.uk&i=d";
var fromText = '<td align="center" id="t03">';
var toText = '</td>';
var content = UrlFetchApp.fetch(url).getContentText();
var scraped = Parser.data(content).from(fromText).to(toText).build();
//Parse table using XmlService.
var root = XmlService.parse(scraped).getRootElement();
}
I have taken this method from an approach I used in a similar question here however its failing on this particular url and giving me the error;
Error on line 1: Content is not allowed in prolog. (line 12, file "Stooq")
In related questions here and here they talk of textual content that is not accepted being submitted to the parser however, I am unable to apply the solutions in these questions to my own problem. Any help would be much appreciated.
How about this modification?
Modification points:
In this case, it is required to modify the retrieved HTML values. For example, when var content = UrlFetchApp.fetch(url).getContentText() is run, each attribute value is not enclosed. These are required to be modified.
There is a merged column in the header.
When above points are reflected to the script, it becomes as follows.
Modified script:
function scrapeData() {
// Retrieve table as a string using Parser.
var url = "https://stooq.com/q/d/?s=barc.uk&i=d";
var fromText = '#d9d9d9}</style>';
var toText = '<table';
var content = UrlFetchApp.fetch(url).getContentText();
var scraped = Parser.data(content).from(fromText).to(toText).build();
// Modify values
scraped = scraped.replace(/=([a-zA-Z0-9\%-:]+)/g, "=\"$1\"").replace(/nowrap/g, "");
// Parse table using XmlService.
var root = XmlService.parse(scraped).getRootElement();
// Retrieve header and modify it.
var headerTr = root.getChild("thead").getChildren();
var res = headerTr.map(function(e) {return e.getChildren().map(function(f) {return f.getValue()})});
res[0].splice(7, 0, "Change");
// Retrieve values.
var valuesTr = root.getChild("tbody").getChildren();
var values = valuesTr.map(function(e) {return e.getChildren().map(function(f) {return f.getValue()})});
Array.prototype.push.apply(res, values);
// Put the result to the active spreadsheet.
var ss = SpreadsheetApp.getActiveSheet();
ss.getRange(1, 1, res.length, res[0].length).setValues(res);
}
Note:
Before you run this modified script, please install the GAS library of Parser.
This modified script is not corresponding to various URL. This can be used for the URL in your question. If you want to retrieve values from other URL, please modify the script.
Reference:
Parser
XmlService
If this was not what you want, I'm sorry.
I have FlexTable with chekBoxes in first cell of each row, when checkBox is true data from FlexTable's row is collected in variable. Now I need to create document with table that contains table with data from variable. I tried to store string's value in Hidden but it doesn't work and can't figure out how to realise it.
All my (although the code is not really my, code is almost half #Sergeinsas's) code is avaliable here: http://pastebin.com/aYmyA7N2, thankyou in advance.
There are a few errors in your code... widgets like hidden can only have string values and they can only return string values when you retrieve their values.
One possible and easy way to convert arrays to string (and back) is to use a combination of join() and split() , here is the modified code (relevant part only) that works.
// Storing checked rows
function check(e) {
var checkedArray = [];
var data = sh.getRange(1,1,lastrow,lastcol).getValues();
for(var n=0; n < data.length;++n){
if(e.parameter['check'+n]=='true'){
checkedArray.push(data[n].join(','));// convert data row array to string with comma separator
}
}
var hidden = app.getElementById('hidden');
hidden.setValue(checkedArray.join('|'));// convert array to string with | separator
return app;
}
function click(e) {
var hiddenVal = e.parameter.hidden.split('|');// e.parameter.hidden is a string, split back in an array of strings, each string should be splitted too to get the original array of arrays
var d = new Date();
var time = d.toLocaleTimeString();
var table = []
for(var n in hiddenVal){
table.push(hiddenVal[n].split(','));// reconstruction of a 2D array
}
DocumentApp.create('doc '+time).getBody().appendTable(table);// the table is in the document
}
Full code available here
EDIT : suggestion : if you put your headers in your spreadsheet you could retrieve them in your final table quite easily like this :
function check(e) {
var checkedArray = [];
var data = sh.getRange(1,1,lastrow,lastcol).getValues();
checkedArray.push(data[0].join(','));// if you have headers in your spreadsheet, you could add headers by default
for(var n=0; n < data.length;++n){
if(e.parameter['check'+n]=='true'){
checkedArray.push(data[n].join(','));
}
}
You could also use data[0] in the doGet function to build the header of your UI, I think this would make your code more easy to maintain without hardcoding of data.... but this is only a suggestion ;-)