How do i get data from a webpage using google apps script? - google-apps-script

i looked the existing questions (such as this one: Get data from webpage using Google Apps Script and Yahoo Query Language) which are similar to my query but had no luck.
How do i get data from bseindia.com using UrlFetchApp? Here is the page: http://www.bseindia.com/stock-share-price/stockreach_financials.aspx?scripcode=532343&expandable=0
Now, for example, How do i get the Revenue of Dec 14 from the above page? In this case, the code should return 2,652.91.
i tried this:
function getit(){
var response = UrlFetchApp.fetch("http://www.bseindia.com/stock-share-price/stockreach_financials.aspx?scripcode=532343&expandable=0");
var cut = response.substring(str.indexOf("<td class=TTRow_right>"),response.length);
var value = cut.substring(0, cut.indexOf("</td>"));
Logger.log(number);
}
The error i get is:
TypeError: Cannot find function substring in object
what i am doing is definitely not correct also because every revenue number in there starts with the same "td class=TTRow_right"

The response you are getting is not a string but of type HTTPResponse, so you cannot use substring(..) on it.
Try response.getContentText()
you find more details here http-response

Related

How to fix error in google sheet custom script [duplicate]

I have a .tsv file from a tool and I have to import it the Google Sheet (nearly) real-time for reports. This is my code for importing:
function importBigTSV(url) {return Utilities.parseCsv(UrlFetchApp.fetch(url).getContentText(),'\t');}
It worked till some days ago when Error messages keep saying "Exceeded maximum execution time (line 0)."
Could anyone help? Thank you a lot!
Issue:
As #TheMaster said, custom functions have a hard limit of 30 seconds, which your function is most probably reaching. Regular Apps Script executions have a much more generous time limit (6 or 30 minutes, depending on your account), so you should modify your function accordingly.
Differences between functions:
In order to transform your function, you have to take into account these basic differences:
You cannot pass parameters to a function called by a Menu or a button. Because of this, you have to find another way to specify the URL to fetch.
Values returned by a regular function don't get automatically written to the sheet. You have to use a writing method (like setValues, or appendRow) to do that.
A non-custom function is not called in any particular cell, so you have to specify where do you want to write the values to.
Since, from what I understand, you are always fetching the same URL, you can specify that URL just by hardcoding it into your function.
Solution:
The function below, for example, will write the parsed output to the range that is currently selected (at the moment of triggering the function). You could as well provide a default range to write the output to, using getRange:
function importBigTSV() {
var url = "{url-to-fetch}";
var range = SpreadsheetApp.getActiveRange();
try {
var output = Utilities.parseCsv(UrlFetchApp.fetch(url).getContentText(),'\t');
var outputRange = range.offset(0, 0, output.length, output[0].length);
outputRange.setValues(output);
} catch(err) {
console.log(err);
}
}
If the URL can change, I'd suggest you to have a list of URLs to fetch, and, before triggering the function, select the desired URL, and use getActiveRange in order to get this URL.
Attaching function to Menu:
In any case, once you have written your function, you have to attach this function somehow, so that it can be trigged from the sheet itself. You can either create a custom menu, or insert and image or drawing, and attach the script to it. The referenced links provide clear and concise steps to achieve this.
Reference:
Custom Functions > Return values
Custom Menus in G Suite

Creating a UrlFetchApp script to replace the Google Sheet importHTML function

I used the following formula for about a year now and suddenly it stopped working/importing the table.
=IMPORTHTML("https://tradingeconomics.com/matrix";"table";1)
It gives me a "Could not fetch url: https://tradingeconomics.com/matrix" error.
I tried various things and one of the interesting findings was that the importHTML works for the cached version, but only in a new sheet under a different Google account. Furthermore, the cached version breaks randomly too.
Thus, it seems I won't get around using a script for this purpose.
Ideally, this script would be flexible enough, where it would have a dedicated function e.g. importHTMLtable where the user can input the URL and the table no. and it works. So it would work for the following functions I currently use e.g.
=importHTMLtable("https://tradingeconomics.com/matrix";"table";1)
OR
=importHTMLtable("https://tradingeconomics.com/country-list/business-confidence?continent=world";"table";1)
OR
=importHTMLtable("https://tradingeconomics.com/country-list/ease-of-doing-business";"table";1)
etc...
Not sure if this Github code solves this problem. It seems to only parse text?
As I would assume this is a fairly common problem users of Google Sheets have and would think there might already be an AppScript out there that does exactly this and might be faster in terms of importing speed too.
I can't program, so I tried copying and posting codes to see if I can get some code to work. No luck :(
Can anyone provide a code or maybe an existing app script (I'm not aware of) that does exactly this that?
Try this way
=importTableHTML(A1,1)
with
function importTableHTML(url,n){
var html = UrlFetchApp.fetch(url,{followRedirects : true,muteHttpExceptions: true}).getContentText().replace(/(\r\n|\n|\r|\t| )/gm,"")
const tables = [...html.matchAll(/<table[\s\S\w]+?<\/table>/g)];
var trs = [...tables[n-1][0].matchAll(/<tr[\s\S\w]+?<\/tr>/g)];
var data = [];
for (var i=0;i<trs.length;i++){
console.log(trs[i][0])
var tds = [...trs[i][0].matchAll(/<(td|th)[\s\S\w]+?<\/(td|th)>/g)];
var prov = [];
for (var j=0;j<tds.length;j++){
donnee=tds[j][0].match(/(?<=\>).*(?=\<\/)/g)[0];
prov.push(stripTags(donnee));
}
data.push(prov);
}
return(data)
}
function stripTags(body) {
var regex = /(<([^>]+)>)/ig;
return body.replace(regex,"").replace(/ /g,' ').trim();
}
url-fetch-app#advanced-parameters
matchAll

IMPORTXML- Could not fetch URL

I am trying to scrape data from wine-searcher.com and am having an issue with IMPORTXML in google sheets, I keep getting the "could not fetch url" error when trying either of the following:
=IMPORTXML("https://www.wine-searcher.com/find/robert+mondavi+rsrv+cab+sauv+napa+valley+county+north+coast+california+usa","//h1")
=IMPORTXML("https://www.wine-searcher.com/find/robert+mondavi+rsrv+cab+sauv+napa+valley+county+north+coast+california+usa","//*[#id='tab-info']/div/div[1]/div[2]/div/div[1]/span[2]/span[2]") ( xpath to scrape current average price)
I've tried suggestions in other stack posts such as with/out http/s, www, and both XPath and full XPath to no avail. I have also tried with other URLs and they work no problem, maybe the problem is with URL length or format? Any help would be appreciated. If it cannot be done with IMPORT XML, any free alternatives suggested?
As the page is built in javascript on the client side and not on the server side, you will not be able to retrieve the data by the importxml / importhtml functions. However, the page contains a json which you can retrieve and analyze to retrieve the information you need.
function myFunction() {
var url = 'https://www.wine-searcher.com/find/robert+mondavi+rsrv+cab+sauv+napa+valley+county+north+coast+california+usa'
var source = UrlFetchApp.fetch(url).getContentText()
var jsonString = source.split('<script type="application/ld+json">')[1].split('</script>')[0]
var data = JSON.parse(jsonString)
Logger.log(data)
}
all these informations are available, from x=0 to x=23
data.offers[x].#type
data.offers[x].priceCurrency
data.offers[x].availability
data.offers[x].priceValidUntil
data.offers[x].url
data.offers[x].name
data.offers[x].seller.#type
data.offers[x].seller.name
data.offers[x].seller.description
data.offers[x].seller.availableDeliveryMethod
data.offers[x].seller.address.#type
data.offers[x].seller.address.addressRegion
data.offers[x].seller.address.addressCountry.#type
data.offers[x].seller.address.addressCountry.name
data.offers[x].priceSpecification.#type
data.offers[x].priceSpecification.description
data.offers[x].priceSpecification.price
data.offers[x].priceSpecification.priceCurrency
https://docs.google.com/spreadsheets/d/17f6lhaHA_xpSWClzxkYZcNs4FeM4VHA480QrmwyJvT4/copy
as mentioned both these basic formulae return nothing:
=IMPORTXML("https://www.wine-searcher.com/find/robert+mondavi+rsrv+cab+sauv+napa+valley+county+north+coast+california+usa"; "//*")
=IMPORTDATA("https://www.wine-searcher.com/find/robert+mondavi+rsrv+cab+sauv+napa+valley+county+north+coast+california+usa")
pls note that importing data into spreadsheet is URL specific, so if something works well for www.aaa.org most likely it wont work for www.bbb.org

Google Apps Script: count occurences of string in HTML query

I am trying to get the count of occurrences of a string in the text fetched from a website
var html = UrlFetchApp.fetch('https://www.larvalabs.com/cryptopunks/details/0000').getContentText();
var offers = html.match('Offered');
Logger.log(offers);
However I get the following data returned: [Offered]
I tried several methods but I do not find much documentation on those I can use to do this task that sounds simple.
I add that I tried to parse with XMLservice but some errors in the HTML code makes it fail.
For example, as one method, how about using matchAll()?
Modified script:
var html = UrlFetchApp.fetch('https://www.larvalabs.com/cryptopunks/details/0000').getContentText();
var offers = [...html.matchAll('Offered')]; // or [...html.matchAll(/Offered/g)]
Logger.log(offers.length);
When I tested above, 3 is returned.
Note:
In this case, the upper- and lowercase letters are distinguished. Please be careful this.
Reference:
matchAll()

Parsing JSON in Google Sheets

I'm working with JSON for the first time, so please excuse my lack of knowledge.
I'm trying to use a JSON file to populate data in a Google Sheet. I just don't know the right syntax. How can I format a JSON function to properly access the data and stop returning an error?
I'm trying to pull data from here:
https://eddb.io/archive/v6/bodies_recently.jsonl
into a Google Sheets.
I've got the ImportJSON script loaded and I've tested it with a really small JSON file (http://date.jsontest.com/) and it works as advertised, using this function:
=ImportJSON("http://date.jsontest.com", "/date")
However, when I try to use the same function with the JSON from eddb.io above, I can't get it to work.
What I would like to do is pull the "name" into A1 and then a few of the attributes into columns, like so:
name id type_name rotational_period, etc.
Here's a link to my tests:
https://docs.google.com/spreadsheets/d/1gCKpLcf-ytbPNcuQIIzxp1RMy7N5K8pD02hCLnL27qQ/edit?usp=sharing
How about this workaround?
Reason of issue:
When I saw the URL of https://eddb.io/archive/v6/bodies_recently.jsonl, I noticed that the extension of the file is jsonl. So when I checked the values retrieved from https://eddb.io/archive/v6/bodies_recently.jsonl, it was found that the values were JSON Lines. This has already been mentioned by Dimu Designs's comment. Also I could confirm that the official document says bodies_recently.jsonl is Line-delimited JSON.
Workaround:
Unfortunately, ImportJSON cannot directly parse the values of JSON Lines. So it is required to modify the script as a workaround. In your shared Spreadsheet, the script of ImportJSON is put as the container-bound script. In this modification, I modified the script. Please modify as follows.
From:
The following function can be seen at the line of 130 - 135 in your script editor.
function ImportJSONAdvanced(url, query, options, includeFunc, transformFunc) {
var jsondata = UrlFetchApp.fetch(url);
var object = JSON.parse(jsondata.getContentText());
return parseJSONObject_(object, query, options, includeFunc, transformFunc);
}
To:
Please replace the above function to the following script, and save the script. Then, please put =ImportJSON("https://eddb.io/archive/v6/bodies_recently.jsonl", "/id") to a cell, again.
function ImportJSONAdvanced(url, query, options, includeFunc, transformFunc) {
var jsondata = UrlFetchApp.fetch(url);
var object = jsondata.getContentText().match(/{[\w\s\S].+}/g).map(function(e) {return JSON.parse(e)}); // Modified
return parseJSONObject_(object, query, options, includeFunc, transformFunc);
}
Result:
Note:
Although this modified script works for the values from https://eddb.io/archive/v6/bodies_recently.jsonl, I'm not sure whether this modified script works for all JSON lines values. I apologize for this.
References:
eddb.io/api
JSON Lines
If I misunderstood your question and this was not the result you want, I apologize.
I'm not with my laptop, but I see you getting the error SyntaxError: Expected end of stream at char 2028 (line 132).
I think the data you received from the URL is to long.
you can use =IMPORTDATA(E1) and get the whole chunk into sheets and then REGEXEXTRACT all parts you need