Parsing JSON in Google Sheets - json

I'm working with JSON for the first time, so please excuse my lack of knowledge.
I'm trying to use a JSON file to populate data in a Google Sheet. I just don't know the right syntax. How can I format a JSON function to properly access the data and stop returning an error?
I'm trying to pull data from here:
https://eddb.io/archive/v6/bodies_recently.jsonl
into a Google Sheets.
I've got the ImportJSON script loaded and I've tested it with a really small JSON file (http://date.jsontest.com/) and it works as advertised, using this function:
=ImportJSON("http://date.jsontest.com", "/date")
However, when I try to use the same function with the JSON from eddb.io above, I can't get it to work.
What I would like to do is pull the "name" into A1 and then a few of the attributes into columns, like so:
name id type_name rotational_period, etc.
Here's a link to my tests:
https://docs.google.com/spreadsheets/d/1gCKpLcf-ytbPNcuQIIzxp1RMy7N5K8pD02hCLnL27qQ/edit?usp=sharing

How about this workaround?
Reason of issue:
When I saw the URL of https://eddb.io/archive/v6/bodies_recently.jsonl, I noticed that the extension of the file is jsonl. So when I checked the values retrieved from https://eddb.io/archive/v6/bodies_recently.jsonl, it was found that the values were JSON Lines. This has already been mentioned by Dimu Designs's comment. Also I could confirm that the official document says bodies_recently.jsonl is Line-delimited JSON.
Workaround:
Unfortunately, ImportJSON cannot directly parse the values of JSON Lines. So it is required to modify the script as a workaround. In your shared Spreadsheet, the script of ImportJSON is put as the container-bound script. In this modification, I modified the script. Please modify as follows.
From:
The following function can be seen at the line of 130 - 135 in your script editor.
function ImportJSONAdvanced(url, query, options, includeFunc, transformFunc) {
var jsondata = UrlFetchApp.fetch(url);
var object = JSON.parse(jsondata.getContentText());
return parseJSONObject_(object, query, options, includeFunc, transformFunc);
}
To:
Please replace the above function to the following script, and save the script. Then, please put =ImportJSON("https://eddb.io/archive/v6/bodies_recently.jsonl", "/id") to a cell, again.
function ImportJSONAdvanced(url, query, options, includeFunc, transformFunc) {
var jsondata = UrlFetchApp.fetch(url);
var object = jsondata.getContentText().match(/{[\w\s\S].+}/g).map(function(e) {return JSON.parse(e)}); // Modified
return parseJSONObject_(object, query, options, includeFunc, transformFunc);
}
Result:
Note:
Although this modified script works for the values from https://eddb.io/archive/v6/bodies_recently.jsonl, I'm not sure whether this modified script works for all JSON lines values. I apologize for this.
References:
eddb.io/api
JSON Lines
If I misunderstood your question and this was not the result you want, I apologize.

I'm not with my laptop, but I see you getting the error SyntaxError: Expected end of stream at char 2028 (line 132).
I think the data you received from the URL is to long.

you can use =IMPORTDATA(E1) and get the whole chunk into sheets and then REGEXEXTRACT all parts you need

Related

How to fix error in google sheet custom script [duplicate]

I have a .tsv file from a tool and I have to import it the Google Sheet (nearly) real-time for reports. This is my code for importing:
function importBigTSV(url) {return Utilities.parseCsv(UrlFetchApp.fetch(url).getContentText(),'\t');}
It worked till some days ago when Error messages keep saying "Exceeded maximum execution time (line 0)."
Could anyone help? Thank you a lot!
Issue:
As #TheMaster said, custom functions have a hard limit of 30 seconds, which your function is most probably reaching. Regular Apps Script executions have a much more generous time limit (6 or 30 minutes, depending on your account), so you should modify your function accordingly.
Differences between functions:
In order to transform your function, you have to take into account these basic differences:
You cannot pass parameters to a function called by a Menu or a button. Because of this, you have to find another way to specify the URL to fetch.
Values returned by a regular function don't get automatically written to the sheet. You have to use a writing method (like setValues, or appendRow) to do that.
A non-custom function is not called in any particular cell, so you have to specify where do you want to write the values to.
Since, from what I understand, you are always fetching the same URL, you can specify that URL just by hardcoding it into your function.
Solution:
The function below, for example, will write the parsed output to the range that is currently selected (at the moment of triggering the function). You could as well provide a default range to write the output to, using getRange:
function importBigTSV() {
var url = "{url-to-fetch}";
var range = SpreadsheetApp.getActiveRange();
try {
var output = Utilities.parseCsv(UrlFetchApp.fetch(url).getContentText(),'\t');
var outputRange = range.offset(0, 0, output.length, output[0].length);
outputRange.setValues(output);
} catch(err) {
console.log(err);
}
}
If the URL can change, I'd suggest you to have a list of URLs to fetch, and, before triggering the function, select the desired URL, and use getActiveRange in order to get this URL.
Attaching function to Menu:
In any case, once you have written your function, you have to attach this function somehow, so that it can be trigged from the sheet itself. You can either create a custom menu, or insert and image or drawing, and attach the script to it. The referenced links provide clear and concise steps to achieve this.
Reference:
Custom Functions > Return values
Custom Menus in G Suite

IMPORTXML- Could not fetch URL

I am trying to scrape data from wine-searcher.com and am having an issue with IMPORTXML in google sheets, I keep getting the "could not fetch url" error when trying either of the following:
=IMPORTXML("https://www.wine-searcher.com/find/robert+mondavi+rsrv+cab+sauv+napa+valley+county+north+coast+california+usa","//h1")
=IMPORTXML("https://www.wine-searcher.com/find/robert+mondavi+rsrv+cab+sauv+napa+valley+county+north+coast+california+usa","//*[#id='tab-info']/div/div[1]/div[2]/div/div[1]/span[2]/span[2]") ( xpath to scrape current average price)
I've tried suggestions in other stack posts such as with/out http/s, www, and both XPath and full XPath to no avail. I have also tried with other URLs and they work no problem, maybe the problem is with URL length or format? Any help would be appreciated. If it cannot be done with IMPORT XML, any free alternatives suggested?
As the page is built in javascript on the client side and not on the server side, you will not be able to retrieve the data by the importxml / importhtml functions. However, the page contains a json which you can retrieve and analyze to retrieve the information you need.
function myFunction() {
var url = 'https://www.wine-searcher.com/find/robert+mondavi+rsrv+cab+sauv+napa+valley+county+north+coast+california+usa'
var source = UrlFetchApp.fetch(url).getContentText()
var jsonString = source.split('<script type="application/ld+json">')[1].split('</script>')[0]
var data = JSON.parse(jsonString)
Logger.log(data)
}
all these informations are available, from x=0 to x=23
data.offers[x].#type
data.offers[x].priceCurrency
data.offers[x].availability
data.offers[x].priceValidUntil
data.offers[x].url
data.offers[x].name
data.offers[x].seller.#type
data.offers[x].seller.name
data.offers[x].seller.description
data.offers[x].seller.availableDeliveryMethod
data.offers[x].seller.address.#type
data.offers[x].seller.address.addressRegion
data.offers[x].seller.address.addressCountry.#type
data.offers[x].seller.address.addressCountry.name
data.offers[x].priceSpecification.#type
data.offers[x].priceSpecification.description
data.offers[x].priceSpecification.price
data.offers[x].priceSpecification.priceCurrency
https://docs.google.com/spreadsheets/d/17f6lhaHA_xpSWClzxkYZcNs4FeM4VHA480QrmwyJvT4/copy
as mentioned both these basic formulae return nothing:
=IMPORTXML("https://www.wine-searcher.com/find/robert+mondavi+rsrv+cab+sauv+napa+valley+county+north+coast+california+usa"; "//*")
=IMPORTDATA("https://www.wine-searcher.com/find/robert+mondavi+rsrv+cab+sauv+napa+valley+county+north+coast+california+usa")
pls note that importing data into spreadsheet is URL specific, so if something works well for www.aaa.org most likely it wont work for www.bbb.org

another IMPORTXML returning empty content

When I input
=IMPORTXML("http://www.ilgiornale.it/autore/franco-battaglia.html","//h2")
in my google sheet, I get: #N/A Imported content is empty.
However, when I input:
=IMPORTXML("http://www.ilgiornale.it/autore/franco-battaglia.html","*")
I get some content, so I can presume that access to the page is not blocked.
And the page contains several h2 tags without any doubt.
So what's the issue?
You want to know the reason of the following situation.
=IMPORTXML("http://www.ilgiornale.it/autore/franco-battaglia.html","//h2") returns #N/A Imported content is empty.
=IMPORTXML("http://www.ilgiornale.it/autore/franco-battaglia.html","*") returns the content.
If my understanding is correct, how about this answer?
Issue:
When I saw the HTML data of http://www.ilgiornale.it/autore/franco-battaglia.html, I noticed that the wrong point of it. It is as follows.
window.jQuery || document.write("<script src='/sites/all/modules/jquery_update/replace/jquery/jquery.min.js'>\x3C/script>")
In this case, the script tag is not closed like \x3C/script>. It seems that when IMPORTXML retrieves this line, the script tab is not closed. I could confirm that when \x3C is converted to <, =IMPORTXML("http://www.ilgiornale.it/autore/franco-battaglia.html","//h2") correctly returns the values of h2 tag.
By this, it seems that the issue that =IMPORTXML("http://www.ilgiornale.it/autore/franco-battaglia.html","//h2") returns #N/A Imported content is empty occurs.
About the reason that =IMPORTXML("http://www.ilgiornale.it/autore/franco-battaglia.html","*") returns the content, when I put this formula, I couldn't find the values of the script tab. From this situation, I thought that the script tag might have an issue. So I could find the above wrong point. I could confirm that when \x3C is converted to <, =IMPORTXML("http://www.ilgiornale.it/autore/franco-battaglia.html","*") returns the values including the values of the script tag.
Workarounds:
In order to avoid above issue, it is required to be modified \x3C to <. So how about the following workarounds? In these workarounds, I used Google Apps Script. Please think of these workarounds as just two of several workarounds.
Pattern 1:
In this pattern, at first, download the HTML data from the URL, and modify the wrong point. Then, the modified HTML data is created as a file, and the file is shared. And retrieve the URL of the file. Using this URL, the values are retrieved.
Sample script:
function myFunction() {
var url = "http://www.ilgiornale.it/autore/franco-battaglia.html";
var data = UrlFetchApp.fetch(url).getContentText().replace(/\\x3C/g, "<");
var file = DriveApp.createFile("htmlData.html", data, MimeType.HTML);
file.setSharing(DriveApp.Access.ANYONE_WITH_LINK, DriveApp.Permission.VIEW);
var endpoint = "https://drive.google.com/uc?id=" + file.getId() + "&export=download";
Logger.log(endpoint)
}
When you use this script, at first, please run the function of myFunction() and retrieve the endpoint. And as a test case, please put the endpoint to the cell "A1". And put =IMPORTXML(A1,"//h2") to the cell "A2". By this, the values can be retrieved.
Pattern 2:
In this pattern, the values of the tag h2 are directly retrieved by parsing HTML data and put them to the active Spreadsheet.
Sample script:
function myFunction() {
var url = "http://www.ilgiornale.it/autore/franco-battaglia.html";
var data = UrlFetchApp.fetch(url).getContentText().match(/<h2[\s\S]+?<\/h2>/g);
var xml = XmlService.parse("<temp>" + data.join("") + "</temp>");
var h2Values = xml.getRootElement().getChildren("h2").map(function(e) {return [e.getValue()]});
var sheet = SpreadsheetApp.getActiveSheet();
sheet.getRange(sheet.getLastRow() + 1, 1, h2Values.length, 1).setValues(h2Values);
Logger.log(h2Values)
}
When you run the script, the values of the tag h2 are directly put to the active Spreadsheet.
References:
Class UrlFetchApp
Class XmlService
If I misunderstood your question and this was not the direction you want, I apologize.

Keep getting 'Null' when pulling data from google sheets

Whenever I try to pull data from 2 of 3 sheets from using the 'google.script.run' function call from Javascript, I keep getting an error saying the array I am returning is Null, but when I just change the exact same function call to work on another sheet, it returns the data perfectly
I have tried deleting the sheets and giving it the same names, I have tried using 'openWithURL' instead of 'getActive' to access the spreadsheet, I have tried rewriting the code, I have tried the same code in a different project, and checking the documentation to make sure I am not missing any detail. I have tried changing the references to the sheets, some work and some dont.
var SS = SpreadsheetApp.getActive();
var DB_BOOKINGS = SS.getSheetByName("BookingDatabase");
var DB_VEHICLES = SS.getSheetByName("VehicleDatabase");
var DB_REQUESTS = SS.getSheetByName("RequestDatabase");
function getRequestData(){
return DB_REQUESTS.getDataRange().getValues();
}
<script>
function getRequestData(callingFunction) {
google.script.run
.withSuccessHandler(callingFunction)
.withFailureHandler(CustomAlert)
.getRequestData();
}
</script>
I want to retrieve the sheet data but keep getting a null value
Since this is an issue with formatting as you said, try using getDisplayValues() rather than getValues(), this will pull the data as you see it in the sheet (as a string), rather than the unformatted data itself.
Reference:
getDisplayValues
I was having a similar problem. This was exactly the solution I needed. For my situation, I was able to use getValues() successfully on the initial page load, but when I tried to run it again as a sort of 'refresh' to update the values without reloading the entire page, it would return null.
My data did indeed contain dates, so after changing it to getDisplayValues(), it worked perfectly.

How do i get data from a webpage using google apps script?

i looked the existing questions (such as this one: Get data from webpage using Google Apps Script and Yahoo Query Language) which are similar to my query but had no luck.
How do i get data from bseindia.com using UrlFetchApp? Here is the page: http://www.bseindia.com/stock-share-price/stockreach_financials.aspx?scripcode=532343&expandable=0
Now, for example, How do i get the Revenue of Dec 14 from the above page? In this case, the code should return 2,652.91.
i tried this:
function getit(){
var response = UrlFetchApp.fetch("http://www.bseindia.com/stock-share-price/stockreach_financials.aspx?scripcode=532343&expandable=0");
var cut = response.substring(str.indexOf("<td class=TTRow_right>"),response.length);
var value = cut.substring(0, cut.indexOf("</td>"));
Logger.log(number);
}
The error i get is:
TypeError: Cannot find function substring in object
what i am doing is definitely not correct also because every revenue number in there starts with the same "td class=TTRow_right"
The response you are getting is not a string but of type HTTPResponse, so you cannot use substring(..) on it.
Try response.getContentText()
you find more details here http-response