Google Sheet use Importxml error could not fetch url [duplicate] - google-apps-script

This question already has answers here:
Scraping data to Google Sheets from a website that uses JavaScript
(2 answers)
Closed last month.
I want to get price data on this website (https://tarkov-market.com/item/Pack_of_sugar)
But it doesn't work
=IMPORTXML("https://tarkov-market.com/item/Pack_of_sugar","//*[#id='__layout']/div/div[1]/div/div[4]/div[1]/div[2]/div[1]/div[2]")

You want to retrieve the price like 55,500₽ from the URL of https://tarkov-market.com/item/Pack_of_sugar and put to a cell on Google Spreadsheet.
I could understand like this. If my understanding is correct, how about this answer?
Issue and workaround:
Unfortunately, IMPORTXML cannot be used for this situation. Because IMPORTXML is used like =IMPORTXML("https://tarkov-market.com/item/Pack_of_sugar","//*"), an error like the value cannot be retrieved from the URL occurs. So in this case, as a workaround, I would like to propose to use Google Apps Script as a custom function. When Google Apps Script is used, the value can be retrieved.
Sample script:
Please copy and paste the following script to the container-bound script of the Spreadsheet. And please put =sampleFormula() to a cell. By this, the value can be put to the cell.
function sampleFormula() {
const url = "https://tarkov-market.com/item/Pack_of_sugar";
const html = UrlFetchApp.fetch(url).getContentText();
return html.match(/price:(.+?)<\/title>/)[1].trim();
}
Result:
Note:
This script is for your question. So when this script is used for other URL and scenes, an error might occur. Please be careful this.
References:
Custom Functions in Google Sheets
Class UrlFetchApp

Related

List of all tabs' "publish to web" links on a large googlesheet document (200 tabs)

Is there a way to get a list of each of the hyperlinks created by the "publish to web" function on google sheets without selecting each tab individually and copying and pasting to a spreadsheet/word document. Ideally the output being all my tab names (circa 200 of them) and the link.
Any help or advice would be greatly appreciated.
If all you wish is tab names then this is a list of tab names:
function getTabNames() {
const ss = SpreadsheetApp.getActive();
Logger.log(ss.getSheets().map(sh => sh.getName()).join(','))
}
You could use openById() if you wish.
I believe your goal is as follows.
You want to receive the Web Published URL for all sheets in a Google Spreadsheet using Google Apps Script.
You want to put the URLs to the Spreadsheet.
Issue and workaround:
When a Google Spreadsheet is published to the web, a URL like https://docs.google.com/spreadsheets/d/e/2PACX-###/pubhtml?gid=###&single=true is obtained. But, in the current stage, unfortunately, this cannot be retrieved using a script and API. Ref By this, it is required to manually create the URL.
In this answer, I would like to propose 2 patterns for achieving your goal.
Pattern 1:
In this pattern, a URL like https://docs.google.com/spreadsheets/d/e/2PACX-###/pubhtml?gid=###&single=true is used. 2PACX-### is not the Spreadsheet ID. Please be careful about this.
First, please publish to the web for your Spreadsheet, and retrieve the URL of https://docs.google.com/spreadsheets/d/e/2PACX-###/pubhtml?gid=###&single=true. In this pattern , https://docs.google.com/spreadsheets/d/e/2PACX-###/pubhtml from https://docs.google.com/spreadsheets/d/e/2PACX-###/pubhtml?gid=###&single=true is used.
Please copy and paste the following script to the script editor of Google Spreadsheet. And, please set your https://docs.google.com/spreadsheets/d/e/2PACX-###/pubhtml to baseUrl. When you use this script, please put a custom function of =SAMPLE(). By this, the URLs are returned.
function SAMPLE() {
const baseUrl = "https://docs.google.com/spreadsheets/d/e/2PACX-###/pubhtml"; // Please modify this for your URL.
return SpreadsheetApp.getActiveSpreadsheet().getSheets().map(s => `${baseUrl}?single=true&gid=${s.getSheetId()}`);
}
Pattern 2:
In this pattern, the URL like https://docs.google.com/spreadsheets/d/### fileId ###/pubhtml is used. In this case, Spreadsheet ID is used. By this, you are not required to do a hard copy of the URL.
Please copy and paste the following script to the script editor of Google Spreadsheet. When you use this script, please put a custom function of =SAMPLE(). By this, the URLs are returned.
function SAMPLE() {
const ss = SpreadsheetApp.getActiveSpreadsheet();
const baseUrl = `https://docs.google.com/spreadsheets/d/${ss.getId()}/pubhtml`;
return ss.getSheets().map(s => `${baseUrl}?single=true&gid=${s.getSheetId()}`);
}
Note:
In this case, when the sheet is not published, you cannot access the URL. Please be careful about this.
References:
map()
getSheetId()

Dynamic hyperlink & Text in the same cell with a formula in google sheets

Note: consider checking the edits first if you have a similar problem
I have Link, Lable, Text and Formula as input
and the formula uses Link, Lable, Text as an input, like this
Make a copy of my example sheet.
=Function(HYPERLINK(A3,B3)," ",C3)
I want to create a custom Function to get the result like in E3
Hyperlink & Text, Google Text, after reading suggested answers i concluded there is no way to achive this result by creating a custom formula that can format the output.
the next best thing is to have a script that extract the formula parameters A3, B3 and the tailing text and use it to output the result in the next cell either automatic onedit or with menu botton.
I tested this script but the problem is the formula is replaced by Plain text only, see the github project google-apps-script-projects. or Make a copy of my example sheet the script is included.
Building to what #Tanaike answer, storing the parameters of the formatting in the Custom formula like this and feed it to the script to output the result in the next cell.
=CustomFunction([A3,B3],C3...)
Explanation
=CustomFunction([Hyperlink,Lable],text...)
I believe your goal is as follows.
Put a text to a cell. In this case, use a hyperlink in a part of the text.
You are required to achieve this using a custom function like =CustomConcatenationFunction(Hyperlink(Link,Lable),"Text1","Text2"...).
In the current stage, in order to reflect the hyperlink in a part of the text, it is required to use setRichTextValue of Google Apps Script. In this case, this method cannot be used with the custom function. This is the current specification.
And, in the case of a custom function like =CustomConcatenationFunction(Hyperlink(Link,Lable),"Text1","Text2"...), the arguments at the custom function side are label, "Text1" and "Text2". I think that in this case, the URL cannot be retrieved at the custom function. I think that this is also a modification point.
So, in order to achieve your goal, it is required to use a workaround. In this post, I would like to introduce the workaround. This workaround uses Web Apps. When Web Apps is used, the methods which cannot be used with a custom function can be used with a custom function. This can be seen at this report and Error when running Youtube Data Service in App Scripts (js) – Daily Limit for Unauthenticated Use Exceeded.
When Web Apps is used for achieving your goal, it becomes as follows.
Usage:
1. Prepare Google Spreadsheet.
Please create a Google Spreadsheet.
2. Prepare sample script.
Please open the script editor of Spreadsheet and copy and paste the following sample script.
function doGet(e) {
const { range, sheetName, link, text, allText } = e.parameter;
const idx = allText.indexOf(text);
const r = SpreadsheetApp.newRichTextValue()
.setText(allText)
.setLinkUrl(idx, idx + text.length, link)
.build();
SpreadsheetApp.getActiveSpreadsheet()
.getSheetByName(sheetName)
.getRange(range)
.setRichTextValue(r);
return ContentService.createTextOutput();
}
// This is used as the custom function.
function SAMPLE(link, text, allText) {
const webAppsUrl = "https://script.google.com/macros/s/###/exec"; // Please set the URL of Web Apps after you set the Web Apps.
const range = SpreadsheetApp.getActiveRange();
UrlFetchApp.fetch(
`${webAppsUrl}?range=${range.getA1Notation()}&sheetName=${range
.getSheet()
.getSheetName()}&link=${link}&text=${text}&allText=${allText}`
);
}
Here, webAppsUrl is required to be replaced with your Web Apps URL. Web Apps is deployed in the following flow.
3. Deploy Web Apps.
The detailed information can be seen at the official document.
Please set this using the new IDE of the script editor.
On the script editor, at the top right of the script editor, please click "click Deploy" -> "New deployment".
Please click "Select type" -> "Web App".
Please input the information about the Web App in the fields under "Deployment configuration".
Please select "Me" for "Execute as".
Please select "Anyone" for "Who has access".
Please click "Deploy" button.
Copy the URL of the Web App. It's like https://script.google.com/macros/s/###/exec, and replace webAppsUrl in the above sample script.
Reflect the latest script to the Web Appps. Because the script of Web Apps is changed. This is an important point.
When you modified the Google Apps Script, please modify the deployment as a new version. By this, the modified script is reflected in Web Apps. Please be careful about this.
You can see the detail of this in the report "Redeploying Web Apps without Changing URL of Web Apps for new IDE".
4. Testing.
In order to test the above sample, please put a custom function like =SAMPLE("###URL###","sampleLink","sampleText sampleLink sampleText"). By this, sampleLink of sampleText sampleLink sampleText has the hyperlink as follows.
Note:
In this case, the inputted custom function is overwritten by the RichTextValue. Because in the current stage, the RichTextValue cannot be used in a custom function.
This is a simple sample script. So, please modify this for your actual situation.
References:
Enhanced Custom Function for Google Spreadsheet using Web Apps as Wrapper.
This sample is for this thread in Stackoverflow
Added:
From we need a workaround to keep the formula in place eiather in a seprate cell or in the formatted cell, I understood you perfectly. we have half of the question answered the last bit is to keep the formula extract the url, lable plain text from it and output the formatted result to a cell on the right as a workaround., how about the following sample script?
In this sample script, the simple trigger of OnEdit is used.
Sample script:
const SAMPLE = _ => "Done";
function onEdit(e) {
const customFunction = "=SAMPLE";
const { range } = e;
const formula = range.getFormula();
if (!formula.includes(customFunction)) return;
const arguments = formula.match(/\((.+)\)/);
if (!arguments) return;
const [link, text, allText] = arguments[1].replace(/"/g, "").split(",");
const idx = allText.indexOf(text);
const r = SpreadsheetApp.newRichTextValue().setText(allText).setLinkUrl(idx, idx + text.length, link).build();
range.offset(0, 1).setRichTextValue(r);
}
When you use this script, please put a custom function of =SAMPLE("###URL###","sampleLink","sampleText sampleLink sampleText") to a cell. By this, the script of onEdit is automatically run by the trigger.
Testing:
When this script is used, the following result is obtained.

How to import historical data (CSV) from Yahoo Finance in SpreadSheets

I try to import historical data (CSV) for APPLE. I use ImportData function in Google Sheet with
https://query1.finance.yahoo.com/v7/finance/download/AAPL?period1=1577982443&period2=1609604843&interval=1d&events=history&includeAdjustedClose=true
but the result is "#N/A".
I want to get the CSV because there is 3 decimals. And only 2, on the website.
=IMPORTXML("https://query1.finance.yahoo.com/v7/finance/download/AAPL?period1=1577982443&period2=1609604843&interval=1d&events=history&includeAdjustedClose=true")
There is a script after to obtain the file : AAPL.csv .
Can you help me ?
Unfortunately, in the case of the URL, it seems that IMPORTDATA and IMPORTXML cannot be used. But, fortunately, I confirmed that UrlFetchApp of Google Apps Script can be retrieved the CSV data. So, in this answer, I would like to propose to use Google Apps Script for achieving your goal.
Sample script:
Please copy and paste the following script to the script editor of Google Spreadsheet, and please put =SAMPLE("https://query1.finance.yahoo.com/v7/finance/download/AAPL?period1=1577982443&period2=1609604843&interval=1d&events=history&includeAdjustedClose=true") to a cell. This script is used as the custom function. By this, the CSV data can be retrieved in the cells.
const SAMPLE = url => Utilities.parseCsv(UrlFetchApp.fetch(url).getContentText());
Result:
When above script is used, the following result is obtained.
References:
Custom Functions in Google Sheets
Class UrlFetchApp
Utilities.parseCsv()

Is it possible to use IMPORTXML function and modify it with a query?

I am creating a spreadsheet portfolio. I came across some limitation, e. g. that I can not automate the process of importing the data from a website for different stocks. This is since the Index for the stock information on the website is often different from another stock. However there is the pattern that it is the next Index from a defined string e. g. "Branche". This made me wonder if I can automate the process with the Google Apps Script.
I wrote down the steps at first in Google Sheets. Then I formulated the steps in the Google Apps Script. Now I am stuck.
Step 1
=IMPORTXML("https://www.comdirect.de/inf/aktien/detail/uebersicht.html?ID_NOTATION=9386126";"//tr/td[#class='simple-table__cell']")
Step 2
=IMPORTXML(CONCATENATE("https://www.comdirect.de/inf/aktien/detail/uebersicht.html?ID_NOTATION=";"9386126");"//tr/td[#class='simple-table__cell']")
Step 3
=INDEX(IMPORTXML(CONCATENATE("https://www.comdirect.de/inf/aktien/detail/uebersicht.html?ID_NOTATION=";"9386126");"//tr/td[#class='simple-table__cell']");62;1)
Step 4 final product - just an idea not working yet
function import_branche() {
var url1 = "https://www.comdirect.de/inf/aktien/detail/uebersicht.html?ID_NOTATION="
var ulr2
var ticker = "//tr/td[#class='simple-table__cell']"
Index = find the INDEX with the String == "Branche"
return Index(IMPORTXML(CONCATENATE(url1;url2); ticker);(Index+1);1)
}
Ideally, I would like to have a function where I only need insert the link of the website and get the result. Here is the index for the information automatically found.
Google Apps Script can't execute Google Sheets spreadsheet functions like IMPORTXML, so you have two basic alternatives
Use Google Apps Script to get the result of a IMPORTXML formula from the spreadsheet, then use JavaScript to do the rest of the job
Do the job completely using Google Apps Script and JavaScript
Related
Google docs ImportXML called from script
(javascript / google scripts) How to get the title of a page encoded with iso-8859-1 so that the title will display correctly in my utf-8 website?

How to use ImportXML in a Google Script?

Scenario:
I am trying to use Google spreadsheet to do something like this:
I've a set of Blog URL's (more than 50) for which I want to fetch title
Currently I am using the formula: ImportXML(A2,"//h1[#class='entry-title']" )
The Problem w/ this approach is that ImportXML calls are limited to 50 per spreadsheet and I've more than 50.
I browsed/searched - found out that we can use ImportXML calls in google apps script but did not found any example.
I found this: https://developers.google.com/apps-script/articles/XML_tutorial but I was hoping to use ImportXML function.
Can anyone describe or point to a resource where they have used ImportXML in a google apps script? Thanks!
I would use a few spreadsheets, then pull the results together into a master spreadsheet, or are you doing >1000 or so. If you have lots, the XML_tutorial link you have is the place to start.
But last time I checked, google-apps-script does not support google spreadsheet functions, if I remember, the feature request to support all spreadsheet functions in GAS was rejected.