This question already has an answer here:
Google Sheets importXML Returns Empty Value
(1 answer)
Closed 2 years ago.
i am trying to use the importxml function to get data off the following website
:https://fantasy.espn.com/basketball/league/standings?leagueId=1878319. I want to get the table titled final standings into a google sheet using the import xml function. The function i am using is listed below:
=IMPORTXML("https://fantasy.espn.com/basketball/league/standings?leagueId=1878319","//*[#id="espn-analytics"]/div/div[5]/div[2]/div[1]/div/div/div[4]/section/div/div/div[2]/table/tbody")
The function returns a #NA error and says the import is empty. How do i fix it to get the data set i need
Unfortunately as more sites move to dynamically loaded content, the IMPORTXML function is losing some of it's usefulness, as it can't read this. Depending on how the site is loading the content, you might be able to analyze the script and find the source, but it might be true pain to do so, and you may have to parse the format to make it work. No fun.
Since the page you referenced is a "Final Standings" - I assume you don't need this to be auto-updating since it won't change, in which case, rather than a messy copy-paste, you might want to try a Chrome extension like "Instant Web Scraper" which will analyze the tables even within dynamic content and let you export it as a CSV which you can then quickly bring into Google Sheets.
Sorry that doesn't fix the IMPORTXML issue in this case, but I hope it helps.
Edit: Here is that top table in a CSV format (copy and save to a text file and name the text file a .csv and you can then upload it to Google Sheets):
jsx-2810852873,Image src,teamName,jsx-2302882246,Table__TD,jsx-2810852873 2,jsx-2810852873 3,jsx-2810852873 4,jsx-2810852873 5,jsx-2810852873 6,dn src
1,https://g.espncdn.com/lm-static/logo-packs/core/CatsAndDogs/cats_dogs-3.svg,Kevin Manning Show,(Kevin Manning),16-3-1,20328.5,17509.5,1016.4,875.5,+140.9,
2,,los angeles lebrons,(Zack Woodard),15-4-1,20909.5,17702.5,1045.5,885.1,+160.3,https://larrybrownsports.com/wp-content/uploads/2013/11/lebron-james-face.jpg
3,,BasketBall Chimps,(Jacob Woodard),13-6-1,19189.0,17317.5,959.5,865.9,+93.6,https://www.kimballstock.com/pix/CHI/03/CHI_03_RK0299_01_P.JPG
4,https://g.espncdn.com/lm-static/logo-packs/core/DIS_Avengers_EndGame/DIS_Avengers_EndGame_Capt_America.svg,Mr.Clean ICE,(Kenil Prajapati),12-7-1,21134.0,17640.5,1056.7,882.0,+174.7,
5,https://g.espncdn.com/lm-static/logo-packs/core/OldTimeMickeyAndFriends/Hockey_Donald.svg,Yonkers Yoinkers,(Einar H),11-8-1,17317.5,16704.5,865.9,835.2,+30.6,
6,,Yogurt Slingers,(Allan Perez),8-11-1,15821.5,16717.5,791.1,835.9,-44.8,https://g.espncdn.com/lm-app/lm/img/shell/shield-FBA.svg
7,https://g.espncdn.com/lm-static/logo-packs/core/TeamMascots-RobbHarskamp/Team_Mascots-04.svg,TAMU Shauced Shnacks,(Enrique Baqueiro),10-9-1,19733.5,17396.0,986.7,869.8,+116.9,
8,https://g.espncdn.com/lm-static/fba/images/default_logos/1.svg,Htown 🍆💦 Dal,(sheshu chandrasekar),3-16-1,13393.5,18560.5,669.7,928.0,-258.4,
9,https://g.espncdn.com/lm-static/logo-packs/fba/DreamTeam-ESPN/dreamTeam-4.svg,Original Gayngster,(Lee Nguyen),7-12-1,14462.0,17812.0,723.1,890.6,-167.5,
10,https://g.espncdn.com/lm-static/logo-packs/fba/Jerseys-ESPN/fba-jerseys-10.svg,Musty Burger FC Juan Prado,(Juan Prado),0-19-1,13300.5,18229.0,665.0,911.5,-246.4,
Related
I need to be able to automatically update a google sheet file every time an order is placed through WooCommerce.
I've found the solution below, but using this each individual item ordered is listed as a new row. I'd like the order to be grouped under the order number and the item quantities separated into appropriate columns instead.
https://www.tychesoftwares.com/export-woocommerce-orders-to-google-sheets-in-realtime/
Below is a Google Sheet we are manually updating at present to show you what i mean.
Example
Is there a way to send the WooCommerce orders directly through to Google Sheets in this format?
Thanks so much in advance for any advice!
Yes it looks possible.
I know nothing about WooCommerce, but I believe you can sort out the received data in any way you want.
Look, the last line in their script appends the received data as a new row:
sheet.appendRow([timestamp,order_number,order_created,order_status]);
As far as I can see, the data contains the four elements:
timestamp
order_number
order_created
orders_status
Instead, you can put these elements into any cell on your table. Something like this, for example:
var ss = Spreadsheet.GetActiveSheet();
ss.getRange('A10').setValue(timestamp); // timestamp goes to A10
ss.getRange('B20').setValue(order_number); // order_number goes to B20
ss.getRange('C30').setValue(order_created + order_status); // created + status go to C30
The same way you can add any of these elements to some existing value in some cell, etc. For example:
var old_value = ss.getRange('A2').getValue(); // get value from the cell A2
var new_value = old_value + order_number; // add with order_number
ss.getRange('A2').setValue(new_value); // put the sum back into the cell A2
The main problem is up to you. You have to figure out:
what exactly the elements you're receiving (number, names)
how exactly you want to sort them out (what to add to what... what to put where... etc)
I can't understand it from the example picture.
Here is some reference documentation on Apps Script:
Main Page - Introducing Apps Script.
Sheets Guide - Introduction to Sheets with Apps Script.
Sheets Reference - Where you will find all the details of everything you can do with Sheets in Apps Script.
Remove Duplicate Rows - A good small tutorial that will teach you the basics of Sheets and Ranges and how to manipulate them.
To export all my WooCommerce orders on a scheduled basis, I used a ready-made solution.
I used a WooCommerce API and JSON client. It worked smoothly: I got the WooCommerce API, and the JSON client was implemented in the tool already.
You just need to choose endpoint in the JSON client to get the required data. I exported all orders once a month, so I used the base URL http:// mydomain /wp-json/wc/v3/orders and my endpoint was orders.
You can check this article to understand better how it works for your purpose.
And here is WooCommerce API documentation.
I assume that setting up an export through the Apps Script is more flexible (and based on the answer above, it's working indeed), but I'm not a code guy. So I searched for an easier solution, and the API + JSON client helped.
Hope you'll find it helpful.
I would like to suggest using WooCommerce Google Sheet Plugin
I'll start by saying that my knowledge on using APIs is extremely limited. I'm impressed I've gotten as far as I have on this.
I've created a workbook in Google Sheets with imported data from the iexcloud API, which I'm using for data on stocks.
The requests have a cell reference in them so they update whenever a different symbol is selected.
So far, everything I've needed to request from it has the option to format as csv, so I can get cells with just the values.
However, this last thing I want doesn't have that option, so the whole response is wrapped in ["" ].
That really messes up what I need it for.
Here's an example
["PSA" CCI SHO ACC]
with each symbol being in its own cell.
I'm using the Peer Groups request.
A sample request:
> https://sandbox.iexapis.com/stable/stock/aapl/peers?token=Tsk_2b4c7c6fd98542f6a99f904cb7a3e721
Using Find and Replace doesn't work. I'm assuming because it's imported.
I need to use the cells with those symbols: PSA, CCI, SHO, ACC to reference in another request.
I recreated this in another Google Sheet that you can edit. The section in question in highlighted in blue
https://docs.google.com/spreadsheets/d/1BQ6FBD0S2YkDtDGZGIkDmQoKrQT4VmVDjuNsgV4mrXM/edit?usp=sharing
So I'm wondering if there's a way to have [ " ] automatically removed from any cells in that row, or if I copy and paste the values only, to have the values updated when the original cells are updated with new symbols (since I can have those characters removed in that row)
Or if there's a way I can format the response in sheets.
Any ideas?
I believe your goal as follows.
You want to achieve from ["CCI" SBAC CTL TDS RCI RCI-A-CT DTEGY] to CCI SBAC CTL TDS RCI RCI-A-CT DTEGY using the built-in functions of Google Spreadsheet.
Modified formula:
=ARRAYFORMULA(REGEXREPLACE(IMPORTDATA("https://cloud.iexapis.com/stable/stock/"&B3&"/peers?format=psv&token=###"),"[\[\]""]",""))
In this modified formula, [, ] and " are removed using REGEXREPLACE.
Please replace ### with your token at the above formula.
Result:
In this result, the values retrieved with =IMPORTDATA("https://cloud.iexapis.com/stable/stock/"&B3&"/peers?format=psv&token=###") are used. So the formula of cell "C9" is =ARRAYFORMULA(REGEXREPLACE(C6:I6,"[\[\]""]","")). But in this case, above modified formula can be used.
Note:
In this answer, I removed your token because I thought that it is your personal information.
Reference:
REGEXREPLACE
I'm trying to retrieve a table which is updating twice per day. On other websites i was able to find the element but i saw that the way i see don't work on all websites where i tried.
In this case the issue is:
In google sheets using importxml, i can't find the correct path to table from the link or identify the element.
The website for this example is: http://lotopolonia.com/tabel/arhiva/index.php
1. I need to retrieve the dates and numbers.
2. They are updated twice per day and being updated in my sheet with adding just the last line at the top of the others. But this one after i solve the first one.
I looked at xpath tutorial from w3c and understood the syntax a bit.
The problem is how to identify correctly the elements and nodes in the inspector to retrieve the data i need.
Also, i've installed a chrome extension (XPath Helper) which shows xpath better that what i got from chrome.
I tried the following:
=IMPORTXML("http://lotopolonia.com/tabel/arhiva/index.php","//table[#class='table_01']/tbody/tr[#class='second_row']/td[#class='colon2']")
=IMPORTXML("http://lotopolonia.com/tabel/arhiva/index.php","//table[#class='table_01']/tbody/tr[#class='second_row']/td[*]")
=IMPORTXML("http://lotopolonia.com/tabel/arhiva/index.php","//table[#class='table_01']/tbody/tr[#class='first_row'][1]/td[*]")
=IMPORTXML("http://lotopolonia.com/tabel/arhiva/index.php","//*[#class='table_01']/table/tbody/tr[#class='first_row'][1]/td[*]")
=IMPORTXML("http://lotopolonia.com/tabel/arhiva/index.php","//table[#class='table_01']/tbody/tr[3]/td[*]")
=IMPORTXML("http://lotopolonia.com/tabel/arhiva/index.php","//table[#class='table_01']/tbody/tr[*]/td[*]")
=IMPORTXML("http://lotopolonia.com/tabel/arhiva/index.php","//table[#class='table_01']/tbody/tr[#class='second_row'][1]/child::td[*]")
The formula looks ok, without errors, but at all above requests i get the same result: imported content is empty
Unfortunately i ran out of ideas and how to interpret that elements...
Any ideea how to go on?
Cheers
How about this answer? I used //table[#class='table_01']/tr[position()>2] as a xpath. "A1" has http://lotopolonia.com/tabel/arhiva/index.php.
=IMPORTXML(A1,"//table[#class='table_01']/tr[position()>2]")
Using table[#class='table_01'], retrieve the table.
Using tr[position()>2], retrieve the dates and numbers.
Result :
Note :
If you want to retrieve the whole table, please use =IMPORTXML(A1,"//table[#class='table_01']/tr").
If this was not what you want, I'm sorry.
I have a txt file available on the web which contains tab separated values (TSV/CSV) like this:
Product_IdtabColortabPricetabQuantityItem1 tabRed tab$5.2 tab5Item2 tabBlue tab$7.5 tab10
I imported the txt file into a Google Spreadsheet using the IMPORTDATA(url) formula. The problem is that now I need to split the text to columns. I tried the following formulas without success:
Split(A1,"\t")
Split(A1," ")
Split(A1,"<tab>")
another thing I tried is to to use the Substitute function, but I just can't figure out how to match the Tab character in Google Spreadsheets?
Pages strips tabs by default when you paste text using a standard paste. Tab delimited data can be pasted and automatically parsed using:
Right Click -> Paste special -> Paste values only
IMPORTDATA(url) seems to handle tabs automatically, as others have mentioned before, if the URL ends in ".tsv".
I had trouble trying to import a file from Dropbox even though the file was named "something.tsv", because the url was
"https://www.dropbox.com/s/xxxxxxx/something.tsv?dl=1"
I managed to solve the problem by adding a dummy query parameter to the url:
"https://www.dropbox.com/s/xxxxxxx/something.tsv?dl=1&x=.tsv"
NOTE: I know this question was asked back in 2014 and I am answering this question some 5 years later. I am posting the answer here in hopes that someone else who googles their way here will be saved the headache and can be helped by how I devised a solution.
SUMMARY OF THE ISSUE: By default the IMPORTDATA() function will properly process a tab-delimited file only if the file name ends with the extension .TSV
UPDATE Nov 14, 2019:
In a comment below, Poul shared that he has found an undocumented parameter for the IMPORTDATA() function by which you can specify the delimiter to split the data. As of writing this, the official documentation makes no reference to this delimiter.
In effect the documentation should look something like the following:
IMPORTDATA("url","delimiter")
So, if you wanted to force a file to be split on the TAB character, it would look something like
IMPORTDATA("url","\t")
PRIOR ANSWER:
UPDATE: I am leaving my original answer just in case it might be helpful if the answer above, which includes undocumented functionality, does not continue to work.
ORIGINAL ANSWER: After seemingly countless attempts, I figured out how to coax Google Sheets into importing a tab-delimited file regardless of the extension.
For those looking for the quick and dirty answer, copy the following into a cell of a Google Sheet to give it a try:
=ARRAYFORMULA(IFERROR(SPLIT(IMPORTDATA("https://iso639-3.sil.org/sites/iso639-3/files/downloads/iso-639-3_Latin1.tab"),CHAR(9),FALSE,FALSE)))
For those that want to know a bit more, I will try to explain how each of the nested functions are helping to create the final solution:
=ARRAYFORMULA( IFERROR( SPLIT( IMPORTDATA(URL-HERE) ,CHAR(9),FALSE,FALSE) ) )
IMPORTDATA() - the primary function that pulls in the data file from the web
SPLIT - split the row by tab, note the use of char(09) to generate the tab character; also note the use of FALSE for the last parameter which was required in my case to ensure empty cells were not collapsed together
IFERROR - used to catch situations where an import might fail, the error will be trapped and not returned to the spreadsheet
ARRAYFORMULA - this function ensures that every line in the file is parsed; without this, only the first line of the file would be returned to the spreadsheet
It turns out that IMPORTDATA(url) can import a tab separated file, but it expects the file name to have the .tsv extension. This is inconsistent with Excel, where a tab-separated export results in *.txt.
If you can ensure that you use a .tsv extension, then your problem is solved.
You can also use the Sheets UI to import the file (into a new Spreadsheet). Select File > Import..., then Upload > Select a file from your computer. When the file selection dialog opens, paste the URL into the file name field, and click Open. The file will be downloaded to your PC then uploaded to Drive, through the Import dialog that will let you choose the delimiter.
(Validated on Windows 8.1 with Chrome; I don't know how this will behave on other OSes or browsers.)
Edit: See this gist.
importFromCSV(string fileName, string sheetName)
Populates a sheet with contents read from a CSV file located in the user's GDrive. If either parameter is not provided, the function will open inputBoxes to obtain them interactively.
Automatically detects tab or comma delimited input.
I had luck using split() and indicating only a single space as the delimiter, even though the data i pasted in had tabs separating each "column": =SPLIT(A1, " ", True) where A1 had data separated by 1 or more spaces. It seems that pasting in TSV data results in conversion from tabs to spaces.
This could be done in two steps leveraging the fact that tab is essentially multiple spaces.
Steps are as follows:
Select the columns which have tab separated data. Then trim tab to single space by using Data -> Data cleanup -> Trim whitespaces.
Now usual Data -> Split text to columns should work out of the box or after selecting space as separator.
I try to receive the JSON of a Google Spreadsheet Worksheet. It worked till some days ago. For the default worksheet it still works, but not for all other worksheets.
This is the working URL for the default worksheet: https://spreadsheets.google.com/feeds/list/1caRqAA1TyBoZ0eVZvvKheEBh9SGRmQII4qih9urY70k/od6/public/full?alt=json
And this is the URL for the worksheet that stopped working: https://spreadsheets.google.com/feeds/list/1caRqAA1TyBoZ0eVZvvKheEBh9SGRmQII4qih9urY70k/1416241220/public/full?alt=json
The error message is Invalid query parameter value for grid_id.
Only difference is the worksheet parameter (od6 vs 1416241220).
Any ideas on why that error suddenly occurs?
ChrisPeterson's note:
You can use worksheet position number (1 for the first/default worksheet, 2 for the second worksheet).
Original answer
I came across the same issue and I managed to find my way out.
It seems that they recently changed the id for each worksheet.
You can find the new ID at the following
https://spreadsheets.google.com/feeds/worksheets/YOUR_SPREADSHEET_ID/private/full
I got something like o3laxt8 between <id> tags
Ps: od6 anddefault values will always work and redirect to the first worksheet of your document.
Joe Germuska' note:
od6 doesn't work anymore
Seems to work again.
I'd like to share a concrete example because I find there are enough confusing instructions out there including the accepted answer and worksheet IDs and where to put them not being obvious.
Here's a document I published and anyone with the link can view:
https://docs.google.com/spreadsheets/d/1QDWpycJJFA-UAiSPIv-icJ4UZhbEmuN8wxxag83SE1c/edit?usp=sharing
The document has to be published correctly. There are two Publish buttons and the first one doesn't work for this task. Use the second.
The document KEY is important. Obtain the KEY from between the /d/ and the /edit in the url. In my example, the key is 1QDWpycJJFA-UAiSPIv-icJ4UZhbEmuN8wxxag83SE1c.
Second, use the following URL style, replacing KEY with your own:
https://spreadsheets.google.com/feeds/list/KEY/od6/public/values?alt=json
My example url links directly to published json:
https://spreadsheets.google.com/feeds/list/1QDWpycJJFA-UAiSPIv-icJ4UZhbEmuN8wxxag83SE1c/od6/public/values?alt=json
Finally, if the worksheet has multiple sheets (or tabs), replace od6 in the url with a number. My example has two tabs, so there are two urls corresponding to either tab. I simply replace od6 with 1 and 2 depending on the order of the sheets:
Tab 1:
https://spreadsheets.google.com/feeds/list/1QDWpycJJFA-UAiSPIv-icJ4UZhbEmuN8wxxag83SE1c/1/public/values?alt=json
Tab 2:
https://spreadsheets.google.com/feeds/list/1QDWpycJJFA-UAiSPIv-icJ4UZhbEmuN8wxxag83SE1c/2/public/values?alt=json
In the event of a worksheet where the tabs are reordered frequently, it is possible to get the ID of a given sheet and use that instead of ordered numbers. I first learned of this approach from this post or this post:
In brief, you would reform a private URL with your KEY:
https://spreadsheets.google.com/feeds/worksheets/KEY/private/full
This only works on a browser where you are logged into Google Drive on an account with permissions.
Next, you have to sift through XML to find your sheet IDs:
Replace the previous 1 and 2 with the IDs, for example:
Tab 1 (first worksheet id in a new google sheet is always od6 by default, no matter order of tabs):
https://spreadsheets.google.com/feeds/list/1QDWpycJJFA-UAiSPIv-icJ4UZhbEmuN8wxxag83SE1c/od6/public/values?alt=json
Tab 2:
https://spreadsheets.google.com/feeds/list/1QDWpycJJFA-UAiSPIv-icJ4UZhbEmuN8wxxag83SE1c/ope57yg/public/values?alt=json
You can find the new ID at the following
https://spreadsheets.google.com/feeds/worksheets/YOUR_SPREADSHEET_ID/private/full