Scraping an AJAX page using VBA - json

I've been trying to Scrape the entire HTML body and assign it as a string variable before manipulating that string to populate an excel file - this will be done on a a loop to update the date every 5 minute interval.
These pages are AJAX pages, so run what looks like JavaScript (I'm not familiar with JS at all though).
I've tried using the XMLHttpRequest object (code below) but t returns the JS Calls:
Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP")
XMLHTTP.Open "GET", "https://www.google.co.uk/finance?ei=bQ_iWLnjOoS_UeWcqsgE", False
XMLHTTP.setRequestHeader "Content-Type", "text/xml"
XMLHTTP.send
Debug.Print XMLHTTP.ResponseText
I've tried creating an IE object with the below code but, again, same issue:
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = False
IE.navigate "https://www.google.co.uk/finance?ei=bQ_iWLnjOoS_UeWcqsgE"
While IE.Busy Or IE.ReadyState <> 4: DoEvents: Wend
Set HTMLdoc = IE.Document
Debug.Print = HTMLdoc.Body.innerHTML
What I want it exactly text available to me when I hit F12 and got to the inspector tab (ie. the entirety of the text within the yellow section below) - If I could get this (full expanded) I could work from there. Any help would be massively appreciated.
In the above example (Google finance), the index prices update asynchronously - I want to capture these at the time at which I assign the string.

For any dynamically loaded data you just inspect XHRs the webpage does, find the one containing the relevant data, make the same XHR (either site provides API or not) and parse response, or in case of IE automation you add extra wait loop until a target element becomes accessible, then retrieve it from DOM.
In that certain case you can get the data via Google Finance API.
Method 1.
To make the request you have to know stock symbols, which could be easily find within webpage HTML content, or e. g. if you click on CAC 40, in opened page there will be a title CAC 40 (INDEXEURO:PX1).
There are the following stock and stock exchange symbols in the World markets table on that page:
Shanghai SHA:000001
S&P 500 INDEXSP:.INX
Nikkei 225 INDEXNIKKEI:NI225
Hang Seng Index INDEXHANGSENG:HSI
TSEC TPE:TAIEX
EURO STOXX 50 INDEXSTOXX:SX5E
CAC 40 INDEXEURO:PX1
S&P TSX INDEXTSI:OSPTX
S&P/ASX 200 INDEXASX:XJO
BSE Sensex INDEXBOM:SENSEX
SMI INDEXSWX:SMI
ATX INDEXVIE:ATX
IBOVESPA INDEXBVMF:IBOV
SET INDEXBKK:SET
BIST100 INDEXIST:XU100
IBEX INDEXBME:IB
WIG WSE:WIG
TASI TADAWUL:TASI
MERVAL BCBA:IAR
IPC INDEXBMV:ME
IDX Composite IDX:COMPOSITE
Put them into URL:
http://finance.google.com/finance/info?q=SHA:000001,INDEXSP:.INX,INDEXNIKKEI:NI225,INDEXHANGSENG:HSI,TPE:TAIEX,INDEXSTOXX:SX5E,INDEXEURO:PX1,INDEXTSI:OSPTX,INDEXASX:XJO,INDEXBOM:SENSEX,INDEXSWX:SMI,INDEXVIE:ATX,INDEXBVMF:IBOV,INDEXBKK:SET,INDEXIST:XU100,INDEXBME:IB,WSE:WIG,TADAWUL:TASI,BCBA:IAR,INDEXBMV:ME,IDX:COMPOSITE
The response contains JSON data, like this:
[
{
"id": "7521596",
"t": "000001",
"e": "SHA",
"l": "3,222.51",
"l_fix": "3222.51",
"l_cur": "CN„3,222.51",
"s": "0",
"ltt": "3:01PM GMT+8",
"lt": "Mar 31, 3:01PM GMT+8",
"lt_dts": "2017-03-31T15:01:15Z",
"c": "+12.28",
"c_fix": "12.28",
"cp": "0.38",
"cp_fix": "0.38",
"ccol": "chg",
"pcls_fix": "3210.2368"
},
...
]
You may use the below VBA code to parse response and output result. It requires JSON.bas module to be imported to VBA project for JSON processing.
Sub GoogleFinanceData()
Dim sJSONString As String
Dim vJSON As Variant
Dim sState As String
Dim aData()
Dim aHeader()
' Retrieve Google Finance data
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "http://finance.google.com/finance/info?q=SHA:000001,INDEXSP:.INX,INDEXNIKKEI:NI225,INDEXHANGSENG:HSI,TPE:TAIEX,INDEXSTOXX:SX5E,INDEXEURO:PX1,INDEXTSI:OSPTX,INDEXASX:XJO,INDEXBOM:SENSEX,INDEXSWX:SMI,INDEXVIE:ATX,INDEXBVMF:IBOV,INDEXBKK:SET,INDEXIST:XU100,INDEXBME:IB,WSE:WIG,TADAWUL:TASI,BCBA:IAR,INDEXBMV:ME,IDX:COMPOSITE", False
.Send
If .Status <> 200 Then Exit Sub
sJSONString = .responseText
End With
' Trim extraneous chars
sJSONString = Mid(sJSONString, InStr(sJSONString, "["))
' Parse JSON string
JSON.Parse sJSONString, vJSON, sState
If sState = "Error" Then Exit Sub
' Convert to table format
JSON.ToArray vJSON, aData, aHeader
' Results output
With Sheets(1)
.Cells.Delete
.Cells.WrapText = False
If UBound(aHeader) >= 0 Then OutputArray .Cells(1, 1), aHeader
Output2DArray .Cells(2, 1), aData
.Columns.AutoFit
End With
End Sub
Sub OutputArray(oDstRng As Range, aCells As Variant)
With oDstRng
.Parent.Select
With .Resize(1, UBound(aCells) - LBound(aCells) + 1)
.NumberFormat = "#"
.Value = aCells
End With
End With
End Sub
Sub Output2DArray(oDstRng As Range, aCells As Variant)
With oDstRng
.Parent.Select
With .Resize( _
UBound(aCells, 1) - LBound(aCells, 1) + 1, _
UBound(aCells, 2) - LBound(aCells, 2) + 1)
.NumberFormat = "#"
.Value = aCells
End With
End With
End Sub
As a result the data you need is located in l_fix, c_fix, cp_fix columns.
Method 2.
Also you can make XHR by the URL like this one for CAC 40:
https://www.google.co.uk/finance/getprices?q=PX1&x=INDEXEURO&i=120&p=20m&f=d,c,v,o,h,l
Particularly that URL is for PX1 stock and INDEXEURO stock exchange symbols, 120 sec interval, 20 minutes period, response data d,c,v,o,h,l is for DATE (UNIX TimeStamp), CLOSE, VOLUME, OPEN, HIGH, LOW.
Response format is as follows:
EXCHANGE%3DINDEXEURO
MARKET_OPEN_MINUTE=540
MARKET_CLOSE_MINUTE=1050
INTERVAL=120
COLUMNS=DATE,CLOSE,HIGH,LOW,OPEN,VOLUME
DATA=
TIMEZONE_OFFSET=120
a1491405000,5098.75,5099.92,5098.75,5099.92,0
1,5100.51,5100.51,5098.09,5098.09,0
2,5099.63,5101.2,5099.29,5100.68,0
3,5099.83,5100.04,5099.07,5099.28,0
4,5098.19,5098.9,5097.71,5098.9,0
5,5098.56,5099.24,5097.99,5099.24,0
6,5097.34,5098.2,5096.14,5098.2,0
7,5096.52,5097.38,5095.66,5097.38,0
8,5093.27,5095.39,5093.27,5095.39,0
9,5094.43,5094.43,5092.07,5093.17,0
10,5088.18,5092.72,5087.68,5092.72,0
The XHR should be done for each stock symbol in the list, then results should be consolidated into table.

Related

is there a way to tell if there is json on a website

i was wondering if you could tell me if there is a way to tell if there is Json that get from a site, i was working with the sydney KWS site and someone was able to tell me what their JSON page was, the page i am looking at is https://www.bne.com.au/passenger/flights/arrivals-departures any help would be great i need to get the flight information for departure from it,
I have taken a look at the backend and found that there is java been used to get the infoamtion, and found that there is a redirect when you first load the page but i could make head ot tails of that
Open the webpage https://www.bne.com.au/passenger/flights/arrivals-departures in a browser (e. g. Chrome) and press F12 to open Developer tools. Go to Network tab, reload the page F5, enter json as filter string, then you can see the requests are logged:
Inspect logged requests, the one having the largest size in that case contains the flights data. Open the request, here you can see URL on Headers tab (unix timestamp is sent as nocache parameter to disable caching):
There is response content on Preview and Response tabs:
Here is VBA example showing how that data could be retrieved. Import JSON.bas module into the VBA project for JSON processing.
Option Explicit
Sub test()
Dim url As String
Dim resp As String
Dim data
Dim state As String
Dim body()
Dim head()
url = "https://www.bne.com.au/sites/default/files/00API-Today.json?nocache=" & CStr(epochTimeStamp(Now))
' Retrieve JSON content
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", url, True
.send
Do Until .readyState = 4: DoEvents: Loop
resp = .responseText
End With
' Parse JSON sample
JSON.Parse resp, data, state
If state = "Error" Then MsgBox "Invalid JSON": End
' Convert JSON to 2D Array
JSON.ToArray data, body, head
' Output to worksheet #1
output head, body, ThisWorkbook.Sheets(1)
MsgBox "Completed"
End Sub
Sub output(head, body, ws As Worksheet)
With ws
.Activate
.Cells.Delete
With .Cells(1, 1)
.Resize(1, UBound(head) - LBound(head) + 1).Value = head
.Offset(1, 0).Resize( _
UBound(body, 1) - LBound(body, 1) + 1, _
UBound(body, 2) - LBound(body, 2) + 1 _
).Value = body
End With
.Columns.AutoFit
End With
End Sub
Function epochTimeStamp(dateTime)
epochTimeStamp = (dateTime - 25569) * 86400
End Function
The output for me is as follows (fragment):
BTW, the similar approach applied in other answers.

Excel VBA Scraping- HTML tables are not visible

I am trying to get data from "https://in.tradingview.com/symbols/NSE-ABB/technicals/" using excel vba website scraping, eventhough I am geting response, but the body.innerHTML is not showing required table, but in chrome if I inspect the page, I am able to see the table with the name.
What is wrong with the code?
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", URL, False
.send
sResponse = StrConv(.responseBody, vbUnicode)
End With
sResponse = Mid$(sResponse, InStr(1, sResponse, "<!DOCTYPE "))
WriteTxtFile sResponse
With html
.body.innerHTML = sResponse
Set tElementC = .getElementsByClassName("table-1i1M26QY- maTable-27Z4Dq6Y- tableWithAction-2OCRQQ8y-")(0).getElementsByTagName("td")
End With
URL --> https://in.tradingview.com/symbols/NSE-ABB/technicals/
classname to access = "table-1i1M26QY- maTable-27Z4Dq6Y- tableWithAction-2OCRQQ8y-"
The webpage source HTML by the link provided https://in.tradingview.com/symbols/NSE-ABB/technicals/ doesn't contain the necessary data, it uses AJAX. The website has a sorta API available. The response is returned in JSON format. So you need to make some reverse engineering work first to find out how does the website works. In a browser, e. g. Chrome, press F12 to open DevTools, navigate to the webpage, go to Network tab, set the filter to XHR, it will look like as shown below:
Examine logged responses. One of them having the largest size actually contains all the necessary data:
To make such XHR you need to keep the entire payload structure also, and add the relevant headers:
In Form Data section there are a lot of quote field titles that located within the array, so you may choose which actually you need. You may find more available titles, click on Initiator link (first screenshot above), you will see JS code which initiated that XHR. Click Pretty print {} at the bottom to make the code readable. Type any title you already pulled out from Form Data in the search box, e. g. Recommend.Other, and find others next to it in the code:
Here is VBA example showing how such scraping could be done. Import JSON.bas module into the VBA project for JSON processing.
Option Explicit
Sub Test()
Dim aQuoteFieldTitles()
Dim aQuoteFieldData()
Dim sPayload As String
Dim sJSONString As String
Dim vJSON
Dim sState As String
Dim i As Long
' Put the necessary field titles into array
aQuoteFieldTitles = Array( _
"name", "description", "country", "type", "after_tax_margin", "average_volume", "average_volume_30d_calc", "average_volume_60d_calc", "average_volume_90d_calc", "basic_eps_net_income", "beta_1_year", "beta_3_year", "beta_5_year", "current_ratio", "debt_to_assets", "debt_to_equity", "dividends_paid", "dividends_per_share_fq", _
"dividends_yield", "dps_common_stock_prim_issue_fy", "earnings_per_share_basic_ttm", "earnings_per_share_diluted_ttm", "earnings_per_share_forecast_next_fq", "earnings_per_share_fq", "earnings_release_date", "earnings_release_next_date", "ebitda", "enterprise_value_ebitda_ttm", "enterprise_value_fq", "exchange", "expected_annual_dividends", _
"gross_margin", "gross_profit", "gross_profit_fq", "industry", "last_annual_eps", "last_annual_revenue", "long_term_capital", "market_cap_basic", "market_cap_calc", "net_debt", "net_income", "number_of_employees", "number_of_shareholders", "operating_margin", _
"pre_tax_margin", "preferred_dividends", "price_52_week_high", "price_52_week_low", "price_book_ratio", "price_earnings_ttm", "price_revenue_ttm", "price_sales_ratio", "quick_ratio", "return_of_invested_capital_percent_ttm", "return_on_assets", "return_on_equity", "return_on_invested_capital", "revenue_per_employee", "sector", _
"eps_surprise_fq", "eps_surprise_percent_fq", "total_assets", "total_capital", "total_current_assets", "total_debt", "total_revenue", "total_shares_outstanding_fundamental", "volume", "relative_volume", "pre_change", "post_change", "close", "open", "high", "low", "gap", "price_earnings_to_growth_ttm", "price_sales", "price_book_fq", _
"price_free_cash_flow_ttm", "float_shares_outstanding", "total_shares_outstanding", "change_from_open", "change_from_open_abs", "Perf.W", "Perf.1M", "Perf.3M", "Perf.6M", "Perf.Y", "Perf.YTD", "Volatility.W", "Volatility.M", "Volatility.D", "RSI", "RSI7", "ADX", "ADX+DI", "ADX-DI", "ATR", "Mom", "High.All", "Low.All", "High.6M", "Low.6M", _
"High.3M", "Low.3M", "High.1M", "Low.1M", "EMA5", "EMA10", "EMA20", "EMA30", "EMA50", "EMA100", "EMA200", "SMA5", "SMA10", "SMA20", "SMA30", "SMA50", "SMA100", "SMA200", "Stoch.K", "Stoch.D", "MACD.macd", "MACD.signal", "Aroon.Up", "Aroon.Down", "BB.upper", "BB.lower", "goodwill", "debt_to_equity_fq", "CCI20", "DonchCh20.Upper", _
"DonchCh20.Lower", "HullMA9", "AO", "Pivot.M.Classic.S3", "Pivot.M.Classic.S2", "Pivot.M.Classic.S1", "Pivot.M.Classic.Middle", "Pivot.M.Classic.R1", "Pivot.M.Classic.R2", "Pivot.M.Classic.R3", "Pivot.M.Fibonacci.S3", "Pivot.M.Fibonacci.S2", "Pivot.M.Fibonacci.S1", "Pivot.M.Fibonacci.Middle", "Pivot.M.Fibonacci.R1", _
"Pivot.M.Fibonacci.R2", "Pivot.M.Fibonacci.R3", "Pivot.M.Camarilla.S3", "Pivot.M.Camarilla.S2", "Pivot.M.Camarilla.S1", "Pivot.M.Camarilla.Middle", "Pivot.M.Camarilla.R1", "Pivot.M.Camarilla.R2", "Pivot.M.Camarilla.R3", "Pivot.M.Woodie.S3", "Pivot.M.Woodie.S2", "Pivot.M.Woodie.S1", "Pivot.M.Woodie.Middle", "Pivot.M.Woodie.R1", _
"Pivot.M.Woodie.R2", "Pivot.M.Woodie.R3", "Pivot.M.Demark.S1", "Pivot.M.Demark.Middle", "Pivot.M.Demark.R1", "KltChnl.upper", "KltChnl.lower", "P.SAR", "Value.Traded", "MoneyFlow", "ChaikinMoneyFlow", "Recommend.All", "Recommend.MA", "Recommend.Other", "Stoch.RSI.K", "Stoch.RSI.D", "W.R", "ROC", "BBPower", "UO", "Ichimoku.CLine", _
"Ichimoku.BLine", "Ichimoku.Lead1", "Ichimoku.Lead2", "VWMA", "ADR", "RSI[1]", "Stoch.K[1]", "Stoch.D[1]", "CCI20[1]", "ADX-DI[1]", "AO[1]", "Mom[1]", "Rec.Stoch.RSI", "Rec.WR", "Rec.BBPower", "Rec.UO", "Rec.Ichimoku", "Rec.VWMA", "Rec.HullMA9" _
)
' Field titles exactly as in the table MOVING AVERAGES
' aQuoteFieldTitles = Array("EMA5", "SMA5", "EMA10", "SMA10", "EMA20", "SMA20", "EMA30", "SMA30", "EMA50", "SMA50", "EMA100", "SMA100", "EMA200", "SMA200", "Ichimoku.BLine", "VWMA", "HullMA9")
' Compose payload
sPayload = "{""symbols"":{""tickers"":[""NSE:ABB""],""query"":{""types"":[]}},""columns"":" & JSON.Serialize(aQuoteFieldTitles) & "}"
' Retrieve JSON response
With CreateObject("MSXML2.XMLHTTP")
.Open "POST", "https://scanner.tradingview.com/india/scan", True
.setRequestHeader "content-type", "application/x-www-form-urlencoded"
.setRequestHeader "user-agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36"
.setRequestHeader "content-length", Len(sPayload)
.send (sPayload)
Do Until .readyState = 4: DoEvents: Loop
sJSONString = .responseText
End With
' Parse JSON response
JSON.Parse sJSONString, vJSON, sState
' Check response validity
Select Case True
Case sState <> "Object"
MsgBox "Invalid JSON response"
Case IsNull(vJSON("data"))
MsgBox vJSON("error")
Case Else
' Output data to worksheet #1
aQuoteFieldData = vJSON("data")(0)("d")
With ThisWorkbook.Sheets(1)
.Cells.Delete
.Cells.WrapText = False
For i = 0 To UBound(aQuoteFieldTitles)
.Cells(i + 1, 1).Value = aQuoteFieldTitles(i)
.Cells(i + 1, 2).Value = aQuoteFieldData(i)
Next
.Columns.AutoFit
End With
MsgBox "Completed"
End Select
End Sub
The output for me is as follows:
BTW, the similar approach applied in other answers.
As mentioned in comments, javascript has to run on the page to update the required content. There doesn't appear to be an API freely available. You can use a browser. You need to go VBE > Tools > References > add a reference to Microsoft Internet Controls.
Option Explicit
Public Sub GetInfo()
Dim IE As InternetExplorer, ws As Worksheet, hTable As Object, tRow As Object, td As Object, r As Long, c As Long, headers()
headers = Array("name", "value", "action")
Set ws = ThisWorkbook.Worksheets("Sheet1"): Set IE = New InternetExplorer
With IE
.Visible = True
.Navigate2 "https://in.tradingview.com/symbols/NSE-ABB/technicals/"
While .Busy Or .readyState < 4: DoEvents: Wend
Set hTable = IE.document.querySelector("table + .tableWithAction-2OCRQQ8y-")
ws.Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
For Each tRow In hTable.getElementsByTagName("tr")
r = r + 1: c = 1
For Each td In tRow.getElementsByTagName("td")
ws.Cells(r, c).Value = td.innerText
c = c + 1
Next td
Next tRow
.Quit
End With
End Sub

VBA Data Import from Google into Excel: Custom Time Ranges

For a VBA application in Excel, I am trying to include the "custom time range" function Google offers when narrowing down the search. So far, I am using the following code (see below), which allows to import "resultStats" from Google for a given search term into Excel but lacks the time range option.
In this specific case, I would need to determine the number of results/articles e.g. for "Elon Musk" between 01/01/2015 and 12/31/2015. Is there any practicable addition to the code below? And can this also be applied for the Google News tab instead of the regular Google Search results?
Many thanks in advance!
Sub Gethits()
Dim url As String, lastRow As Long
Dim XMLHTTP As Object, html As Object, objResultDiv As Object, objH3 As Object, link As Object
Dim start_time As Date
Dim end_time As Date
Dim var As String
Dim var1 As Object
lastRow = Range("A" & Rows.Count).End(xlUp).Row
Dim cookie As String
Dim result_cookie As String
start_time = Time
Debug.Print "start_time:" & start_time
For i = 2 To lastRow
url = "https://www.google.com/search?q=" & Cells(i, 1) & "&rnd=" & WorksheetFunction.RandBetween(1, 10000)
Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP")
XMLHTTP.Open "GET", url, False
XMLHTTP.setRequestHeader "Content-Type", "text/xml"
XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
XMLHTTP.send
Set html = CreateObject("htmlfile")
html.body.innerHTML = XMLHTTP.responseText
Set objResultDiv = html.getElementById("rso")
Set var1 = html.getElementById("resultStats")
Cells(i, 2).Value = var1.innerText
DoEvents
Next
end_time = Time
Debug.Print "end_time:" & end_time
Debug.Print "done" & "Time taken : " & DateDiff("n", start_time, end_time)
MsgBox "done" & "Time taken : " & DateDiff("n", start_time, end_time)
End Sub
It seems you need URL encoding so a string as shown below works when you include your cd_max and cd_min parameters. You specify news with the parameter tbm=nws.
As #chillin mentions you can achieve encoding of parameters with Application.Encodeurl().
I also tried the API method but with limited success. Though the dataRange filter can be passed in the sort parameter, you need to register for an API key , set up a custom search engine and set your requirements. Results are max 10 per query; there is an API call limit for free calls. You can specify a start number to get blocks of 10. You can also see what is URL encoded by running through the Google APIs explorer - custom search. I found it only returned 2 results which was clearly not in the region of the expected number.
Option Explicit
Public Sub GetResultCount()
Dim sResponse As String, html As HTMLDocument
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.google.co.uk/search?q=elon+musk&safe=strict&biw=1163&bih=571&source=lnt&tbs=cdr%3A1%2Ccd_min%3A1%2F1%2F2015%2Ccd_max%3A12%2F31%2F2015&tbm=nws", False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
sResponse = StrConv(.responseBody, vbUnicode)
End With
Set html = New HTMLDocument
With html
.body.innerHTML = sResponse
Debug.Print .querySelector("#resultStats").innerText
End With
End Sub
Thanks for your feedback. I have now amended the URL line as follows (including the Excel ENCODEURL function, which I applied directly for the input cells of the Excel spreadsheet) and it works perfectly:
url = "https://www.google.com/search?q=" & Cells(i, 1) & "&source=lnt&tbs=cdr%3A1%2Ccd_min%3A" & Cells(i, 2) & "%2Ccd_max%3A" & Cells(i, 3) & "&tbm=nws"

Extracting marker coordinates from embedded google map

Pretty new to this so bear with me. I am needing to extract marker coordinates from an embedded google map - an example link is http://www.picknpay.co.za/store-search and I want to extract all marker positions generated in the map on search. Considered using services such as ParseHub but before going that route I thought I'd give a shot through SO/myself.
There has to be an easier way of finding the coordinates for markers stored in the map than manually going through them all and searching for their coordinates individually?
The webpage source HTML by the link provided http://www.picknpay.co.za/store-search doesn't contain the necessary data, it uses AJAX. The website http://www.picknpay.co.za has a sorta API available. Response is returned in JSON format. Navigate the page e. g. in Chrome, then open Developer Tools window (F12), Network tab, reload (F5) the page and examine logged XHRs. Most relevant data is JSON string returned by the URL:
http://www.picknpay.co.za/picknpay/json/picknpay/en/modules/store_finder/findStores.json
You may use the below VBA code to retrieve info as described above. Import JSON.bas module into the VBA project for JSON processing.
Option Explicit
Sub Scrape_picknpay_co_za()
Dim sResponse As String
Dim sState As String
Dim vJSON As Variant
Dim aRows() As Variant
Dim aHeader() As Variant
' Retrieve JSON data
XmlHttpRequest "POST", "http://www.picknpay.co.za/picknpay/json/picknpay/en/modules/store_finder/findStores.json", "", "", "", sResponse
' Parse JSON response
JSON.Parse sResponse, vJSON, sState
If sState <> "Array" Then
MsgBox "Invalid JSON response"
Exit Sub
End If
' Convert result to arrays for output
JSON.ToArray vJSON, aRows, aHeader
' Output
With ThisWorkbook.Sheets(1)
OutputArray .Cells(1, 1), aHeader
Output2DArray .Cells(2, 1), aRows
.Columns.AutoFit
End With
MsgBox "Completed"
End Sub
Sub XmlHttpRequest(sMethod As String, sUrl As String, arrSetHeaders, sFormData, sRespHeaders As String, sContent As String)
Dim arrHeader
'With CreateObject("Msxml2.ServerXMLHTTP")
' .SetOption 2, 13056 ' SXH_SERVER_CERT_IGNORE_ALL_SERVER_ERRORS
With CreateObject("MSXML2.XMLHTTP")
.Open sMethod, sUrl, False
If IsArray(arrSetHeaders) Then
For Each arrHeader In arrSetHeaders
.SetRequestHeader arrHeader(0), arrHeader(1)
Next
End If
.send sFormData
sRespHeaders = .GetAllResponseHeaders
sContent = .responseText
End With
End Sub
Sub OutputArray(oDstRng As Range, aCells As Variant)
With oDstRng
.Parent.Select
With .Resize(1, UBound(aCells) - LBound(aCells) + 1)
.NumberFormat = "#"
.Value = aCells
End With
End With
End Sub
Sub Output2DArray(oDstRng As Range, aCells As Variant)
With oDstRng
.Parent.Select
With .Resize( _
UBound(aCells, 1) - LBound(aCells, 1) + 1, _
UBound(aCells, 2) - LBound(aCells, 2) + 1)
.NumberFormat = "#"
.Value = aCells
End With
End With
End Sub
The output for me is as follows:
BTW, the similar approach applied in other answers.

Extract data locations from map

I want to extract data from a map then get and store the locations all charging station in a specific state. (eg: https://www.plugshare.com/)
How can this be done? I don't mind using any programming language but which one is the best one for this application?
You can retrieve the data directly from https://www.plugshare.com with XHRs. You have to look into a little how does a website work to scrape the data. For any dynamically loaded data you just inspect XHRs the webpage does, find the one containing the relevant data, make the same XHR (either site provides API or not) and parse response. Navigate the page e. g. in Chrome, then open Developer Tools window (F12), Network tab, reload F5 the page and examine XHRs in the list.
There is one of the requests to URL https://www.plugshare.com/api/locations/region?... that returns latitude, longitude and other info for charging stations in a rectangle viewport area with specified coordinates. You can find URL, query parameters and some necessary headers as shown below:
Response is in JSON format:
You need to add basic authorization header to request. To retrieve the credentials go to Sources tab, add XHR Breakpoint for URL contains https://www.plugshare.com/api/locations/region, reload F5 the page, when the page is paused on XHR, follow the Call Stack frame by frame:
Skip any NREUM and nrWrapper objects that are the part of New Relic functionality. Click pretty-print {} to format source. Search e. g. Basic, Authorization or setRequestHeader in the sources, for that particular case first match is found in https://www.plugshare.com/js/main.js?_=1:
Click a station on the map and you get one more XHR appeared with URL like https://www.plugshare.com/api/locations/[id] with detailed information for that station, as shown below:
Response is in JSON format also:
Also you may get data for stations from URL like https://www.plugshare.com/api/stations/[id].
You may use the below VBA code to retrieve info as described above. Import JSON.bas module into the VBA project for JSON processing.
Option Explicit
Sub Test_www_plugshare_com()
Const Transposed = False ' Output option
Const Detailed = True ' Scrape option
Dim sResponse As String
Dim aQryHds()
Dim oQuery As Object
Dim sQuery As String
Dim vRegionJSON
Dim sState As String
Dim aResult()
Dim i As Long
Dim vLocationJSON
Dim aRows()
Dim aHeader()
' Retrieve auth token
XmlHttpRequest "GET", "https://www.plugshare.com/js/main.js?_=1", "", "", "", sResponse
With RegExMatches(sResponse, "var s\=""(Basic [^""]*)"";") ' var s="Basic *";
If .Count > 0 Then
aQryHds = Array( _
Array("Authorization", .Item(0).SubMatches(0)), _
Array("Accept", "application/json") _
)
Else
MsgBox "Can't retrieve auth token"
Exit Sub
End If
End With
' Set query parameters
Set oQuery = CreateObject("Scripting.Dictionary")
With oQuery
.Add "minimal", "1"
.Add "count", "500"
.Add "latitude", "19.697593650121235"
.Add "longitude", "-155.06529816792295"
.Add "spanLng", "0.274658203125"
.Add "spanLat", "0.11878815323507652"
.Add "access", "1,3"
.Add "outlets", "[{""connector"":1},{""connector"":2},{""connector"":3},{""connector"":4},{""connector"":5},{""connector"":6,""power"":0},{""connector"":6,""power"":1},{""connector"":7},{""connector"":8},{""connector"":9},{""connector"":10},{""connector"":11},{""connector"":12},{""connector"":13},{""connector"":14},{""connector"":15}]"
.Add "fast", "add"
End With
sQuery = EncodeQueryParams(oQuery)
' Retrieve a list of stations for the viewport
XmlHttpRequest "GET", "https://www.plugshare.com/api/locations/region?" & sQuery, aQryHds, "", "", sResponse
' Parse JSON response
JSON.Parse sResponse, vRegionJSON, sState
If sState <> "Array" Then
MsgBox "Invalid JSON response"
Exit Sub
End If
' Populate result array
ReDim aResult(UBound(vRegionJSON))
' Extract selected properties from parsed JSON
For i = 0 To UBound(aResult)
Set aResult(i) = ExtractKeys(vRegionJSON(i), Array("id", "name", "latitude", "longitude"))
DoEvents
Next
If Detailed Then
' Populate result array with detailed info for each location
For i = 0 To UBound(aResult)
' Retrieve detailed info for each location
XmlHttpRequest "GET", "https://www.plugshare.com/api/locations/" & aResult(i)("id"), aQryHds, "", "", sResponse
' Parse JSON response
JSON.Parse sResponse, vLocationJSON, sState
If sState = "Object" Then
' Extract selected properties from parsed JSON
Set aResult(i) = ExtractKeys(vLocationJSON, Array("reverse_geocoded_address", "hours", "phone", "description"), aResult(i))
End If
DoEvents
Next
End If
' Convert resulting array to arrays for output
JSON.ToArray aResult, aRows, aHeader
' Output
With ThisWorkbook.Sheets(1)
.Cells.Delete
If Transposed Then
Output2DArray .Cells(1, 1), WorksheetFunction.Transpose(aHeader)
Output2DArray .Cells(1, 2), WorksheetFunction.Transpose(aRows)
Else
OutputArray .Cells(1, 1), aHeader
Output2DArray .Cells(2, 1), aRows
End If
.Columns.AutoFit
End With
MsgBox "Completed"
End Sub
Sub XmlHttpRequest(sMethod As String, sUrl As String, arrSetHeaders, sFormData, sRespHeaders As String, sContent As String)
Dim arrHeader
'With CreateObject("Msxml2.ServerXMLHTTP")
' .SetOption 2, 13056 ' SXH_SERVER_CERT_IGNORE_ALL_SERVER_ERRORS
With CreateObject("MSXML2.XMLHTTP")
.Open sMethod, sUrl, False
If IsArray(arrSetHeaders) Then
For Each arrHeader In arrSetHeaders
.SetRequestHeader arrHeader(0), arrHeader(1)
Next
End If
.send sFormData
sRespHeaders = .GetAllResponseHeaders
sContent = .responseText
End With
End Sub
Function RegExMatches(sText, sPattern, Optional bGlobal = True, Optional bMultiLine = True, Optional bIgnoreCase = True) As Object
With CreateObject("VBScript.RegExp")
.Global = bGlobal
.MultiLine = bMultiLine
.IgnoreCase = bIgnoreCase
.Pattern = sPattern
Set RegExMatches = .Execute(sText)
End With
End Function
Function EncodeQueryParams(oParams As Object) As String
Dim aParams
Dim i As Long
aParams = oParams.Keys()
For i = 0 To UBound(aParams)
aParams(i) = EncodeUriComponent((aParams(i))) & "=" & EncodeUriComponent((oParams(aParams(i))))
Next
EncodeQueryParams = Join(aParams, "&")
End Function
Function EncodeUriComponent(strText As String) As String
Static objHtmlfile As Object
If objHtmlfile Is Nothing Then
Set objHtmlfile = CreateObject("htmlfile")
objHtmlfile.parentWindow.execScript "function encode(s) {return encodeURIComponent(s)}", "jscript"
End If
EncodeUriComponent = objHtmlfile.parentWindow.encode(strText)
End Function
Function ExtractKeys(oSource, aKeys, Optional oTarget = Nothing) As Object
Dim vKey
If oTarget Is Nothing Then Set oTarget = CreateObject("Scripting.Dictionary")
For Each vKey In aKeys
If oSource.Exists(vKey) Then
If IsObject(oSource(vKey)) Then
Set oTarget(vKey) = oSource(vKey)
Else
oTarget(vKey) = oSource(vKey)
End If
End If
Next
Set ExtractKeys = oTarget
End Function
Sub OutputArray(oDstRng As Range, aCells As Variant)
With oDstRng
.Parent.Select
With .Resize(1, UBound(aCells) - LBound(aCells) + 1)
.NumberFormat = "#"
.Value = aCells
End With
End With
End Sub
Sub Output2DArray(oDstRng As Range, aCells As Variant)
With oDstRng
.Parent.Select
With .Resize( _
UBound(aCells, 1) - LBound(aCells, 1) + 1, _
UBound(aCells, 2) - LBound(aCells, 2) + 1)
.NumberFormat = "#"
.Value = aCells
End With
End With
End Sub
Change to Const Detailed = False if you have a lot of items for output to prevent application hanging, since XHRs are in synchronous mode. The output for me with specified viewport coordinates is as follows:
BTW, the similar approach applied in other answers.