VBA web scraping from CME group - html

I am trying to import crude oil data from the API at this web site: Crude Oil 1 Options Quotes
I want to get Options data of All Strike Range.
When I tried to get it by clicking From Web in the Data tab in Excel, the web site only showed At the Money Strike Range.
I am trying to get: Type: American Options, Expiration: Jun 2016, Strike Range: All
How could I import the crude oil data to Excel spreadsheet using VBA?

Fiddler and web traffic:
I inspected the web traffic whilst on this page using fiddler. I saw that there was a GET request made for the data of interest. N.B. I went with the default info on the page. Making changes may result in a POST request which will follow broadly the same approach but would require a parameter based POST request with any required Header information and the like.
Fiddler output:
JSON response:
This GET request returned a JSON string containing all the CALLS/PUTS.
JSON sample:
JSON preview:
So I replicated this GET request using XMLHTTP request and then used JSONConverter to handle the JSON string returned*.
*Note: After downloading JSONConverter.bas and adding to project you must add a reference to Microsoft Scripting Runtime via VBE > Tools > References
The object generated by Set JSON = JsonConverter.ParseJson(sResponse) is a dictionary as denoted by the leading "{".
JSON object handling:
I used JSONConverter to handle accessing information from the JSON Object.
The initial dictionary has keys of "quoteDelayed","quoteDelay","tradeDate","optionContractQuotes","underlyingFutureContractQuotes","empty".
Reviewing the webpage:
The following shows the webpage headers and first data row.
I am looking in the JSON structure for information on Calls, Strike Price and Puts. This info appears to be under key "optionContractQuotes"
Required information:
So, we can re-assign JSON to this part of the original JSON (we could have jumped straight to this but I thought it might help to explain) with:
Set JSON = JSON("optionContractQuotes")
Collection of dictionaries:
A quick inspection of the new JSON structure tells me this will be a collection of dictionaries. The "[" tells me I have a collection and the following "{", as before, tells me I then have dictionaries, as "{" is repeated, inside.
A closer examination shows me that these dictionaries have keys of "strikePrice","strikeRank","put","call","underlyingFutureContract".
The info we really care about:
We care about keys "strikePrice", "put" and "call". "strikePrice" key has primitive string value e.g.
"strikePrice": "140.0",
"put" and "call" keys have associated dictionaries. Remember the "{"?
Example keys from call dictionary object:
What we can see from the above, is that the keys of the dictionaries "put" and "call" correspond to the original source headers, and the values correspond to the rows of information we saw originally in the webpage source table.
Caveat: There are slight differences between the webpage headers and the associated JSON headers (there is also more info in the JSON...), so mapped headers are used for accessing the JSON structure to then write out to the appropriate column in the sheet:
jsonHeaders = Array("updated", "highLowLimits", "volume", "high", "low", "priorSettle", "change", "last")
Why bother?
The above analysis of the JSON structure has led us to a point where we now know:
Where the items are located that we are interested in
How these items would map to our output table (remember the point raised about headers and row data?).
Writing to the sheet:
To make things easier I am going to actually hard code the headers for "calls" and "puts" in the sheet with:
headers = Array("Updated", "Hi / Low Limit", "Volume", "High", "Low", "Prior Settle", "Change", "Last", "Strike Price", "Last", "Change", "Prior Settle", "Low", "High", "Volume", "Hi / Low Limit", "Updated")
I can then write those out in one go as my table headers with:
.Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
I know that my rows are "call", "Strike Price", "put". And that "put" headers are the same headers as "call" headers, but in reverse order:
This handy observation means, when I am looping each dictionary in the collection, which is effectively a row of information to population the table with; I can pass each dictionary to a helper sub, WriteToSheet. This helper sub uses the keys of interest to access the dictionary values and write them to the sheet. If "strikePrice" the associated value can be written direct to column 9, the middle of our table. Otherwise, if it is either "call" or "put", we can loop the jsonHeaders array to access the required inner dictionary information. Remember that "call" and "put" are dictionaries in their own right, whereas "strikePrice" was a primitive string.
To save looping the headers twice, once for "put" and once for "call", I shorthand the whole thing into a single loop:
For i = LBound(jsonHeaders) To UBound(jsonHeaders) '<==Dictionaries
.Cells(rowCounter, i + 1).Value = inputDict("call")(jsonHeaders(i))
.Cells(rowCounter, 17 - i).Value = inputDict("put")(jsonHeaders(i))
Next i
And bish, bash, bosh we have our table in the sheet.
Sample view of webpage:
View of code output:
VBA:
Option Explicit
Public Sub GetCrudeOilOptionQuotes()
Dim sResponse As String, JSON As Object, headers(), jsonHeaders(), ws As Worksheet
headers = Array("Updated", "Hi / Low Limit", "Volume", "High", "Low", "Prior Settle", "Change", "Last", "Strike Price", "Last", "Change", "Prior Settle", "Low", "High", "Volume", "Hi / Low Limit", "Updated")
jsonHeaders = Array("updated", "highLowLimits", "volume", "high", "low", "priorSettle", "change", "last") '<== JSON headers vary from visible headers on page slightly
'Ignored headers open,close,highLimit,lowLimit
Set ws = ActiveSheet
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.cmegroup.com/CmeWS/mvc/Quotes/Option/769/G/Q8/ATM?optionExpiration=190-M6&strikeRange=ATM&optionProductId=769&pageSize=500&_=1530436274974", False
.send
sResponse = StrConv(.responseBody, vbUnicode)
End With
Set JSON = JsonConverter.ParseJson(sResponse) 'Returns a dictionary
Dim i As Long, rowCounter As Long
Set JSON = JSON("optionContractQuotes") '<==Collection of dictionaries
rowCounter = 1
With ws
.UsedRange.ClearContents
.Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
End With
For i = 1 To JSON.Count 'Loop all the dictionaries to empty info into rows and columns of sheet
rowCounter = rowCounter + 1
WriteToSheet JSON(i), jsonHeaders, rowCounter, ws
Next i
End Sub
Public Sub WriteToSheet(ByVal inputDict As Object, ByVal jsonHeaders As Variant, ByVal rowCounter As Long, ByVal ws As Worksheet)
Application.ScreenUpdating = False
Dim key As Variant, i As Long
With ws
.Cells(rowCounter, 9).Value = inputDict("strikePrice") '<==literal string
For i = LBound(jsonHeaders) To UBound(jsonHeaders) '<==Dictionaries
.Cells(rowCounter, i + 1).Value = inputDict("call")(jsonHeaders(i))
.Cells(rowCounter, 17 - i).Value = inputDict("put")(jsonHeaders(i))
Next i
End With
Application.ScreenUpdating = True
End Sub

For something like this, where the values change very frequently, I think you should use Power Query. The reason is, PQ will just be cleaner and more robust, and you don't have to rely on complex VBA code to grab your data, especially when VBA is somewhat unstable with these kinds of things, and I believe you are putting some real money behind this so you want to absolutely rely on the data you get.
Simply download and install PQ.
https://www.microsoft.com/en-us/download/confirmation.aspx?id=39379
Point it to your desired URL and run it. Then, put PQ on a timer, so it refreshes every 1 minute or so. Data > Connections > Properties.
I talk about this and a lot of other similar techniques in my book, which you can find here.
https://www.amazon.com/Automating-Business-Processes-Reducing-Increasing-ebook/dp/B01DJJKVZC?ie=UTF8&keywords=ryan%20shuell&qid=1464012902&ref_=sr_1_1&sr=8-1

Related

Parsing Only the First Object of Array JSON

I have the following text structure in JSON:
{
"Results":{
"Burger King":{
"Location":"New York",
"Address":"Avenue Dalmatas, 20"
},
"Mcdonalds":{
"Location":"Los Angeles",
"Address":"Oscar Street, 50"
}
}
}
I managed to get the city and address results, but for that I need to mention the name of the restaurant in the code to grab the token string.
Dim JSON As String = **json here**
Dim token As JToken
token = JObject.Parse(JSON.ToString())
Dim data = token.SelectToken("Results").SelectToken("Burger King").SelectToken("Location")
My question is, how can I list only the restaurants (Burger King, Mcdonalds, etc), for example, in a Listbox? So I can add some feature that checks the address and city with the user's choice, which I know and already understand how to do, but getting the token with only the names of the restaurants is being difficult for me. If i have a new restaurant name for example, I wouldn't want to include manually in the code. I have tried a lot of different ways, but the last one I have used it was the following:
Dim data = token.SelectToken("Results").SelectToken(0) 'i thought it would print 'Burger King'
'or this one
Dim data = token.SelectToken("Results")(0).ToString()
I've tried some 'For Each' loop code but I wasn't successful either. I've researched countless methods on how to do this and nothing works. I believe it's something simple that I'm either ignoring or completely forgetting.. Please give me a light! Thanks.
I can post c# code for you, it would be easier for you to translate it to VB if you need
var jsonObject= JObject.Parse(json);
List<string> restaurantNames= ((JObject)jsonObject["Results"]).Properties()
.Select(x=>x.Name).Distinct().ToList();
result
Burger King
Mcdonalds

Extracting WEB PAGE data from a hidden menu that is revealed after clicking

New to VBA and going in circles trying to extract data from a table on a webpage that only appears after it is clicked.
I'm using a VBA macro within excel to pull data from a weather.com:
I can successfully extract elements from the visible table.
I'm pulling Error 91 when trying to pull the "sunrise" and "sunset" time from the hidden menu.
I would like to extract the time of sunrise and sunset from the 2nd day on the table.
Question:
Is there a way to click within the code to reveal the data I want to extract?
This data is located within a SPAN in the source code - Can I be extract this innerText from the closest ClassName to the SPAN in question?
Would somebody mind taking a look at the code and helping me out? Thanks!
I can successfully extract "day", "weather", and "temp" but cannot pull "srise".
Here is my VBA code:
Sub Get_Lancaster()
Dim request As Object
Dim response As String
Dim html As New HTMLDocument
Dim website As String
Dim weather As Variant
Dim temp As Variant
Dim day As Variant
Dim srise As Variant
website = "https://weather.com/weather/5day/l/2db548c2f0fb03c25c0d5c5520a32877082d295d907b06df5eff91cd140165b9"
Set request = CreateObject("MSXML2.XMLHTTP")
request.Open "GET", website, False
request.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
request.send
response = StrConv(request.responseBody, vbUnicode)
html.body.innerHTML = response
weather = html.getElementsByClassName("description").Item(2).innerText
temp = html.getElementsByClassName("temp").Item(2).innerText
day = html.getElementsByClassName("day-detail clearfix").Item(1).innerText
srise = html.getElementsByClassName("sunrise").Item(1).innerText
Range("B3").Value = day
Range("D3").Value = weather
Range("E3").Value = temp
Range("G3").Value = srise
End Sub
Here is the source HTML code for the menu after it is clicked and revealed on the source page:
I am trying to extract the 7:05am located in the bottom SPAN:
<tr classname="clickable open" class="clickable open">
<td class="twc-sticky-col cell-hide">
<div classname="twc-table-shadow sticky" class="twc-table-shadow sticky"></div>
</td>
<td headers="uv-index" title="Partly cloudy skies. High around 40F. Winds light and variable." data-track-string="ls_24_hour_ls_24_hour_toggle" classname="uv" class="uv">
<span class="">3 of 10</span>
</td>
<td headers="sunrise" title="Partly cloudy skies. High around 40F. Winds light and variable." data-track-string="ls_24_hour_ls_24_hour_toggle" classname="sunrise" class="sunrise">
<div>
<span classname="icon icon-font iconset-astro dark icon-sunrise" class="icon icon-font iconset-astro dark icon-sunrise">
</span>
<span>7:05 am</span>
</div>
</td>
Thanks for taking a look and helping a complete newbie who's in over their head!
-J
You've done a good job so far, +1 for the effort you've put in this!
I would like to take it one step further, if you don't mind.
Firstly, you have to keep in mind that what you see when you inspect an element in your browser's developer tools is not necessarily what you'll see in the source HTML of the page.
Indeed, in this case if you actually right click and view the source code of the page, you will not find the HTML snippet that you posted. You will find it however if you inspect the element of interest.
This happens because this part of the HTML code is generated by a script. You will find this script in the Page's source code inside a <script></script> tag (just search for "sunrise" for example). This tag contains a huge string and a second small one, both in JSON format.
The script tag looks like this:
<script charSet="UTF-8">window.__data={"transactionId":"a3520089-63d0-4320-bb6e-c6308b6e820d", ... {"startIndex":0}};window.experience={"connectionSpeed":"4g","deviceClass":"desktop"};</script>
I have replaced most of the string with ... for the sake of readability.
All the data you need is in the first JSON string (window.__data) between the curly brackets {...}. You will not be needing the second string (window.experience).
So basically what you need is to isolate this string from the response and then parse it to get the info you want.
You can inspect the JSON string's structure using a tool like this. Here's how it looks like:
To parse a string like this you will need to add this JSON parser to your project. Follow the installation instructions in the link and you should be set to go.
You will also need to add the following references to your project (VBE>Tools>References):
Microsoft XML version 6.0
Microsoft Scripting Runtime
Having said that, here's how I would do it:
Option Explicit
Sub weather()
Dim req As New MSXML2.XMLHTTP60
Dim url As String, data As String, startOfData As String, endOfData As String
Dim dataJSON As Object, day As Object
startOfData = "window.__data=" 'the string of interest starts after this
endOfData = ";window.experience=" 'the string of interest ends before this
url = "https://weather.com/weather/5day/l/2db548c2f0fb03c25c0d5c5520a32877082d295d907b06df5eff91cd140165b9"
With req
.Open "GET", url, False
.send
data = .responseText
End With
data = Mid(data, InStr(1, data, startOfData) + Len(startOfData)) 'isolate the string of interest: step 1
data = Mid(data, 1, InStr(1, data, endOfData) - 1) 'isolate the string of interest: step 2
Set dataJSON = JsonConverter.ParseJson(data)
Set dataJSON = dataJSON("dal")("DailyForecast")("geocode:44.49,-71.57:language:en-US:units:e")("data")("vt1dailyForecast")
For Each day In dataJSON
Debug.Print day("sunrise")
Debug.Print day("day")("narrative")
Debug.Print day("night")("narrative")
Next day
End Sub
For demonstration purposes, the code above only prints the sunrise and the narrative of each day in the immediate window. Having in mind how the JSON is structured and following the same logic, you can adjust the code to print whichever parameter you need.
Here's a sample of the output:

How to adding nested dictionaries & collection in VBA/Json Ms Access 2016

Our company require to send invoices details for assessment to our Tax consultant in Json format, this means every invoice generated must pass through the Tax consultant for authentication. If an invoice is not correctly generated the Tax consultant send it back as rejected. The problem here we are not able to get the correct Json format after exporting Ms access 2016 invoices data
I have a code below which is almost complete for exporting Ms Access data into the correct Json format, but I’m the following two things below:
Current Json format:
{
"PosSerialNumber":"",
"IssueTime":"",
"TransactionType":0,
"PaymentMode":0,
"SaleType":0,
"Items":{
"ItemId":1,
"Description":"Apple"
"Barcode":"458630036",
"Quantity":8,
"UnitPrice":2,
"Discount":0,
"Taxable":[
"A",
"T"
]
}
}
The following two things below:
(1) As you can see its missing a square bracket on "Items":{ , the correct one is supposed to be "Items":[{ as well as the closing ]
(2) I want to add also (“Total”,120, “IstaxInclusive”: true) so that the final code should look like below:
Below is the current MS Access VBA code we are using to try and achieve the required goal
VBA code used to generate the above Json format:
Option Compare Database
Option Explicit
Private Sub Command0_Click()
Dim foo As New Dictionary
Set foo = New Dictionary
Dim Noor As Dictionary
Set Noor = New Dictionary
Dim hoo As New Collection
Dim goo As New Dictionary
Set goo = New Dictionary
Dim Zoo As New Dictionary
Set Zoo = New Dictionary
Dim Koo As New Collection
Dim json As String
With foo
.Add "PosSerialNumber", Me.txtchris
.Add "IssueTime", Me.txtAddress
.Add "TransactionTyp", 0
.Add "PaymentMode", 0
.Add "SaleType", 0
.Add "Items", Noor
Noor.Add "ItemID", 1
Noor.Add "Description", "Apple"
Noor.Add "BarCode", "4589630036"
Noor.Add "Quantity", 8
Noor.Add "UnitPrice", 2
Noor.Add "Discount", 0
Noor.Add "Taxable", hoo
hoo.Add "A"
hoo.Add "T"
End With
Dim member As Variant
For Each member In foo
Next
MsgBox JsonConverter.ConvertToJson(foo, Whitespace:=3), vbOKOnly, "Audited by C H"
End Sub
Required format:
{
"PosSerialNumber":"",
"IssueTime":"",
"TransactionType":0,
"PaymentMode":0,
"SaleType":0,
"Items":[{
"ItemId":1,
"Description":"Apple"
"Barcode":"458630036",
"Quantity":8,
"UnitPrice":2,
"Discount":0,
"Taxable":[
"A",
"T"
]
"Total":120,
"IsTaxInclusive":true,
"SP":0
}
]
}
Square brackets in JSON indicate an Array. JsonConverter will wrap collection or Arrays in [].
You can either wrap Noor in an Array() or add it to a collection
Array
.Add "Items", Array(Noor)
Collection
Dim NoorCollection As New Collection
NoorCollection.Add Noor
.Add "Items", NoorCollection

How to pick JSON nodes given in an Excel table for each field?

Hi I am quite new to VBA and JSON. I want to parse a JSON script using a VBA macro.
I already have an Excel table with each field and it's corresponding JSON path.
While trying to pick the value from Excel table, it is read as string and, quotes are added in the beginning and end of the string. The quotes " in the beginning and the end of the path of the JSON variable makes it impossible to read the value from the JSON script.
For example, if the path is project->name, the location in Excel table is ("project")("name"). But after reading it in VBA it becomes "("project")("name")". With the extra quotes " in the path, the location is not getting identified in the VBA code.
{
"quiz": {
"sport": {
"name": "Basketball",
"Questions":{
"question1": "Which one is correct team name in NBA?",
"question2":"Who is your favorite player",}
}
}
}
For this JSON script, I have created an Excel table with paths of question1 and question2:
("quiz")("sport")("name")("question1")
("quiz")("sport")("name")("question2")
The following code runs a loop and identify path of question1 first and returns "Which one is correct team name in NBA?" and do the same for question2.
But, item(path) is returning an empty string while writing the path completely in the code returns the correct value.
Set jsonObject = JsonConverter.ParseJson(JsonScript) 'Parse Json from GitHub
For Each item In jsonObject("data")
For i = 1 To nrow ' loops through rows with path for each field
Path = ws2.Range("C" & i).Value 'path of each field
MsgBox item(Path) 'returns Empty
question1 = item("quiz")("sport")("name")("question1") 'returns question1 value:Which one is correct team name in NBA?
Next
Next
You can't apply in that way particularly with the (). You can have the paths, as a comma separated list of path elements (no spaces), in the sheet and do the following.
You also need to remove the additional "," after player.
Note: I am testing the UBound of the array generated by the split on the "," of the string read in from the sheet and held in the array (arr) at the current index i. This ensures I apply the write nested syntax to retrieve the required value.
I am not sure where you are getting "data" from. I can't see it in the string provided. You would amend according to your actual JSON if different.
Data:
Output:
VBA:
Option Explicit
Public Sub GetInfoFromSheet()
Dim json As Object, jsonSource As String, paths(), i As Long, ws As Worksheet, arr() As String
Set ws = ThisWorkbook.Worksheets("Sheet1")
jsonSource = ws.[C1]
Set json = JsonConverter.ParseJson(jsonSource)
paths = Application.Transpose(ws.Range("A1:A3").Value)
For i = LBound(paths) To UBound(paths)
arr = Split(paths(i), ",")
Select Case UBound(arr)
Case 2
Debug.Print json(arr(0))(arr(1))(arr(2))
Case 3
Debug.Print json(arr(0))(arr(1))(arr(2))(arr(3))
End Select
Next i
End Sub
JSON string in C1:
{ "quiz": { "sport": { "name": "Basketball", "Questions":{ "question1": "Which one is correct team name in NBA?", "question2":"Who is your favorite player"} } } }

Json to Excel (multi level)

I know how to parse JSON to Excel with VBA but I have a problem with multi-level JSON.
Example :
{
"Level1": [{
"String1": "Data1",
"Level 2": [{
"String2": "Data2",
"String3": "Data3",
"Level3": [{
"String4": "Data4",
"String5": "Data5"
}]
}]
}]
}
How to get everything?
The { means a dictionary so you access by key, the [ means a collection so you access by index. "" means a string literal so you read as is. Test the data type and handle as required. Below I use a JSON parser to read in your JSON string from a cell A1. After adding the .bas from that link to your project you then add a reference via VBE > Tools > References > Microsoft Scripting Runtime.
I use a sub EmptyDict which I call recursively to test whether the current object is a dictionary or collection and then loop until I empty each dictionary. For each collection I shift one column to the right.
As mentioned in the comments, you would tailor to the output format you want in the sheet.
The tree structure you are descending looks like:
VBA:
Option Explicit
Public r As Long, c As Long
Sub readValues()
Dim json As Object, item As Object
Set json = JsonConverter.ParseJson([A1])("Level1")(1) 'dictionary
r = 1: c = 1
EmptyDict json
End Sub
Public Sub EmptyDict(ByVal dict As Object)
Dim key As Variant, item As Object
Select Case TypeName(dict)
Case "Collection"
For Each item In dict
c = c + 1
r = 1
EmptyDict item
Next
Case "Dictionary"
For Each key In dict
If TypeName(dict(key)) = "Collection" Then
EmptyDict (dict(key))
Else
With ThisWorkbook.Worksheets("Sheet2")
.Cells(r, c) = dict(key)
End With
r = r + 1
End If
Next
End Select
End Sub