Excel VBA macro to get HTML SPAN ID value - html

I appreciate there are similar questions, but as a novice I find it hard to full adapt examples.
Problem Statement
I want the create a macro in Excel to pull the "last updated" value found on the website https://www.centralbank.ae/en/fx-rates. Specifically this is found within their HTML code (value example also below):
<span class="dir-ltr">11 Feb 2021 6:00PM</span>
What I wanted to Repurpose
The code here (https://www.encodedna.com/excel/extract-contents-from-html-element-of-a-webpage-in-excel-using-vba.htm) seemed to be a very clean way of launching IE in the background and then clearing down all elements thereafter. It iterates through hyperlinks which I don't need to do.
My code doesn't seem to work:
Option Explicit
Const sSiteName = "https://www.centralbank.ae/en/fx-rates"
Private Sub GetHTMLContents()
' Create Internet Explorer object.
Dim IE As Object
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = False ' Keep this hidden.
IE.navigate sSiteName
' Wait till IE is fully loaded.
While IE.readyState <> 4
DoEvents
Wend
Dim oHDoc As HTMLDocument ' Create document object.
Set oHDoc = IE.document
Dim oHEle As HTMLSpanElement ' Create HTML element (<span>) object.
Set oHEle = oHDoc.getElementById("dir-ltr").innerText ' Get the element ref using its ID. [A]
' Clean up.
IE.Quit
Set IE = Nothing
Set oHEle = Nothing
Set oHDoc = Nothing
End Sub
Once it works printing to innerText, I thought you can replace line commented by [A] with something like this but again not 100% sure how to replace:
Cells(iCnt + 1, 1) = .getElementsByTagName("h1").Item(iCnt).getElementsByTagName("a").Item(0).innerHTML
The goal is to print this SPAN CLASS ID value into a cell in an Excel worksheet (say "Sheet1").

The span tag has no ID. dir-ltr is the class. You can get all elements with a specific class with getElementsByClassName(). With the get methods with the plural Elements you create a node collection which is based by index 0. The class dir-ltr is the one and only class with this name in the document.
You can refer to it via index 0 which will be written behind the name of the node collection (like an array) or behind the method call. If you do it after the method call the node collection will be destroyed imidiatly but you get the indexed element of the list.
If you want to read the innertext you can do it directly behind the index but than you have a string, no object. I used that in the following code:
Private Sub GetHTMLContents()
Const sSiteName = "https://www.centralbank.ae/en/fx-rates"
Dim IE As Object
'Create Internet Explorer object.
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = False ' Keep this hidden.
IE.navigate sSiteName
' Wait till IE is fully loaded.
While IE.readyState <> 4: DoEvents: Wend
'New sheet with name "New sheet" at the end
ThisWorkbook.Sheets.Add after:=Sheets(Worksheets.Count)
ThisWorkbook.ActiveSheet.Name = "New sheet"
' Get the element ref using its ID. [A]
ThisWorkbook.Sheets("New sheet").Cells(1, 1) = IE.document.getElementsByClassName("dir-ltr")(0).innerText
' Clean up.
IE.Quit
Set IE = Nothing
End Sub

Related

How to access the Web using VBA? Please check my code

In order to improve the repeatitive work, I tried to access the Web site which is using in company using VBA.
So, I made code using VBA. And I checked it could be access the normal site such as google, youtube...
But, I don't know why it could not be access the company site.
VBA stopped this line
Set HTMLDoc = IE_ctrl.document
Thank you in advanced.
And I checked one different things(VBA Local values, type) between Normal and company site.
please check below 2 pictures.
Sub a()
Dim IE_ctrl As InternetExplorer
Dim HTMLDoc As HTMLDocument
Dim input_Data As IHTMLElement
Dim URL As String
URL = "https://www.google.com"
Set IE_ctrl = New InternetExplorer
IE_ctrl.Silent = True
IE_ctrl.Visible = True
IE_ctrl.navigate URL
Wait_Browser IE_ctrl
Set HTMLDoc = IE_ctrl.document
Wait_Browser IE_ctrl
Set input_Data = HTMLDoc.getElementsByClassName("text").Item
input_Data.Click
End Sub
Sub Wait_Browser(Browser As InternetExplorer, Optional t As Integer = 1)
While Browser.Busy
DoEvents
Wend
Application.Wait DateAdd("s", t, Now)
End Sub
Normal site(operating well.)
enter image description here
Company site(operating error.)
enter image description here
You can try the following code. Please read the comments. I can't say anymore because I don't know the page or the html of the page.
Sub a()
'Use late binding for what you need
Dim ie As Object
Dim nodeInputData As Object
Dim url As String
url = "https://www.google.com"
'Use the windows GUID to initialize the Internet Explorer, if you
'want to get access to a company page. This helps if there are
'security rules you can't access over other ways of initializing IE
'This don't work in most cases for pages in the "real" web
'Read here for more infos:
'https://blogs.msdn.microsoft.com/ieinternals/2011/08/03/default-integrity-level-and-automation/
Set ie = GetObject("new:{D5E8041D-920F-45e9-B8FB-B1DEB82C6E5E}")
ie.Visible = True
ie.navigate url
'Waiting for the document to load
Do Until ie.readyState = 4: DoEvents: Loop
'If necessary, if there is dynamic content that must be loaded,
'after the ie reports, loading was ready
'(The last three values are: hours, minutes, seconds)
Application.Wait (Now + TimeSerial(0, 0, 1))
'I don't know your html. If you only want to click a button,
'you don't need a varable
'ie.document.getElementsByClassName("text")(0).Click
'will do the same like
Set nodeInputData = ie.document.getElementsByClassName("text")(0)
nodeInputData.Click
'A short explanation of getElementsByClassName() and getElementsByTagName():
'Both methods create a node collection of all html elements that was found
'by the creteria in the brackets. This is because there can be any number of
'html elements with specified class names or tag names. If, for example,
'3 html elements with the class name "Text" were found, a node collection
'with three elements is created by getElementsByClassName("Text").
'These have the indices 0 to 2, as in an array. The individual elements are
'also addressed via these indices. They are indicated in round brackets.
End Sub

VBA code to scrape data using html/javascript won't work

I want to make VBA code to search on a website on the basis of input made in the first column. Range is from A1 to A102. This code is working fine except one thing: It copies my data from Excel Cell and then paste it in the Search box of website. But it doesn't click the search button Automatically. I welcome any good Suggestions from Experts.
I know how to scrape data from websites but there is a specific class for this searchbox button. What would be this class I should use to made click? This question is relatable to both VBA and javascript/html Experts.
I am getting this as button ID " nav-search-submit-text " and this code as `Class " nav-search-submit-text nav-sprite ", when I click on Inspect element.
Both don't work?
Thanks
Private Sub worksheet_change(ByVal target As Range)
If Not Intersect(target, Range("A1:A102")) Is Nothing Then
Call getdata
End If
End Sub
Sub getdata()
Dim i As Long
Dim URL As String
Dim IE As Object
Dim objElement As Object
Dim objCollection As Object
Set IE = CreateObject("InternetExplorer.Application")
'Set IE.Visible = True to make IE visible, or False for IE to run in the background
IE.Visible = True
URL = "https://www.amazon.co.uk"
'Navigate to URL
IE.Navigate URL
'making sure the page is done loading
Do
DoEvents
Loop Until IE.ReadyState = 4
'attempting to search date based on date value in cell
IE.Document.getElementById("twotabsearchtextbox").Value = ActiveCell.Value
'Sheets("Sheet1").Range("A1:A102").Text
'Select the date picker box and press Enter to 'activate' the new date
IE.Document.getElementById("twotabsearchtextbox").Select
'clicking the search button
IE.Document.getElementsByClassName("nav-sprite").Click
'Call nextfunction
End Sub
To use web scraping with Excel, you must be able to use both VBA and HTML. Additionally CSS and at least some JS. Above all, you should be familiar with the DOM (Document Object Model). Only with VBA or only with HTML you will not get far.
It's a mystery to me why you want to do it in a complicated way when you can do it simply via the URL. For your solution you have to use the class nav-input. This class exists twice in the HTML document. The search button is the element with the second appearance of nav-input. Since the indices of a NodeCollection start at 0, you have to click the element with index 1.
Sub getdata()
Dim URL As String
Dim IE As Object
URL = "https://www.amazon.co.uk"
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True ' True to make IE visible, or False for IE to run in the background
IE.Navigate URL 'Navigate to URL
'making sure the page is done loading
Do: DoEvents: Loop Until IE.ReadyState = 4
'attempting to search date based on date value in cell
IE.Document.getElementById("twotabsearchtextbox").Value = ActiveCell.Value
'clicking the search button
IE.Document.getElementsByClassName("nav-input")(1).Click
End Sub
Edit: Solution to open offer with known ASIN
You can open an offer on Amazon webpage directly if you know the ASIN. To use the ASIN in the active cell in the URL (this does not work reliably. If you have to press Enter to finish the input, the active cell is the one under the desired one), it can be passed as a parameter to the Sub() getdata():
Private Sub worksheet_change(ByVal target As Range)
If Not Intersect(target, Range("A1:A102")) Is Nothing Then
Call getdata(ActiveCell.Value)
End If
End Sub
In the Sub() getdata() the URL with the transferred ASIN is then called:
Sub getdata(searchTerm As String)
Dim URL As String
Dim IE As Object
'Use the right base url
URL = "https://www.amazon.co.uk/dp/" & searchTerm
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True ' True to make IE visible, or False for IE to run in the background
IE.Navigate URL 'Navigate to URL
'making sure the page is done loading
Do: DoEvents: Loop Until IE.ReadyState = 4
End Sub
It's also possible to do that all in the worksheet_change event of the worksheet (Include getting price and offer title):
Private Sub worksheet_change(ByVal target As Range)
If Not Intersect(target, Range("A1:A102")) Is Nothing Then
With CreateObject("InternetExplorer.Application")
.Visible = True ' True to make IE visible, or False for IE to run in the background
.Navigate "https://www.amazon.co.uk/dp/" & ActiveCell 'Navigate to URL
'making sure the page is done loading
Do: DoEvents: Loop Until .ReadyState = 4
'Get Price
ActiveCell.Offset(0, 1).Value = .document.getElementByID("priceblock_ourprice").innertext
'Get offer title
ActiveCell.Offset(0, 2).Value = .document.getElementByID("productTitle").innertext
End With
End If
End Sub

VBA Web search button - GetElementsbyClassName

I have a problem with the VBA code.
I would like to open this website: https://www.tnt.com/express/en_us/site/tracking.html and in Shipment numbers search box I would like to put active cells from Excel file. At the beginning I tried to put only a specific text for example: "777777".
I wrote the below code but unfortunately, the search button is empty and there is no error. I tried everything and I have no idea what should I change in my code.
Any clues? Thank you in advance.
HTML:
<input class="__c-form-field__text ng-touched ng-dirty ng-invalid" formcontrolname="query" pbconvertnewlinestocommasonpaste="" pbsearchhistorynavigation="" shamselectalltextonfocus="" type="search">
VBA:
Sub TNT2_tracker()
Dim objIE As InternetExplorer
Dim aEle As HTMLLinkElement
Dim y As Integer
Dim result As String
Set objIE = New InternetExplorer
objIE.Visible = True
objIE.navigate "https://www.tnt.com/express/en_us/site/tracking.html"
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
Dim webpageelement As Object
For Each webpageelement In objIE.document.getElementsByClassName("input")
If webpageelement.Class = "__c-form-field__text ng-pristine ng-invalid ng-touched" Then
webpageelement.Value = "777"
End If
Next webpageelement
End Sub
You could use the querySelector + class name to find an element.
something like
'Find the input box
objIE.document.querySelector("input.__c-form-field__text").value = "test"
'Find the search button and do a click
objIE.document.querySelector("button.__c-btn").Click
No need to loop through elements. Unless the site allows you to search multiple tracking numbers at the same time.
It seems automating this page is a litte tricky. If you change the value of the input field it doesn' t work. Nothing happen by clicking the submit button.
A look in the dom inspector shows several events for the input field. I checked them out and it seems we need to paste the value over the clipboard by trigger the paste event of the shipping field.
In order for this to work without Internet Explorer prompting, its security settings for the Internet zone must be set to allow pasting from the clipboard. I'm using a German version of IE, so I have problems explaining how to find the setting.
This macro works for me:
Sub TNT2_tracker()
Dim browser As Object
Dim url As String
Dim nodeDivWithInputField As Object
Dim nodeInputShipmentNumber As Object
Dim textToClipboard As Object
'Dataobject by late binding to use the clipboard
Set textToClipboard = CreateObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
url = "https://www.tnt.com/express/en_us/site/tracking.html"
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
Set browser = CreateObject("internetexplorer.application")
browser.Visible = True
browser.navigate url
Do Until browser.ReadyState = 4: DoEvents: Loop
'Manual break for loading the page complitly
'Application.Wait (Now + TimeSerial(pause_hours, pause_minutes, pause_seconds))
Application.Wait (Now + TimeSerial(0, 0, 3))
'Get div element with input field for shipment number
Set nodeDivWithInputField = browser.Document.getElementsByClassName("pb-search-form-input-group")(0)
If Not nodeDivWithInputField Is Nothing Then
'If we got the div element ...
'First child element is the input field
Set nodeInputShipmentNumber = nodeDivWithInputField.FirstChild
'Put shipment number to clipboard
textToClipboard.setText "7777777"
textToClipboard.PutInClipboard
'Insert value by trigger paste event of the input field
Call TriggerEvent(browser.Document, nodeInputShipmentNumber, "paste")
'Click button
browser.Document.getElementsByClassName("__c-btn")(0).Click
Else
MsgBox "No input field for shipment number found."
End If
End Sub
And this function to trigger a html event:
Private Sub TriggerEvent(htmlDocument As Object, htmlElementWithEvent As Object, eventType As String)
Dim theEvent As Object
htmlElementWithEvent.Focus
Set theEvent = htmlDocument.createEvent("HTMLEvents")
theEvent.initEvent eventType, True, False
htmlElementWithEvent.dispatchEvent theEvent
End Sub
As #Stavros Jon alludes to..... there is a browserless way using xhr GET request via API. It returns json and thus you ideally need to use a json parser to handle the response.
I use jsonconverter.bas as the json parser to handle the response. Download raw code from here and add to standard module called JsonConverter . You then need to go VBE > Tools > References > Add reference to Microsoft Scripting Runtime. Remove the top Attribute line from the copied code.
Example request with dummy tracking number (deliberately passed as string):
Option Explicit
Public Sub TntTracking()
Dim json As Object, ws As Worksheet, trackingNumber As String
trackingNumber = "1234567" 'test input value. Currently this is not a valid input but is for demo.
Set ws = ThisWorkbook.Worksheets("Sheet1") 'for later use if writing something specific out
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.tnt.com/api/v3/shipment?con=" & trackingNumber & "&searchType=CON&locale=en_US&channel=OPENTRACK", False
.send
Set json = JsonConverter.ParseJson(.responseText)
End With
'do something with results
Debug.Print json("tracker.output")("notFound").Count > 0
Debug.Print JsonConverter.ConvertToJson(json("tracker.output")("notFound"))
End Sub

Using VBA to extract data from website, but getting run time error '91'

Quite new to VBA, having a problem with this error code.
run time error '91' object variable or With block variable not set
I'm trying to extract data from a website and past to a excel document. My Excel doc is Book2 and my module is called Module1. I'll paste the code below.
Sub WebNavigate()
Dim CreatingObject As Object
Dim WebNavigate As Object
Set objIE = CreatingObject("InternetExplorer.Application")
WebSite = "website link"
With objIE
.Visable = True
.navigate WebSite
Do While .Busy Or .readyState <> 4
DoEvents
Loop
Set elements = .document.getElementByClass("timark")
Sheet1.Cells(i, 8) = element.innerText
End With
End Sub
In the absence of HTML/URL to go with:
1) You are spelling of Visible is incorrect
2) The following:
Set elements = .document.getElementByClass("timark")
Is missing an s as it returns a collection and should be ClassName:
Set elements = .document.getElementsByClassName("timark")
3) You may need a pause or loop to ensure elements is available on the page.
4) This
Sheet1.Cells(i, 8) = element.innerText
You don't yet have element declared and assigned (you also don't have elements declared) . You may use in a For Loop.
e.g.
Dim element As Object, elements As Object
Set elements = .document.getElementsByClassName("timark")
For each element in elements
5) Creating should be Create (also as noted) and you need to declare objIE
Dim objIE As Object
Set objIE = CreateObject("InternetExplorer.Application")
6) i is not declared anywhere and must be greater than 1 when it is as there is no cell with row 0 in the sheet. Also, i would indicate a Loop of which there is no sign and when in a loop should be incremented to avoid overwriting the same cell.
7) Dim WebNavigate As Object is unassigned and not needed at present in the code.
8) To avoid many of the above use Option Explicit at the top of your code (As already mentioned).

HTML object library / pull

I have the following code in an HTML web page, and I am trying to use the html object library via vba engine to pull the value from within this tag:
<input name="txtAdd_Line1" disabled="disabled" size="30" maxLength="50" value="123 N 1ST ST"/>
I figure I have to use .getelementsbytagname or .getelementsbyname, but I am not sure how to grab the value. Does anyone have any ideas?
Here's an example with comments, subtitute in your actual address:
Sub Example()
'Declare needed variables
Dim ie, elements
Dim x As Long
'Create IE Applction
Set ie = CreateObject("InternetExplorer.Application")
'Navigate to the website
ie.navigate "C:\test.html" 'Substitute your actual address
'Wait for website to finish loading
Do While ie.ReadyState <> 4
Loop
'Find the elements
Set elements = ie.document.getelementsbyName("txtAdd_Line1")
'Display the value of each returned element
For x = 0 To elements.Length - 1
MsgBox elements(x).Value
Next
'Quit IE
ie.Quit
End Sub
Based on your comment most likely just looking at the document wasn't retrieving the actual layer of the tree you wanted, try this:
Set HTMLDoc = ie.document.frames("MainFrame").document
With HTMLDoc
'This returns an (object) which contains an array of all matching elements
a = .getElementsByName("txtAdd_Line1")
end with
For x = 0 to a.length
msgbox a(x).value
next
You can use a CSS selector of input[name='txtAdd_Line1'] . This says element with input tag having attribute name with value 'txtAdd_Line1'.
CSS selector:
You apply a CSS selector using the .querySelector method of document e.g.
Msgbox ie.document.querySelector("input[name='txtAdd_Line1']").innerText