VBA: handling data in Document Object Model - html

I am currently trying to scrap data from a website using VBA. I am following this tutorial and hence my code is the following one:
Sub Foo()
Dim appIE As Object
Set appIE = CreateObject("internetexplorer.application")
With appIE
.Navigate "https://www.ishares.com/it/investitore-privato/it/prodotti/251843/ishares-euro-high-yield-corporate-bond-ucits-etf"
.Visible = True
End With
Do While appIE.Busy
DoEvents
Loop
Set allRowOfData = appIE.document.getElementsByClassName("visible-data totalNetAssets")
Dim myValue As String
myValue = allRowOfData.Cells(1).innerHTML
MsgBox myValue
End Sub
Unfortunately there are some differences between data I want to scrap and those ones used in the example: this line
myValue = allRowOfData.Cells(1).innerHTML
is wrong according to VBA debug.
Anyone could provide me with some explanations about why that doesn't work and how am I supposed to pick the right method to scrap HTML pages?

Try the below change which will solve your issue. In brief, you will need to treat the allRowofData as a collection.
myValue = allRowOfData(0).Cells(1).innerHTML

Related

How to get excel VBA to find a name then click the hyperlink

This is the piece I am stuck on, everything else is working.
Sub Avidie()
Dim i As Long
Dim url As String
Dim ie As Object
Set ie = CreateObject("InternetExplorer.Application")
Set links = .document.getelementbyclass("invoice-number-hyperlink").getelementsbyname("25407")(0).Click
I am up to the point where I have the Webpage code, which is below, and I want my code to find the link that has that class and name, and click it. How can I fix this?? The class name is the same on multiple parts of this page, but the name Is what Iā€™m looking up so I need to find it by that.
<a class="invoice-number-hyperlink" onclick="avid.navigateHelper['PAQ-Invoice'](this);" href="#/invoices/70c2373a-ac71-43f2-a9e0-8d0332f1a19b?fromQueue=true">25407</a>
Based on the part of the code you provided, as well as the methods that other members of the community have mentioned, you can traverse all eligible s according to the class, filter according to the text content, and then select and click. I have created a simple example, I hope it can be helpful to you:
Dim appIE As InternetExplorerMedium
Set appIE = New InternetExplorerMedium
sURL = "https://www.example.com/"
With appIE
.navigate sURL
.Visible = True
End With
Do While appIE.Busy Or appIE.readyState <> 4
DoEvents
Loop
'appIE.document.getElementsByClassName("invoice-number-hyperlink")(0).Click
For Each link In appIE.document.getElementsByClassName("invoice-number-hyperlink")
If link.Text = "25407" Then
link.Click
Exit For
End If
Next
And page:
<a class="invoice-number-hyperlink" onclick="window.alert('jump to google...')" href="https://www.google.com/">25407</a>

How to look up HTML input name then assign a value in VBA?

I am fairly new to VBA and trying to log in to a website. I have gotten the code to get me to the website then look up the input names for the username and password elements. This was important because everytime the page opens the names are slightly different so I can't just inspect the element name and use that as a fixed value.
The username input name always starts with "txt_1_" then some numbers and the password goes "txt_2_" and some numbers. That is the reason I have the if statement that looks for the names similar to those.
The error I am currently receiving is "Run-time error '438': Object doesn't support this property or method"
Below is what I have so far:
Sub login()
Const Url$ = "examplewebsite.com"
Dim HTMLDoc As HTMLDocument
Dim oHTML_Element As IHTMLElement
Dim ie As Object
Set ie = CreateObject("InternetExplorer.Application")
With ie
.Navigate Url
ieBusy ie
.Visible = True
Set HTMLDoc = ie.Document
Dim ID1 As String, ID2 As String
For Each oHTML_Element In HTMLDoc.getElementsByTagName("input")
If oHTML_Element.Name Like "txt_1*" Then ID1 = oHTML_Element.Name
If oHTML_Element.Name Like "txt_2*" Then ID2 = oHTML_Element.Name
Next
Debug.Print ID1
Debug.Print ID2
Dim oLogin As Object, oPassword As Object
Set oLogin = .Document.getElementsByTagName(ID1)
Set oPassword = .Document.getElementsByTagName(ID2)
oLogin.Value = "MyUsername"
oPassword.Value = "MyPassword"
.Document.forms(0).submit
End With
End Sub
Sub ieBusy(ie As Object)
Do While ie.Busy Or ie.ReadyState < 4
DoEvents
Loop
End Sub
Issue
getElementsByTagName returns a collection of elements with the given tag name. So for your For loop it works for you to loop all the elements with the tagname of input.
However, when you are trying to assign oLogin and oPassword using the same getElementsByTagName it won't work, as you are passing a name attribute to it and not a tagname.
Solution
Using your code you can simply assign oLogin and oPassword in your for loop.
For Each oHTML_Element In HTMLDoc.getElementsByTagName("input")
If oHTML_Element.Name Like "txt_1*" Then set oLogin = oHTML_Element
If oHTML_Element.Name Like "txt_2*" Then set oPassword = oHTML_Element
Next
Another solution could be by using the attribute selector, specifically the starts with symbol. That pattern looks like [attribute^="value"]. So in your case input[name^="txt_1"] and input[name^="txt_2"].
This would then be used with querySelector instead of getElementsByTagName. This means you wouldn't have to loop either.
set oLogin = HTMLDoc.querySelector("input[name^=""txt_1""]")
set oPassowrd = HTMLDoc.querySelector("input[name^=""txt_2""]")
A few extra notes
You are using Late Binding for the Microsoft Internet Controls library ā€” really you should use it early binding by creating a reference to the library.
Try to avoid using Systems Hungarian notation with your variable names ā€” adding o before password and login gets really confusing for others to read (Read This Article).
Avoid writing your declarations on a single line. This is easy to write errors, and also is harder to read.

How can I pull data from website using vba

I am new at vba coding to pull data from website so generally, I use this code to connect and check item to pull data from website but this code cannot check data via watch in vba with my firm webapp. it show nothing when I add watch to the class so what should I do.HTML Code from my firm webapp 1
HTML Code from my firm webapp 2
Sub Connect_web()
Dim ie As InternetExplorer
Dim doc As HTMLdocument
Dim ele As IHTMLElement
Dim col As IHTMLElementCollection
Dim ele_tmp As IHTMLElement
Set ie = New InternetExplorer
URL = "" ' Cannot provide
ie.Visible = True
ie.navigate URL
Do While ie.readyState <> READYSTATE_COMPLETE
Application.StatusBar = "Loading Page..."
DoEvents
End If
Loop
Set doc = ie.Document
Set ele = doc.getElementByClassName("GDB3EHGDHLC")
end sub
Let's start with four things:
1) Instead of .Navigate use .Navigate2
2) Use a proper wait
While ie.Busy Or ie.readyState < 4: DoEvents: Wend
3) Correct the syntax of your Set ele line. You are using ByClassNamewhich returns a collection and therefore is plural. You are missing the s at the end of element.
As you have declared ele as singular (element), perhaps first set the collection into a separate variable and index into that collection.
Dim eles As Object, ele As Object
Set eles = doc.getElementsByClassName("GDB3EHGDHLC")
Set ele = eles(0)
4) You should always use id over other attributes, if possible, as id is usually quicker for retrieval. There is an id against that class name in your image (highlighted element). I am not going to try and type it all out. Please share your HTML using the snippet tool, by editing your question, so we can relate to your html in answer easily.
Set ele = doc.getElementById("gwt-debug-restOfIdStringGoesHere")

How to get span id value into excel VBA?

Ok so here is my entire code:
Private Sub CommandButton1_Click()
Dim appIE As Object
Set appIE = CreateObject("internetexplorer.application")
With appIE
.Navigate "http://finance.yahoo.com/q/ks?s=" & "AAPL"
.Visible = True
End With
Do While appIE.Busy
DoEvents
Loop
Set getPrice = appIE.Document.getElementById("yfs_l84_aapl")
Dim myValue As String: myValue = getPrice.Cells(1).innerHTML
appIE.Quit
Set appIE = Nothing
Range("B1").Value = myValue
End Sub
And here is the HTML that I'm trying to read into Excel (specifically, I need the 113.92):
<span id="yfs_l84_aapl">113.92</span>
What do I have to change in these two lines of code to read a "span id"?
Set getPrice = appIE.Document.getElementById("yfs_l84_aapl")
Dim myValue As String: myValue = getPrice.Cells(1).innerHTML
Or, alternatively, is there a way just to read whatever is directly after "yfs_184"??
I'm brand new to coding and am working very hard to get better, so any help is really appreciated!! Thanks! :)
Use this:
myValue = getPrice.innerText

HTML Page Title in Excel VBA

Given an url, how can I get the title of the html page in VBA in Excel?
For example suppose I have three urls like :
http://url1.com/somepage.html
http://url2.com/page.html
http://url3.com/page.html
Now I need to get the title of these html pages in another column. How do I do it?
Remou's answer was VERY helpful for me, but it caused a problem: It doesn't close the Internet Explorer process, so since I needed to run this dozens of times I ended up with too many IEs open, and my computer couldn't handle this.
So just add
wb.Quit
and everything will be fine.
This is the code that works for me:
Function GetTitleFromURL(sURL As String)
Dim wb As Object
Dim doc As Object
Set wb = CreateObject("InternetExplorer.Application")
wb.Navigate sURL
While wb.Busy
DoEvents
Wend
GetTitleFromURL = wb.Document.Title
wb.Quit
Set wb = Nothing
End Function
I am not sure what you mean by title, but here is an idea:
Dim wb As Object
Dim doc As Object
Dim sURL As String
Set wb = CreateObject("InternetExplorer.Application")
sURL = "http://lessthandot.com"
wb.Navigate sURL
While wb.Busy
DoEvents
Wend
''HTML Document
Set doc = wb.document
''Title
Debug.Print doc.Title
Set wb = Nothing
if you use Selenium :
Sub Get_Title()
Dim driver As New WebDriver
debug.print driver.Title
End Sub