How can I pull data from website using vba - html

I am new at vba coding to pull data from website so generally, I use this code to connect and check item to pull data from website but this code cannot check data via watch in vba with my firm webapp. it show nothing when I add watch to the class so what should I do.HTML Code from my firm webapp 1
HTML Code from my firm webapp 2
Sub Connect_web()
Dim ie As InternetExplorer
Dim doc As HTMLdocument
Dim ele As IHTMLElement
Dim col As IHTMLElementCollection
Dim ele_tmp As IHTMLElement
Set ie = New InternetExplorer
URL = "" ' Cannot provide
ie.Visible = True
ie.navigate URL
Do While ie.readyState <> READYSTATE_COMPLETE
Application.StatusBar = "Loading Page..."
DoEvents
End If
Loop
Set doc = ie.Document
Set ele = doc.getElementByClassName("GDB3EHGDHLC")
end sub

Let's start with four things:
1) Instead of .Navigate use .Navigate2
2) Use a proper wait
While ie.Busy Or ie.readyState < 4: DoEvents: Wend
3) Correct the syntax of your Set ele line. You are using ByClassNamewhich returns a collection and therefore is plural. You are missing the s at the end of element.
As you have declared ele as singular (element), perhaps first set the collection into a separate variable and index into that collection.
Dim eles As Object, ele As Object
Set eles = doc.getElementsByClassName("GDB3EHGDHLC")
Set ele = eles(0)
4) You should always use id over other attributes, if possible, as id is usually quicker for retrieval. There is an id against that class name in your image (highlighted element). I am not going to try and type it all out. Please share your HTML using the snippet tool, by editing your question, so we can relate to your html in answer easily.
Set ele = doc.getElementById("gwt-debug-restOfIdStringGoesHere")

Related

Can we fetch the specific data via using urls in vba

I have 15 different URLs, and I need to fetch price from the particular website in Excel a particular column, can you please help me out. It's my first VBA program and I try but it show my syntax error.
Sub myfile()
Dim IE As New InternetExplorer Dim url As String Dim item As
HTMLHtmlElement Dim Doc As HTMLDocument Dim tagElements As Object
Dim element As Object Dim lastRow Application.ScreenUpdating =
False Application.DisplayAlerts = False Application.EnableEvents =
False Application.Calculation = xlCalculationManual url =
"https://wtb.app.channeliq.com/buyonline/D_nhoFMJcUal_LOXlInI_g/TOA-60?html=true"
IE.navigate url IE.Visible = True Do DoEvents Loop Until
IE.readyState = READYSTATE_COMPLETE
Set Doc = IE.document
lastRow = Sheet1.UsedRange.Rows.Count + 1 Set tagElements =
Doc.all.tags("tr") For Each element In tagElements
If InStr(element.innerText, "ciq-price")> 0 And
InStr(element.className, "ciq-product-name") > 0 Then
Sheet1.Cells(lastRow, 1).Value = element.innerText
' Exit the for loop once you get the temperature to avoid unnecessary processing
Exit For End If Next
IE.Quit Set IE = Nothing Application.ScreenUpdating = True
Application.DisplayAlerts = True Application.EnableEvents = True
Application.Calculation = xlCalculationAutomatic
End Sub
You can't copy any web scraping macro for your purposes. Every page has it's own HTML code structure. So you must write for every page an own web scraping macro.
I can't explain all about web scraping with VBA here. Please start your recherche for information with "excel vba web scraping" and "document object model". Further you need knowlege about HTML and CSS. In best case also about JavaScript:
The error message user-defined type not defined ocours because you use early binding without a reference to the libraries Microsoft HTML Object Library and Microsoft Internet Controls. You can read here how to set a reference via Tools -> References... and about the differences between early and late binding Early Binding v/s Late Binding and here deeper information from Microsoft Using early binding and late binding in Automation
To get the prices from the shown url you can use the following macro. I use late binding:
Option Explicit
Sub myfile()
Dim IE As Object
Dim url As String
Dim tagElements As Object
Dim element As Object
Dim item As Object
Dim lastRow As Long
lastRow = ActiveSheet.UsedRange.Rows.Count + 1
url = "https://wtb.app.channeliq.com/buyonline/D_nhoFMJcUal_LOXlInI_g/TOA-60?html=true"
Set IE = CreateObject("internetexplorer.application")
IE.navigate url
IE.Visible = True
Do: DoEvents: Loop Until IE.readyState = 4
Set tagElements = IE.document.getElementsByClassName("ciq-online-offer-item ")
For Each element In tagElements
Set item = element.getElementsByTagName("td")(1)
ActiveSheet.Cells(lastRow, 1).Value = Trim(item.innerText)
lastRow = lastRow + 1
Next
IE.Quit
Set IE = Nothing
End Sub
Edit for a second Example:
The new link leads to an offer. I assume the price of the product is to be fetched. No loop is needed for this. You just have to find out in which HTML segment the price is and then you can decide how to get it. In the end there are only two lines of VBA that write the price into the Excel spreadsheet.
I'm in Germany and Excel has automatically set the currency sign from Dollar to Euro. This is of course wrong. Depending on where you are, this may have to be intercepted.
Sub myfile2()
Dim IE As Object
Dim url As String
Dim tagElements As Object
Dim lastRow As Long
lastRow = ActiveSheet.UsedRange.Rows.Count + 1
url = "https://www.wayfair.com/kitchen-tabletop/pdx/cuisinart-air-fryer-toaster-oven-cui3490.html"
Set IE = CreateObject("internetexplorer.application")
IE.navigate url
IE.Visible = True
Do: DoEvents: Loop Until IE.readyState = 4
'Break for 3 seconds
Application.Wait (Now + TimeSerial(0, 0, 3))
Set tagElements = IE.document.getElementsByClassName("BasePriceBlock BasePriceBlock--highlight")(0)
ActiveSheet.Cells(lastRow, 1).Value = Trim(tagElements.innerText)
IE.Quit
Set IE = Nothing
End Sub

Using VBA to navigate a href in IE HTML document

I am trying to use VBA to navigate IE to search some of the information for me. However, after the code is written and I run it. I found everything is fine before the part of clicking href, but if I use "F8" to execute the code one by one and it works. Unfortunately, I haven't found some situation like mine in this forum. Can someone give some hints to me? Thanks
Sub GetData()
Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim HTMLInput As MSHTML.IHTMLElement
Dim HTMLButton As MSHTML.IHTMLElement
Dim HTMLAllhref As MSHTML.IHTMLElementCollection
studentid = 12345678
Set IE = New InternetExplorerMedium
IE.Visible = True
IE.Navigate "https://xxx.xxx"
Do While IE.ReadyState <> READYSTATE_COMPLETE
Loop
Set HTMLDoc = IE.Document
Set HTMLInput = HTMLDoc.getElementById("abcd")
HTMLInput.Focus
HTMLInput.FireEvent("onchange")
HTMLInput.Value = studentid
Set HTMLButton = HTMLDoc.getElementById("submit")
HTMLButton.Click
The part above works properly, but the following part works only if I use "F8" to execute one by one
Set HTMLAllhref = HTMLDoc.GetElementsByTagName ("a")
For Each link in HTMLAllhref
If InStr(link.innerText, studentid) Then
IE.Navigate link.href
Exit For
End If
Next
End Sub
1) If works with F8 it is usually a timing issue and you need to determine where a wait needs to happen.
As a minimum be sure to use a proper wait after each .Navigate2, .Click and .Submit to allow for page loading:
While IE.Busy Or ie.readyState < 4: DoEvents: Wend
Potentially also after:
HTMLInput.FireEvent("onchange")
2) The recommended method is .navigate2, not .navigate.

Pull value from website (HTML div class) using Excel VBA

I'm trying to automate going to a website and pulling the ratings from several apps.
I've figured out how to navigate and login to the page.
How do I pull the element - the number "3.3" in this case - from this specific section into Excel.
Being unfamiliar with HTML in VBA, I got this far following tutorials/other questions.
Rating on website and the code behind it
Sub PullRating()
Dim HTMLDoc As HTMLDocument
Dim ie As InternetExplorer
Dim oHTML_Element As IHTMLElement
Dim sURL As String
On Error GoTo Err_Clear
sURL = "https://www.appannie.com/account/login/xxxxxxxxxx"
Set ie = New InternetExplorer
ie.Silent = True
ie.navigate sURL
ie.Visible = True
Do
'Wait until the Browser is loaded
Loop Until ie.readyState = READYSTATE_COMPLETE
Set HTMLDoc = ie.Document
HTMLDoc.all.Email.Value = "xxxxxxxxx#xxx.com"
HTMLDoc.all.Password.Value = "xxxxx"
For Each oHTML_Element In HTMLDoc.getElementById("login-form")
If oHTML_Element.Type = "submit" Then oHTML_Element.Click: Exit For
Next
Dim rating As Variant
Set rating = HTMLDoc.getElementsByClassName("rating-number ng-binding")
Range("A1").Value = rating
'ie.Refresh 'Refresh if required
Err_Clear:
If Err <> 0 Then
Err.Clear
Resume Next
End If
End Sub
The code below will let you extract text from first element with class name "rating-number ng-binding" in HTML document. By the way GetElementsByClassName is supported since IE 9.0. I use coding compatible also with older versions in my example.
Dim htmlEle1 as IHTMLElement
For Each htmlEle1 in HTMLDoc.getElementsByTagName("div")
If htmlEle1.className = "rating-number ng-binding" then
Range("A1").Value = htmlEle1.InnerText
Exit For
End if
Next htmlEle1
While Ryszards code should do the trick if you want to use the code you have already written then here is the alterations I believe you need to make.
For Each oHTML_Element In HTMLDoc.getElementById("login-form")
If oHTML_Element.Type = "submit" Then oHTML_Element.Click: Exit For
Next
'Need to wait for page to load before collecting the value
Loop Until ie.readyState = READYSTATE_COMPLETE
Dim rating As IHTMLElement
Set rating = HTMLDoc.getElementsByClassName("rating-number ng-binding")
'Need to get the innerhtml of the element
Range("A1").Value = rating.innerhtml

How to get a particular link from a web page's source code?

i need a macro in VBA that is able to extract all HTML source code from an url contained in a EXCEL cell and put it line by line in all different Excel cells.
I've previously searched different solutions on the net but not finding the right one.
Thanks for helping ;)
EDIT:
thanks to the libraries just insert i could also test another macro that i've previously found on the net:
Sub Naviga()
Dim texto As String
Dim objIE As Object
Dim DestUrl As String
DestUrl = "http://www.google.it"
Set objIE = CreateObject("InternetExplorer.Application")
objIE.Visible = False
objIE.Navigate2 DestUrl
Do
DoEvents
Loop Until objIE.ReadyState = READYSTATE_COMPLETE
Range("A" & 1).Value = objIE.document.body.innerHTML
End Sub
and it's works, but unfortunately i would like that the link was acquired directly from a cell in excel, and when the line is copied, the next line, start with the next cell, the cell below.
How can i modify the macro?
EDIT 2:
The solution is near, i've just fixed the code, now is more clean:
Sub EstrSorgPag()
Dim IE As Object
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = False
IE.navigate Range("H1")
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Range("A" & 1).Value = IE.document.body.innerHTML
End Sub
but lacks the last part where the macro should copy the content cell by cell (A1,A2,A3,A4... and so on)
EDIT 3:
Hello guys, i wrote this short code that extract all links from a web page's source code:
Sub EstraiURLdaWeb()
Dim doc As HTMLDocument
Dim output As Object
Set IE = New InternetExplorer
IE.Visible = False
IE.navigate Range("L1")
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Set doc = IE.document
Set output = doc.getElementsByTagName("a")
i = 5
For Each link In output
Range("A" & i).Value = link
i = i + 1
Next
MsgBox "Fatto!"
End Sub
But i would need to extract this in particular:
<li class="bubble"><span>Main</span></li>
how can I do?
Verify the <a>'s InnerHTML or InnerText
If you already got all the <a> tag elements, you can loop through them all (you got this already) and create a logical condition, if each particular element contains the keyword you are looking for.
Set output = doc.getElementsByTagName("a")
For Each link In output
If link.InnerHTML = "Main" Then
Range("A" & i).Value2 = link
End If
Next
Combine more GetElement(s) methods
To get more narrow collection of HTML elements, you can combine multiple GetElement(s) methods. Like so:
You can get all the HTML elemens with specific class:
Set BubbleCollection = doc.getElementsByClassName("bubble")
Then you can scan this collection for <a> tags:
Set output = BubbleCollection.getElementsByTagName("a")
Check how many elements you've got (optional for debugging/refining the search):
Debug.Print output.length

Cycling Through List of URLs Using Excel VBA

I am much more familiar with Excel now, but one thing is still baffling me - how to cycle through URLs in a loop. My current conundrum is that I have this list of URLs of packages, and need to obtain the status of each package on each page using its HTML. What I currently have to cycle through the list is:
Sub TrackingDeliveryStatusResults()
Dim IE As Object
Dim URL As Range
Dim wb1 As Workbook, ws1 As Worksheet
Dim filterRange As Range
Dim copyRange As Range
Dim lastRow As Long
Set wb1 = Application.Workbooks.Open("\\S51\******\Folders\******\TrackingDeliveryStatus.xls")
Set ws1 = wb1.Worksheets("TrackingDeliveryStatusResults")
Set IE = New InternetExplorer
With IE
.Visible = True
For Each URL In Range("C2:C & lastRow")
.Navigate URL.Value
While .Busy Or .ReadyState <> 4: DoEvents: Wend
MsgBox .Document.body.innerText
Next
End With
End Sub
And the list of URLs
My goal here is:
Cycle through each URL (inserts URL in IE and keeps going without opening new tabs)
Obtain the status of the item for each URL from the HTML element
FedEx: Delivered (td class="status")
UPS: Delivered (id="tt_spStatus")
USPS: Arrived at USPS Facility (class= "info-text first)
Finish the loop and save as a csv if at all possible (I've already done that, so I'm just posting the code portion I'm having a problem with).
My understanding is that I have to code a different if statement for each different url, since all of them have different HTML tags for their delivery status. Loops are simple, but to loop through webpages is new to me. The code has been throwing me errors no matter what changes I make.
The IE object opens up but then Excel hits an error and the code stops running.
OK Ill start with the proper syntax for you to get your code going and I will edit this answer for further code
Sub Sample()
Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False
Application.EnableEvents = True
Dim wsSheet As Worksheet, Rows As Long, links As Variant, IE As Object, link As Variant
Set wb = ThisWorkbook
Set wsSheet = wb.Sheets("Sheet1")
Set IE = New InternetExplorer
Rows = wsSheet.Cells(wsSheet.Rows.Count, "A").End(xlUp).Row
links = wsSheet.Range("A1:A" & Rows)
With IE
.Visible = True
For Each link In links
.navigate (link)
While .Busy Or .ReadyState <> 4: DoEvents: Wend
MsgBox .Document.body.innerText
Next link
End With
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
Application.EnableEvents = True
End Sub
This will get you looping I think you had some general syntax issues which you can see the difference in my code in order to loop through in the for each the link has to be of type object or variant and links I set to variant assuming it will default to a string