Getting information from HTML page via VBA - html

From VBA, I am trying to access to the "username" cell from a web page so that I could type in the appropriate username.
The problem is that in the HTML code from the page we have more than one element with the same name which is "LOGON_USERID" and I can't figure out how to access to the right one.
As you can see on the image "part of the HTML code", the line I'm trying to access to is the highlighted one, but there are also 2 other elements which have the same name above it.
part of the HTML code
I tried lots of different ways (using different methods or variable types etc), but since I'm not familiar with HTML I can't manage to get what I want.
Sub Pum()
Dim ie As New InternetExplorer
'Dim IEDoc As IHTMLElementCollection
Dim IEDoc As HTMLDocument
Dim name As Object
Dim nameList As HTMLInputElement
Dim WRONGS As DispHTMLElementCollection
Dim Elems As HTMLElementCollection
Dim i As Integer
ie.navigate "thewebsiteinquestion"
ie.Visible = False
WaitIE ie
Set IEDoc = ie.document
'MsgBox IEDoc.DocumentElement.
'Elems = IEDoc.getElementsByTagName("INPUT")
MsgBox TypeName(IEDoc.getElementById("LOGON_USERID").all)
Set Elems = IEDoc.getElementById("LOGON_USERID")
'For i = 0 To 5
MsgBox Elems.Length
'Next i
For Each name In Elems.Children
MsgBox name.nodeName
MsgBox name.Attributes
MsgBox name.all
Next
'If ((NameStr Isnot Nothing And (NameStr.Length <> 0)) Then
'If NameStr = "LOGON_USERID" Then
'If TypeName(IEDoc.all("LOGON_USERID")) = "HTMLInputElement" Then
'MsgBox TypeName(IEDoc.all("LOGON_USERID"))
'Set names = IEDoc.all.Item("text")
'TypeName (InputUsernameTextzone)
'Dim Question As IHTMLElement
'Question = InputUsernameTextzone.parentElement
'MsgBox TypeName(InputUsernameTextzone.parentElement.getAttribute("name"))
'InputUsernameTextzone.parentElement
'CELLULE.value = "qtc2464"
WaitIE ie
Set ie = Nothing
Set IEDoc = Nothing
End Sub
I tried two other similar codes using different methods but I still have no results. Hopefully you can help me.
If you need more information, let me know.

The other two input elements are of different type (they are hidden) so you could use querySelector with attribute type=text to find your desired element.
Dim userid As HTMLInputElement
Set userid = IEDoc.querySelector("input[name='LOGON_USERID'][type='text']")
If Not userid Is Nothing Then
' Continue with user id element
Else
MsgBox "LOGON_USERID not found on the page"
End If

I am a newbie at this but if this could help anyone, here's the simplified version of the macro I made :
Sub Access_Puma()
Dim ie As New InternetExplorer
Dim IEDoc As HTMLDocument
Dim userid As HTMLInputElement
Dim userpwd As HTMLInputElement
ie.navigate "thewebsitetoaccess"
ie.Visible = True
WaitIE ie
Set IEDoc = ie.document
Set userid = IEDoc.querySelector("input[name='LOGON_USERID'][type='text']")
If Not userid Is Nothing Then
userid.value = "myusername"
Else
MsgBox "LOGON_USERID not found on the page"
End If
Set userpwd = IEDoc.querySelector("input[name='LOGON_PASSWD'][type='password']")
If Not userpwd Is Nothing Then
userpwd.value = "mypassword"
Else
MsgBox "LOGON_PASSWD not found on the page"
End If
End Sub

Related

Webpage search filter inputs not detected when apply button is pressed via Excel VBA

I'm trying to write a VBA program which will find the person holding a specific position at a specific company via LinkedIn. I've already figured out how to open the LinkedIn search window, open all filters, and input the desired company name and position, but once I hit the "Apply" button to apply those filters, they simply aren't recognized as filters. It clicks the button as if I never input anything into those filter boxes at all.
I've figured out that the problem is not my method of clicking the apply button, but instead the lack of input recognition. My input strings are only recognized as filters if their respective filter area/box is clicked on before or after the string is added.
With all that being said, I know that the solution I'm looking for is a way to input the string and then click into that same filter box before I click the apply filters button.
Below is my code to input my specific filters into LinkedIn and click the apply button.
Sub Fill()
Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim HTMLInput As MSHTML.IHTMLElement
Dim HTMLButtons As MSHTML.IHTMLElementCollection
Dim AllFiltersButton As MSHTML.IHTMLElement
Dim ApplyButton As MSHTML.IHTMLElement2
Dim ApplyButtons As MSHTML.IHTMLElementCollection2
Dim Company As String, Position As String
Company = Range("A2").Value
Position = ("""" + Range("B1").Value + """")
IE.Visible = True
IE.navigate "https://www.linkedin.com/search/results/people/?facetGeoRegion=%5B%22us%3A70%22%5D&origin=FACETED_SEARCH"
Do While IE.ReadyState <> READYSTATE_COMPLETE
Loop
Set HTMLDoc = IE.Document
Set HTMLButtons = HTMLDoc.getElementsByTagName("button")
For Each AllFiltersButton In HTMLButtons
If AllFiltersButton.className = "search-filters-bar__all-filters button-tertiary-medium-muted flex-shrink-zero mr3" Then AllFiltersButton.Click
Next AllFiltersButton
Set HTMLInput = HTMLDoc.getElementById("search-advanced-company")
HTMLInput.Value = Company
Set HTMLInput = HTMLDoc.getElementById("search-advanced-title")
HTMLInput.Value = Position
Set ApplyButtons = HTMLDoc.getElementsByTagName("button")
For Each ApplyButton In ApplyButtons
If ApplyButton.className = "search-advanced-facets__button--apply button-primary-large" Then ApplyButton.Click
Next ApplyButton
I have all the necessary references selected (Microsoft HTML Object Library, Microsoft Internet Controls, Microsoft Office 15.0 Object Library etc.)
and this portion of the code seems to work flawlessly.
After spending hours trying to figure out a way to click in these search/filter boxes, I stumbled across a method which should work, but I can't seem to adapt the code for my specific circumstance.
Set evt = ie.Document.createEvent("keyboardevent")
evt.initEvent "change", True, False
PW.all(0).dispatchEvent evt
My attempt to adapt this method to click into the company search box (but doesn't work) is as follows:
Dim vSelect As HTMLSelectElement
Dim eventClick As Object
Set vSelect = HTMLDoc.getElementById("search-advanced-company")
Set eventClick = HTMLDoc.createEvent("click")
eventClick.initEvent "change", True, False
vSelect.dispatchEvent eventClick
How should I adapt this code to click into a search/filter box?
If any one could help me out in any way, it would be much appreciated.
Thanks in advance.
EDIT: I'm getting mixed information as to whether data scraping on LinkedIn is permitted or not. Just to clarify, I will not be using this code for data scraping to avoid any infringement issues.
After resorting to the often avoided and disrespected SendKeys function, I was finally able to make it work.
Below is the corrected code:
Sub Fill()
Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim HTMLButtons As MSHTML.IHTMLElementCollection
Dim HTMLInput As MSHTML.IHTMLElement
Dim HTMLInputs As MSHTML.IHTMLElementCollection2
Dim AllFiltersButton As MSHTML.IHTMLElement
Dim ApplyButton As MSHTML.IHTMLElement
Dim ApplyButtons As MSHTML.IHTMLElementCollection3
Dim Company As String, Position As String
Company = Range("A2").Value
Position = ("""" + Range("B1").Value + """")
IE.Visible = True
IE.navigate "https://www.linkedin.com/search/results/people/?facetGeoRegion=%5B%22us%3A70%22%5D&origin=FACETED_SEARCH"
Do While IE.ReadyState <> READYSTATE_COMPLETE
Loop
Set HTMLDoc = IE.Document
Set HTMLButtons = HTMLDoc.getElementsByTagName("button")
For Each AllFiltersButton In HTMLButtons
If AllFiltersButton.className = "search-filters-bar__all-filters button-tertiary-medium-muted flex-shrink-zero mr3" Then AllFiltersButton.Click
Next AllFiltersButton
Set HTMLInputs = HTMLDoc.getElementsByTagName("input")
HTMLInputs(2).Value = Position
HTMLInputs(2).Focus
Application.SendKeys "{LEFT}", True
HTMLInputs(3).Value = Company
HTMLInputs(3).Focus
Application.SendKeys "{LEFT}", True
Set ApplyButtons = HTMLDoc.getElementsByTagName("button")
For Each ApplyButton In ApplyButtons
If ApplyButton.className = "search-advanced-facets__button--apply button-primary-large" Then ApplyButton.Click
Next ApplyButton
End Sub
A big "Thank you!" to #ashleedawg for the suggestion leading to the solution!

Parse Saved HTML File VBA

I have a HTML file that is saved locally on the desktop which contains a table of statistics from which I need to pull specific data, paste it into a excel workbook table and then email it.
I've got the rest of the process working, I'm just struggling to figure out how to parse the html file and all other examples I've seen are parsing a website rather than a locally saved html file.
Apologies if this is a bit of beginner question but I'm finding it hard to make sense of the other examples I've seen.
thank you for any assistance.
Thank you to everyone for your examples and pointing me in the right direction ! The example posted below copies the data from a HTML file stored on the users desktop and pastes it into a new worksheet in Excel.
Option Explicit
Sub ParseHTML()
Dim URL As String
Dim IE As InternetExplorer
Dim htmldoc As MSHTML.IHTMLDocument 'Document object
Dim eleColtr As MSHTML.IHTMLElementCollection 'Element collection for tr tags
Dim eleColtd As MSHTML.IHTMLElementCollection 'Element collection for td tags
Dim htmlTables As MSHTML.IHTMLElementCollection 'Element collection for table tags
Dim eleRow As MSHTML.IHTMLElement 'Row elements
Dim eleCol As MSHTML.IHTMLElement 'Column elements
Dim wksOut As Worksheet
Dim rngOut As Range
Dim intTableIndex As Integer
Dim intRowIndex As Integer
Dim intColIndex As Integer
URL = Environ("userProfile") & "\desktop\FileName.HTML"
'Open InternetExplorer.
Set IE = New InternetExplorer
'Navigate to URL.
With IE
.navigate URL
.Visible = False
'Extract html information to objects.
Set htmldoc = IE.document
Set htmlTables = htmldoc.getElementsByTagName("table")
Set eleColtr = htmlTables(intTableIndex).getElementsByTagName("tr")
'Extract table to a new blank worksheet.
On Error Resume Next
Set wksOut = ThisWorkbook.Worksheets("WorksheetName")
If Err.Number <> 0 Then
Set wksOut = ThisWorkbook.Worksheets.Add(After:=Worksheets(Worksheets.Count))
wksOut.Name = "WorksheetName"
End If
With wksOut
.Cells.Clear
.Cells.NumberFormat = "General"
.Cells.ColumnWidth = 2
End With
On Error GoTo 0
'This section populates Excel
intRowIndex = 0
For Each eleRow In eleColtr
Set eleColtd = htmlTables(intTableIndex).getElementsByTagName("tr")(intRowIndex).getElementsByTagName("td") 'get all the td elements in that specific tr
Set rngOut = wksOut.Range("A1000000").End(xlUp).Offset(1, 0)
intColIndex = 0
For Each eleCol In eleColtd
rngOut.Offset(0, intColIndex) = eleCol.innerText
intColIndex = intColIndex + 1
Next eleCol
intRowIndex = intRowIndex + 1
Next eleRow
wksOut.Cells.EntireColumn.AutoFit
'Cleanup
IE.Quit
Set IE = Nothing
Set htmldoc = Nothing
Set htmlTables = Nothing
Set eleColtr = Nothing
Set eleColtd = Nothing
Set wksOut = Nothing
Set rngOut = Nothing
End With
End Sub
Please note that excel may throw a Runtime Error Automation Error on line:
Set IE = New InternetExplorer
If this happens try setting InternetExplorer integrity to Medium:
Set IE = New InternetExplorerMedium
If you need more information regarding InternetExplorer Integrity please see
https://blogs.msdn.microsoft.com/ieinternals/2011/08/03/default-integrity-level-and-automation/
As Tim mentioned I could open the file in excel and copy and paste the values which runs a lot faster:
Sub CopyHTML()
dim Wb as Workbook
dim Ws as Worksheet
Set Wb = ActiveWorkbook
Set Ws = Wb.Sheets("Sheet1")
'Opens html file and copies range
Workbooks.Open (Environ("userProfile") & "\desktop\FileName.html")
Range("A1:AJ21").Select
Selection.Copy
'pastes range in cell B5 on active workbook
Wb.Activate
Range("B5").Select
Ws.Paste
Application.CutCopyMode = False
Workbooks("FileName.html").Close
Thanks for the advice Tim !

Unable to click at hyperlink on webpage with anchor tag

After testing of different logic's, finally I stuck in Visual Basic for Applications to find out the right way to trigger the below attribute:
I want to click on hyperlink which does not remain same, it shows different numbers with hyperlink on every next attempt and below is my VBA code:
Dim MyBrowser As InternetExplore
Dim MyHTML_Element As IHTMLElement
Dim myURL As String
Dim htmlInput As HTMLInputElement
Dim htmlColl As IHTMLElementCollection
Dim p As String
Dim link As Object
Dim I As Integer
Dim ie As SHDocVw.InternetExplorer
Dim doc As MSHTML.HTMLDocument
myURL = "url............."
Set MyBrowser = New InternetExplorer
MyBrowser.Silent = True
MyBrowser.navigate myURL
MyBrowser.Visible = True
Do
Loop Until MyBrowser.readyState = READYSTATE_COMPLETE
Set HTMLDoc = MyBrowser.Document
If htmldoc.all.item(i).innerText = Range("K20").Value Then ' Range is equal to cell value "4000123486736"
htmldoc.all.item(i).Click <------- not woking both lines
Please also see inspects on IE appended below:
Of course this cannot work
If htmldoc.all.item(i).innerText = Range("K20").Value Then ' Range is equalto cell value "4000123486736"
htmldoc.all.item(i).Click <------- not woking both lines
because there is no loop that defines i.
I suggest to loop through all link tags <a> only:
Dim LinkItem As Variant
For Each LinkItem In HTMLDoc.getElementsByTagName("a")
If LinkItem.innerText = Range("K20").Value Then
LinkItem.Click
Exit For 'stop looping when link was found
End If
Next LinkItem

VBA auto login unsuccessful Error 91

Basically, I would like to auto login a website, i could find the id= "username" so i could use IE.Document.getElementById("username").value = "xxxxxxx" However, it does not work since there is no value input on Dom. Instead, when i try to edit on HTML DOM by creating new attribute - Value = "xxxxxxx", it shows the username input on website. I wonder if i could transfer this into vba and finish the auto login. Thanks a lot! Error 91 shown regarding to this case
Sub Nomuralogin()
Dim IE As InternetExplorer
Dim Stockcode As String
Dim Stocktext As String, Textchange As String
Dim HTMLDoc As MSHTML.HTMLDocument
Dim IEField As HTMLInputElement
Dim i As Integer, nAsset As Integer
Set IE = CreateObject("InternetExplorer.Application")
IE.Navigate "https://www.nomuranow.com/portal/site/nnextranet/en/#curtain- login"
IE.Visible = True
Do While IE.Busy Or IE.ReadyState <> READYSTATE_COMPLETE
DoEvents
Loop
Set HTMLDoc = IE.Document
Set IEField = HTMLDoc.getElementById("username")
IEField.Value = "abc#gmail.com"
'HTMLDoc.all.item("username").value = "abc#gmail.com"
Application.Wait Now + TimeValue("00:00:03")
Application.DisplayAlerts = False
IE.Quit
Set IE = Nothing
End Sub
Pic 1 shows there is no Value input:
Pic 2 shows I have created property and added a value input:

Pull value from website (HTML div class) using Excel VBA

I'm trying to automate going to a website and pulling the ratings from several apps.
I've figured out how to navigate and login to the page.
How do I pull the element - the number "3.3" in this case - from this specific section into Excel.
Being unfamiliar with HTML in VBA, I got this far following tutorials/other questions.
Rating on website and the code behind it
Sub PullRating()
Dim HTMLDoc As HTMLDocument
Dim ie As InternetExplorer
Dim oHTML_Element As IHTMLElement
Dim sURL As String
On Error GoTo Err_Clear
sURL = "https://www.appannie.com/account/login/xxxxxxxxxx"
Set ie = New InternetExplorer
ie.Silent = True
ie.navigate sURL
ie.Visible = True
Do
'Wait until the Browser is loaded
Loop Until ie.readyState = READYSTATE_COMPLETE
Set HTMLDoc = ie.Document
HTMLDoc.all.Email.Value = "xxxxxxxxx#xxx.com"
HTMLDoc.all.Password.Value = "xxxxx"
For Each oHTML_Element In HTMLDoc.getElementById("login-form")
If oHTML_Element.Type = "submit" Then oHTML_Element.Click: Exit For
Next
Dim rating As Variant
Set rating = HTMLDoc.getElementsByClassName("rating-number ng-binding")
Range("A1").Value = rating
'ie.Refresh 'Refresh if required
Err_Clear:
If Err <> 0 Then
Err.Clear
Resume Next
End If
End Sub
The code below will let you extract text from first element with class name "rating-number ng-binding" in HTML document. By the way GetElementsByClassName is supported since IE 9.0. I use coding compatible also with older versions in my example.
Dim htmlEle1 as IHTMLElement
For Each htmlEle1 in HTMLDoc.getElementsByTagName("div")
If htmlEle1.className = "rating-number ng-binding" then
Range("A1").Value = htmlEle1.InnerText
Exit For
End if
Next htmlEle1
While Ryszards code should do the trick if you want to use the code you have already written then here is the alterations I believe you need to make.
For Each oHTML_Element In HTMLDoc.getElementById("login-form")
If oHTML_Element.Type = "submit" Then oHTML_Element.Click: Exit For
Next
'Need to wait for page to load before collecting the value
Loop Until ie.readyState = READYSTATE_COMPLETE
Dim rating As IHTMLElement
Set rating = HTMLDoc.getElementsByClassName("rating-number ng-binding")
'Need to get the innerhtml of the element
Range("A1").Value = rating.innerhtml