Scrape SVG HTML - html

I am wondering if anyone could tell me if this is possible as I've hit a bit of a brick wall. Not looking for someone to do this for me, just a nudge in the right direction would be great.
I'm scraping the link using VBA.
More specifically I'm trying to scrape the circle tag co-ordinates of the SVG element. I am able to drill down to the div class sqw_icons no problem.
I can return the properties of the SVG, such as height and width. But I can't seem to work out a way to get to the g class elements beneath that.
My impression is that it's possible because they are HTML Collections.
Any ideas would be very useful.
Here is the code:
Sub ImportSqw()
Dim ie As InternetExplorer
Dim html As HTMLDocument
Set ie = New InternetExplorer
ie.Visible = False
ie.Navigate "http://epl.squawka.com/english-barclays-premier-league/17-05-2016/man-utd-vs-b-mouth/matches"
Do While ie.readyState <> READYSTATE_COMPLETE
DoEvents
Loop
Set html = ie.Document
Set ie = Nothing
Application.StatusBar = ""
Cells.Clear
Dim levelOne As IHTMLElement
Dim levelTwo As IHTMLElementCollection
Dim levelThree As IHTMLElement
Dim levelFour As IHTMLElementCollection
Dim levelFive As IHTMLElement
Dim levelSix As IHTMLElementCollection
Set levelOne = html.getElementById("mc-pitch-view")
Set levelTwo = levelOne.Children
For Each levelThree In levelTwo
If levelThree.className = "sqw_pitch_container" Then
Set levelFour = levelThree.Children
End If
Next
For Each levelFive In levelFour
If levelFive.className = "sqw_icons" Then
Set levelSix = levelFive.Children
End If
Next
Set html = Nothing
End Sub
The item in levelSix doesn't have Children properties.

Related

Web Side Login / no debug.print shown in specific Window

I want to login a web side
After a lot of trial and errors I finally wrote a code that works pretty well.
.. but got probs
First I got a prob that occurs and said
no connection to server
This Prob I solved
Set ie = CreateObject("new:{D5E8041D-920F-45e9-B8FB-B1DEB82C6E5E}")
The next Prob is that when I ran the code Step by step (Trigger F8) the code runs pretty well.But Triggering F5 I got the Prob in Line
Set UserNameInput = LoginForm.getElementsByClassName("prePopulatedCredentials")(0)
This.. I do not understand
The next Prob is a debug.print "Prob"
Although I activated the Window (STR+G) debug.print is not shown.
Why want I see the debug Print result?
So.. I want to find out which Button I have to trigger by using the VBA Code
In my case I have 3 options
Search, new Search, go back or Sign In
For Sign In I found a solution for another web Side
Sub TestLogin()
Dim...
Dim LoginForm As MSHTML.HTMLFormElement
.. code
LoginForm.submit
For another Web Side
I want to find the buttons
Search, new Search, go back
The code is almost the same
Sub Get()
Dim...
Dim HTML Button AS MSHTML.IHTMLELEMENT
... Code
Set Buttons= HTMLDoc.GetElementsByTagName("button")
For Each HTMLButton In HTMLButtons
Debug.Print HTMLButton.className, HTMLButton.tagName, HTMLButton.Id, HTMLButton.innerText
Next Button
End Sub
In this case the code runs until the line
Set Buttons= HTMLDoc.GetElementsByTagName("button")
and it jumps over to End Sub
Here my Code for the first web side with the Prob Triggering the code with F8 or F5
Sub TestLogin()
Dim ie As SHDocVw.InternetExplorerMedium
Sub TestLogin()
Dim ie As SHDocVw.InternetExplorerMedium
Dim doc As MSHTML.HTMLDocument
Dim LoginForm As MSHTML.HTMLFormElement
Dim UserNameInput As MSHTML.HTMLInputElement
Dim PasswordInput As MSHTML.HTMLInputElement
Set ie = New SHDocVw.InternetExplorer
' Here I found a solution in the www regarding the Prob Net.Framework
Set ie = CreateObject("new:{D5E8041D-920F-45e9-B8FB-B1DEB82C6E5E}")
ie.Visible = True
ie.Navigate Sheet1.Range("B1").Text
Do While ie.ReadyState <> READYSSTATE_COMPLETE And ie.Busy
Loop
Set doc = ie.Document
Set LoginForm = doc.getElementById(Sheet1.Range("B4").Text)
Set UserNameInput = LoginForm.getElementsByClassName("prePopulatedCredentials")(0)
Set PasswordInput = LoginForm.getElementsByClassName("prePopulatedCredentials ")(1)
UserNameInput.Value = Sheet1.Range("B2").Text
PasswordInput.Value = Sheet1.Range("B3").Text
Stop
'LoginForm.submit
End Sub
An here the code for the 2nd Web Side with the Prob debug Print
Sub Get()
Dim ie As New SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim HTMLInput As MSHTML.IHTMLElement
Dim UserNameInput As MSHTML.HTMLInputElement
Dim VornameInput As MSHTML.HTMLInputElement
Dim HTMLButton As MSHTML.IHTMLElement
Dim HTMLButtons As MSHTML.IHTMLElementCollection
Dim SecurityWindow As Object
Set ie = New SHDocVw.InternetExplorer
Set ie = CreateObject("new:{D5E8041D-920F-45e9-B8FB-B1DEB82C6E5E}")
ie.Visible = True
ie.Navigate Tabelle1.Range("B1").Text
Do While ie.ReadyState <> READYSSTATE_COMPLETE And ie.Busy
Loop
Set HTMLDoc = ie.Document
Set UserNameInput = HTMLDoc.getElementById("sur")
Set VornameInput = HTMLDoc.getElementById("given")
UserNameInput.Value = Sheets1.Range("B2").Text
VornameInput.Value = Sheets1.Range("B3").Text
Stop
Set HTMLButtons = HTMLDoc.getElementsByTagName("btn")
For Each HTMLButton In HTMLButtons
Debug.Print HTMLButtons.className, HTMLButtons.tagName, HTMLButtons.ID, HTMLButtons.innerText
Next HTMLButton
End Sub
What have I done wrong or missed?
Thx for any help
And.. most important
Take all of you care
Best wishes, stay safe and healthy!
Pete
For your first issue:
It looks like a timing-related issue.
I suggest you make your code wait for some seconds before executing the problematic line of code.
It can help to fetch the value and solve this issue.
For your second issue:
As I informed you in my previous comment, you can create a new thread with detailed information. We will try to check it and try to provide you further suggestions for it.

Referencing to an iframe in vba

as my previous question doesnt fit anymore, im creating a new one.
I have the following code to open an window. in that windows i want to open the sorting "function" and then select which attribute to sort.
Page opening works
What doesnt work, is trying to address the attributes of the sorting drop down menu, so that it can select the right one. i always get debug.print length = 0 in the last code line
The code:
Sub testalt()
Dim HTMLDoc As MSHTML.HTMLDocument
Dim ie As InternetExplorerMedium
Dim enumm As String
Dim ifrm As MSHTML.HTMLDocument
Dim selection As MSHTML.IHTMLElementCollection
enumm = "E2711846"
Set ie = New InternetExplorerMedium
ie.Visible = True
ie.navigate "https://plm.corp.int:10090/enovia/common/emxFullSearch.jsp?pageConfig=tvc:pageconfig//tvc/search/AutonomySearch.xml&tvcTable=true&showInitialResults=true&cancelLabel=Close&minRequiredChars=3&genericDelete=true&selection=multiple&txtTextSearch=" & [enumm] & "&targetLocation=popup"
Do While ie.readyState <> READYSTATE_COMPLETE
Loop
Set HTMLDoc = ie.document
Set ifrm = HTMLDoc.frames(1).document
Set selection = ifrm.getElementsByTagName("select")
Debug.Print selection.Length
End Sub
This is the, i think, relevant html part of the opened page. Sry dont know how to copyit
this is the iframe, which is on the top

Webpage search filter inputs not detected when apply button is pressed via Excel VBA

I'm trying to write a VBA program which will find the person holding a specific position at a specific company via LinkedIn. I've already figured out how to open the LinkedIn search window, open all filters, and input the desired company name and position, but once I hit the "Apply" button to apply those filters, they simply aren't recognized as filters. It clicks the button as if I never input anything into those filter boxes at all.
I've figured out that the problem is not my method of clicking the apply button, but instead the lack of input recognition. My input strings are only recognized as filters if their respective filter area/box is clicked on before or after the string is added.
With all that being said, I know that the solution I'm looking for is a way to input the string and then click into that same filter box before I click the apply filters button.
Below is my code to input my specific filters into LinkedIn and click the apply button.
Sub Fill()
Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim HTMLInput As MSHTML.IHTMLElement
Dim HTMLButtons As MSHTML.IHTMLElementCollection
Dim AllFiltersButton As MSHTML.IHTMLElement
Dim ApplyButton As MSHTML.IHTMLElement2
Dim ApplyButtons As MSHTML.IHTMLElementCollection2
Dim Company As String, Position As String
Company = Range("A2").Value
Position = ("""" + Range("B1").Value + """")
IE.Visible = True
IE.navigate "https://www.linkedin.com/search/results/people/?facetGeoRegion=%5B%22us%3A70%22%5D&origin=FACETED_SEARCH"
Do While IE.ReadyState <> READYSTATE_COMPLETE
Loop
Set HTMLDoc = IE.Document
Set HTMLButtons = HTMLDoc.getElementsByTagName("button")
For Each AllFiltersButton In HTMLButtons
If AllFiltersButton.className = "search-filters-bar__all-filters button-tertiary-medium-muted flex-shrink-zero mr3" Then AllFiltersButton.Click
Next AllFiltersButton
Set HTMLInput = HTMLDoc.getElementById("search-advanced-company")
HTMLInput.Value = Company
Set HTMLInput = HTMLDoc.getElementById("search-advanced-title")
HTMLInput.Value = Position
Set ApplyButtons = HTMLDoc.getElementsByTagName("button")
For Each ApplyButton In ApplyButtons
If ApplyButton.className = "search-advanced-facets__button--apply button-primary-large" Then ApplyButton.Click
Next ApplyButton
I have all the necessary references selected (Microsoft HTML Object Library, Microsoft Internet Controls, Microsoft Office 15.0 Object Library etc.)
and this portion of the code seems to work flawlessly.
After spending hours trying to figure out a way to click in these search/filter boxes, I stumbled across a method which should work, but I can't seem to adapt the code for my specific circumstance.
Set evt = ie.Document.createEvent("keyboardevent")
evt.initEvent "change", True, False
PW.all(0).dispatchEvent evt
My attempt to adapt this method to click into the company search box (but doesn't work) is as follows:
Dim vSelect As HTMLSelectElement
Dim eventClick As Object
Set vSelect = HTMLDoc.getElementById("search-advanced-company")
Set eventClick = HTMLDoc.createEvent("click")
eventClick.initEvent "change", True, False
vSelect.dispatchEvent eventClick
How should I adapt this code to click into a search/filter box?
If any one could help me out in any way, it would be much appreciated.
Thanks in advance.
EDIT: I'm getting mixed information as to whether data scraping on LinkedIn is permitted or not. Just to clarify, I will not be using this code for data scraping to avoid any infringement issues.
After resorting to the often avoided and disrespected SendKeys function, I was finally able to make it work.
Below is the corrected code:
Sub Fill()
Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim HTMLButtons As MSHTML.IHTMLElementCollection
Dim HTMLInput As MSHTML.IHTMLElement
Dim HTMLInputs As MSHTML.IHTMLElementCollection2
Dim AllFiltersButton As MSHTML.IHTMLElement
Dim ApplyButton As MSHTML.IHTMLElement
Dim ApplyButtons As MSHTML.IHTMLElementCollection3
Dim Company As String, Position As String
Company = Range("A2").Value
Position = ("""" + Range("B1").Value + """")
IE.Visible = True
IE.navigate "https://www.linkedin.com/search/results/people/?facetGeoRegion=%5B%22us%3A70%22%5D&origin=FACETED_SEARCH"
Do While IE.ReadyState <> READYSTATE_COMPLETE
Loop
Set HTMLDoc = IE.Document
Set HTMLButtons = HTMLDoc.getElementsByTagName("button")
For Each AllFiltersButton In HTMLButtons
If AllFiltersButton.className = "search-filters-bar__all-filters button-tertiary-medium-muted flex-shrink-zero mr3" Then AllFiltersButton.Click
Next AllFiltersButton
Set HTMLInputs = HTMLDoc.getElementsByTagName("input")
HTMLInputs(2).Value = Position
HTMLInputs(2).Focus
Application.SendKeys "{LEFT}", True
HTMLInputs(3).Value = Company
HTMLInputs(3).Focus
Application.SendKeys "{LEFT}", True
Set ApplyButtons = HTMLDoc.getElementsByTagName("button")
For Each ApplyButton In ApplyButtons
If ApplyButton.className = "search-advanced-facets__button--apply button-primary-large" Then ApplyButton.Click
Next ApplyButton
End Sub
A big "Thank you!" to #ashleedawg for the suggestion leading to the solution!

Unable to click at hyperlink on webpage with anchor tag

After testing of different logic's, finally I stuck in Visual Basic for Applications to find out the right way to trigger the below attribute:
I want to click on hyperlink which does not remain same, it shows different numbers with hyperlink on every next attempt and below is my VBA code:
Dim MyBrowser As InternetExplore
Dim MyHTML_Element As IHTMLElement
Dim myURL As String
Dim htmlInput As HTMLInputElement
Dim htmlColl As IHTMLElementCollection
Dim p As String
Dim link As Object
Dim I As Integer
Dim ie As SHDocVw.InternetExplorer
Dim doc As MSHTML.HTMLDocument
myURL = "url............."
Set MyBrowser = New InternetExplorer
MyBrowser.Silent = True
MyBrowser.navigate myURL
MyBrowser.Visible = True
Do
Loop Until MyBrowser.readyState = READYSTATE_COMPLETE
Set HTMLDoc = MyBrowser.Document
If htmldoc.all.item(i).innerText = Range("K20").Value Then ' Range is equal to cell value "4000123486736"
htmldoc.all.item(i).Click <------- not woking both lines
Please also see inspects on IE appended below:
Of course this cannot work
If htmldoc.all.item(i).innerText = Range("K20").Value Then ' Range is equalto cell value "4000123486736"
htmldoc.all.item(i).Click <------- not woking both lines
because there is no loop that defines i.
I suggest to loop through all link tags <a> only:
Dim LinkItem As Variant
For Each LinkItem In HTMLDoc.getElementsByTagName("a")
If LinkItem.innerText = Range("K20").Value Then
LinkItem.Click
Exit For 'stop looping when link was found
End If
Next LinkItem

Getting information from HTML page via VBA

From VBA, I am trying to access to the "username" cell from a web page so that I could type in the appropriate username.
The problem is that in the HTML code from the page we have more than one element with the same name which is "LOGON_USERID" and I can't figure out how to access to the right one.
As you can see on the image "part of the HTML code", the line I'm trying to access to is the highlighted one, but there are also 2 other elements which have the same name above it.
part of the HTML code
I tried lots of different ways (using different methods or variable types etc), but since I'm not familiar with HTML I can't manage to get what I want.
Sub Pum()
Dim ie As New InternetExplorer
'Dim IEDoc As IHTMLElementCollection
Dim IEDoc As HTMLDocument
Dim name As Object
Dim nameList As HTMLInputElement
Dim WRONGS As DispHTMLElementCollection
Dim Elems As HTMLElementCollection
Dim i As Integer
ie.navigate "thewebsiteinquestion"
ie.Visible = False
WaitIE ie
Set IEDoc = ie.document
'MsgBox IEDoc.DocumentElement.
'Elems = IEDoc.getElementsByTagName("INPUT")
MsgBox TypeName(IEDoc.getElementById("LOGON_USERID").all)
Set Elems = IEDoc.getElementById("LOGON_USERID")
'For i = 0 To 5
MsgBox Elems.Length
'Next i
For Each name In Elems.Children
MsgBox name.nodeName
MsgBox name.Attributes
MsgBox name.all
Next
'If ((NameStr Isnot Nothing And (NameStr.Length <> 0)) Then
'If NameStr = "LOGON_USERID" Then
'If TypeName(IEDoc.all("LOGON_USERID")) = "HTMLInputElement" Then
'MsgBox TypeName(IEDoc.all("LOGON_USERID"))
'Set names = IEDoc.all.Item("text")
'TypeName (InputUsernameTextzone)
'Dim Question As IHTMLElement
'Question = InputUsernameTextzone.parentElement
'MsgBox TypeName(InputUsernameTextzone.parentElement.getAttribute("name"))
'InputUsernameTextzone.parentElement
'CELLULE.value = "qtc2464"
WaitIE ie
Set ie = Nothing
Set IEDoc = Nothing
End Sub
I tried two other similar codes using different methods but I still have no results. Hopefully you can help me.
If you need more information, let me know.
The other two input elements are of different type (they are hidden) so you could use querySelector with attribute type=text to find your desired element.
Dim userid As HTMLInputElement
Set userid = IEDoc.querySelector("input[name='LOGON_USERID'][type='text']")
If Not userid Is Nothing Then
' Continue with user id element
Else
MsgBox "LOGON_USERID not found on the page"
End If
I am a newbie at this but if this could help anyone, here's the simplified version of the macro I made :
Sub Access_Puma()
Dim ie As New InternetExplorer
Dim IEDoc As HTMLDocument
Dim userid As HTMLInputElement
Dim userpwd As HTMLInputElement
ie.navigate "thewebsitetoaccess"
ie.Visible = True
WaitIE ie
Set IEDoc = ie.document
Set userid = IEDoc.querySelector("input[name='LOGON_USERID'][type='text']")
If Not userid Is Nothing Then
userid.value = "myusername"
Else
MsgBox "LOGON_USERID not found on the page"
End If
Set userpwd = IEDoc.querySelector("input[name='LOGON_PASSWD'][type='password']")
If Not userpwd Is Nothing Then
userpwd.value = "mypassword"
Else
MsgBox "LOGON_PASSWD not found on the page"
End If
End Sub