How to scrape option values from a website using VBA - html

I am trying to fetch the names and values of locations from a website page.
For example: I want to take the value 10 and label " Johannesburg OR Tambo International Airport" and insert it into cell B3 and B4 respectively and then loop it for all optgroups. I get an error "Object doesn't support this property or method." Im sure my code has multiple issues. any assistance will be greatly appreciated.
My code is as follows:
Sub test1()
''''''''''''''''''''''''''''This part states the variables and their dimenstions.
Dim appIE As Object
Dim ws As Worksheet
Dim wb As Workbook
Dim o
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
i = 2
Set wb = Application.Workbooks("Test2")
Set ws = wb.Worksheets("Europcar Branches")
Set appIE = CreateObject("internetexplorer.application")
'Navigate to Europcar
'Open internet explorer
With appIE
.Navigate "https://www.europcar.co.za"
.Visible = True
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Application.Wait (Now + TimeValue("0:00:03"))
Do While appIE.busy
DoEvents
Application.Wait (Now + TimeValue("0:00:05"))
Loop
Application.Wait (Now + TimeValue("0:00:02"))
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Set entry = appIE.document.getElementById("PickupBranch_BranchID_id")
For Each o In entry.getElementsByName("optgroup")
Cells(i, 3).Value = o.Value
For Each p In entry.getElementsByName("optgroup").Options
Cells(i, 4).Value = p.innerText
i = i + 1
Exit For
Next
Exit For
Next
'
'.Navigate "https://www.europcar.co.za"
'.Visible = True
Application.Wait (Now + TimeValue("0:00:01"))
Do While appIE.busy
DoEvents
Application.Wait (Now + TimeValue("0:00:03"))
Loop
End With
appIE.quit
Set appIE = Nothing
End Sub
A section of Html is as follows:
<select name="PickupBranch_BranchID" class="pick-up-select responsive-select" id="PickupBranch_BranchID_id" style="display: none;" data-placeholder="Pickup Location">
<option value=""></option>
<optgroup value="0" label="Airports">
<option value="10">Johannesburg OR Tambo International Airport</option>
<option value="20">Cape Town International Airport</option>
<option value="76">King Shaka International Airport</option>
<option value="48">Lanseria Airport</option>
<option value="89">Bloemfontein Airport</option>
<option value="70">East London Airport</option>
<option value="61">George Airport</option>
<option value="91">Kimberley Airport </option>
<option value="14">Polokwane Airport</option>
<option value="95">Kruger Mpumalanga Int Airport</option>
<option value="138">Malelane Airport</option>
<option value="79">Margate Airport</option>
<option value="44">CSIR Pretoria</option>
<option value="13">Pietermaritzburg Airport</option>
<option value="7">Port Elizabeth Airport</option>
<option value="84">Richards Bay Airport</option>
<option value="75">Umtata Airport</option>
<option value="103">Upington Airport</option>
<option value="52">Wonderboom Airport</option>
<option value="46">Germiston Rand Airport</option>
</optgroup>
<optgroup value="3" label="Gauteng">
<option value="133">Boksburg Easyway</option>
<option value="42">Braamfontein</option>
<option value="134">Bryanston Easyway </option>
<option value="43">Centurion</option>
<option value="135">Constantia Kloof Easyway</option>
<option value="45">Fourways</option>
<option value="154">Johannesburg Parkstation</option>
<option value="125">Kramerville</option>
<option value="121">Meadowdale</option>
<option value="50">Megawatt Park</option>
<option value="155">Menlyn Easyway</option>
<option value="47">Mogale City (Krugersdorp Agency)</option>
<option value="11">Pretoria Hatfield</option>
<option value="53">Randburg</option>
<option value="161">Rosebank Gautrain Station</option>
<option value="158">Sandton Gautrain Station</option>
<option value="55">Sandton Town</option>
<option value="59">Vanderbijlpark</option>
</optgroup>
</select>

The following shows you how to do for one drop down (it gathers all the optgroups within). It avoids using a browser and goes with the faster xmlhttp request. I use getElementById, to get the parent select element, and then getElementsByClassName to retrieve the child option tag elements. I loop from 1 to avoid the empty first element.
References (VBE > Tools > References):
Microsoft HTML Object Library
VBA:
Option Explicit
Public Sub GetOptions()
Dim html As Object, ws As Worksheet, headers()
Dim i As Long, r As Long, c As Long, numRows As Long
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set html = New HTMLDocument
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.europcar.co.za/", False
.send
html.body.innerHTML = .responseText
Dim pickupBranches As Object, pickupBranchResults()
Set pickupBranches = html.getElementById("PickupBranch_BranchID_id").getElementsByTagName("option")
headers = Array("Pickup Location", "option value")
numRows = pickupBranches.Length - 1
ReDim pickupBranchResults(1 To numRows, 1 To 2)
For i = 1 To numRows
pickupBranchResults(i, 1) = pickupBranches.item(i).innerText
pickupBranchResults(i, 2) = pickupBranches.item(i).Value
Next
With ws
.Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
.Cells(2, 1).Resize(UBound(pickupBranchResults, 1), UBound(pickupBranchResults, 2)) = pickupBranchResults
End With
End With
End Sub

Related

Select tag option in VBA

Set objCollection = ie.document.getElementsByTagName("select")
For Each opt In ie.document.getElementsByTagName("option")
If opt.innerText = "1000" Then
Debug.Print "found it"
opt.Selected = True
Exit For
End If
Next opt
I want it to appear like this:
That's how it appears:
That is the HTML code
<select size="1" name="ocorrencias_listar_length" aria-controls="ocorrencias_listar">
<option value="10">10</option>
<option value="25" selected="selected">25</option>
<option value="50">50</option>
<option value="100">100</option>
<option value="250">250</option>
<option value="500">500</option>
<option value="1000">1000</option>
</select>

Using VBA to choose from Dropdown in HTML

I am trying to use VBA to select from a dropdown list. The HTML code is below
<select name="template" class="chzn-select stdText allow_single_deselect"
id="template" style="width: 315px; display: none; visibility: visible;"
onchange="newDoc.doAfterTemplateNew('templateRow',this);" size="3"
data-automation-id="authorTemplateDropdown" data-placeholder="Choose template...">
<option></option>
<option value="1">1. Company_Comment (CC)</option>
<option value="3">1. Company_Flash (CF)</option>
<option value="79">1. Company_Report (CR)</option>
<option value="21">2. Sector_Comment (SC)</option>
Right now I am trying to use:
.document.querySelector("Select[name=template] option[value=1]").Selected = True
However, I am getting error 8070000c. Thank you for the help!
This worked for me. Note quotes around option value attribute
Sub tester()
'Added reference to Microsoft HTML Object Model
Dim doc As New HTMLDocument, opt As Object, slct As Object
doc.body.innerHTML = Range("A1").Value 'load HTML from cell for testing
Set slct = doc.querySelector("select[name=template]") 'select object
Set opt = doc.querySelector("select[name=template] option[value='1']") 'option
Debug.Print "before", slct.selectedIndex ' >> -1
opt.Selected = True
Debug.Print "after", slct.selectedIndex ' >> 1
End Sub
HTML in A1:
<select name="template" id="template" size="3" data-placeholder="Choose template...">
<option></option>
<option value="1">1. Company_Comment (CC)</option>
<option value="3">1. Company_Flash (CF)</option>
<option value="79">1. Company_Report (CR)</option>
<option value="21">2. Sector_Comment (SC)</option>
</select>

VBA to run event after selecting option out of select list on a web page

I 'm currently working on an access database project and i 've decided it's time to make a bigger step by trying the website automation. So, considering the fact that I 'm new to the "Internet Explorer Object" area, I would appreciate it if you could help me with something. I have discovered how to track the html elements i want to work with and how to get their value or the opposite. The point is that on the website I 'm working on, there is a hidden button which in order to make it appear you 'll have to click on the first option out of dropdown list. I 've tried everything i found online but still I can't make the button appear, and so i cannot "click" it via VBA. Here 's the VBA code I use:
Dim objIE As InternetExplorer
Dim oHTML As HTMLOptionElement
Set objIE = New InternetExplorer
objIE.Visible = True
objIE.Navigate "https://registration.dikaiomata.gr/user_registration/"
Do While objIE.Busy = True Or objIE.ReadyState <> 4: DoEvents: Loop
For Each oHTML In objIE.Document.getElementById("appSelect").getElementsByTagName("option")
If oHTML.value = "gaee2020" Then
oHTML.Click
End If
Next
Here is the dropdownlist HTML code:
<div class="form-group">
<script type="text/javascript">
var BASE_APP_URL = 'https://registration.dikaiomata.gr/user_registration/apps/index.php?m=';
</script>
<select class="form-control form-control-sm" id="appSelect">
<option value="0">-- Επιλέξτε --</option>
<option value="gaee2020">Ενιαία Αίτηση Ενίσχυσης 2020 (ΕΑΕ2020)</option>
<option value="gaee2019">Ενιαία Αίτηση Ενίσχυσης 2019 (ΕΑΕ2019)</option>
<option value="gaee2018">Ενιαία Αίτηση Ενίσχυσης 2018 (ΕΑΕ2018)</option>
<option value="gaee2017">Ενιαία Αίτηση Ενίσχυσης 2017 (ΕΑΕ2017)</option>
<option value="M101AEO">Μέτρο 10.1 "Αμπελώνες Ελαιώνες Ορνιθοπανίδα"</option>
<option value="M1018">Μέτρο 10 (Δράσεις 10.1.08 – 10.1.07)</option>
<option value="M1014">Δράση 10.1.04 "Μείωση της ρύπανσης νερού από γεωργική δραστηριότητα"</option>
<option value="M1019">Δράση 10.1.09 "Διατήρηση Απειλούμενων Αυτόχθονων Φυλών Αγροτικών Ζώων"</option>
<option value="bpe">Μεταβιβάσεις Δικαιωμάτων Βασικής Ενίσχυσης (ΜΔΒΕ) 2015-2020</option>
<option value="FarmersTab">Καρτέλα Αγρότη</option>
<option value="Organics16">Μέτρο 11 του ΠΑΑ 2014-2020 (Βιολογικά)</option>
<option value="gaee2016">Ενιαία Αίτηση Ενίσχυσης 2016 (ΕΑΕ2016)</option>
<option value="gaee2015">Ενιαία Αίτηση Ενίσχυσης 2015 (ΕΑΕ2015)</option>
<option value="gaee2014">Ενιαία Αίτηση Ενίσχυσης 2014 (ΕΑΕ2014)</option>
<option value="RDIIS">Ο.Π.Σ.Α.Α. 2014-2020</option>
</select>
</div>
And here is the HTML code for the button i want to reach:
<a href="#modal4pyli" class="btn btn-success" data-toggle="modal" data-target="#modal4pyli">
Κωδικός υποβολής <i class="fa fa-chevron-right"></i>
</a>
All I want is to click that button so I can go further away with my code.
Thank you in advance!
As it seems, all i needed to do was to trigger the appropriate event to make the button appear, using the code below:
Set event_onChange = HTMLdoc.createEvent("HTMLEvents")
event_onChange.initEvent "change", True, False
objIE.document.querySelector("#appSelect").value = "gaee2020"
objIE.document.querySelector("#appSelect").dispatchEvent event_onChange

select given value from dropdown list and click to add button

I can login on my website then then navigate to a web page where I need to select a value from a drop down box then need to click on add button. (using vba). select given value from dropdown list and click to add button.
I am not able to do that i have tried but unable to do that.
my drop down list html code is:
<select id="input_ifxlist_opts"><option value="43.66.18.70>11">SAL-EC-S1>sp_wan</option>
<option value="43.72.38.250>11">SDT-EC-S1>sp_wan</option>
<option value="43.95.88.9>3">SISC-CE2>Gi0/2</option>
<option value="43.95.88.5>3">SISC-CE1>Gi0/2</option>
<option value="43.88.32.237>11">SID-EC-S1>sp_wan</option>
<option value="43.95.74.54>2">SOEM_PG-CE1V.virtela.net>Gi0/0/1</option>
<option value="43.95.66.1>2">SAL-CE1>Gi0/1</option>
<option value="43.76.42.10>2">SEK-CE1V>Gi0/0</option>
<option value="43.95.94.5>2">SEV-CE2>Gi0/1</option>
<option value="43.95.78.9>2">SI-CE2>Gi0/1</option>
<option value="43.95.88.13>3">SID-CE1>Gi0/1</option>
<option value="43.95.76.5>1">SOK-CE1>Gi0/0</option>
<option value="43.95.86.9>37">SOMEA-CE1>Gi0/1.102</option>
<option value="43.95.92.9>2">SPH-CE1>Gi0/0/1</option>
<option value="43.95.70.2>2">STWN-CE1V>Gi0/1</option>
<option value="43.95.74.2>3">SOEM_KL-CE3>Gi0/1</option>
<option value="43.95.74.62>2">SOEM_KL-CE1V.virtela.net>Gi0/0/1</option>
<option value="43.95.74.46>2">SOMAS-CE1V>Gi0/0/1</option>
<option value="43.95.72.33>2">SDT-CE1.virtela.net>Gi0/0/1</option>
<option value="43.95.72.45>2">SOTHAI-CE>Gi0/0/1</option>
<option value="43.95.72.41>2">STT-CE1V.virtela.net>Gi0/0/1</option>
<option value="43.95.72.37>2">STTB-CE1.virtela.net>Gi0/0/1</option>
<option value="43.74.61.6>11">SOEM-PG-EC-S1>sp_wan</option>
<option value="43.95.92.2>2">SPHWNS-CE>Gi0/0/1</option>
<option value="43.95.65.1>4">GDC-CE1>Gi0/2</option>
<option value="43.72.61.5>11">SOTHAI-EC-S1>sp_wan</option>
<option value="146.215.74.110>3">IBP-CE1>Gi0/1</option>
<option value="43.95.86.9>2">SOMEA-CE1>Gi0/1</option>
<option value="43.95.88.5>2">SISC-CE1>Gi0/1</option>
<option value="43.95.88.9>2">SISC-CE2>Gi0/1</option>
</select>
my add button html code is :
<button type="button" onclick="addToList(document.forms['queryform'].input_ifxlist,document.getElementById('input_ifxlist_opts'))">Add</button>
Sub login_page()
Dim ieApp As SHDocVw.InternetExplorer
Dim iedoc As MSHTML.HTMLDocument
'Dim ieApp As InternetExplorer
'Dim ieDoc As Object
'Dim ieTable As Object
'Dim clip As DataObject
Set ieApp = New SHDocVw.InternetExplorer
ieApp.Visible = True
ieApp.navigate "http:"
Do While ieApp.Busy: DoEvents: Loop
Do Until ieApp.readyState = READYSTATE_COMPLETE: DoEvents: Loop
Set iedoc = ieApp.document
'fill in the login form – View Source from your browser to get the control names
With iedoc.forms(0)
.user.Value = "id"
.Password.Value = "Pass"
.submit
End With
ieApp.navigate "http:"
Do While ieApp.Busy: DoEvents: Loop
ieApp.navigate "http:"
Do While ieApp.Busy: DoEvents: Loop
ieApp.navigate "http:"
Do While ieApp.Busy: DoEvents: Loop
ieApp.navigate "http:"
Do While ieApp.Busy: DoEvents: Loop
ieApp.navigate "http:"
Do While ieApp.Busy: DoEvents: Loop
ieApp.navigate "http:"
Do While ieApp.Busy: DoEvents: Loop
ieApp.navigate "http:"
Do While ieApp.Busy: DoEvents: Loop
ieApp.navigate "http:"
Do While ieApp.Busy: DoEvents: Loop
End Sub
You can select a particular value by using attribute = value selector e.g.
ieApp.document.querySelector("[value='43.72.38.250>11']").Selected = True
You can also use index on the select element
ieApp.document.querySelector("#input_ifxlist_opts").SelectedIndex = 2
You can click the button with an attribute = value selector using starts with ^ operator
ieApp.document.querySelector("[onclick^=addToList]").click
You could use a more compact syntax for proper page waits:
While ieApp.Busy Or ieApp.readyState < 4: DoEvents: Wend
You could also try:
ieApp.document.parentWindow.ExecScript "document.forms ['queryform'].input_ifxlist.value='43.96.83.197>11'"
ieApp.document.querySelector("[onclick^=addToList]").click

Pull data from HTML dropdown

I'm looking for a way to extract data from the available options from a website dropdown box, specifically the second optgroup "All fund companies".
extract of HTML code I'm scraping
</div><div class="large-5 medium-5 columns spacer-bottom padding-left-none"><div class="select_wrap"><select id="search-company" name="companyid" class="default">
<option value="">Search by company</option>
<optgroup label="Popular companies">
<option value="4">Hargreaves Lansdown</option>
<option value="1908">Lindsell Train</option>
<option value="55">Jupiter</option>
<option value="191">Legal & General</option>
</optgroup>
<optgroup label="All fund companies">
<option value="218">Aberdeen</option>
<option value="1080">Aberforth Unit Trust Managers</option>
<option value="141">Allianz Global Investors</option>
<option value="3472">Alquity Investment Management Limited</option>
<option value="1324">Amati Global Investors Ltd</option>
VBA:
Set htmlObj = html.getElementById("search-company")
For Each Child In htmlObj.getElementByClassName("optgroup")(1).Children
sqlId = Child.Value
sqlCompany = Child.innerText
Debug.Print (sqlId & " - " & sqlCompany)
Next
Thanks Jeeped...
This is the code that managed to solve the problem... not sure if it's the most efficient way of doing it, but it works :)
Set htmlObj = html.getElementById("search-company")
For Each Child In htmlObj.Children
If Child.Label = "All fund companies" Then Set htmlObj2 = Child
Next
For Each Child In htmlObj2.Children
sqlId = Child.Value
sqlCompany = Child.innerText
Debug.Print (sqlId & " - " & sqlCompany)
Next