VBA to click a dynamic href - html

I'm trying to click a link on a website with the tag:
<a href="/dbget-bin/www_bget?dr:D01441:>D01441</a>
However, I'm doing this after searching for a unique item (I have an array of >9000 unique items), and the "D01441" part is different for each item, and I don't know in advance what it will be for each. The following code is in a loop that goes through each item and searches for it one at a time. After searching, I would like to click on a link that appears (the code above) and do more things on that next web page.
Dim IE As Object
Dim ele As Object
Set IE = CreateObject("InternetExplorer.Application")
...
For Each ele In IE.document.getElementsByTagName("a")
If ele.Href = "/dbget-bin/www_bget?dr:D01441" Then
ele.Click
Exit For
End If
Next
The above code doesn't work and I'm not sure why. But once I get it to work, I don't know how to modify the "D01441" part so that I can click on any searched item's link. Here's more html around the link I want:
<tbody>
<tr> ... </tr>
<tr>
<td class = "data1">
<a href = "/dbget-bin/www_bget?dr:D01441:>D01441</a>
</td>
<td class = "data1">..</td>
<td class = "data1">..</td>
...
EDIT: To try to deal with the changing "D01441", I tried using InStr but it doesn't work either:
For Each ele In IE.document.getElementsByTagName("a")
If InStr(ele.Href, "/dbget-bin/www_bget?dr:") = 1 Then
MsgBox "There"
ele.Click
Exit For
End If
Next

CSS selectors:
Try using a CSS selector combination applied via querySelector method of document to target the common start part of the href.
Applying the selector combination:
IE.document.querySelector("a[href^='/dbget-bin/www_bget?dr:']").Click
Understanding the selector combination:
This uses a CSS selector combination to target the element with:
a[href^='/dbget-bin/www_bget?dr:']
This says element with a tag having attribute href whose value starts with
'/dbget-bin/www_bget?dr:' . The ^ means starts with.
Query in action:
Here is the selector in action on your HTML sample:
Side note:
If you have multiple elements with a tags and an href that starts with /dbget-bin/www_bget?dr:, it will match the first one, in most instances. If that is the case seeing more HTML would help. I think there are a few problems with that HTML sample because in theory a more selective CSS query might be .data1 a[href^='/dbget-bin/www_bget?dr:'], so as to include the parent element class of data1, "." being a class selector.

#QHarr answer is the elegant and best solution, but...
To address your issue of getting the part number from the href, you can use the InStr like this
For Each ele In IE.document.getElementsByTagName("a")
Dim partNumber As String
Dim colonPosition As Long
colonPosition = InStr(1, ele.Href, ":", vbTextCompare)
If colonPosition > 0 Then
partNumber = Right$(ele.Href, Len(ele.Href) - colonPosition)
Debug.Print partNumber
End If
Next ele

Related

Internet Explorer click an icon and a button of a specific line of a table

I've asked a similar question two days ago but I know stumble again on a similar problem but somehow different. previous question asked on a related problem
I have a report of many lines with the same structure. I need to click an icon that is on the nth line. That report is structured in cells so I know that my icon is in the first position (column) of that report. After I have click that icon I'll also have to click on a button in the 10th column.
I already know how to access the page in question with that code
Sub click_button_no_hlink()
Dim i As Long
Dim IE As Object
Dim Doc As Object
Dim objElement As Object
Dim objCollection As Object
Set IE = CreateObject("InternetExplorer.Application") 'create IE instance
IE.Visible = True
IE.Navigate "https://apex.xyz.qc.ca/apex/prd1/f?p=135:LOGIN_DESKTOP::::::" ' Adress of web page
While IE.Busy: DoEvents: Wend 'loading page
This first part is easy isn't? And I know how to handle it. Afterward I tried different variation around this but it either do nothing, or I get an error message. Obviously I don't fully understand what I'm doing with the "querySelector" thing…
dim step_target as string
step_target = 2
'identify all the lines of my table containing lines, containing icons
'and button to click on
Set objCollection = IE.document.getElementsByClassName("highlight-row")
i = 0
Do While i < objCollection.Length
'cell 2 is the one containing the step I'm targetting
If objCollection.Item(i).Cells(2).innerText = step_target Then
'that's not doing anything
objCollection.Item(i).Cells(9).Click
'tried many syntax around this with no luck
IE.document.querySelector([objCollection.Item(i).Cells(9)]).FireEvent ("onclick")
End If
i = i + 1
Loop
Here's images of the code of the page
Showing all the lines of the report
Showing all code lines of a particular line
and now the code of that first icon I need to click on (this is where I need help ;-) how can I call that action)
and finally the code of that button I also need to click on
Again, I thank you all in advance, for the time you'll take to help me along this.
you could try attribute selector for first in combination with descendant combinator and a type selector
ie.document.querySelector("[headers='ID_DET_DEM_TRAV_STD'] a").click
you could try attribute selector for second in combination with descendant combinator and input type selector
ie.document.querySelector("[headers='BOUTON1'] input").click
alternative for second is
ie.document.querySelector("[value=Fait]").click
Typically, if you want to select by position e.g. 1 and 10th columns you would use
td:nth-of-type(1)
td:nth-of-type(10)
Though you would also use a tr:nth-of-type(n) to get the right row as well e.g. first row, first col. Then add in any child type selector, for example, that you might need.
ie.document.querySelector("tr:nth-of-type(1) td:nth-of-type(1)")
Child a tag:
ie.document.querySelector("tr:nth-of-type(1) td:nth-of-type(1) a")
Child input tag: would then be:
IE.document.querySelector("tr:nth-of-type(4) td:nth-of-type(10) input").Click

How to click on a button on a webpage using <td> and <tr>?

I am trying to click o the first "Completed" button in the highlighted part of the webpage below.
Here is a piece of the VBA code of the website page:
I tried to click on the FIRST completed button in many different ways such as :
For Each element In ie3.getElementsByTagName("main_table_data_right_border main_table_data_bottom_border")(5)
If element.innerText = "Completed" Then
' Application.Wait (Now + TimeValue("0:03:00"))
element.Click
Application.Wait (Now + TimeValue("0:00:20"))
Exit For
Else
End If
Next
Or
doc.querySelector("#divPage > table.advancedSearch_table > tbody"). _ getElementsByTagName("tr")(3).getElementsByTagName("td")(5).Children(0).Click
But none of them seem to work. When I debug the code and I go through this part and this particular line, nothing really happens. So the button is not being clicked.
Can anyone help me with that?
You could use the getElementsByTagName method to find the hyperlink. Please refer to the following sample:
VBA code to find the hyperlink and click the button (in this sample, I just find the special cell in the first row. If you want to loop through the hyperlink, you need to use For Each statement to loop through the array).
Sub Test()
Dim ie As Object
Dim Rank As Object
Set ie = CreateObject("InternetExplorer.application")
ie.Visible = True
ie.Navigate ("http://localhost:54382/HtmlPage47.html")
Do
If ie.ReadyState = 4 Then
Exit Do
Else
End If
Loop
Set doc = ie.document
doc.getElementsByTagName("tr")(1).getElementsByTagName("td")(5).getElementsByTagName("a")(0).Click
End Sub
Code in the Web page:
<div>
<table class="main_table" style="text-align:center;">
<tr class="main_table_header">
<td></td>
<td>Export Type</td>
<td>Criteria</td>
<td>Rep./List</td>
<td>Creation Date</td>
<td>Status</td>
<td>Reference</td>
</tr>
<tr class="main_table_data">
<td>
<input id="Checkbox1" type="checkbox" />
</td>
<td>Activites</td>
<td>Process Date from 2019/07/02 to 2019/07/02</td>
<td>For an advanced search</td>
<td>2019/07/03</td>
<td><a onclick="javascript:alert('hello AA')" id="link1" href="#">Conpleted</a> (601 lines)</td>
<td>662602308</td>
</tr>
<tr class="main_table_data">
<td>
<input id="Checkbox1" type="checkbox" />
</td>
<td>Activites</td>
<td>Process Date from 2019/07/02 to 2019/07/02</td>
<td>For an advanced search</td>
<td>2019/07/03</td>
<td><a onclick="javascript:alert('hello BB')" href="#">Conpleted</a> (601 lines)</td>
<td>662602308</td>
</tr>
<tr class="main_table_data">
<td>
<input id="Checkbox1" type="checkbox" />
</td>
<td>Activites</td>
<td>Process Date from 2019/07/02 to 2019/07/02</td>
<td>For an advanced search</td>
<td>2019/07/03</td>
<td><a onclick="javascript:alert('hello CC')" href="#">Conpleted</a> (601 lines)</td>
<td>662602308</td>
</tr>
<tr class="main_table_data">
<td>
<input id="Checkbox1" type="checkbox" />
</td>
<td>Activites</td>
<td>Process Date from 2019/07/02 to 2019/07/02</td>
<td>For an advanced search</td>
<td>2019/07/03</td>
<td><a onclick="javascript:alert('hello DD')" href="#">Conpleted</a> (601 lines)</td>
<td>662602308</td>
</tr>
</table>
</div>
The result is like this:
I see you are a bit confused as to how to access HTML elements, so I'll take this opportunity to demonstrate the logic of doing so in a very detailed manner, which I also believe to be very intuitive. There are other ways to do it, but I believe the following one is the most comprehensive and intuitive one and ideal for a beginner.
Firstly, I will go ahead and assume that ie3 is an InternetExplorer object.
When you use this object to navigate to a page, you can access the html of that page by using the ie3.document, which holds an HTML document object.
To take full advantage of the HTML document object you should add a reference to the Microsoft HTML Object Library. This Library will allow you to use a number of HTML elements which make your life easier.
In your case, the elements you want to be able to access are
HTML tables and their rows and cells
HTML anchor elements ()
So my declarations would be the following:
Dim ie3 As New InternetExplorer 'To be used to navigate to the page of interest
Dim doc As HTMLDocument 'this will hold the HTML document corresponding to the page
Dim toBeClicked As HTMLAnchorElement 'To be used to store the <a></a> element
Dim table As HTMLTable 'To be used to store the table element
Dim tableRow As HTMLTableRow 'To be used to store a row of the table element
Dim tableCell As HTMLTableCell 'To be used to store a cellof the table element
Assuming that you have already used the ie3 to navigate to the website of interest, you can store it's HTML document in doc like so:
Set doc = ie3.document
Once you have access to the HTML document of the webpage, you can also get access to its elements in a number of ways, some more targeted than others. Below I am demonstrating the most common methods to do that, using the table element as an example.
If the table has a unique ID, you can get access to it by using the .getElementById() method. This method returns a single element. In your case, the table you're after, doesn't have an ID.
If the table belongs to a class, you can get access to it by using the .getElementsByClassName() method. This method returns a collection of elements, all of which belong to the same class. To get access to a member of this collection you can use a (item index) kind of notation. The first member has an index of 0. In your case the table belongs to class "advancedSearch_table", which happens to only have one member.
If there's no class or ID you can use the .getElementsByTagName method. This method returns a collection of all the elements who have the same tag. In your case you would need all the tables in the document. To get access to a member of this collection you can use a (item index) kind of notation. The first member has an index of 0. Tags in HTML look like so <tagName attribute="something">Something</tagName>.
Below I demonstrate all three methods. You can use either one of the first two:
Set table = doc.getElementsByClassName("advancedSearch_table")(0)
Set table = doc.getElementsByTagName("table")(0)
Set table = doc.getElementById("ID of the table") 'only for demostration purposes, it doesn't apply to your case, as the table has no ID.
Keep in mind that in your case, there is only one table in the document and there's only one element that belongs to the class "advancedSearch_table". This means that you need the first element of the corresponding collections. That's why I use 0 as index.
By the same logic as above, now that the table has been stored, you can get access to its rows and cells. More specifically, you need the 5th cell of the 4th row. That's where the link that you want to click is:
Set tableRow = table.getElementsByTagName("tr")(3)
Set tableCell = tableRow.getElementsByTagName("td")(4)
Finally, now that the cell of interest has been stored, you can access the anchor element and click it. Again, there's only one anchor element in the cell, so it's going to be the first one in the corresponding collection:
Set toBeClicked = tableCell.getElementsByTagName("a")(0)
toBeClicked.Click
BONUS
If you want to click on all the "Completed" links, one by one, you need to loop through the corresponding elements. Here'w two ways to do it:
Click on the anchor in the 5th cell of each row:
For Each tableRow In table.Rows
Set toBeClicked = tableRow.getElementsByTagName("td")(4).getElementsByTagName("a")(0)
toBeClicked.Click
Next tableRow
Loop through all rows and though all cells of the table, find the inner text that you're looking for and click the corresponding anchor:
For Each tableRow In table.Rows
For Each tableCell In tableRow.Cells
If tableCell.innerText = "Something" Then
Set toBeClicked = tableCell.getElementsByTagName("a")(0)
toBeClicked.Click
Next tableCell
Next tableRow
Here, once you click on completed hyperlink, JavaScript gets executed and it opens an Excel file, here you can use ie3.Navigate "javascript:openExcelFile('t83_Kerrfinancialadvisorsinc/455X3/ExportActivity_66260230820190703122002139.xlsx)"
Since it's tied with a hyperlink, you can also try using
element.Click
element.FireEvent ("onclick")
or you can use execScript
Call ie3.document.parentWindow.execScript("your script in webpage", "JavaScript")

Retrieving the text between the <div> with VBA

I am trying to get a text string from inside a div on a webpage, but I can't seem to figure out how it is stored in the element.
Set eleval = objIE.Document.getElementsByClassName("outputValue")(0)
Debug.Print (eleval.innerText)
I have tried this and variations thereof, but my string just reads as "".
I mainly need help on how is this type of data is referenced in VBA.
<div class="outputValue">"text data that I want"</div>
Here is a screenshot of the page in question, I cannot give a link since it requires a company login to reach.
With .querySelector method, make sure page if fully loaded before attempting.
Example delays can be added with Application.Wait Now + TimeSerial(h,m,s)
Set eleval = objIE.Document.querySelector("div[class="outputValue"]")
Debug.Print eleval.innerText
If it is the first of its className on the page you could also use:
Set eleval = objIE.Document.querySelector(".outputValue")
If there is more than one and it is at a later index you can use
Set eleval = objIE.Document.querySelectorAll(".outputValue")
And then access items from the nodeList returned with
Debug.Print eleval.Item(0).innerText 'or replace 0 with the appropriate index.
Dim elaval as Variant
elaval = Trim(Doc.getElementsByTagName("div")(X).innerText)
msgbox elaval
Where X is the instance of your class div

Get data from a web table with table tag

I have this code in HTML:
<table cellspacing = "0" cellpadding = "0" width = "100%" border="0">
<td class="TOlinha2"><span id="Co">140200586125</span>
I already have a VBA function that accesses a web site, logs in and goes to the right page. Now I'm trying to take the td tags inside a table in HTML. The value I want is 140200586125, but I want a lot of td tags, so I intend to use a for loop to get those tds and put them in a worksheet.
I have tried both:
.document.getElementByClass()
and:
.document.getElementyById()
but neither worked.
Appreciate the help. I'm from Brazil, so sorry about any English mistakes.
There is not enough HTML to determine if the TOlinha2 is a consistent class name for all the tds within the table of interest; and is limited only to this table. If it is then you can indeed use .querySelectorAll
You could use the CSS selector:
ie.document.querySelectorAll(".TOlinha2")
Where "." stands for className.
You cannot iterate over the returned NodeList with a For Each Loop. See my question Excel crashes when attempting to inspect DispStaticNodeList. Excel will crash and you will lose any unsaved data.
You have to loop the length of the nodeList e.g.
Dim i As Long
For i = 0 To Len(nodeList) -1
Debug.Print nodeList(i).innerText
Next i
Sometimes you need different syntax which is:
Debug.Print nodeList.Item(i).innerText
You can seek to further narrow this CSS selector down with more qualifying elements such as, the element must be within tbody i.e. a table, and preceeded by a tr (table row) and have classname .TOLinha2
ie.document.querySelectorAll("tbody tr .TOlinha2")
Since you mentioned you need to retrieve multiple <td> tags, it would make more sense to retrieve the entire collection rather than using getElementById() to get them one-at-a-time.
Based on your HTML above, this would match all <span> nodes within a <td> with a class='TOlinha2':
Dim node, nodeList
Set nodeList = ie.document.querySelectorAll("td.TOlinha2 > span")
For Each node In nodeList
MsgBox node.innerText ' This should return the text within the <span>
Next

extracting text from a specific <h> element using GetElementById

I have created a VBS script file that looks at an XML data file.
Within the XML data file, the HTML data I need is embedded within the
<![CDATA[]'other interesting HTML data here'].
I have stripped out this HTML data using XPATH and insterted into a Div object (myDiv) element that is represented as a variable (its not written to a document).
So for example, the contents of myDiv.innerHTML looks like this;
<table>
<tr><td>text in cell 1</td></tr>
<tr><td><h1 id="myId1">my text for H1</h></td><tr>
<tr><td><h2 id="myId2">my text for h2</h></td></tr>
</table>
What I want to do at first is simply select the appropriate tag with the Id that matches "myId1", therefore, I used a statement like this;
MyIdText = MyDiv.getElementById("myId1")
However, the aplpication I am using says "Err 438, Object doesn't support this property or method".
I am a bit of a newbie with code and can understand some of the basic fundamantals, but get a bit lost when it becomes a bit more complex (sorry). I have looked through other postings on this board, and all of them seem to rlate to HTML nad Javascript, not VBScript (the application I am using will not allow Java Script).
Am I using the code wrong?
To use getElementById() you should write: document.getElementById("myId1"). This way you tell the browser to search inside 'document' for the specified ID. Your variable is not defined and it does not have this method attached, so your code will generate the above error.
To extract the text inside the specific H element:
MyIdText = document.getElementById("myId1").textContent;
many thanks for the help, unfortunately, I know a little VBS and even littler about DOM and I am trying to learn both by experimenting. There are certain restrictions within the environment/application I am working with (Its called ASCE and its a tool for managing Safety Cases - but thats not important right now).
However, so that we are comparing apples with apples, I have tried to experiment within an HTML page to give me a better understanding of what the DOM/VBS commands can actually do. I have had some partial success, but still cant understand why it falls over where it does.
Here is the exact file I am experimenting with, I have added comment text for each section;
<html>
<head>
<table border=1>
<tr>
<td>text in cell 1</td>
</tr>
<tr>
<td><h1 id="myId1">my text for H1</h1></td>
</tr>
<tr>
<td><h1 id="myId2">my text for h2</h2></td>
</tr>
</table>
<script type="text/vbscript">
DoStuff
Sub DoStuff
' Section 1: Get a node with the Id value of "myId1" from the above HTML
' and assign it to the variable 'GetValue'
' This works fine :-)
Dim GetValue
GetValue = document.getElementById("myId1").innerHTML
MsgBox "the text=" & GetValue
' Section 2: Create a query that assigs to the variable 'MyH1Tags' to all of the <h1>
' tags in the document.
' I assumed that this would be a 'collection of <h1> tags so I set up a loop to itterate
' through however many there were, but this fails as the browser says that this object
' doesn't support this property or method - This is where I am stuck
Dim MyH1Tags
Dim H1Tag
MyH1Tags = document.getElementsByTagName("h1") ' this works
For Each H1Tag in MyH1Tags ' this is where it falls over
MSgbox "Hello"
Next
' Section 3: Create a new Div element 'NewDiv' and then insert some HTML 'MyHTML'
' into 'NewDiv'. Create a query 'MyHeadings' that extracts all h1 headings from 'NewDiv'
' then loop round for however many h1 headings there are in 'MyHeadings'
' and display the text content. This works Ok
Dim NewDiv
Dim MyHTML
Dim MyHeadings
Dim MyHeading
Set NewDiv = document.createElement("DIV")
MyHTML="<h1 id=""a"">heading1</h1><h2 id=""b"">Heading2</h2>"
NewDiv.innerHTML=MyHTML
Set MyHeadings = NewDiv.getElementsByTagName("h1")
For Each MyHeading in MyHeadings
Msgbox "MyHeading=" & MyHeading.innerHTML
Next
'Section 4: Do a combination of Section 1 (that works) and Section 3 (that works)
' by creating a new Div element 'NewDiv2' and then paste into it some HTML
' 'MyHTML2' and then attempt to create a query that extracts the inner HTML from
' an id attribute with the value of "a". But this doesnt work either.
' I have tried "Set MyId = NewDiv2.getElementById("a").innerHTML" and
' also tried "Set MyId = NewDiv2.getElementById("a")" and it always falls over
' at the same line.
Dim NewDiv2
Dim MyHTML2
Dim MyId
Set NewDiv2 = document.createElement("DIV")
MyHTML2="<h1 id=""a"">heading1</h1><h2 id=""b"">Heading2</h2>"
NewDiv2.innerHTML=MyHTML
MyId = NewDiv2.getElementById("a").innerHTML
End Sub
</script>
</head>
<body>