VBA scraping with same class name but different innertext - html

Scraping value on a website but turned out the value that I need shared the same class name as the others.
HTML code
<tr class="table_bdrow1_style">
<td></td>
<td style="text-align:center" class="table_bdtext_style">1.</td>
<td style="text-align:center" class="table_bdtext_style">
<div id="a">
"0.8948"
</div>
</td>
<td style="text-align:center" class="table_bdtext_style">December 19, 2016</td>
</tr>
I need the value of second line (0.8948) and third line - the date value (December 19, 2016) but the code I am using only shows me the first value (1).
extract1 = IE.Document.getElementsByClassName("table_bdtext_style")(1).innerText
Cells(4, "A").Value = extract1
Not sure how can I extract the second and third but not the first value. Anyone can help? Thanks a lot!

Just assign the respective index in your extract call:
' for second tag
IE.Document.getElementsByClassName("table_bdtext_style")(2).innerText
' for third tag
IE.Document.getElementsByClassName("table_bdtext_style")(3).innerText

Related

Watir following_sibling Not Seeing Element

I'm trying to get an element's text and it somehow isn't working.
## html:
<table class="infobox vcard">
(snip)
<tbody>
<tr>(stuff)</tr>
<tr>(stuff)</tr>
<tr>
<th scope="row" class="infobox-label" style="padding-right: 0.5em;">Website</th>
<td class="infobox-data" style="line-height: 1.35em;">
<a rel="nofollow" class="external text" href="https://group.pingan.com">https://group.pingan.com</a>
</td>
</tr>
<tr>(stuff)</tr>
</tbody>
## code:
website = b.element(class: 'vcard').element(visible: 'Website').following_sibling(index: 1).text
## error:
/Users/rich/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/watir-7.1.0/lib/watir/locators/element/selector_builder.rb:149:in `raise_unless': expected one of [TrueClass, FalseClass], got "Website":String (TypeError)
```
That line with the text "Website" is the only `row` in that `table`, hence why I'm using it. I though using the `.following_sibling` method would flush this out, but it's not usable for some reason and I don't know why. This is on `Wikipedia`.
Anybody know how what I'm doing wrong? How can I fix this?
There are 2 issues:
The exception is occurring on element(visible: 'Website'). The :visible locator expects either true (element can be seen by a person) or false (element cannot be seen by a person). As you want to locate the cell on text, use the :text locator.
For following_sibling, the :index locator is 0-based. Having index: 1, is looking for the second sibling, which doesn't exist. The cell with the link is index: 0, which is the default so does not need to be explicitly specified.
Putting it together, gives:
browser.element(class: 'vcard').element(text: 'Website').following_sibling.text
=> "https://group.pingan.com"

Generate HTML tag with more than one attribute in SQL request

Could you please help me to understand how can I generate XML/HTML with more than one attribute
I have this SQL code
select
[td/#align] = 'center', td = format(GETDATE(),'dd.MM.yyyy'), null
for xml path('tr')
This code returns as its result:
<tr>
<td align="center">16.09.2020</td>
</tr>
and I need
<tr>
<td align="center" style="background-color: red;">16.09.2020</td>
</tr>
Can't find out how to do this...
If I try to use something like this [td/#align/#style] - SQL is causing an error
Column name 'td/#align/#style' contains an invalid XML identifier as required by FOR XML; '#'(0x0040) is the first character at fault
Are you looking for this:
select 'center' AS [td/#align]
,'background-color: red;' AS [td/#style]
,format(GETDATE(),'dd.MM.yyyy') AS [td]
for xml path('tr')
it yields this:
<tr>
<td align="center" style="background-color: red;">16.09.2020</td>
</tr>
You can think of one row columns as xml tag value and attributes, which are grouped using the alias AS. So, for more attributes, just add new value with the corresponding alias - td/#....

Find specific element position in XPath after checking a condition

I have the following html I am working with: (a chunk of it here)
<table class="detailTable">
<tbody>
<tr>
<td class="detailTitle" align="top">
<h3>Credit Limit:</h3>
<h3>Current Balance:</h3>
<h3>Pending Balance:</h3>
<h3>Available Credit:</h3>
</td>
<td align="top">
<p>$677.77</p>
<p>$7.77</p>
<p>$7.77</p>
<p>$677.77</p>
</td>
<td class="detailTitle">
<h3>Last Statement Date:</h3>
<h4>Payment Address</h4>
</td>
<td>
<p> 05/19/2015 </p>
<p class="attribution">
</td>
</tr>
</tbody>
</table>
I need to first check if "Statement Date" exists, and then find its position. Then get it's value which is in a corresponding <p> tag. I need to do this using XPath. Any suggestions?
So far I tried using //table[#class='detailTable'][1]//td[2]//p[position(td[contains(.,'Statement Date')])] but it doesn't work.
This is one possible way : (formatted for readability)
//table[#class='detailTable']
//tr
/td[*[contains(.,'Statement Date')]]
/following-sibling::td[1]
/*[position()
=
count(
parent::td
/preceding-sibling::td[1]
/*[contains(.,'Statement Date')]/preceding-sibling::*
)+1
]
explanation :
..../td[*[contains(.,'Statement Date')]] : From the beginning up to this part, the XPath will find td element where, at least, one of its children contains text "Statement Date"
/following-sibling::td[1] : from previously matched td, navigate to the nearest following sibling td ...
/*[position() = count(parent::td/preceding-sibling::td[1]/*[contains(.,'Statement Date')]/preceding-sibling::*)+1] : ...and return child element at position equals to position of element that contains text "Statement Date" in the previous td. Notice that we use count(preceding-sibling::*)+1 to get position index of the element containing text "Statement Date" here.
You can do it this way:
//table[#class='detailTable'][1]//td[#class="detailTitle" and contains(./h3, 'Statement Date')]/following-sibling::td[1]/p[1]/text()
This will find the <td> that contains the Statement Date heading, and get the <td> immediately after it. Then it gets the text content of the first p in that <td>.

Scraping HTML by Class in VBA

I have a html code as shown
<div class="property-title visible-xs">
<a href="/property/473902/Office-Lot">
<h2><b> 2nd Floor, Block D5, Solaris Dutamas, No. 1, Jalan Dutamas 1, 50480, Kuala Lumpur</b></h2>
</a>
</div>
<p style="color: #0071ee;">Office Lot</p>
<h4><b>RM 880,000</b></h4>
<div>
<table>
<!-- <tr><td>Office Lot</td></tr> -->
<tr>
<td>Property Code</td><td>:</td><td>PB473902</td>
</tr>
<tr>
<td>Auction Date</td><td>:</td><td>2016-02-26</td>
</tr>
<tr>
<td>Built up </td><td>:</td><td>754 sq.ft </td>
</tr>
<tr>
<td>Tenure</td><td>:</td><td>Freehold</td>
</tr>
and I used the following code to extract the details "2nd Floor, Block D5,...."
objIE1.Document.getElementsByClassName("property-title visible-xs").getElementsByTagName ("a")
but it don't seem to get the result I need. Please help.
The html code shown is in multiple form.
This will work:
extract1 = objIE1.Document.getElementsByClassName("property-title visible-xs")(0).getElementsByTagName ("a")(0).innerText
Cells(1,1).Value = extract1
When a function has getElementsBy (plural - "Elements") such as getElementsByClassName or getElementsByTagName the code will extract a collection of elements so you need to specify which one you want, in this case it is the first which in html is 0. When a function uses getElementBy (singular - "Element") such as getElementById this extracts a single element and therefore does not need an index specification as there is no collection.

How to use xpath to select a value in a drop down list located in a particular row?

I am testing a page using selenium web driver. I have rows of data that represent 'requests', and in the last column of each of those rows the user can click a drop down list (with the option to either approve or reject) element that allows them to 'approve' or 'reject' the request.
I need to be able to select the approve option on the drop down list of a row whose 'Name' column is equal to a variable (in this instance say the variable is 'John').
In this test the user will be approving 'John's' request by selecting approve. How do I use xpath to ensure I am selecting the correct drop down element for the right person (right row)? Will I need to include a select element within an xpath somehow?
An example of the select element method to select a drop down element:
new SelectElement(this.Driver.FindElement(By.Name("orm")).FindElement(By.Name("Tutors"))).SelectByText(tutorName);
<form name="RequestsForm" action="SubmitRequest.aspx" method="POST">
<h2 class="blacktext" align="center">Course approvals</h2>
<table class="cooltable" width="90%" border="0" cellspacing="1" cellpadding="1">
<tbody>
<tr>
<td class="heading">
<b>Name</b>
</td>
<td class="heading">
<b>Request Date</b>
</td>
<td class="heading">
<b>Approved</b>
</td>
</tr>
<tr>
<td>
John
<input id="T1" type="text" value="888" name="T1">
</td>
<td>1/3/2015</td>
<td>
<select id="D1" class="selecttext" size="1" name="D1">
<option>?</option>
<option value="Approved">Approved</option>
<option>Rejected</option>
</select>
</td>
</tr>
</tbody>
</table>
Using XPath, this gets the position where the Name column is in your table:
count(//table[#class='cooltable']/tbody/tr[1]/td[b = 'Name']/preceding-sibling::td)+1
You can use that position to get the corresponding table cell in the other columns. This selects the corresponding td in the second row (where the ... represent the expression above):
//table[#class='cooltable']/tbody/tr[2]/td[count( ... )+1]
Appending /text() will extract the text (with spaces). Using normalize-space() will trim the text so you can compare it with John:
normalize-space(//table[#class='cooltable']/tbody/tr[2]/td[count( ... )+1]/text()) = 'John'
To select only the tr which contains John in the Name column, you leave only the td in the predicate. Now it returns a node-set of all tr which match the predicate text = John:
//table[#class='cooltable']/tbody/tr[normalize-space(td[count( ... )+1]/text()) = 'John']
Finally, if you append //select/option[#value='Approved'] to that expression, you will select the option with the Approved attribute in the context of that tr. Here is the full XPath expression:
//table[#class='cooltable']/tbody/tr[normalize-space(td[count(//table[#class='cooltable']/tbody/tr[1]/td[b = 'Name']/preceding-sibling::td)+1]/text()) = 'John']//select/option[#value='Approved']