I'm trying to get an element's text and it somehow isn't working.
## html:
<table class="infobox vcard">
(snip)
<tbody>
<tr>(stuff)</tr>
<tr>(stuff)</tr>
<tr>
<th scope="row" class="infobox-label" style="padding-right: 0.5em;">Website</th>
<td class="infobox-data" style="line-height: 1.35em;">
<a rel="nofollow" class="external text" href="https://group.pingan.com">https://group.pingan.com</a>
</td>
</tr>
<tr>(stuff)</tr>
</tbody>
## code:
website = b.element(class: 'vcard').element(visible: 'Website').following_sibling(index: 1).text
## error:
/Users/rich/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/watir-7.1.0/lib/watir/locators/element/selector_builder.rb:149:in `raise_unless': expected one of [TrueClass, FalseClass], got "Website":String (TypeError)
```
That line with the text "Website" is the only `row` in that `table`, hence why I'm using it. I though using the `.following_sibling` method would flush this out, but it's not usable for some reason and I don't know why. This is on `Wikipedia`.
Anybody know how what I'm doing wrong? How can I fix this?
There are 2 issues:
The exception is occurring on element(visible: 'Website'). The :visible locator expects either true (element can be seen by a person) or false (element cannot be seen by a person). As you want to locate the cell on text, use the :text locator.
For following_sibling, the :index locator is 0-based. Having index: 1, is looking for the second sibling, which doesn't exist. The cell with the link is index: 0, which is the default so does not need to be explicitly specified.
Putting it together, gives:
browser.element(class: 'vcard').element(text: 'Website').following_sibling.text
=> "https://group.pingan.com"
Related
I have this block in thymeleaf where I'm trying to show the name of a protocolVersion isntead of the id. I only have the Id so I pass the list of protocolVersion (protocolVersions) and I'm iterating over it to show the one that matches the id (1007 is just a test).
<th:block th:each="item: ${protocolVersions}">
<tr th:if="${item.id == 1007}">
<td th:text="${item.name}"></td>
</tr>
</th:block>
I'm getting the error message:
Exception evaluating SpringEL expression: "item.id == 1007"
I've also tried something like this, as I've found it in one of the questions here:
<td th:if="${protocolVersions.?[id == '${excelMongoDoc.protocolVersionId}']}"
th:text="${protocolVersions[id == '${excelMongoDoc.protocolVersionId}'].name}">
</td>
But this is not working either. Can anyone help please?
Thanks
The expression is simply: ${protocolVersions.^[id==1007].name}. So you could do this in your table:
<tr>
<td th:text="${protocolVersions.^[id==1007].name}"></td>
</tr>
If you want to check against a variable, rather than hardcoding 1007 something like this:
<th:block th:with="check=1007">
<td th:text="${protocolVersions.^[id==#root.check].name}"></td>
</th:block>
Scraping value on a website but turned out the value that I need shared the same class name as the others.
HTML code
<tr class="table_bdrow1_style">
<td></td>
<td style="text-align:center" class="table_bdtext_style">1.</td>
<td style="text-align:center" class="table_bdtext_style">
<div id="a">
"0.8948"
</div>
</td>
<td style="text-align:center" class="table_bdtext_style">December 19, 2016</td>
</tr>
I need the value of second line (0.8948) and third line - the date value (December 19, 2016) but the code I am using only shows me the first value (1).
extract1 = IE.Document.getElementsByClassName("table_bdtext_style")(1).innerText
Cells(4, "A").Value = extract1
Not sure how can I extract the second and third but not the first value. Anyone can help? Thanks a lot!
Just assign the respective index in your extract call:
' for second tag
IE.Document.getElementsByClassName("table_bdtext_style")(2).innerText
' for third tag
IE.Document.getElementsByClassName("table_bdtext_style")(3).innerText
I have the following html I am working with: (a chunk of it here)
<table class="detailTable">
<tbody>
<tr>
<td class="detailTitle" align="top">
<h3>Credit Limit:</h3>
<h3>Current Balance:</h3>
<h3>Pending Balance:</h3>
<h3>Available Credit:</h3>
</td>
<td align="top">
<p>$677.77</p>
<p>$7.77</p>
<p>$7.77</p>
<p>$677.77</p>
</td>
<td class="detailTitle">
<h3>Last Statement Date:</h3>
<h4>Payment Address</h4>
</td>
<td>
<p> 05/19/2015 </p>
<p class="attribution">
</td>
</tr>
</tbody>
</table>
I need to first check if "Statement Date" exists, and then find its position. Then get it's value which is in a corresponding <p> tag. I need to do this using XPath. Any suggestions?
So far I tried using //table[#class='detailTable'][1]//td[2]//p[position(td[contains(.,'Statement Date')])] but it doesn't work.
This is one possible way : (formatted for readability)
//table[#class='detailTable']
//tr
/td[*[contains(.,'Statement Date')]]
/following-sibling::td[1]
/*[position()
=
count(
parent::td
/preceding-sibling::td[1]
/*[contains(.,'Statement Date')]/preceding-sibling::*
)+1
]
explanation :
..../td[*[contains(.,'Statement Date')]] : From the beginning up to this part, the XPath will find td element where, at least, one of its children contains text "Statement Date"
/following-sibling::td[1] : from previously matched td, navigate to the nearest following sibling td ...
/*[position() = count(parent::td/preceding-sibling::td[1]/*[contains(.,'Statement Date')]/preceding-sibling::*)+1] : ...and return child element at position equals to position of element that contains text "Statement Date" in the previous td. Notice that we use count(preceding-sibling::*)+1 to get position index of the element containing text "Statement Date" here.
You can do it this way:
//table[#class='detailTable'][1]//td[#class="detailTitle" and contains(./h3, 'Statement Date')]/following-sibling::td[1]/p[1]/text()
This will find the <td> that contains the Statement Date heading, and get the <td> immediately after it. Then it gets the text content of the first p in that <td>.
I have a html code as shown
<div class="property-title visible-xs">
<a href="/property/473902/Office-Lot">
<h2><b> 2nd Floor, Block D5, Solaris Dutamas, No. 1, Jalan Dutamas 1, 50480, Kuala Lumpur</b></h2>
</a>
</div>
<p style="color: #0071ee;">Office Lot</p>
<h4><b>RM 880,000</b></h4>
<div>
<table>
<!-- <tr><td>Office Lot</td></tr> -->
<tr>
<td>Property Code</td><td>:</td><td>PB473902</td>
</tr>
<tr>
<td>Auction Date</td><td>:</td><td>2016-02-26</td>
</tr>
<tr>
<td>Built up </td><td>:</td><td>754 sq.ft </td>
</tr>
<tr>
<td>Tenure</td><td>:</td><td>Freehold</td>
</tr>
and I used the following code to extract the details "2nd Floor, Block D5,...."
objIE1.Document.getElementsByClassName("property-title visible-xs").getElementsByTagName ("a")
but it don't seem to get the result I need. Please help.
The html code shown is in multiple form.
This will work:
extract1 = objIE1.Document.getElementsByClassName("property-title visible-xs")(0).getElementsByTagName ("a")(0).innerText
Cells(1,1).Value = extract1
When a function has getElementsBy (plural - "Elements") such as getElementsByClassName or getElementsByTagName the code will extract a collection of elements so you need to specify which one you want, in this case it is the first which in html is 0. When a function uses getElementBy (singular - "Element") such as getElementById this extracts a single element and therefore does not need an index specification as there is no collection.
I have searched and searched for 3 days straight now trying to get a data scraper to work and it seems like I have successfully parsed the HTML table that looks like this:
<tr class='ds'>
<td class='ds'>Length:</td>
<td class='ds'>1/8"</td>
</tr>
<tr class='ds'>
<td class='ds'>Width:</td>
<td class='ds'>3/4"</td>
</tr>
<tr class='ds'>
<td class='ds'>Color:</td>
<td class='ds'>Red</td>
</tr>
However, I can not seem to get it to print to csv correctly.
The Ruby code is as follows:
Specifications = {
:length => ['Length:','length','Length'],
:width => ['width:','width','Width','Width:'],
:Color => ['Color:','color'],
.......
}.freeze
def specifications
#specifications ||= xml.css('tr.ds').map{|row| row.css('td.ds').map{|cell| cell.children.to_s } }.map{|record|
specification = Specifications.detect{|key, value| value.include? record.first }
[specification.to_s.titleize, record.last] }
end
And the csv is printing into one column (what seems to be the full arrays):
[["", nil], ["[:finishtype, [\"finish\", \"finish type:\", \"finish type\", \"finish type\", \"finish type:\"]]", "Metal"], ["", "1/4\""], ["[:length, [\"length:\", \"length\", \"length\"]]", "18\""], ["[:width, [\"width:\", \"width\", \"width\", \"width:\"]]", "1/2\""], ["[:styletype, [\"style:\", \"style\", \"style:\", \"style\"]]"........
I believe the issue is that I have not specified which values to return but I wasn't successful anytime I tried to specify the output. Any help would be greatly appreciated!
Try changing
[specification.to_s.titleize, record.last]
to
[specification.last.first.titleize, record.last]
The detect yields e.g. [:length, ["Length:", "length", "Length"]] which will become
"[:length, [\"Length:\", \"length\", \"Length\"]]" by to_s. With last.first you can extract just the part "Length:" of it.
In case you encounter attributes not matching to your Specification, you could just drop the values by changing to:
xml.css('tr.ds').map{|row| row.css('td.ds').map{|cell| cell.children.to_s } }.map{|record|
specification = Specifications.detect{|key, value| value.include? record.first }
[specification.last.first.titleize, record.last] if specification
}.compact