I have the following html I am working with: (a chunk of it here)
<table class="detailTable">
<tbody>
<tr>
<td class="detailTitle" align="top">
<h3>Credit Limit:</h3>
<h3>Current Balance:</h3>
<h3>Pending Balance:</h3>
<h3>Available Credit:</h3>
</td>
<td align="top">
<p>$677.77</p>
<p>$7.77</p>
<p>$7.77</p>
<p>$677.77</p>
</td>
<td class="detailTitle">
<h3>Last Statement Date:</h3>
<h4>Payment Address</h4>
</td>
<td>
<p> 05/19/2015 </p>
<p class="attribution">
</td>
</tr>
</tbody>
</table>
I need to first check if "Statement Date" exists, and then find its position. Then get it's value which is in a corresponding <p> tag. I need to do this using XPath. Any suggestions?
So far I tried using //table[#class='detailTable'][1]//td[2]//p[position(td[contains(.,'Statement Date')])] but it doesn't work.
This is one possible way : (formatted for readability)
//table[#class='detailTable']
//tr
/td[*[contains(.,'Statement Date')]]
/following-sibling::td[1]
/*[position()
=
count(
parent::td
/preceding-sibling::td[1]
/*[contains(.,'Statement Date')]/preceding-sibling::*
)+1
]
explanation :
..../td[*[contains(.,'Statement Date')]] : From the beginning up to this part, the XPath will find td element where, at least, one of its children contains text "Statement Date"
/following-sibling::td[1] : from previously matched td, navigate to the nearest following sibling td ...
/*[position() = count(parent::td/preceding-sibling::td[1]/*[contains(.,'Statement Date')]/preceding-sibling::*)+1] : ...and return child element at position equals to position of element that contains text "Statement Date" in the previous td. Notice that we use count(preceding-sibling::*)+1 to get position index of the element containing text "Statement Date" here.
You can do it this way:
//table[#class='detailTable'][1]//td[#class="detailTitle" and contains(./h3, 'Statement Date')]/following-sibling::td[1]/p[1]/text()
This will find the <td> that contains the Statement Date heading, and get the <td> immediately after it. Then it gets the text content of the first p in that <td>.
Related
I have a complex html structure with lot of tables and divs.. and also the structure might change. How to find xpath by skipping the elements in between.
for example :
<table>
<tr>
<td>
<span>First Name</span>
</td>
<td>
<div>
<table>
<tbody>
<tr>
<td>
<div>
<table>
<tbody>
<tr>
<td>
<img src="1401-2ATd8" alt="" align="middle">
</td>
<td><span><input atabindex="2" id=
"MainLimitLimit" type="text"></span></td>
</tr>
</tbody>
</table>
</div>
</td>
</tr>
</tbody>
</table>
</div>
</td>
</tr>
</table>
I have to get the input element with respect to the "First Name" span
eg :
By.xpath("//span[contains(text(), 'First Name')]/../../td[2]/div/table/tbody/tr/td/table/tbody/tr/td[2]/input")
but.. can I skip the between htmls and directly access the input element.. something like?
By.xpath("//span[contains(text(), 'First Name')]/../../td[2]//input[contains#id,'MainLimitLimit')]")
You can try this Xpath :
//td[contains(span,'First Name')]/following-sibling::td[1]//input[contains(#id, 'MainLimitLimit')]
Explanation :
select <td><span>First Name</span></td> element :
//td[contains(span,'First Name')]
then get <td> element next to above <td> element :
/following-sibling::td[1]
then get <input> element within <td> element selected in the 2nd step above :
//input[contains(#id, 'MainLimitLimit')]
You can use // which means at any level
By.xpath("//span[contains(text(), 'First Name')]//td[2]/input[contains#id,'MainLimitLimit')]")
you can use the "First Name" span as a predicate. Try the code below
//td[preceding-sibling::td[span[contains(text(), 'First Name')]]]//input[contains(#id,'MainLimitLimit')]
I have a html code as shown
<div class="property-title visible-xs">
<a href="/property/473902/Office-Lot">
<h2><b> 2nd Floor, Block D5, Solaris Dutamas, No. 1, Jalan Dutamas 1, 50480, Kuala Lumpur</b></h2>
</a>
</div>
<p style="color: #0071ee;">Office Lot</p>
<h4><b>RM 880,000</b></h4>
<div>
<table>
<!-- <tr><td>Office Lot</td></tr> -->
<tr>
<td>Property Code</td><td>:</td><td>PB473902</td>
</tr>
<tr>
<td>Auction Date</td><td>:</td><td>2016-02-26</td>
</tr>
<tr>
<td>Built up </td><td>:</td><td>754 sq.ft </td>
</tr>
<tr>
<td>Tenure</td><td>:</td><td>Freehold</td>
</tr>
and I used the following code to extract the details "2nd Floor, Block D5,...."
objIE1.Document.getElementsByClassName("property-title visible-xs").getElementsByTagName ("a")
but it don't seem to get the result I need. Please help.
The html code shown is in multiple form.
This will work:
extract1 = objIE1.Document.getElementsByClassName("property-title visible-xs")(0).getElementsByTagName ("a")(0).innerText
Cells(1,1).Value = extract1
When a function has getElementsBy (plural - "Elements") such as getElementsByClassName or getElementsByTagName the code will extract a collection of elements so you need to specify which one you want, in this case it is the first which in html is 0. When a function uses getElementBy (singular - "Element") such as getElementById this extracts a single element and therefore does not need an index specification as there is no collection.
I have a complex html structure with lot of tables and divs.. and also the structure might change. How to find xpath by skipping the elements in between.
for example :
<table>
<tr>
<td>
<span>First Name</span>
</td>
<td>
<div>
<table>
<tbody>
<tr>
<td>
<div>
<table>
<tbody>
<tr>
<td>
<img src="1401-2ATd8" alt="" align="middle">
</td>
<td><span><input atabindex="2" id=
"MainLimitLimit" type="text"></span></td>
</tr>
</tbody>
</table>
</div>
</td>
</tr>
</tbody>
</table>
</div>
</td>
</tr>
</table>
I have to get the input element with respect to the "First Name" span
eg :
By.xpath("//span[contains(text(), 'First Name')]/../../td[2]/div/table/tbody/tr/td/table/tbody/tr/td[2]/input")
but.. can I skip the between htmls and directly access the input element.. something like?
By.xpath("//span[contains(text(), 'First Name')]/../../td[2]//input[contains#id,'MainLimitLimit')]")
You can try this Xpath :
//td[contains(span,'First Name')]/following-sibling::td[1]//input[contains(#id, 'MainLimitLimit')]
Explanation :
select <td><span>First Name</span></td> element :
//td[contains(span,'First Name')]
then get <td> element next to above <td> element :
/following-sibling::td[1]
then get <input> element within <td> element selected in the 2nd step above :
//input[contains(#id, 'MainLimitLimit')]
You can use // which means at any level
By.xpath("//span[contains(text(), 'First Name')]//td[2]/input[contains#id,'MainLimitLimit')]")
you can use the "First Name" span as a predicate. Try the code below
//td[preceding-sibling::td[span[contains(text(), 'First Name')]]]//input[contains(#id,'MainLimitLimit')]
I am testing a page using selenium web driver. I have rows of data that represent 'requests', and in the last column of each of those rows the user can click a drop down list (with the option to either approve or reject) element that allows them to 'approve' or 'reject' the request.
I need to be able to select the approve option on the drop down list of a row whose 'Name' column is equal to a variable (in this instance say the variable is 'John').
In this test the user will be approving 'John's' request by selecting approve. How do I use xpath to ensure I am selecting the correct drop down element for the right person (right row)? Will I need to include a select element within an xpath somehow?
An example of the select element method to select a drop down element:
new SelectElement(this.Driver.FindElement(By.Name("orm")).FindElement(By.Name("Tutors"))).SelectByText(tutorName);
<form name="RequestsForm" action="SubmitRequest.aspx" method="POST">
<h2 class="blacktext" align="center">Course approvals</h2>
<table class="cooltable" width="90%" border="0" cellspacing="1" cellpadding="1">
<tbody>
<tr>
<td class="heading">
<b>Name</b>
</td>
<td class="heading">
<b>Request Date</b>
</td>
<td class="heading">
<b>Approved</b>
</td>
</tr>
<tr>
<td>
John
<input id="T1" type="text" value="888" name="T1">
</td>
<td>1/3/2015</td>
<td>
<select id="D1" class="selecttext" size="1" name="D1">
<option>?</option>
<option value="Approved">Approved</option>
<option>Rejected</option>
</select>
</td>
</tr>
</tbody>
</table>
Using XPath, this gets the position where the Name column is in your table:
count(//table[#class='cooltable']/tbody/tr[1]/td[b = 'Name']/preceding-sibling::td)+1
You can use that position to get the corresponding table cell in the other columns. This selects the corresponding td in the second row (where the ... represent the expression above):
//table[#class='cooltable']/tbody/tr[2]/td[count( ... )+1]
Appending /text() will extract the text (with spaces). Using normalize-space() will trim the text so you can compare it with John:
normalize-space(//table[#class='cooltable']/tbody/tr[2]/td[count( ... )+1]/text()) = 'John'
To select only the tr which contains John in the Name column, you leave only the td in the predicate. Now it returns a node-set of all tr which match the predicate text = John:
//table[#class='cooltable']/tbody/tr[normalize-space(td[count( ... )+1]/text()) = 'John']
Finally, if you append //select/option[#value='Approved'] to that expression, you will select the option with the Approved attribute in the context of that tr. Here is the full XPath expression:
//table[#class='cooltable']/tbody/tr[normalize-space(td[count(//table[#class='cooltable']/tbody/tr[1]/td[b = 'Name']/preceding-sibling::td)+1]/text()) = 'John']//select/option[#value='Approved']
I have table without any class or id (there are more tables on the page) with this structure:
<table cellpadding="2" cellspacing="2" width="100%">
...
<tr>
<td class="cell_c">...</td>
<td class="cell_c">...</td>
<td class="cell_c">...</td>
<td class="cell">SOME_ID</td>
<td class="cell_c">...</td>
</tr>
...
</table>
I want to get only one row, which contains <td class="cell">SOME_ID</td> and SOME_ID is an argument.
UPD.
Currently i am doing iy in this way:
doc = Jsoup.connect("http://www.bank.gov.ua/control/uk/curmetal/detail/currency?period=daily").get();
Elements rows = doc.select("table tr");
Pattern p = Pattern.compile("^.*(USD|EUR|RUB).*$");
for (Element trow : rows) {
Matcher m = p.matcher(trow.text());
if(m.find()){
System.out.println(m.group());
}
}
But why i need Jsoup if most of work is done by regexp ? To download HTML ?
If you have a generic HTML structure that always is the same, and you want a specific element which has no unique ID or identifier attribute that you can use, you can use the css selector syntax in Jsoup to specify where in the DOM-tree the element you are after is located.
Consider this HTML source:
<html>
<head></head>
<body>
<table cellpadding="2" cellspacing="2" width="100%">
<tbody>
<tr>
<td class="cell">I don't want this one...</td>
<td class="cell">Neither do I want this one...</td>
<td class="cell">Still not the right one..</td>
<td class="cell">BINGO!</td>
<td class="cell">Nothing further...</td>
</tr> ...
</tbody>
</table>
</body>
</html>
We want to select and parse the text from the fourth <td> element.
We specify that we want to select the <td> element that has the index 3 in the DOM-tree, by using td:eq(3). In the same way, we can select all <td> elements before index 3 by using td:lt(3). As you've probably figured out, this is equal and less than.
Without using first() you will get an Elements object, but we only want the first one so we specify that. We could use get(0) instead too.
So, the following code
Element e = doc.select("td:eq(3)").first();
System.out.println("Did I find it? " + e.text());
will output
Did I find it? BINGO!
Some good reading in the Jsoup cookbook!