Here is the html code:
<table>
<tr class="WhiteRow">
<td align="center">
<input id="SelectedDelivery1" type="checkbox" onclick="HandleClick(this.name,this.checked,"")" value="Y" name="SelectedDelivery1">
</td>
<td valign="top">
<span></span>
<span class="bold">Instrument Search</span>
<br>
abc (TRANSFER)
</td>
<td align="center">5 minutes</td>
<td class="noborder" align="right">
<td class="noborder" align="right">
<td class="noborder" align="right">
<td class="noborder" align="right">
</tr>
<tr>
<td align="center">
<input id="SelectedDelivery2" type="checkbox" onclick="HandleClick(this.name,this.checked,"")" value="Y" name="SelectedDelivery1">
</td>
<td valign="top">
<span></span>
<span class="bold">Instrument Search</span>
<br>
abc (CAVEAT)
</td>
...
</tr>
</table>
I would like to target the <tr> containing <span class="bold">Instrument Search</span> and abc (TRANSFER). That tr may not be the first element in the table.
So far I tried
//td/span[text()="Instrument Search"]/ancestor::tr
which only satisfy one of the condition, and there are a few tr that satisfy the selector.
Could you please advise me how to target both of them
Use the following XPath expression:
//tr[contains(., 'abc (TRANSFER)') and contains(td/span[#class = 'bold'], 'Instrument Search')]
If possible, you should always use expressions that are unidirectional, because a "backwards" axis like ancestor:: could be a costly move. That's the advantage over the solution you have found already.
If the span[#class = 'bold'] cannot contain anything else than "Instrument Search", you should modifiy the expression above to:
//tr[contains(., 'abc (TRANSFER)') and td/span[#class = 'bold'] = 'Instrument Search']
The location of "abc (TRANSFER)" is still not very precise, if it is required in a certain place (e.g. always inside a td element) you'd have to further restrict the above.
EDIT Respondin to your comment:
abc (TRANSFER) is inside td tag, it's just a text field
Then use
//tr[contains(td, 'abc (TRANSFER)') and td/span[#class = 'bold'] = 'Instrument Search']
I found myself an answer after crawling through the syntax.
Please let me know if there is any other better ways
//td/span[text()="Instrument Search"]/ancestor::td/text()[contains(., "TRANSFER")]/ancestor::tr
Related
I have a complex html structure with lot of tables and divs.. and also the structure might change. How to find xpath by skipping the elements in between.
for example :
<table>
<tr>
<td>
<span>First Name</span>
</td>
<td>
<div>
<table>
<tbody>
<tr>
<td>
<div>
<table>
<tbody>
<tr>
<td>
<img src="1401-2ATd8" alt="" align="middle">
</td>
<td><span><input atabindex="2" id=
"MainLimitLimit" type="text"></span></td>
</tr>
</tbody>
</table>
</div>
</td>
</tr>
</tbody>
</table>
</div>
</td>
</tr>
</table>
I have to get the input element with respect to the "First Name" span
eg :
By.xpath("//span[contains(text(), 'First Name')]/../../td[2]/div/table/tbody/tr/td/table/tbody/tr/td[2]/input")
but.. can I skip the between htmls and directly access the input element.. something like?
By.xpath("//span[contains(text(), 'First Name')]/../../td[2]//input[contains#id,'MainLimitLimit')]")
You can try this Xpath :
//td[contains(span,'First Name')]/following-sibling::td[1]//input[contains(#id, 'MainLimitLimit')]
Explanation :
select <td><span>First Name</span></td> element :
//td[contains(span,'First Name')]
then get <td> element next to above <td> element :
/following-sibling::td[1]
then get <input> element within <td> element selected in the 2nd step above :
//input[contains(#id, 'MainLimitLimit')]
You can use // which means at any level
By.xpath("//span[contains(text(), 'First Name')]//td[2]/input[contains#id,'MainLimitLimit')]")
you can use the "First Name" span as a predicate. Try the code below
//td[preceding-sibling::td[span[contains(text(), 'First Name')]]]//input[contains(#id,'MainLimitLimit')]
I am unable to locate first span with XPath I tried:
//*[#id='student-grid']/div[2]/div[1]/table/tbody/tr[1]/td/span/span[contains(text(), 'Edit School')]
to select span with text - Edit Student button
<tbody role="rowgroup">
<tr class="" data-uid="2f3646c6-213a-4e91-99f9-0fbaa5f7755d" role="row" aria-selected="false">
<td class="select-row" role="gridcell">
<td class="font-md" role="gridcell">marker, Lion</td>
<td role="gridcell">TESTLINK_1_ArchScenario</td>
<td role="gridcell">1st</td>
<td role="gridcell">Not Started</td>
<td role="gridcell"/>
<td role="gridcell"/>
<td role="gridcell">QA Automation TestLink Folders</td>
<td class="k-cell-action" role="gridcell"/>
<td class="k-cell-action detail-view-link font-md" role="gridcell">
<span class="button-grid-action kendo-lexia-tooltip icon-pencil" role="button" title="Edit Student">
<span>Edit Student</span>
</span>
</td>
<td class="k-cell-action archive-link font-md" role="gridcell">
<span class="button-grid-action kendo-lexia-tooltip icon-archive" role="button" title="Archive Student">
<span>Archive Student</span>
</span>
</td>
</tr>
</tbody>
If you want to select span with text - Edit Studen try any of this:
//span[#title='Edit Student']/span
//span[text()='Edit Student']
If you want to select Edit Studen with role="button" try any of this:
//span[#title='Edit Student'][#role='button']
//span[#role='button'][./span[text()='Edit Student']]
//span[#role='button'][./span[.='Edit Student']]
simply use can use any of this xpaths
//span[contains(text(),'Edit Student')]
//*[contains(text(),'Edit Student')]
//span [#class='button-grid-action kendo-lexia-tooltip icon-pencil']/span
//span [#title='Edit Student']/span
//span [contains(#title,'Edit Student')]/span
//span [contains(#class,'button-grid-action kendo-lexia-tooltip icon-pencil')]/span
To select the outer span, this XPath,
//span[#role='button' and normalize-space()='Edit School']
will select span elements with a button #role and a normalized string value of Edit School.
To select the inner span, this XPath,
//span[text()='Edit School']
will select span elements with an immediate text node child whose value is Edit School.
You can, of course, further qualify the heritage in either case as needed.
See also
Testing text() nodes vs string values in XPath
This should get the span text
//span[.='Edit Student']
I`m using Scrapy Python to try to grep data from the site.
How I can grep this structure with Xpath?
<div class="foo">
<h3>Need this text_1</h3>
<table class="thesamename">
<tbody>
<tr>
<td class="tmp_year">
45767
</td>
<td class="tmp_outcome">
<b>Win_1</b><br>
<span class="tmp_category">TEST_1</span>
</td>
</tr>
<tr>
<td class="tmp_year">
1232004
</td>
<td class="tmp_outcome">
<b>Win_2</b><br>
<span class="tmp_category">TEST_2</span>
</td>
</tr>
<tr>
<td class="tmp_year">
122004
</td>
<td class="tmp_outcome">
<b>Win_3</b><br>
<span class="tmp_category">TEST_3</span>
</td>
</tr>
</tbody>
<h3>Need this text_2</h3>
<table class="thesamename">
<tbody>
<td class="tmp_year">
234
</td>
<td class="tmp_outcome">
<b>Win_E</b><br>
<span class="tmp_category">TEST_E</span>
</td>
</tr>
<tr>
<td class="tmp_year">
3476
</td>
<td class="tmp_outcome">
<b>Win_C</b><br>
<span class="tmp_category">TEST_C</span>
</td>
</tr>
</tbody>
<h3>Need this text_3</h3>
<table class="thesamename">
<tbody>
<tr>
<td class="tmp_year">
85567
</td>
<td class="tmp_outcome">
<b>Win_T</b><br>
<span class="tmp_category">TEST_T</span>
</td>
</tr>
<tr>
<td class="tmp_year">
435656
</td>
<td class="tmp_outcome">
<b>Win_A</b><br>
<span class="tmp_category">TEST_A</span>
</td>
</tr>
<tr>
<td class="tmp_year">
980
</td>
<td class="tmp_outcome">
<b>Win_Z</b><br>
<span class="tmp_category">TEST_Z</span>
</td>
</tr>
</tbody>
I would like to have output with this structure:
"Section": {
Need this text_1 :
[45767 : Win_1 : TEST_1]
[1232004 : Win_2 : TEST_2]
[122004: Win_3 : TEST_3]
,
Need this text_2:
[234 : Win_E : TEST_E]
[3476 : Win_C : TEST_C]
,
Need this text_3:
[85567 : Win_T : TEST_T]
[435656 : Win_A : TEST_A]
[980: Win_Z : TEST_Z]
}
How can I create the proper xpath select to take this structure?
I can take separately all "h3" , all "a" then all tags with class but how I can match?
GREP YOU SAY?! LOL Well, You would be entirely wron to name it so but for the sake ofkeeping the jargon cleanfor understanding your just parsing/extracting.... So new to scrapy? or web dev sideof things? No matter... Theres no way I couldexpect to teach you in one answer here how to xpth/regex like a pro... only wayis for you to keep at but I throw in my input.
First of all, xpath is amazingly usefull wen it comes to websites that are necessarily build to stadard, which doesnt make them bad per say but in the html snipet you gave... its structured all right soo.. Id recommend css extract .. THESE ARE THE VALUES...
year = response.css('td.tmp_year a::text').extract()
outcome = response.css('td.tmp_outcome b::text').extract()
category= response.css('span.tmp_category::text').extract()
PRO-TIP: For what ever case you deem it neccesary, you can save a web page asan HTML file and use scrapy shell by referencing the direct file path to it... So I save you html snippet to a file on my desktop then ran...
scrapy shell file:///home/scriptso/Desktop/letsGREPlol.html
ANYWAYS... as far as xpath... since you asked lol... cake. lets compare the xpath with the cssand tell me you can see... it? lol
response.css('td.tmp_outcome b::text').extract()
so is a td tag....and the class name is tmp_outcome, thn the next node is a bold tag... of which where the text is thusly declaring it as text with the ::text
response.xpath('//td[#class="tmp_outcome"]/b/text()').extract()
So xpath is basically saying we star with a patter inthe entire site of the td tag... and class= tmp_outcome, then the bold, then in xpath to declare type /text() is for text.... /#href is for.. yeah you guessedit
I have a complex html structure with lot of tables and divs.. and also the structure might change. How to find xpath by skipping the elements in between.
for example :
<table>
<tr>
<td>
<span>First Name</span>
</td>
<td>
<div>
<table>
<tbody>
<tr>
<td>
<div>
<table>
<tbody>
<tr>
<td>
<img src="1401-2ATd8" alt="" align="middle">
</td>
<td><span><input atabindex="2" id=
"MainLimitLimit" type="text"></span></td>
</tr>
</tbody>
</table>
</div>
</td>
</tr>
</tbody>
</table>
</div>
</td>
</tr>
</table>
I have to get the input element with respect to the "First Name" span
eg :
By.xpath("//span[contains(text(), 'First Name')]/../../td[2]/div/table/tbody/tr/td/table/tbody/tr/td[2]/input")
but.. can I skip the between htmls and directly access the input element.. something like?
By.xpath("//span[contains(text(), 'First Name')]/../../td[2]//input[contains#id,'MainLimitLimit')]")
You can try this Xpath :
//td[contains(span,'First Name')]/following-sibling::td[1]//input[contains(#id, 'MainLimitLimit')]
Explanation :
select <td><span>First Name</span></td> element :
//td[contains(span,'First Name')]
then get <td> element next to above <td> element :
/following-sibling::td[1]
then get <input> element within <td> element selected in the 2nd step above :
//input[contains(#id, 'MainLimitLimit')]
You can use // which means at any level
By.xpath("//span[contains(text(), 'First Name')]//td[2]/input[contains#id,'MainLimitLimit')]")
you can use the "First Name" span as a predicate. Try the code below
//td[preceding-sibling::td[span[contains(text(), 'First Name')]]]//input[contains(#id,'MainLimitLimit')]
I have one table, in this table two columns and 5 rows are there. In First column have a check box and second column have a data. I want get the second column row value of checked items.
My table id was "tb1", checkbox id "cb1" and second field id "da1".
I want the result like "Data2 and Data5" That means whatever I check, that particular second column(<td>) value.
This is possible, please help me.
If you are using cb2 and da2 for row two, then in javascript you could do:
If data column is td element only:
document.getElementById("da2").innerText
If data column is input element:
document.getElementById("da2").value;
JS Bin
Bind the event on every checkbox and the find second field" with the help of closest
JavaScript
$(function(){
$("#tb1").on('click','input',function(){
var value =($(this).closest('tr').find('[id^=da]').text())
alert(value)
})
})
HTML
<table id="tb1" cellpadding="5" border="1" cellspacing='0' width="200">
<tr>
<td id="cb1"><input type='checkbox' /> </td>
<td id="da1">Data 1</td>
</tr>
<tr>
<td id="cb2"><input type='checkbox' /> </td>
<td id="da2">Data 2</td>
</tr>
<tr>
<td id="cb3"><input type='checkbox' /> </td>
<td id="da3">Data 3</td>
</tr>
<tr>
<td id="cb4"><input type='checkbox' /> </td>
<td id="da4">Data 4</td>
</tr>
</table>