Xpath to select tr based on specific td not containing text - html

I'm trying to write a ruby/selenium script to click the first check box in a table where the row does not contain certain values (plural) for //tr/td[6].
Each tr in the table is structured as follows:
<tr>
<td>
<input name = "checkbox" type = "checkbox"></input>
</td>
<td> irrelevant </td>
<td> irrelevant </td>
<td> irrelevant </td>
<td> irrelevant </td>
<td> text I care about </td>
</tr>
I need the xpath for the checkbox of the first tr in the table where//tr/td[6] does not contain "badtext" and does not contain "badtext2"
Not exactly sure how to write an xpath for this. Hopefully I explained this well enough.

You need to use the not(expression) function like this
//tr/td[6][not(contains(text(),'badtext'))][not(contains(text(),'badtext2'))]
and your final xpath would be like this
//tr/td[6][not(contains(text(),'badtext'))][not(contains(text(),'badtext2'))]/../td/input[#name='checkbox']
if you want to filter any attribute including text, use
contains(.,'badtext')

Related

Xpath to an element with two different contains text

I have the next HTML and i have this xpath to find the "Show":
xpath=//*[#id="Some_id"]/div/table/tbody/tr/td[contains(text (), "Show")]
and it works, but i need to find "Show" of a particular item, in this matter of a "Main" item, so i need smth like this:
xpath=//*[#id="Some_id"]/div/table/tbody/tr/td[contains(text (), "Show")]/preceding-sibling::td[contains(text (), "Main")]
but it doesn't work. Thanks
<tr class="even">
<td title="Main">
<strong>Main</strong>
</td>
<td>
text/html
</td>
<td>
Another text
</td>
<td>
Some text here
</td>
<td>
No
</td>
<td>
Show
</td>
</tr>
You can try this xpath. This will first select a tr element which contains the specific td and then select the required a tag.
"//tr[#class='even' and td[#title='Main']]/a[text()='Show']"
EDIT: This xpath worked for the OP
"//*[#id='Some_id']/div[1]/table/tbody/tr[#class='even']/td/a[contains(text (), 'Show')]"
You want to find "Show" that belong to the "Main" row in the table.
Here is what i would do
xpath=//td[#title='Main']/following-sibling::td[contains(text(), 'Show')]
More on xpath axes: http://www.w3schools.com/xsl/xpath_axes.asp

Xpath help to Find Unique value

I want to find the first tr tag with PONumber: text. I am not able to do that. Any help? I can find it using the //table/tbody/tr/td[contains(text(),'PONumber')] but it gives 2 objects. I want to find the first one only.
<tr>
<td class="clsLabel" align="right"> PONumber: </td>
<td class="clsInput"> PN659 </td>
</tr>
<tr>
<td class="clsLabel" align="right"> PreviousPONumber: </td>
<td class="clsInput"/>
</tr>
You can use following xpath to find exact object which you want
//tr/td[normalize-space(.)='PONumber:']
You can use something like
(//tr/td[contains(text(),'PONumber')])[1]
so put the xpath in brackets and with [1] you can specifiy to only return the first entry. Otherwise you could also use something like:
//tr/td[contains(text(),'PONumber') and not(contains(text(),'Previous'))]
so "Previous" will be excluded from the search results
You can limit the XPath result to return only the first matched by using [1] :
(//table/tbody/tr/td[contains(.,'PONumber')])[1]

Only parsing outer element

I am writing a scraper with Nokogiri, and I want to scrape a large HTML file.
Currently, I am scraping a large table; here is a small fragment:
<table id="rptBidTypes__ctl0_dgResults">
<tr>
<td align="left">S24327</td>
<td>
Airfield Lighting
<div>
<div>
<table cellpadding="5px" border="2" cellspacing="1px" width="100%" bgcolor=
"black">
<tr>
<td bgcolor="white">Abstract:<br />
This project is for the purchase and delivery, of various airfield
lighting, for a period of 36 months, with two optional 1 year renewals,
in accordance with the specifications, terms and conditions specified in
the solicitation.</td>
</tr>
</table>
</div>
</div>
</td>
</tr>
</table>
And here is the Ruby code I am using to scrape:
document = doc.search("table#rptBidTypes__ctl0_dgResults tr")
document[1..-1].each do |v|
cells = v.search 'td'
if cells.inner_html.length > 0
data = {
number: cells[0].text,
}
end
ScraperWiki::save_sqlite(['number'], data)
end
Unfortunately this isn't working for me. I only want to extract S24327, but I am getting the content of every table cell. How do I only extract the content of the first td?
Keep in mind that under this table, there are many table rows following the same format.
In CSS, table tr means tr anywhere underneath the table, including nested tables. But table > tr means the tr must be a direct child of the table.
Also, it appears you only want the cell values, so you don't need to iterate. This will give you all such cells (the first in each row):
doc.search("table#rptBidTypes__ctl0_dgResults > tr > td[1]").map(&:text)
The content of the first td would be:
doc.at("table#rptBidTypes__ctl0_dgResults td").text
The problem is that your search is matching two different things: the <tr> tag nested directly within the table with id rptBidTypes__ctl0_dgResults, and the <tr> tag within the table nested inside that parent table. When you loop through document[1..-1] you're actually selecting the second <tr> tag rather than the first one.
To select just the direct child <tr> tag, use:
document = doc.search("table#rptBidTypes__ctl0_dgResults > tr")
Then you can get the text for the <td> tag with:
document.css('td')[0].text #=> "S24327"

Multiple classes with barely the same name

I've got a little question and it seems there's no place on the internet where I can find the answer except here :p
So I've got an html page with some tables. Thoses tables have lines (as usual :p), and in those lines they are some inputs.
I want to add a rule in my css file wich have an effect on all those lines. Those lines have an id that is barely the same semantic.
Here's my code :
<table>
<tr id="tr_creneau_1">
<td>
<input />
</td>
</tr>
<tr id="tr_creneau_2">
<td>
<input />
</td>
</tr>
</table>
<table>
<tr id="tr_logo_1">
<td>
<input />
</td>
</tr>
</table>
At the end I want a css rule who impact all the inputs in the tr_* lines.
You can try with:
tr[id^="tr_"] input
But this is a css 3 selector and it doesn't work on all browsers, alternatively you can simply use:
tr input
or add a class to every row with that id and match that class
You can try:
tr[id^="tr_"] { --your css here-- }
It will check all of the tr tags if their id starts with tr_.
If it doesn't need to be at the start of the id attribute, just somewhere random , you can use:
tr[id*="tr_"]
If above doesn't work I would suggest going for a class based approach.
You could always add a CSS class to each table row you wish to target. e.g.
<table>
<tr id="tr_creneau_1" class="style-me">
<td>
<input />
</td>
</tr>
<tr id="tr_creneau_2" class="style-me">
<td>
<input />
</td>
</tr>
<tr id="somethingElse">
no input, so no class needed
</tr>
</table>
Then style as so:
table .style-me input {
background-color: red;
}

Form tag won't enclose elements inside a table

I've run into a curious problem; I've got a form inside a <tr>, however the form refuses to wrap any tags inside it. I've made a quick JSFiddle here to play around with. Firebug reports that the form isn't wrapping anything:
The <form> element is greyed out and not wrapping anything. The HTML for this test is below
<table>
<form>
<tr>
<td>Input</td>
<td>Another input</td>
</tr>
<tr>
<td colspan="4"><span>Other stuff</span></td>
</tr>
</form>
<tr>
<td>
Rows not affected by the form
</td>
</tr>
<tr>
<td>
Rows not affected by the form
</td>
</tr>
</table>
As can be seen, the form holds two trs in the written markup. I read here that this is invalid, so my question is can I create a form that holds two or more trs and an arbitrary amount of other elements inside a table? The table has other rows in it not associated with the form, so putting a <form> round the entire table is unhelpful, although seeing as the other rows won't have any inputs for the form (POST request), I suppose a form could be put around the entire table.
Which is a better solution; whole-table wrap, or a working fix for just enclosing the needed rows in a form tag? I know I could put a table inside a td > form, but then the column widths wouldn't be the same in the nested table, which is why I came to ask this question.
You cannot interrupt the <table> structure with any tags besides <thead>, <tfoot>, <tbody>, <tr>, <th>, or <td>. You form tags need to be encapsulated between two <td> or your entire <table> needs to be placed within the <form> tag.
<table>
<tr>
<td>
<form>
...form data...
</form>
</td>
</tr>
</table>
..or..
<form>
<table>
...
</table>
</form>
you can only put a form inside a td basically, so you could put those 2 rows inside a new table that you create inside a td
like ...
<table><tr><td><form><table><tr>...</tr><tr>...</tr></table></form></td></tr><tr>...</tr><tr>...</tr></table>
The <form> tag can only be placed inside a <td> element or outside the <table> in this case.
If I were you, I'd just put the <form> around the whole table since you said there won't be any other forms within it.
Or, you could replace the <table> completely with <div>s instead that use display: table; or display: table-cell;.