Select a row based on the contents of a cell with xpath - html

I have a table that consists of multiple rows that each contain 5 cells, like this:
<tr>
<td></td>
<td>123456</td>
<td>statusText</td>
<td><a>linkText</a></td>
<td>editButton</td>
</tr>
The 123456 could be any string of random letters and numbers. I want to be able to select a link based on the contents of the second cell in the table. I've been trying something like this:
//tr[contains(td, '123456')]
to get me to the cell, but it either returns every row or nothing, depending on how I tweak the xpath.

I've been trying something like this:
//tr[contains(td, '123456')]
to get me to the cell, but it either
returns every row or nothing,
depending on how I tweak the xpath
You get what you asked for. The above XPath expression selects any tr element (row) in the document that has (at least one) td child whose string value contains '123456'.
But you want:
//tr/td[text() = '123456']
this selects every td element (cell) in the document, that has a text node child, whose string value is '123456'.
There can be different variations, depending on whether a td may have more than one text nodes and on whether the white space in a text node should be normalized, but the question doesn't provide any information if any of these apply in this particular case.

I'd research something like //tr[string(td[2]) = '123456']. If this does not work, I'd look up XPath axes.

Related

Use Xpath to Get Row Number

[View of the table and first row] Ultimately, I need to click on an edit button in td1 within a tr that is dynamic.
My plan was to find that tr[#] based on the text in td2 (the email address that is the identifier).
//table/tr/td[2][contains(text(),'me#address.com')]
[The HTML code] 2 Correctly highlights td2 of the row I need to capture. I'd like to get that tr# and then use the next line to click the element in tr[#]/td1, but I am stuck.
You can use the 'preceding-sibling' to select the desired element
//table/tr/td[2][contains(text(),'me#address.com')]/preceding-sibling::td
This will look for your element and then select the preceding td containing your desired link
Also it is worth noting that the '(text(),'me#address.com')' attribute you are using is case sensitive

Extract a single row from a table

I’m trying to extract a single row from a table.
I'm using google sheet to create the links and in cell D3 it contains this url.
https://www.wsj.com/market-data/quotes/AAPL/options
I have several links in cell D3 to go through.
The word "Last Trade" appears several times in different tables but I'M ONLY INTERESTED IN THE VERY FIRST TABLE FROM THE TOP.
with this word and once this word is found i'm looking to extract the ROW just above it.
Below is the IMPORTXML, and its needs modification and it should be able to pull that last row.
=IMPORTXML(D3,"//tr[td1/#class='acenter inthemoney'][last()]")
Any help would be greatly appreciated.
Thanks.
For that row you will need:
(//tr[#class='last_trade_row'])[1]/preceding-sibling::tr[1]
And then pick the wright td...it's unclear which td you want. So if you wanted the third td the XPath would be:
(//tr[#class='last_trade_row'])[1]/preceding-sibling::tr[1]/td[3]
Its always the first table that ends with the word LAST TRADE and the row above it that i'm looking to extract, so in this case this is the row that i'm looking to extract, below is the picture.
https://www.wsj.com/market-data/quotes/AAPL/options
In the above case where you want the first td the XPath will then be
(//tr[#class='last_trade_row'])[1]/preceding-sibling::tr[1]/td[1]

How do I find the value of X cells after a matched contains with xpath and lxml

I have a document with multiple rows that has a value in the 4th TD element that I can't figure out how to retrieve. There is nothing unique in the tags so I have to match based on the word TOTAL, and then get the value I need from the 4th TD in the existing row. This is one TR for illustration:
<TR>
<TD ALIGN="right" COLSPAN="30" bgcolor=d8caca><div class=small4>SECTION TOTAL</div></TD>
<TD ALIGN="right" COLSPAN="8" bgcolor=d8caca> </TD>
<TD ALIGN="right" COLSPAN="13" bgcolor=gold><div class=small4> 11.907531</div>
</TD>
<TD ALIGN="right" COLSPAN="13" bgcolor=gold><div class=small4> $773.10</div></TD>
</TR>
I want to match on the word "TOTAL" and then get the value exactly three cells later, or in this case, $773.10.
This successfully gathers each of the "TOTAL" text in an array without issue:
titles = tree.xpath("//tr/td[contains(., 'TOTAL')]//text()")
However, I am unable to get the values in the last element. I've tried numerous variations of the following searching for the TOTAL and then trying to use following or following-sibling to no avail:
totals = tree.xpath("//tr/td[contains(., 'TOTAL')]/../following::td[4]/div/text()")
...but I either get an array of the non-breakable space from the immediate next TD after the TOTAL, no data at all, or "element" references that when expanded to text are null. How do I properly get the value inside td[4] in the existing TR after the contains is matched?
I am trying to get every occurence, not just one, so that the titles and totals arrays are a 1:1 match. If there is a way to do a key=>value pairing that'd be even better.
You can use following-sibling axis to get td located after td that contain text "TOTAL" in the same parent, and then filter the result further to get only the last of such td using predicate [last()], then return the child div/text() :
query = "//tr/td[contains(., 'TOTAL')]/following-sibling::td[last()]/div/text()"
titles = tree.xpath(query)
xpathtester demo: http://www.xpathtester.com/xpath/5cf0aa473d030da66de1bec73bcb8795

XPath for label and drop down combined

I am trying to create an XPath that will allow me to verify if a row in a table where label is "X" has the correct drop down value.
The XPath for the label is
//*[#id="mainContent"]/table/tbody/tr/td/center/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr[14]/td[1]/b
The XPath for the drop down is
//*[#id="mainContent"]/table/tbody/tr/td/center/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr[14]/td[2]/select
How can I modify these so I only need one XPath?
Assuming that the "X" label is unique:
//td[. = 'X']/following-sibling::td[1]/select
Or, you can retain some,
//*[#id="mainContent"]//td[. = 'X']/following-sibling::td[1]/select
or all,
//*[#id="mainContent"]/table/tbody/tr/td/center/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr[14]/td[1]/following-sibling::td[1]/select
of the original path as necessary to meet whatever generality/specificity is required given the data on the page.
Similar to kjhughes' answer, but I would use
//td[1][b = 'X']/../td[2]/select
[b = 'X'] more closely matches what the OP asked, since the td could have other content besides the label (including whitespace-only nodes). And using td[1]/td[2] ensures that we use the first two columns, as the OP did.

Spring bean comma separating values, but I want to overwrite

Alright, so I'm pretty new to Spring, but I was asked to resolve a bug. So in our application, we have a page that queries a database based on an id. However, not all entries are unique to the id. The id and date pair, on the other hand, do define unique entries.
So this page takes in an id. If there is only a single entry related to this id, everything works fine. However, if there are multiple entries, the page displays a radio button selection of the various dates that pertain to that id. We use something like:
< form:radiobutton id="loadDate" path="loadDate" value="${date}" label="${date}" />
Later on the same page, we want to display the data for that option. As part of it, we display the date of that selection:
< form:input id="aiLoadDate" path="loadDate" maxlength="22" size="22" class="readonly" readonly="true"/>
The problem is that when this happens, the variable (or bean? I'm not quite sure about Spring yet..) loadDate (a string) ends up being the same date twice, seperated with a comma. I'm guessing the problem here is the "path="loadDate"" that is common to both lines.
Instead of appending the date to the already existing one like a csv, I'd like it to overwrite the current entry intead. Is there a way to do this?
Spring is not the direct cause of your problem. When the elements of an HTML form are submitted, each element will appear in the request as a name=value pair. If two or more elements in the form have the same name (not id, name attribute) then those elements appear in the request as name=value,value (with one value per element with a duplicated name).
Option 1: stop using an input as a display element. Just display the date in a span (or div or paragraph or what ever). If you want the look of an input box (border, etc.) use CSS to create a class that has the look you want and attach the class to the span (or div or paragraph, etc) in which you display the date.
Option2: continue using an input as a display element. Disabled input elements are not added to the request when the form is submitted. in the form:imput set disabled="true".