xpath expression - select element where parent contains a specific text - html

I struggle currently to find the correct xpath expression to select an input element where its parent/sibling element contains a specific text.
In the example below, I would like to select the "input" element where, in the same tr row, a td element with a specific text exists.
my example path - returns no match
//input[contains(../../../td/text(),"15-935-331")]
source code
<tr>
<td>xxxx, yyyyy</td>
<td>Mr</td>
<td></td>
<td> 15-935-331</td>
<form id="betreuerModel" action="xxxx" method="POST">
<td class="tRight">
<input value="Bearbeiten" id="bearbeiten" name="bearbeiten" class="submit" title="Bearbeiten" type="submit"/>
</td>
</form>
</tr>
<tr>
// .. next row with same structure
</tr>

The contains function, when given a nodeset, will only operate on the very first node in that nodeset. In your case, it is <td>xxxx, yyyyy</td>.
You could instead refactor your expression so that the predicate operates on all the nodes to check, and the contains function operates on a single item:
//input[../../../td/text()[contains(., "15-935-331")]]
This will get any input element, where the parent's parent's parent contains a td element with a text node containing the text 15-935-331.
A perhaps easier way to specify this would be to use ancestor::tr[1]/td in place of ../../../td.
//input[ancestor::tr[1]/td/text()[contains(.,"15-935-331")]]
This would get the first tr in the ancestor hierarchy, and operate on that.

As an alternative to the solution posted by Keith, you can use the following XPath expression:
//tr[td[contains(., "15-935-331")]]/form//input
This makes it a bit more independent of the actual structure of the HTML. It selects the tr which contains a td containing the given text, and from that tr it takes the input element anywhere below the form element.

Related

XPath to separately select each of two values in a table cell?

<td _ngcontent-wp class="align-middle">
"4.79728"
<small _ngcontent-wp class="neo_red_dark"> -0.08% </small>
</td>
My XPath as follows:
(//table[#class="table"]/tbody/tr/td[3])[1]
It works, but it gets two values together (4.79728 -0.08%). How can I get them separately?
You can get the value before the space and after the space using:
substring-before() and substring-after()
or change your XPath to target the text() descendants of the td instead of the td itself (which is producing the calculated text value).
In order to select "4.79728":
(//table[#class="table"]/tbody/tr/td[3])[1]/text()
In order to select -0.08%:
(//table[#class="table"]/tbody/tr/td[3])[1]/small/text()
You should indicate with XPath questions which XPath version you are using.
If it's version 1.0, remember that the set of data types you can return is very limited: a single string, number, or boolean, or a node-set. And some APIs only allow you to return a node-set.
Your current query is returning a node-set containing one node, namely a td element, whose string value contains the concatenation of all the text within. You could return a node-set containing all the text nodes individually by appending //text() to the query. But of course, it won't always be the case that the two numbers are in separate text nodes.

How to correctly use `XPath` locators, such as `parent`, `child`, following-sibling`?

I have some form:
<div class="form_item--2c8WB">
<label>
<span class="label--2VxxL required--2nkmI">
"Text"
::after
</span>
<br>
<input type="password" name="newPasswordRepeat" autocomplete="new-password"
aria-invalid="true" aria-required="true"
aria-errormessage="vee_Text2">
</label>
<div class="errors--qVgtm">
<div>Text3</div>
</div>
</div>
I need to find path to Text3 text element, but exactly via input section:
My way:
//input[#name='newPasswordRepeat']/../../div/div
The path is valid, but it is a long way to go and I want to use the follow-sibling command. But I can't do that
For example, I'm trying to use the parent:: path:
//input[#name='newPasswordRepeat']/parent::
//input[#name='newPasswordRepeat']::parent::
//input[#name='newPasswordRepeat']::parent
//input[#name='newPasswordRepeat']/parent
//input[#name='newPasswordRepeat']/::parent
No one from this order not working, only
//input[#name='newPasswordRepeat']/..
Also I cannot use following-sibling, but in this case another way (.., .) does not exist.
How to correctly use XPath locators such as parent, child, following-sibling`?
It's always axis::node_test (compare this answer where I explain various XPath terms).
For example
parent::div selects the parent node if it's a <div> (that's the node test).
ancestor::div selects all (!) ancestor nodes that are <div>s.
following-sibling::div selects all (!) following siblings that are <div>s.
Most of the time there is no guarantee that only a single node is selected. Therefore it's sensible to also have some sort of [predicate] that narrows down the selection to prevent false positives - for example we could verify the #class attribute.
//input[#name='newPasswordRepeat']/parent::label/following-sibling::div[starts-with(#class, 'errors')]/div
Of course parent::label can be shortened to .. if we don't care what kind of element the parent is.
Rather than select a target and have to traverse up to a parent, consider using a predicate on the parent in the first place:
//label[input/#name='newPasswordRepeat']/following-sibling::*[1]/div
will select the div child of the element immediately following the label which contains the targeted input element. No parent:: axis is required.

What is the role of parentheses in XPath 1.0?

In Chrome DevTools > Elements, when I search for //tr/td/span I find an element (because such an element exists on my page).
When I search for (//tr)/td/span or (//tr/td)/span I also find this element.
But neither //tr(/td)/span nor //tr/(td)/span nor //tr/(td/)span find anything.
What is the meaning of these parentheses in XPath?
Parenthesis in XPath are used as they are in other programming languages:
Function argument grouping: e.g: //tr/td[contains(.,"e")]
Evaluation precedence indication: e.g: normal arithmetic expression grouping as well as leading path grouping (trace LocationPath through to PrimaryExpr in the XPath grammar) as in (//td)[1] to find the first td in the document as opposed to //td[1] which finds the td elements that are the first child of their respective parent elements.
They're also used in
node tests: e.g: node(), element(), ...
processing instructions: e.g: PageBreak().
Your examples that do not find anything (e.g: //tr(/td)/span, //tr/(td)/span1, etc) have parenthesis embedded within the path that do not follow in one of the above categories. Such use of parenthesis are actually syntactically invalid and should have been reported as such rather than silently failing.
1Note that this expression would actually be syntatically valid under XPath 2.0/3.0. Thanks, #Andersson, for noticing.
I don't think that parenthesis mean something in your case, but it might be used to return required node/nodes set depending on passed index
For instance, HTML is like below:
<table>
<tr>
<td>
<span>first</span>
</td>
<td>
<span>second</span>
</td>
</tr>
<tr>
<td>
<span>third</span>
</td>
<td>
<span>fourth</span>
</td>
</tr>
</table>
(//tr)[1]/td will return cells for first row only (first, second)
(//tr)[2]/td - for second row (third, fourth)
(//tr/td)[1] - first cell of first row (first). Note that //tr/td[1] will returns each first cell of each row (first, third)
...

A CSS selector or xpath to iterate over rows with specific attributes

Imagine I am given a table like this:
<table>
<tr><td>A</td></tr>
<tr><td>B</td></tr>
<tr><td>C</td></tr>
<tr><td>D</td></tr>
<tr><td>E</td></tr>
<tr><td>B</td></tr>
</table>
I'd like to construct a CSS selector (preferred) or an XPath (accepted) that picks out the n'th row that contains an a anchor, such that:
selector(1) => B
selector(2) => C
selector(3) => E
CSS selectors
At this point, I'm pretty sure that CSS won't do the job, but
'table tr:nth-child(' + n + ')'
will pick out the n'th row, but that selects rows whether or not they have an a anchor. Similarly,
'table tr:nth-child(' + n + ') a'
will pick out rows with an a anchor, but only if n is 2, 3 or 5.
XPath
With XPath, this matches all the tr that have an a
`//table//tr//a/ancestor::tr`
but I can't figure out how to select the n'th match. In particular,
`//table//tr//a/ancestor::tr[position() = 2]`
doesn't appear to select anything.
You can't do this with a CSS selector1 for a number of reasons:
There is no parent selector, and
There is no selector for matching the nth child satisfying an arbitrary selector.
Your XPath is incorrect because a/ancestor::tr[position() = 2] returns the second tr ancestor of the a element. That is, the [position() = 2] predicate is connected to the ancestor:: axis. This XPath would match the middle-level tr in the following HTML:
<table>
<tr><td><table>
<tr><td><table>
<tr><td>
</table>
</table>
</table>
In your HTML, each a element has only one tr ancestor, so this will not select anything.
The XPath you should use is:
(//table//tr[descendant::a])[2]
This matches the second tr element that contains an a descendant.
1 In Selectors 4, a potential solution would be table tr:nth-match(2 of :has(a)).
If I understand you correctly, you can find the nth td which has an <a href like so (you want C to be the 2nd match?):
(/table//tr/td[a[#href]])[2]
If you can't guarantee a td element, you can wild card the path and elements:
(/table//tr//*[a[#href]])[2]
Answers from #BoltClock and #StuartLC both work. But now that I know parentheses in XPath can control operator precedence, a more straightforward solution seems to be:
(//table//tr//a)[2]
Am I missing something?

XPath Expression Problem

I have the following HTML snippet, http://paste.enzotools.org/show/1209/ , and I want to extract the tag that has a text() descendant with the value of "172.80" (it's the fourth node from that snippet). My attempts so far have been:
'descendant::td[#class="roomPrice figure" and contains(descendant::text(), "172.80")]'
'descendant::td[#class="roomPrice figure" and contains(div/text(), "172.80")]'
'descendant::td[#class="roomPrice figure" and div[contains(text(), "172.80")]]'
but neither of them selects anything.
Does anyone have any suggestions?
When passing node set to function calls, do note that if the function signature doesn't declare a node set argument then it will cast the the first node from that node set.
So, I think you need this XPath expression:
descendant::td[#class="roomPrice figure"][div[text()[contains(.,'172.80')]]]
Test for a text node child of div
or
descendant::td[#class="roomPrice figure"]
[div[descendant::text()[contains(.,'172.80')]]]
Test for a text node descendant of div
or
descendant::td[#class="roomPrice figure"]
[descendant::text()[contains(.,'172.80')]]
Test for a text node descendat of td
I believe you want something like this:
<xsl:for-each select="//td[contains(string(.), '172.80')]">
The string() function will give you all the text in the current and descendant nodes wherease text() just gives you the text in the current (context) node.
Of course, you extend the xpath selector to filter on the class names too...
<xsl:for-each select="//td[contains(string(.), '172.80')][#class='roomPrice figure']">
And as stated in the comments above, you're posted xml/html is invalid as it stands.
My understanding is that you want to select the td element in specified class, that has a descendant text node containing the value "172.80".
I'm assuming the context node is the <tr> (or some ancestor of it).
The attempts you listed all suffer from the problem that contains() converts its first argument to a single string, using only the first node of the nodeset. So if the td or div has a descendant or child text node before the one that contains "172.80", the one containing "172.80" will not be noticed.
Try this:
'descendant::td[#class="roomPrice figure" and
descendant::text()[contains(., "172.80")]]'