How can I use a wildcard on Xpath - html

cover: $main//*[has-class("aligncenter wp-image-121146 size-large")]//img
cover: $main//img[has-class("aligncenter wp-image-121146 size-large")]
the string has a static part aligncenter wp-image- and a dynamic part "" what I want to do here is concatenate all the posibles ""
in bash is somthing like this:
"aligncenter wp-image-"*
How can I make that in Xpath?

I think this will work:
'//E[#class="aligncenter" and contains(concat(" ", #class, "wp-image-"), " C ")]'
This might be more robust, haven't tried a lot of and and it looks like you're expecting the classes to be in that order:
'//E[contains(concat(" ", #class, "aligncenter wp-image-"), " C ")]'
I haven't tested it though, either way, this should help you:
https://gist.github.com/glenpierce/400d5b569094b902f06789d80757454e

There is not enough information to give an exact XPath solution to this question as you didn't provide sample XML/HTML, and both of the attempted XPath expressions (if they actually are) are invalid.
To answer the title, you can't use wildcalrd like that in XPath 1.0, which is the most widely implemented version of XPath. But you can use contains() or starts-with() function. The latter seems more suitable in this particular case. For example, the following XPath returns all img elements, anyhwere in the document, where class attribute value starts with substring 'aligncenter wp-image-' :
//img[starts-with(#class, 'aligncenter wp-image-')]
demo

Related

Xpath: Get Text After Element With Containing Text

I am looking for a way to get text which is not inside an HTML element:
<div class="col-sm-4">
<strong>Handelnde Personen:</strong><br><br>
<strong>Geschäftsführer</strong><br>
Mr John Doe<br>
Privatperson<br>
.....<br>
<br>
I want to get "Mr John Doe".
The only way I see is looking for a strong element which contains "Geschäftsführer" and then look for the following text.
My idea so far:
//strong[contains(text(), 'Gesch')]/br/../text()
... I simply can't make it work.
Also, is there a "wildcard" for strings? That I could use
*esch*ftsf*hr*
for "Geschäftsführer"?
I highly appreciate your help, thanks!
Try
//strong[starts-with(., 'Gesch')]/following-sibling::text()[1]
As for wildcard matching, with XPath 2.0 you use regular expressions:
//strong[matches(., '.*esch.*ftsf.*hr.*')]
With XPath 3.0 you could also use the Unicode collation algorithm
//strong[compare(., 'Geschäftsführer',
'http://www.w3.org/2013/collation/UCA?strength=primary') = 0]
(strength=primary ignores case and accents)
But to get anything more advanced than XPath 1.0 in the browser, you would need to deploy Saxon-JS.
Another option with 1.0 is to use translate() to remove case and umlauts:
//strong[translate(., 'ABCD..XYZÄÖÜäöüß', 'abcd..xyzaouaous') = 'geschaftsfuhrer']
Note, in all these examples I have used "." rather than "text()" to get the string value of an element - this is recommended practice.

is it possible to read the text of a li using Xpath with different attributes?

I am aware that I can directly use:
driver.FindElement(By.XPath("//ul[3]/li/ul/li[7]")).Text
to get the text .. but I am trying get the text by using Xpath and combination of different attributes like text(), contains() etc.
//ul[3]/li/ul/li//[text()='My Data']
Please suggest me different ways that I can handle this ... except the one I mentioned.
<li class="ng-binding ng-scope selectedTreeElement" ng-click="orgSelCtrl.selectUserSessionOrg(child);" ng-class="{selectedTreeElement: child.organizationId == orgSelCtrl.SelectedOrg.organizationId}" ng-repeat="child in node.childOrgs" style="background-color: transparent;"> My Data </li>
looks like you have extra "/" in your xpath and you miss dot:
//ul[3]/li/ul/li//[text()='My Data']
try this:
.//ul[3]/li/ul/li[text()='My Data']
BUT you are use xpath only for find elements, but not for reading its attributes. If you need to read attribute or text inside of it, you need to use selenium after search.
.Text of a WebElement would just return you the text of an element.
If you want to make expectations about the text, check the text() inside the XPath expression, e.g.:
//ul[3]/li/ul/li[text()='My Data']
or, using contains():
//ul[3]/li/ul/li[contains(text(), 'My Data')]
There are other functions you can make use of, see Functions - XPath.
You can also combine it with other conditions. For instance:
//ul[3]/li/ul/li[contains(#class, 'selectedTreeElement') and contains(text(), 'My Data')]

Regex selects first to last instead of just first

I'm trying to use String.sub! in ruby and it substitutes way too much.
The regex i'm using. You can see it's matching too much: http://rubular.com/r/IUav4KEFWH
<rb>.+<\/rb>
it selects from the first to the last and I want it just to select the first pair.
is there another version of sub I'm not aware of, or a better way to sub
it would be easy to turn of multi-line and put them on separate lines but I don't want to sacrifice multi-lining
Your regex is too greedy:
<rb>.+<\/rb>
Make it non-greedy using:
<rb>.+?<\/rb>
Rubular Demo
It matches from the first <rb> tag up until the very last </rb> tag because + is a greedy operator meaning it will match as much as it can and still allow the remainder of the regular expression to match.
You want to use +? for a non-greedy match meaning "one or more — preferably as few as possible".
<rb>.+?</rb>
Note: A parser to extract from HTML is recommended rather than using regular expression.
You can try this variant:
<rb>(?>(?!<\/rb>).)*+<\/rb>
Demo
Or if you want:
<rb>[^<]+<\/rb>
Demo
See the difference between .*? And [^<]+ in this DEMO

Use XPath to find links containing two things

I'm using XPath to parse an HTML document to find a specific link. The specific link has a domain name in it and the character '#'.
//a[#*[contains(., 'domain')]] | //a[#*[contains(., '#')]]"
Will return links with 'domain' OR '#' in them and I need 'domain' AND '#'
I've been trying to use:
//a[#*[contains(., 'domain')]] & //a[#*[contains(., '#')]]"
But that's no good.
You can read about XPath operators here. The & operator does not exist.
Also, there is no need to select the element twice.
You could use either
//a[#*[contains(., 'domain')]][#*[contains(., '#')]]
or
//a[#*[contains(., 'domain')] and #*[contains(., '#')]]
Should be as easy as:
//a[#*[contains(., 'domain')]][#*[contains(., '#')]]

Selenium: test if element contains some text

With Selenium IDE, how can I test if an element's inner text contains a specific string? For example:
<p id="fred">abcde</p>
'id=fred' contains "bcd" = true)
The Selenium-IDE documentation is helpful in this situation.
The command you are looking for is assertText, the locator would be id=fred and the text for example *bcd*.
It can be done with a simple wildcard:
verifyText
id="fred"
*bcd*
See selenium IDE Doc
You can also use:
assertElementPresent
css=p#fred:contains('bcd')
A solution with XPath:
Command: verify element present
Target: xpath=//div[#id='fred' and contains(.,'bcd')]
Are you able to use jQuery if so try something like
$("p#fred:contains('bcd')").css("text-decoration", "underline");
It seems regular expressions might work:
"The simplest character set is a character. The regular expression "the" contains three
character sets: "t," "h" and "e". It will match any line with the string "the" inside it.
This would also match the word "other". "
(From site: http://www.grymoire.com/Unix/Regular.html)
If you are using visual studio there is functionality for evaluating strings with regular expressions of ALL kinds (not just contains):
using System.Text.RegularExpressions;
Regex.IsMatch("YourInnerText", #"^[a-zA-Z]+$");
The expression I posted will check if the string contains ONLY letters.
Your regular expression would then according to my link be "bcd" or some string you construct at runtime. Or:
Regex.IsMatch("YourInnerText", #"bcd");
(Something like that anyway)
Hope it helped.
You can use the command assertTextPresent or verifyText