How to select HTML element by XPATH if attribute name contains space? - html

for example
<li big class="attribute"></li>
in selenium selecting would be like this
driver.find_element(By.XPATH, '//*[#big class="attribute"]');
so how can i select the element by XPATH , using that results an invalid expression.
selecting just by class like this //*[#class="attribute"] doesnt work

If you want to select element by both attributes correct code would be
driver.find_element(By.XPATH, '//li[#big and #class="attribute"]')
note that big seem to be a separate boolean attribute (it might not have an explicit value) but not an "... attribute name contains space"

Related

How to search for tag + attribute in Chrome devtools element inspector?

I perform a simple search in devtool, but it drops drastically without a reason:
What's more, if I view the source and do the same search, the number of results of <link rel is just 58, not 184. Do you know why?
Here is the page if you need to examine.
For these "complex" queries you'll have to use xPath selectors:
//link[#rel]
//link[contains(#rel,'style')]
or CSS selectors:
link[rel]
link[rel*="style"]
For a trivial CSS selector like a use html a instead to ensure it doesn't match as literal text.
List of supported queries
Devtools uses CDP command DOM.performSearch and judging by the implementation it tries to match these types of queries:
text - inside #text nodes (like textContent in js)
text - inside tag names
text - inside attribute names
text - inside attribute values
<tag - matching at the start of a tag name
</tag - matching a closing tag
tag> - matching at the end of a tag name
<tag> - matching an entire tag name
"text - matching at the start of an attribute value
text" - matching at the end of an attribute value
text - matching an entire attribute value
//a[contains(., 'foo')] - XPath selector
a#foo.class[attr] - CSS selector
As you can see the literal text matching is limited to the first four types, and it won't find things that span more than one type like attr="value" that spans two types.

Selecting element based on attribute order in XPath?

I am working on a project using the Html-Agility-Pack and I need to build a list of each link that has an href attribute as its first attribute. What XPath expression would be used for this?
Example (I would want to only select the first):
<a href="http://someurl.com"/>
<a id="someid" href="http://someurl.com"/>
No, don't do that.
You really don't want to select elements based upon the ordering of their attributes because attribute order is arbitrary in HTML and XML. Find another criteria to limit your selections:
attribute presence or attribute value
child element presence or string value
preceding element value, possibly a label
etc
You want to choose a criteria that's invariant across all instances of the HTML/XML documents you may encounter. Attribute order is not such a criteria.

XPath based on id attribute value that starts with something?

For example if I have multiple anchor elements on a site and the easiest way to get them is via their ID, but the IDs look like this:
lots of html...
hop1
...lots of html...
hop2
...lots of html...
hop3
...lots of html
Is it possible to select the href attributes of all anchor elements whose id has the "foo_" part of the id? In other words, can I add a wildcard in an attribute's value in XPath?
This XPath expression, which works with all versions of XPath,
//a[starts-with(#id,"foo_")]/#href
will select all a/#href attributes whose a has an id attribute value that starts with "foo_".
Yes you can use matches function in terms of XSL:
Starting with foo_ //a/#id[matches(.,'^foo_\d+')]
Containing foo_ //a/#id[matches(.,'foo_\d+')]
Please specify for which language you are asking for

XPath: Way to match text inside an arbitrary number of nested elements?

Is it possible for one XPath expression to match all the following <a> elements using the text in the element, in this case "Link"?
Examples:
Link
<span>Link</span>
<div>Link</div>
<div><span>Link</span></div>
This simple XPath expression,
//a[contains(., 'Link')]
will select the a elements of all of your examples because . represents the current node (a), and contains() will check the string value of a to see if it contains 'Link'. The string value of a already conveniently abstracts away from any descendent elements.
This even simpler XPath expression,
//a[. = 'Link']
will also select the a elements in all of your examples. It's appropriate to use if the string value of a will exactly equal, rather than just contain, "Link".
Note: The above expressions will also select Li<br/>nk, which may or may not be desirable.
You could use the following:
//a[(.//*|.)[contains(text(), "Link")]]
This will select a elements that contain the text "Link" or a elements that have a descendant element that contains the text "Link".
//a - Select all a elements
( - Open OR grouping
.//* Select all the descendant nodes
| - Or..
. - Select the current node
) - Close OR grouping
[contains(text(), "Link")] - If they contain the text "Link"
Alternatively, you could also use:
//a[(.//*|.)[.="Link"]]

Select attribute content XPath

I have an XPath
//*[#class]
I would like to make an XPath to select the content inside this attribute.
<li class="tab-off" id="navList0">
So in this case I would like to extract the text "tab-off", is this possible with XPath?
Your original //*[#class] XPath query returns all elements which have a class attribute. What you want is //*[#class]/#class to retrieve the attribute itself.
In case you just want the value and not the attribute name try string(//*[#class]/#class) instead.
If you are specifically grabbing the data from an tag, you can do this:
//li[#class]
and loop through the result set to find a class with attribute "tab-off". Or
//li[#class='tab-off']
If you're in a position to hard code.
I assume you have already put your file through an XML parser like a DOMParser. This will make it much easier to extract any other values you may need on a specific tag.