Get inner href of an href link with xpath - html

I just started out with Python and learning about xpath expressions.
I'm trying to get a div, an a class, look for the href inside the a class and then get the part of the href, then just continue with something.
div class: dropdown-menu and a class: dropdown-item
My url: https://www.something.com/library/category/stuff
My xpath expression: response.xpath("//div[#class='dropdown-menu']//a[#class='dropdown-item']//a[contains(#href, 'category')]")
It just returns an empty string and I can't figure out why, please advice.

Since an <a> can't really be nested inside an <a>, I suppose you meant to write two conditions for the same <a> here:
response.xpath("//div[#class='dropdown-menu']//a[#class='dropdown-item']//a[contains(#href, 'category')]")
That would be written like this:
response.xpath("//div[#class='dropdown-menu']//a[#class='dropdown-item' and contains(#href, 'category')]")
or like this (predicates, i.e. the filter conditions in the square brackets, can be chained and are evaluated one after another):
response.xpath("//div[#class='dropdown-menu']//a[#class='dropdown-item'][contains(#href, 'category')]")

Related

Get Xpath for href from a tag based on span tag value

I am a beginner to xpath and I am unable to get XPath to get link from 'a' tag for below HTML code.
Get HREF value where span class value is "Upholstered" as shown in the snapshot.
Here, I want this value "/furniture/Bedrooms/Queen-Beds/_/N-8ddZ1z141u9?qf=styles_Upholstered" using Xpath.
Can you help me out please
According to your description the relevant XPath query to Get href value where span class value is "Upholstered will be something like:
//a[#class='Upholstered']/#href
However you forgot to add your actual HTML code (at least partial) so the above answer might not be 100% accurate.
Reference material:
XPath Language Specification
XPath Tutorial
Using the XPath Extractor in JMeter
Use below xpath to extract the URL of your <a> tag
//ul[#class='facetOptions']/li/a[#role='checkbox']/#href

Xpath - Retrieve text from within a span with mailto href

I have this piece of HTML code. I've already tried several xpath selectors but don't seem to be able to get the "Ask us" text from within the span with class "someClass".
<span class="someClass">Ask us</span>
Thanks in advance.
You can reach the content from the link with "/text()"
For me works this XPath snippet on your example.
/span[#class="someClass"]/a/text()
string(//span[#class="someClass"])
If you want the string() function to concatenate all child text, you
must then pass a single node instead of a node-set.

Parsing awful HTML: How do I recognize boundaries with xpath?

This is almost going to sound like a joke, but I promise you this is real life. There is a site on the internet, one which you have all used, that does not believe in css classes. Everything is defined directly in the style tag on an element. It's horrifying.
My problem though is that it also makes the html extraordinarily difficult to parse. The structure that I've got to go on looks something like this:
<td>
<a name="<random_string>"></a>
<div style="generic-style, used by other elements">
<div style="similarly generic style">{some_stuff}</div>
</div>
<a name="<random_string>"></a>
...
</td>
Basically, I've got these a tags that are forming the boundaries of the reviews, whos only defining information is the random string that is their name. I don't actually care about the anchor tags, but I would like to grab the reviews between them using xpath.
I've looked into sibling queries, but they don't seem to be well suited for alternating boundaries. I also looked into the Kayessian method of xpath queries, which (aside from having an awesome name) only seems well suited to grab a particular div, rather than all divs between the anchor tags.
Any thoughts on how I could grab the divs here?
If //td/div[../a[#name]] works for you, then the following should also work :
//td[a/#name]/div
This way you don't need to go back and forth -or rather down and up-. For a more specific selector, you may want to try the following :
//td/div[preceding-sibling::*[1][self::a/#name]][following-sibling::*[1][self::a/#name]]
The XPath selects div element having all the following properties :
td/div : is child of <td> element
[preceding-sibling::*[1][self::a/#name]] : preceded directly by <a> element having attribute name
[following-sibling::*[1][self::a/#name]] : followed directly by <a> element having attribute name
I figured it out! It turns out that xpath will allow for relative attribute assertions. I am not sure if this behavior is desired, but it happens to work in this case! Here's the xpath:
//td/div[../a[#name]]
Nice and clean, the ../a[#name] basically just says:
Go up a level, and make sure on that level of the hierarchy there's an a element with a name attribute

Watir get INNER html of <span>

I'm trying to find a navigation link by iterating through a handful of spans with class 'menu-item-text'. My goal is to compare what is inside the span tags to see if it is the right navigation control to click (there are no hard ids to go by.) My code is like this:
navlinks = #browser.spans(:class, 'menu-item')
navlinks.each do |this|
puts "'#{this.text}'"
if this.text == link_name
this.click
break
end
I know for sure I'm getting the correct elements. However, text is always an empty string. My second idea was to use .html instead of .text, but that returns something like this:
<span class="menu-item">Insights</span>
What I want is the "Insights" text inside the span, not the full html that includes the tag markup. I have also tried using this.span.text, but that did not work either.
How can I target exclusively the inner html of an element through watir's content grabbing methods?
Thanks!
Assuming you are using Watir-Webdriver v0.6.9 or later, a inner_html method has been added for getting the inner HTML.
For the span:
<span class="menu-item">Insights</span>
You could do:
#browser.span(:class => 'menu-item').inner_html
#=> "Insights"
Similarly, you could try using this method in your loop instead of .text.
Note that depending on the uniqueness of your text, you might be able to simply check if the text appears in element's (outer) html:
#browser.spans(:class => 'menu-item', :html => /#{link_name}/).click

Xpath and innerHTML

What Xpath expression can I use to find all the anchor (just 'a') elements whose actual text (the innerHTML) is Logout.
something like
//a[#innerHTML='Logout']
Would that be correct?
No, it would be incorrect. innerHTML is a property, part of the object model, while XPath operates on tags and attributes. Unless your a tag actually has an attribute named innerHTML, this wouldn't work.
If you want to compare the value of the tag itself, you can use the . (dot) to refer to the tag:
a[.='Logout']
However, I must add, just in case you're using jQuery: I'm not sure if it will work with jQuery. jQuery does not support XPath fully, only basic stuff.