Why is contains(text(), "string" ) not working in XPath? - html

I have written this expression //*[contains(text(), "Brand:" )] for the below HTML code.
<div class="info-product mt-3">
<h3>Informazioni prodotto</h3>
Brand: <span class="brand_title font-weight-bold text-uppercase">Ava</span><br> SKU: 8002910009960<br> Peso Lordo: 0.471 kg <br> Dimensioni: 44.00 × 145.00 × 153.00 mm<br>
<p class="mt-2">
AVA BUCATO A MANO E2 GR.380</p>
</div>
The xpath that I have written is not working I want to select Node that contains text Brand:. Can someone tell me my mistake?

Your XPath,
//*[contains(text(), "Brand:")]
in XPath 1.0 will select all elements whose first text node child contains a "Brand:" substring. In XPath 2.0 it is an error to call contains() with a sequence of more than one item as the first argument.
This XPath,
//*[text()[contains(., "Brand:")]]
will select all elements with a text node child whose string value contains a "Brand:" substring.
See also
XPath 1.0 vs 2.0+ different contains() behavior explanation
Testing text() nodes vs string values in XPath

Related

How to Xpath from Form

I have html code like:
<form class="variations_form cart" action="https://example.com/name-of-product" method="post" enctype='multipart/form-data' data-product_id="386" data-product_variations="[{"attributes":{"attribute_pa_czas-realizacji":"24h"},"availability_html":"<p class=\"stock out-of-stock\">Brak w magazynie<\/p>\n","backorders_allowed":false,"dimensions":{"length":"","width":""}]">
I would like to extract "Brak w magazynie".
I have tried xpath:
//*[text() = 'Brak w magazynie']
but it doesn't work. Any idea how to do it? :)
You can use the following XPath expressions to locate this element:
//form[#class='variations_form cart']
Or
//form[#action='https://example.com/name-of-product']
Or
//form[#action='https://example.com/name-of-product' and #class='variations_form cart']
And then extract the found element text
UPD
If you want to select such elements containing Brak w magazynie in their data-product_variations attribute you can use XPath like this:
//form[#class='variations_form cart' and(contains(#data-product_variations,'Brak w magazynie')) ]
Or
//form[#action='https://example.com/name-of-product' and contains(#data-product_variations,'Brak w magazynie')]

HTML: How to refer to span.title inside a class?

I am building a webscraper and I have this block of HTML code:
<div class = 'example-1'
<ul class = 'example-2'
<li>
<span title = 'data1' > 155 </span>
/
<span title = 'data2' > 155 </span>
And I want to scrape the numbers 155 and 145 inside the span title
In my code using scrapy, I identified this as:
'size': detail.css('ul.example-2 ::text').get(),
but it is not returning me anything. How do I fix this?
The correct CSS selectors are:
span[title="data1"]
span[title="data2"]
Alternatively, you can select both at the same time with:
span[title^="data"]
I am unfamiliar with scrapy syntax, but I believe your scrapy selector should look something like this:
response.css('span[title^="data"]::text').getall()
Further info:
In CSS, square brackets denotes the attribute selector.
You can select:
an element with an attribute : span[title]
an element with a specific attribute-value : span[title="data1"]
an element with the start pattern of an attribute-value : span[title^="data"]
an element with the end pattern of an attribute-value : span[title$="1"]
and more.

(Reverse) Traverse XPath Query for Accessing a DIV with a particular Text Value

Working with a DOM that has the same HTML loop 100+ times that looks like this
<div class="intro">
<div class="header">
<h1 class="product-code"> <span class="code">ZY001</span> <span class="intro">ZY001 Title/Intro</span> </h1>
</div>
<div>
<table>
<tbody>
<tr>
<td>Available</td>
<td> S </td>
<td> M </td>
<td> XL </td>
</tr>
I was previously using this XPath Query to get ALL the node values back (all 100+ instances of the DOM Query in connection with the variable nodes that may contain in Available
//div[#class='intro']/div/table/tbody/tr/td[contains(text(),'Available')]/following-sibling::td
object(DOMNodeList)[595]
public 'length' => int 591
Now I am needing to target the product-code / code specifically to retrieve all the td attributes for a particular code
Because the div that contains the unique identifier (in the example above, ZY001) is not a direct ancestor, my thinking is I have to do a Reverse XPath Query
Here's one of my attempts:
//h1[#class='product-code']/span[contains(#class, 'code') and text() = 'ZY001']/../../div[#class='intro']/div/table/tbody/tr/td[contains(text(),'Available')]/following-sibling::td
As I am defining /span[contains(#class, 'code') and text() = 'ZY001'] and then attempting to traverse the dom backwards twice using /../../ I was hoping/expecting to get back the div[#class='intro'] with the text ZY001 immediately above it, or rather a public 'length' => int 1
But all my attempts thus far have resulted in 0 results. Not false, indicating an improper XPath, but 0.
How can I modify my XPath Query to get back the single instance in the one-of-many <div class="intro">'s that contain the <h1 class="product-code">/<span class="code"> text value ZY001?
Use
//h1[#class='product-code']/span[contains(#class, 'code') and text() = 'ZY001']/../../../div/table/tbody
instead of
//h1[#class='product-code']/span[contains(#class, 'code') and text() = 'ZY001']/../../div[#class='intro']/div/table/tbody
You can use any of the below xpath's for that:
//div[#class='intro' and //h1[#class='product-code']/span[#class='code' and text()='ZY001']]//tbody/tr[td[text()='Available']]/td[2]
//div[#class='intro' and //span[#class='code' and text()='ZY001']]//tbody/tr[td[text()='Available']]/td[2]
//div[#class='intro' and //span[#class='code' and text()='ZY001']]//tr[td[text()='Available']]/td[2]
Change td[2] to td[3] and td[4] to get the 3rd and 4th td respectively

Selenium WebDriver how to verify Text from Span Tag

I'm trying to verify the text in the span by using WebDriver. There is the span tag:
<span class="value">
/Company Home/IRP/tranzycja
</span>
I tried something like this:
driver.findElement(By.xpath("//span[#id='/Company Home/IRP/tranzycja']'"));
driver.findElement(By.cssSelector("span./Company Home/IRP/tranzycja"));
but none of this work.
Any help would be really appreciated. Thanks
More code:
<span id="uniqName_64_0" class="alfresco-renderers-PropertyLink alfresco-renderers-Property pointer small" data-dojo-attach-point="renderedValueNode" widgetid="uniqName_64_0">
<span class="inner" tabindex="0" data-dojo-attach-event="ondijitclick:onLinkClick">
<span class="label">
In folder:
</span>
<span class="value">
/Company Home/IRP/tranzycja
</span>
</span>
uniqName shouldn't be a target because are a lot of them and they are change.
There is a full html code:
http://www.filedropper.com/spantag
Here I am assuming you are trying to verify the text in the span tag.
i.e '/Company Home/IRP/tranzycja'
Try Below code
String expected String = "/Company Home/IRP/tranzycja";
String actual_String = driver.findElement(By.xpath("//span[#class='alfresco-renderers-PropertyLink alfresco-renderers-Property pointer small']//span[#class='value']")).getText();
if(expected String.equals(actual_String))
{
System.out.println("Text is Matched");
}
else
{
System.out.println("Text is not Matched");
}
You can try using xpath ('some text' can be replaced by variable like #Rupesh suggested):
driver.findElement(By.xpath("//span/span[#class='value'][normalize-space(.) = 'some text']"))
or
driver.findElement(By.xpath("//span/span[#class='value'][contains(text(),'some text')]"))
(Be aware that this xpath will find first matching element, so if there are span elements with text 'some text 1' and 'some text 2', only first occurrence will be found.)
Of course, those two methods will throw NoSuchElementException if element (with defined text) is not found on page. If you're using Java and if needed, you can easy catch that error and print proper message.
One possible xpath to find that <span> element :
//span[normalize-space(.) = '/Company Home/IRP/tranzycja']
I think your going to want to use something like
driver.findElement(By.xpath("//span[#id='/Company Home/IRP/tranzycja'])).getText();
the getText(); will get the text within that span
You can use text() method inside Xpath. I hope this will resolve your problem
String str1 = driver.findElement(By.xpath("//span[text()='/Company Home/IRP/tranzycja']")).getText();
System.out.println("str1");
Output = /Company Home/IRP/tranzycja

how to retrieve data from html between <span> and </span>

I want to get the rate that is from 1 to 5 in amazon customer reviews.
I check the source, and find this part looks as
<div style="margin-bottom:0.5em;">
<span style="margin-right:5px;"><span class="swSprite s_star_5_0 " title="5.0 out of 5 stars" ><span>5.0 out of 5 stars</span></span> </span>
<span style="vertical-align:middle;"><b>Works great right out of the box with Surface Pro</b>, <nobr>October 5, 2013</nobr></span>
</div>
I want to get 5.0 out of 5 stars from
<span>5.0 out of 5 stars</span></span> </span>
how can i use xpathSApply to get it?
Thank you!
I would recommend using the selectr package, which uses css selectors in place of xpath.
library(XML)
doc <- htmlParse('
<div style="margin-bottom:0.5em;">
<span style="margin-right:5px;">
<span class="swSprite s_star_5_0 " title="5.0 out of 5 stars" >
<span>5.0 out of 5 stars</span></span> </span>
<span style="vertical-align:middle;">
<b>Works great right out of the box with Surface Pro</b>,
<nobr>October 5, 2013</nobr></span>
</div>', asText = TRUE
)
library(selectr)
xmlValue(querySelector(doc, 'div > span > span > span'))
UPDATE: If you are looking to use xpath, you can use the css_to_xpath function in selectr to figure out the appropriate xpath command, which in this case turns out to be
"descendant-or-self::div/span/span/span"
I do not know r much but I can give you the XPath string. It seems you want the first span's text which has no attribute and this would be:
//span[not(#*)][1]/text()
You can put this string into xpathSApply.