Find xpath nearest element - html

I need to define an xpath before an element on the page. I have a string(FIO) that I can find using xpath and I need to bind to it. I don't understand how to do it.
My xpath witch i can find on page:
/html/body/div[1]/div[2]/section/div/div[1]/div/ul/li[2]//div[1]/span[contains(., '"+FIO+"')]
look at screenshot, i need find string 1, it have xpath:
/html/body/div[1]/div[2]/section/div/div[1]/div/ul/li[2]/ul/li[4]/ul/li[1]/div/div/a
image
string with my param(FIO) 2, have xpath:
/html/body/div[1]/div[2]/section/div/div[1]/div/ul/li[2]/ul/li[4]/ul/li[1]/div/div/div[1]/span
and i shortened it and inserted a variable:
/html/body/div[1]/div[2]/section/div/div[1]/div/ul/li[2]//div[1]/span[contains(., '"+FIO+"')]
how i can get xpath to element 2 with binding at element 1 ? maybe following sibling ?
sorry, i can't copy the code correctly, only like this:
</div>
</div>
<ul>
<li>
<div class="structure2__item1">
<div class="structure2__item2" style="">
<a class="structure2__position" href=https://**>
"String 2"
</a>
<div class="structure2__name" style="">
<span>String_FIO</span>
</div>
</div>
</div>
</li>
<li>

//div[child::span[contains(text(), "String_FIO")]]/preceding-sibling::a
This would help fetch the a tag from the span.
(From next time - please look out for the standards mentioned in the comments.)

Related

Scrapy, how to extract s subtext from <b>

I got a html like this:
<section id="SECTION_A">
<h4>List</h4>
<a style="text-decoration: none;" href="#list" data-toggle="collapse">
<div class="ITEM">
TEXT
</div>
</a>
<div id="IDENTIFICATION" class="collapse">
</div>
<a style="text-decoration: none;" href="#list" data-toggle="collapse">
<div class="ITEM2">
TEXT2
</div>
</a>
<div id="IDENTIFICATION2" class="collapse">
<div><b>TITLE</b>: CONTENT</div>
<div><b>TITLE2</b>: CONTENT2</div>
</div>
</section>
I've got stored it in a selector XPATH like this, because the html got several sections with similar structure, tags and repeated data:
sectionA = response.xpath('//section[#id="SECTION_A"]')
Now, I want to extract the ITEMS and their IDENTIFICATIONS and write them into a file.
Extracting the ITEM gave no problem with:
item = sectionA.xpath('.//div/#class[contains(.,"ITEM")]').extract()
And it returns:
[u'ITEM', u'ITEM2']
But I cannot extract the TEXT of the ITEMS, I've tried:
item = sectionA.xpath('.//div/#class[contains(.,"ITEM")]/text()').extract()
But returns an empty list.
I'm also unable to extract the IDENTIFICATIONS, one problem with these is that they may not have any content or several, so I've tried to extract a selector of them from the SECTIONA selector like this:
identifications = sectionA.xpath('.//div/#id[contains(.,"IDENTIFICATION")]')
It retunrs me a selector similar to sectionA, but when I try to search in it i got nothing with this:
for id in identifications:
title= signature.xpath('.//div')
I've tried sevelal combinations like .//div/b or .//b or just .// but i got nothing.
Anyone know how I can get the ITEM-TEXT and IDENTIFICATIONS-CONTENT from an html like this?
The problem you are facing is not in the steps applied but is a Logical mistake. The reason why you are not getting the Text inside the 'ITEM' class is due to an extra / that you are using.
In the code that you wrote :
item = sectionA.xpath('.//div/#class[contains(.,"ITEM")]').extract()
Here it returns [u'ITEM', u'ITEM2'] due to the use of / before #class in //div/#class , which basically here means : return me the value used in the class which contains "ITEM" substring in it. And since the attribute #class is being pointed to here, it returns [] for there is no text to be pointed to.
What you instead want to do is :
item = sectionA.xpath('.//div[contains(#class,"ITEM")]/text()').extract()
Here the output of sectionA.xpath('.//div[contains(#class,"ITEM")]') is the selector:
[<Selector xpath='.//div[contains(#class,"ITEM")]' data=u'<div class="ITEM">'>, <Selector xpath='.//div[contains(#class,"ITEM")]' data=u'<div class="ITEM2">'>]
Similar mistake is made in the extraction for "IDENTIFICATIONS", with one more grave Logical Problem. The usage of // in title = signature.xpath('.//div') is not the appropriate method since it will not show the div in just the div IDENTIFICATIONS, but will instead try with all divs preset in the HTML. Again, this may not be a problem unless there was a div with substring "IDENTIFICATION" outside the div we are searching in. So a better way to do it, instead is to do something similar follows as per requirement :
>>> identification=sectionA.xpath('.//div[contains(#id,"IDENTIFICATION")]')
>>> for id in identification:
... print(id.xpath('div/b')).extract()

xpath working in chrome console but not in protractor script

html:
<div class="view doc">
<div class="view-doc-heading-dec mt10 ng-binding" id="docSummaryHeader"> Document Title </div>
<div class="view-doc-inner mt11 ng-binding" id="docBodyHeader">
</div>
I want to retrieve 'Document Title' in above elements with xpath:
$x('//*[#id=docSummaryHeader]')[0]
works in chrome console
but
element(by.xpath('//*[#id=docSummaryHeader]'))
in protractor doesn't allow [0]
If I use
element(by.xpath('//*[#id=docSummaryHeader]'))
it gives multiple elements in current html
Find all elements and get the desired one by index:
element.all(by.xpath('//*[#id="docSummaryHeader"]')).get(0);
or:
element.all(by.xpath('//*[#id="docSummaryHeader"]')).first();
Or, you can use the XPath-indexing (1-based):
element(by.xpath('//*[#id="docSummaryHeader"][1]'))
Actually you don't need xpath here:
$$('#docSummaryHeader').first();
Consider using CSS selector instead.

double slash for xpath. Selenium Java Webdriver

I am using Selenium WebDriver. I have a doubt about the xpath.
If I have the following code example:
<div>
<div>
<div>
<a>
<div>
</div>
</a>
</div>
</div>
</div>
And I want to locate the element which is in the last <div>. I think I have 2 options with the xpath.
First option is with single slash:
driver.findElement(By.xpath("/div/div/div/a/div")).click();
Second option is using double slash (and here is where I have the doubt).
driver.findElement(By.xpath("//a/div")).click();
Is it going to search in the <a> directly, but what happens if the html example code was just a part of a bigger code and in this bigger code are more "<a>"?. Where would this method look exactly?
What happens for example if I do it like this:
driver.findElement(By.xpath("//div")).click();
Would it looks if every <div> found in the html code?
First of all, avoiding // is usually the right thing to do - so, the first expression you show is perfect.
Would it looks if every <div> found in the html code?
Yes, exactly. An XPath expression like
//div
will select all div elements in the document, regardless of where they are.
what happens if the html example code was just a part of a bigger code and in this bigger code are more <a>?. Where would this method look exactly?
Then, let us make the HTML "bigger":
<div>
<a>
<p>C</p>
</a>
<div>
<div>
<a>
<div>A</div>
</a>
</div>
<a>
<div>B</div>
</a>
</div>
</div>
As you can see, I have added two more a elements - only one of them contains a div element. Assuming this new document as the input, there will now be a difference between
/div/div/div/a/div
which will select only <div>A</div> as the result, and
//a/div
which will select both <div>A</div> and <div>B</div> - because the exact position of a in the tree is now irrelevant. But none of them will select the first a element that contains p.

Selenium - Find a child element under a DIV

Can someone help me with the below HTML:
<div id="ext-156" class="menuBar">
<a id="ext-234" href="javascript:void(0);" class="active">
<i id="ext-365" class="menuItem"></i>
</a>
</div>
I am looking for the element with class "menuItem" and only from inside the div with class "menuBar" in Selenium.
Well, depending on what language you're using, the method call will be different, but the selector should be the same across language bindings:
css:
"div.menuBar .menuItem"
xpath:
"//div[#class='menuBar']//*[#class='menuItem']"
In java, the call would look like this:
driver.find(By.cssSelector("div.menuBar .menuItem"));
You can use XPath: //div[#class='menuBar']//*[#class='menuItem'].

Xpath get element with condition

I have some block of code and need to get data out of it and trying different version of xpath commands but with no success.
<div>
<div class="some_class">
<a title="id" href="some_href">
<nobr>1<br>
</a>
</div>
<div class="some_other_class">
<a title="name" href="some_href">
<nobr>John<br>
</a>
</div>
</div>
<div>
<div class="some_class">
<a title="id" href="some_href">
<nobr>2<br>
</a>
</div>
<div class="some_other_class">
<a title="name" href="some_href">
<nobr>John<br>
</a>
</div>
</div>
// and many blocks like this
So, this div blocks are the same except they are different by content of its sub-element. I need xpath query to get John's href which <a title="id"> is equal to 1.
I've tried something like this:
//div[./div/nobr='1' AND ./div/nobr='John']
to get only div that contains data I need and then wouldn't be hard to get John's href.
Also, I've managed to get John's href with:
//a[./nobr='John'][#title='name']/#href
but that way it doesn't depend on value from <a title="id"...> element but it has to depend on it.
Any suggestions?
I think what you want is
//div/div[a/#title='id']/following-sibling::div[1]/a/#href
which, given a well-formed input document, will return (individual results separated by --------):
href="some_href"
-----------------------
href="some_href"
You did not explain it very clearly though, as kjhughes has noted, and perhaps your sample HTML is not ideal.
Regarding your attempted path expressions, as the input is HTML, it is hard to know whether
<nobr>John<br>
means that "John" is inside the nobr element or not.
Thanks Mathias, your example was helpful, but as there are many elements with #title='id' it isn't reliable solution that will always catch good elements.
I've managed to make workaround, first catched the whole div, and then extract href I need.
//div[./div/a[#title='name']/nobr='John' and ./div/a[#title='id']/nobr='1']
//a[./nobr='John'][#title='name']/#href