XPath testing that string ends with substring? - html

Given that the HTML contains:
<div tagname="779853cd-355b-4242-8399-dc15f95b3276_Destination" class="panel panel-default"></div>
How do we write the following expression in XPath:
Find a <div> element whose tagname attribute ends with the string 'Destination'
I've been searching for days and I can't come up with something that works. Among many, I tried for example:
div[contains(#tagname, 'Destination')]

XPath 2.0
//div[ends-with(#tagname, 'Destination')]
XPath 1.0
//div[substring(#tagname, string-length(#tagname)
- string-length('Destination') + 1) = 'Destination']

You can use ends-with (Xpath 2.0)
//div[ends-with(#tagname, 'Destination')]

XPath 2 or 3: There's always regex.
.//div[matches(#tagname,".*_Destination$")]

You could use the below xpath which will work with Xpath 1.0
//div[string-length(substring-before(#tagname, 'Destination')) >= 0 and string-length(substring-after(#tagname, 'Destination')) = 0 and contains(#tagname, 'Destination')]
Basically it checks if there is any string ( or no strings ) before the first occurrence of Destination but there should not be any text after the Destination
Test input :
<root>
<!--Ends with Destination-->
<div tagname="779853cd-355b-4242-8399-dc15f95b3276_Destination" class="panel panel-default"></div>
<!--just Destination-->
<div tagname="Destination" class="panel panel-default"></div>
<!--Contains Destination-->
<div tagname="779853cd-355b-4242-8399-dc15f95b3276_Destination_some_text" class="panel panel-default"></div>
<!--Doesn't contain destination-->
<div tagname="779853cd-355b-4242-8399-dc15f95b3276" class="panel panel-default"></div>
</root>
Test output:
<div class="panel panel-default"
tagname="779853cd-355b-4242-8399-dc15f95b3276_Destination"/>
<div class="panel panel-default" tagname="Destination"/>

Another solution that is XPath 1.0 compatible:
//div[contains(concat(#tagname, 'UNIQUE'), concat('Destination', 'UNIQUE'))]
I used this while searching for entries in a KeePass Database using the XPath expression search feature:
//Entry/String[contains(concat(Key, 'UNIQUE'), '/UNIQUE')]
found all entries that have a custom string field that ends in '/'.

Related

Extract class attribute using xpath

I have the following html:
<div class="g-recaptcha" data-sitekey="6LdWKrUUAAAAAP3b4V05YVzvFNJNAUrDb0RoJZf7" data-callback="reValidateP" data-expired-callback="reInvalidateP" style="clear:left;">
How can I extract sitekey value attribute via Xpath?
XPath 1.0 solution :
string(//div[#class="g-recaptcha"]/#data-sitekey)
Output : 6LdWKrUUAAAAAP3b4V05YVzvFNJNAUrDb0RoJZf7

XPath to select link containing text?

I tried to use this XPath:
//*[contains(normalize-space(text()),'Jira')]
Also tried:
//*[contains(text(),'Jira')]
In the below HTML example, there is space before and after text "Jira". I am not able to click on the link:
<a href="#/crm/usergroup-edit?id=572a3c84e4b07f6189958700"
ng-repeat="gp in groups | filter : userGroupSearch | orderBy:'-name':1"
class="ng-scope">
<div class="inventoryPanel" ng-style="myStyle" style="width: 15.8%;">
<h4 class="ng-binding">
<div class="groupIcon G">
<div class="text ng-binding">P</div>
</div>Jira
</h4>
</div>
</a>
The following XPath will select all a elements whose string value contains a Jira substring:
//a[contains(.,'Jira')]

Get div class title content text using xpath

I have a requirement of getting the text below of "ELECTRONIC ARTS" (this can change according to data) using class title "Offered By" (this class will be same for all) using Xpath. I tried various xpath coding, but couldn't get the results I want. I'm really looking for someone's help on this.
<div class="meta-info">
<div class="title"> Offered By</div>
<div class="content">ELECTRONIC ARTS</div> </div>
This is one possible XPath expression to starts with, which then you can simplify or add more criteria as needed (XPath formatted to be more readable) :
//div[
#class='meta-info'
and
div[#class='title' and normalize-space()='Offered By']
]/div[#class='content']
explanation :
//div[#class='meta-info' and ... : find div element where class attribute value equals "meta-info" and ...
div[#class='title' and normalize-space()='Offered By']] : ... has child element div where class attribute value equals "title" and content equals "Offered By"
/div[#class='content'] : from such div (the <div class="meta-info"> to be clear), return child element div where class attribute value equals "content"
Using the examples on Mozilla:
var xpath = document.evaluate("//div[#class='content']", document, null, XPathResult.STRING_TYPE, null);
document.write('The text found is: "' + xpath.stringValue + '".');
console.log(xpath);
<div class="meta-info">
<div class="title"> Offered By</div>
<div class="content">ELECTRONIC ARTS</div>
</div>
By the way, I think document.querySelector or document.querySelectorAll are much more convenient in this situation:
var content = document.querySelector('.meta-info .content').innerText;
document.write('The text found is: "' + content + '".');
console.log(content);
<div class="meta-info">
<div class="title"> Offered By</div>
<div class="content">ELECTRONIC ARTS</div>
</div>

HTML element to contain id or name from ko.observable using foreach

Below I have a for-each loop using knockout.js.
<div data-bind="foreach:Stuff">
<div class="row">
<span data-bind="text: $data.name"></span>
</div>
</div>
I need to have the HTML Element with an id or name or something that reflects a unique value related to the $data.name value, as another method runs asynchronously, and needs to know which HTML element to update.
Ideally, it would look something like this, I guess:
<div data-bind="foreach:Stuff">
<div class="row">
<span id="data-bind='text: $data.name'" data-bind="text: $data.name"></span>
</div>
</div>
I have found a knockout syntax that applies values during runtime to specified attributes:
<div data-bind="foreach:Stuff">
<div class="row">
<span data-bind="attr: { id: $data.name}"></span>
</div>
</div>
Are you looking for this
<div data-bind="foreach:Stuff">
<div class="row">
<span data-bind="text: $data.name,attr:{id:$data.name}'"></span>
</div>
</div>
Here name is a observable i believe when ever there is change in name in stuff it will automatically updates its value & attr:{id}just to give a dynamic id to element using available bindings .

Parse html page with mechanize to receive the appropriate array

I have the following html code on the page received by mechanize (agent.get):
<div class="b-resumehistorylist-views">
<!-- first date start-->
<div class="b-resumehistory-date">date1</div>
<div class="b-resumehistory-company">
<div class="b-resumehistory-time">time1</div>
company1</div>
<!-- second date start -->
<div class="b-resumehistory-date">date2</div>
<div class="b-resumehistory-company">
<div class="b-resumehistory-time">time2</div>
company2
</div>
<div class="b-resumehistory-company">
<div class="b-resumehistory-time">time3</div>
company3</div>
<div class="b-resumehistory-company">
<div class="b-resumehistory-time">time4</div>
company4</div>
<div class="b-resumehistory-company">
<div class="b-resumehistory-time">time5</div>
company5</div>
<div class="b-resumehistory-company">
<div class="b-resumehistory-time">time6</div>
company6</div>
<div class="b-resumehistory-company">
<div class="b-resumehistory-time">time7</div>
company7</div>
...
</div>
I need to search inside the div with class="b-resumehistorylist-views" each date.
Then find all divs between two div-dates and link each item to this particular date.
The problem is that each item (div class = b-resumehistorylist-views) is not inside div=b-resumehistorylist-views.
At final stage I need to receive the following array:
array = [ [date1, time1, company1, companylink1], [date2, time2, company2, companylink2], [date2, time3, company3, companylink3],[date2, time4, company4, companylink4] ]
I know that I must use method search with text() option, but I cannot find the solution.
My code right now can parse all companies information between div class=b-resumehistory-company, but I need to find right date.
It would be the same thing as before, just some of the class attributes have been changed:
doc = agent.get(someurl).parser
doc.css('.b-resumehistory-company').map{|x| [x.at('./preceding-sibling::div[#class="b-resumehistory-date"][1]').text , x.at('.b-resumehistory-time').text, x.at('a').text, x.at('a')[:href]]}