XPath syntax for getting nodes after another node? - html

I have something like this:
<div id = "colors">
<div id = "color_red" ></div>
<div id = "color_blue" ></div>
<div id = "color_green"></div>
<div id = "color_black"></div>
<!-- and so on -->
</div>
I'm trying to select all the divs after the color_blue div with:
//div[#id="colors"]/following-sibling::div[#id="color_blue"]/div[starts-with(#id, 'color_')]
That doesn't work.
I also tried:
//div[#id="colors"]/div[starts-with(#id, 'color_')][following-sibling::div[#id="color_blue"]]
No luck with that either.

This XPath,
//div[#id="colors"]/div[#id="color_blue"]/following-sibling::div
will select all div siblings following the one with #id="color_blue" within the #id="colors" div.

Related

Select specific html tags from a liste of html tag via an xpath selector

I want get some specific information from this html code :
<div class="main">
<div class="a"><div><a>linkname1</a></div></div> <!-- I DON'T want get the text of this 'a' tag -->
<div class="b">xxx</div>
<div class="c">xxx</div>
<div class="a"><div><a>linkname2</a></div></div> <!-- I want get the text of this 'a' tag -->
<div class="a"><div><a>linkname3</a></div></div> <!-- I want get the text of this 'a' tag -->
<div class="a"><div><a>linkname4</a></div></div> <!-- I want get the text of this 'a' tag -->
<div class="a"><div><a>linkname5</a></div></div> <!-- I want get the text of this 'a' tag -->
<div class="d"></div>
<div class="c">xxx</div>
<div class="a"><div><a>linkname6</a></div></div> <!-- I DON'T want get the text of this 'a' tag -->
<div class="a"><div><a>linkname7</a></div></div> <!-- I DON'T want get the text of this 'a' tag -->
<div class="a"><div><a>linkname8</a></div></div> <!-- I DON'T want get the text of this 'a' tag -->
<div class="d"></div>
<div class="c">xxx</div>
<div class="a"><div><a>linkname9</a></div></div> <!-- I DON'T want get the text of this 'a' tag -->
<div class="a"><div><a>linkname10</a></div></div> <!-- I DON'T want get the text of this 'a' tag -->
</div>
I want get in an array the list of the link's text in the 'second' 'a' (class) tags block (between the first div with the class 'c' and the second div with the class 'c') . How can I do that via an xpath selector ? is it possible ? I don't find how do..
With my example, the expected result is :
linkname2
linkname3
linkname4
linkname5
Thank you :)
Your question is a Set question like explained in this SO answer: How to perform set operations in XPath 1.0.
So applied to your specific situation you should use an intersection like this:
(: intersection :)
$set1[count(. | $set2) = count($set2)]
set1 should be the follow set after div[#class='c'] and
set2 should be the preceding set before div[#class='d'].
Now, putting both together according to the above formula with
set1 = "div[#class='c'][1]/following-sibling::*" and
set2 = "div[#class='d'][1]/preceding-sibling::*"
the XPath expression could look like this:
div[#class='c'][1]/following-sibling::*[count(. | current()/div[#class='d'][1]/preceding-sibling::*) = count(current()/div[#class='d'][1]/preceding-sibling::*)]
Output:
linkname2
linkname3
linkname4
linkname5
You can try this expression:
/div/div[position() > 3 and position() < 8]/div/a/text()
I found one possible solution :)
//following::div[#class='a' and count(preceding::div[#class="c"]) = 1]/div/a/text()

Draw table, and put div with specific class to new row

I have some rendered data, and I want to display it "in table".
How can I implement this using CSS? (Maybe flex could help me)
<div class="test">1</div>
<div class="test">2</div>
<div class="newRow">3</div>
<div class="test">4</div>
<div class="newRow">5</div>
<div class="test">6</div>
Expected result, something like this:
1 3 5
2 4 6
*
Divs are rendered by angular2 loop (ngFor). There may be 2 or more. I need more dynamic solutions which would be depends on "class="newRow"" - end of row.
Perhaps try this:
<div>
<div class="test">1</div>
<div class="test">3</div>
<div class="test">5</div>
</div>
<div>
<div class="test">2</div>
<div class="test">4</div>
<div class="test">6</div>
</div>
CSS:
.test{
display:inline;
}
Fiddle here: https://jsfiddle.net/d0Ltnenp/2/

How to select subchildren starting with a specific one via XPath

I have the following HTML tree (Note that the id is "msg-uniqueRandomNumber"):
<div class="elements">
<div class="grp" id="msg-128736"> </div>
<div class="grp" id="msg-312422"> </div>
<div class="grp" id="msg-012312"> </div>
<div class="grp" id="msg-567243"> </div>
</div>
I want to match a group of elements where the first one is a specific id.
Example: Match every class grp starting with msg-012312.
Result should be:
<div class="grp" id="msg-012312"> </div>
<div class="grp" id="msg-567243"> </div>
Choroba's nice explanation and fine answer are correct (+1), but here's a simpler XPath that will work:
//div[#class="grp" and not(./following-sibling::div[#id="msg-012312"])]
Read as
Select all of the grp div elements that do not appear
before the div with an id of msg-012312.
To select a div of the given class and id, use
//div[#class="grp" and #id="msg-012312"]
To select the following siblings, you can use
following-sibling::div[#class="grp"]
Putting both nodesets together with the union operator |:
( //div[#class="grp" and #id="msg-012312"]
| //div[#class="grp" and #id="msg-012312"]/following-sibling::div[#class="grp"] )

select the most nested element with same classes

i'm trying to make some special menu but i have a problem with selecting the most nested element (div) . Menu will be dynamic so it can change how much divs will be nested in one div. (parents will be created with new childs) so i need to select the last one (the most nested) without using more classes od Ids.
Here is a code i wrote until now:
<div id="strategy">
<div class="selected">
0
<div class="selected">
some text
<div class="selected"> this is the last div, but it can be anytime changed and more childs of this element can be created</div>
</div>
</div>
<div class="selected">
1
</div>
<div>
2
</div>
</div>
and something of css i tried:
div.selected:only-of-type {background: #F00;}
also tried nth:last-child, only-child.. i think everything but there must be some way how to do it.
if you're open to jQuery...
$(document).ready(function() {
var $target = $('#strategy').children();
while( $target.length ) {
$target = $target.children();
}
var last = $target.end(); // You need .end() to get to the last matched set
var lastHtml = last.html();
$('body').append('<strong>deepest child is: ' + lastHtml + '</strong>');
last.css('color', 'blue');
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div id="strategy">
<div class="selected">
0
<div class="selected">
some text
<div class="selected"> this is the last div, but it can be anytime changed and more childs of this element can be created</div>
</div>
</div>
<div class="selected">
1
</div>
<div>
2
</div>
</div>

Parsing with Xpath

Consider the following HTML:
<div class='data'>
<div class='user_name'>Lankesh</div>
<div class='user_details'>
<div class='country'>Srilanka</div>
<div class='age'>9</div>
</div>
<div class='user_name'>Bob</div>
<div class='user_details'>
<div class='country'>US</div>
<div class='age'>54</div>
</div>
<div class='user_name'>Deiter</div>
<div class='user_details'>
<div class='country'>Germany</div>
<div class='age'>34</div>
</div>
<div class='user_name'>Yakob</div>
<div class='user_details'>
<div class='country'>Syria</div>
<div class='age'>90</div>
</div>
<div class='user_name'>Qureshi</div>
<div class='user_details'>
<div class='country'>Afgan</div>
<div class='age'>56</div>
</div>
<div class='user_name'>Smith George</div>
<div class='user_details'>
<div class='country'>India</div>
<div class='age'>23</div>
</div>
</div>
And the following Ruby code:
require 'nokogiri'
sample_html = File.open("r.htm", "r").read
n = Nokogiri::HTML::parse sample_html
xpaths = {}
xpaths[:name] = "//div[#class = 'user_name']/text()"
xpaths[:country] = "//div[#class = 'country']/text()"
xpaths[:age] = "//div[#class = 'age']/text()"
full_path = xpaths.values.join(" | ")
n.xpath(full_path).each do |i|
puts i
end
This works to extract data, but how can I chunk (name, age and country) so that I can extract the parsed data into a structure more easily.
Since name is outside the user_details block, I am unable to write a query like: //div[#class = 'user_details'] and extract each attribute.
I know I can chunk the array into groups of 3; but I am looking for xpath based solution, because my actual need has varying number of child properties.
Silly, but: anyway to somehow inject characters to the extracted text, during parsing?
Any ideas?
Let me start out by saying it would be better to adjust the HTML to wrap each user block in its own containing div:
<div class='user'>
<div class='name'>John</div>
<div class='details'>
<div class='country'>US</div>
...
</div>
</div>
Then you could simply query each user block separately using "//div[#class = 'user']". You are probably not in control of the HTML, though.
Given the current situation I would propose to simply obtain the user_name divs, as well as the user_details divs and zip them together. Then, you can create a Hash from the user details based on the child divs (.xpath("div")) which will work for any amount of user_details and uses their class attribute as a Hash key and their text as a value. Note this implementation only works on single-level user_details. Of course this will have to be adjusted if not all user_details child divs will have a class attribute. But judging from your example input they do.
require 'pp'
require 'nokogiri'
sample_html = File.open("r.htm", "r").read
n = Nokogiri::HTML::parse sample_html
user_names = n.xpath("//div[#class = 'user_name']")
user_details = n.xpath("//div[#class = 'user_details']")
users = user_names.zip(user_details).map do |name, details|
{
name: name.text,
details: Hash[details.xpath("div").map { |d| [d['class'].to_sym, d.text] }]
}
end
pp users
# [{:name=>"Lankesh", :details=>{:country=>"Srilanka", :age=>"9"}},
# {:name=>"Bob", :details=>{:country=>"US", :age=>"54"}},
# {:name=>"Deiter", :details=>{:country=>"Germany", :age=>"34"}},
# {:name=>"Yakob", :details=>{:country=>"Syria", :age=>"90"}},
# {:name=>"Qureshi", :details=>{:country=>"Afgan", :age=>"56"}},
# {:name=>"Smith George", :details=>{:country=>"India", :age=>"23"}}]