How to select subchildren starting with a specific one via XPath - html

I have the following HTML tree (Note that the id is "msg-uniqueRandomNumber"):
<div class="elements">
<div class="grp" id="msg-128736"> </div>
<div class="grp" id="msg-312422"> </div>
<div class="grp" id="msg-012312"> </div>
<div class="grp" id="msg-567243"> </div>
</div>
I want to match a group of elements where the first one is a specific id.
Example: Match every class grp starting with msg-012312.
Result should be:
<div class="grp" id="msg-012312"> </div>
<div class="grp" id="msg-567243"> </div>

Choroba's nice explanation and fine answer are correct (+1), but here's a simpler XPath that will work:
//div[#class="grp" and not(./following-sibling::div[#id="msg-012312"])]
Read as
Select all of the grp div elements that do not appear
before the div with an id of msg-012312.

To select a div of the given class and id, use
//div[#class="grp" and #id="msg-012312"]
To select the following siblings, you can use
following-sibling::div[#class="grp"]
Putting both nodesets together with the union operator |:
( //div[#class="grp" and #id="msg-012312"]
| //div[#class="grp" and #id="msg-012312"]/following-sibling::div[#class="grp"] )

Related

CSS selector for the element without any classname or attribute

Is it possible to write a CSS selector matching the element which does not contain any attributes or class names?
For example, I have html like the following (but with tons of divs and dynamic class names) and I want to match the second div (it does not contain class)
<div class="xeuugli x2lwn1j x1cy8">
<div>
<div class="xeuugli x2lwn1j x1cy8">
<div class="xeuugli x2lwn1j n94">
<div class="x8t9es0 x10d9sdx xo1l8bm xrohj xeuugli">$0,00</div>
</div>
</div>
<div class="xeuugli x2lwn1j x1cy8zghib x19lwn94">
<span class="x8t9es0 xw23nyj xeuugli">Helloworld.</span>
</div>
</div>
</div>
P.S. Getting the div like div:nth-child(2) is not a solution.
P.P.S. Could you please advise in general why the dynamic class names are used in the development?
Well, if you can't use classes, maybe try giving it an ID if possible, like
<div class="xeuugli x2lwn1j x1cy8">
<div id="myId">
<div class="xeuugli x2lwn1j x1cy8">
<div class="xeuugli x2lwn1j n94">
<div class="x8t9es0 x10d9sdx xo1l8bm xrohj xeuugli">$0,00</div>
</div>
</div>
<div class="xeuugli x2lwn1j x1cy8zghib x19lwn94">
<span class="x8t9es0 xw23nyj xeuugli">Helloworld.</span>
</div>
</div>
</div>
ad then you can select the ID via the css #id selector like so:
#myId {
/*stuff here*/
}
If you can't have IDs either, we could get really creative by finding a grouping element which you will swear to never use on another place, like <section> or <article>, and then you could use
const elem = document.getElementsByTagName("article")[0];
elem.style.border = '2px solid red';
which returns an array of all elements with that tag name, which in our case would be the only one you need. Then you could via Javascript give it the css you need.

XPath syntax for getting nodes after another node?

I have something like this:
<div id = "colors">
<div id = "color_red" ></div>
<div id = "color_blue" ></div>
<div id = "color_green"></div>
<div id = "color_black"></div>
<!-- and so on -->
</div>
I'm trying to select all the divs after the color_blue div with:
//div[#id="colors"]/following-sibling::div[#id="color_blue"]/div[starts-with(#id, 'color_')]
That doesn't work.
I also tried:
//div[#id="colors"]/div[starts-with(#id, 'color_')][following-sibling::div[#id="color_blue"]]
No luck with that either.
This XPath,
//div[#id="colors"]/div[#id="color_blue"]/following-sibling::div
will select all div siblings following the one with #id="color_blue" within the #id="colors" div.

XPath return value on sibling value multiple with conditions

This is how the HTML is structured; I am attempting to obtain the value of <div> if the sibling <p> is equal to type1
<div class="zsg-lg">
<div class="hdp-fact-ataglance">
<div class="media-bd">
<p>
type1
<div>
value
<div class="zsg-lg">
<div class="hdp-fact-ataglance">
<div class="media-bd">
<p>
type2
<div>
value2
Here's my XPath that's currently not working, I'm pretty confused on how to structure it.
div[contains(#class, "zsg-lg")]/div[contains(#class, "hdp-fact-ataglance")]/div[contains(#class, "media-bd") and [p == "Type"]]/div/text()
I would suggest this:
normalize-space(
//div[contains(#class, "zsg-lg")]
/div[contains(#class, "hdp-fact-ataglance")]
/div[
contains(#class, "media-bd")
and
normalize-space(p/text())="type1"
]
/div
/text()
)
looks like the syntax was a little off, this worked:
div[contains(#class, "zsg-lg")]/div[contains(#class, "hdp-fact-ataglance")]/div[contains(#class, 'media-bd') and p = 'type1']/div/text()

Draw table, and put div with specific class to new row

I have some rendered data, and I want to display it "in table".
How can I implement this using CSS? (Maybe flex could help me)
<div class="test">1</div>
<div class="test">2</div>
<div class="newRow">3</div>
<div class="test">4</div>
<div class="newRow">5</div>
<div class="test">6</div>
Expected result, something like this:
1 3 5
2 4 6
*
Divs are rendered by angular2 loop (ngFor). There may be 2 or more. I need more dynamic solutions which would be depends on "class="newRow"" - end of row.
Perhaps try this:
<div>
<div class="test">1</div>
<div class="test">3</div>
<div class="test">5</div>
</div>
<div>
<div class="test">2</div>
<div class="test">4</div>
<div class="test">6</div>
</div>
CSS:
.test{
display:inline;
}
Fiddle here: https://jsfiddle.net/d0Ltnenp/2/

Parse html page with mechanize to receive the appropriate array

I have the following html code on the page received by mechanize (agent.get):
<div class="b-resumehistorylist-views">
<!-- first date start-->
<div class="b-resumehistory-date">date1</div>
<div class="b-resumehistory-company">
<div class="b-resumehistory-time">time1</div>
company1</div>
<!-- second date start -->
<div class="b-resumehistory-date">date2</div>
<div class="b-resumehistory-company">
<div class="b-resumehistory-time">time2</div>
company2
</div>
<div class="b-resumehistory-company">
<div class="b-resumehistory-time">time3</div>
company3</div>
<div class="b-resumehistory-company">
<div class="b-resumehistory-time">time4</div>
company4</div>
<div class="b-resumehistory-company">
<div class="b-resumehistory-time">time5</div>
company5</div>
<div class="b-resumehistory-company">
<div class="b-resumehistory-time">time6</div>
company6</div>
<div class="b-resumehistory-company">
<div class="b-resumehistory-time">time7</div>
company7</div>
...
</div>
I need to search inside the div with class="b-resumehistorylist-views" each date.
Then find all divs between two div-dates and link each item to this particular date.
The problem is that each item (div class = b-resumehistorylist-views) is not inside div=b-resumehistorylist-views.
At final stage I need to receive the following array:
array = [ [date1, time1, company1, companylink1], [date2, time2, company2, companylink2], [date2, time3, company3, companylink3],[date2, time4, company4, companylink4] ]
I know that I must use method search with text() option, but I cannot find the solution.
My code right now can parse all companies information between div class=b-resumehistory-company, but I need to find right date.
It would be the same thing as before, just some of the class attributes have been changed:
doc = agent.get(someurl).parser
doc.css('.b-resumehistory-company').map{|x| [x.at('./preceding-sibling::div[#class="b-resumehistory-date"][1]').text , x.at('.b-resumehistory-time').text, x.at('a').text, x.at('a')[:href]]}