CSS selector for nth nested child - html

Is there a CSS selector to select the nth element with class "someclass" when the elements you need to match are "nested":
<div>
<p><span></span></p>
<p><span class="someclass"></span></p>
<p><span></span></p>
<p><span class="someclass"></span></p>
<p><span class="someclass"></span></p>
<p><span class="someclass"></span></p>
<p><span></span></p>
.
.
.
</div>
it's not possible to move the class to the the p tags unfortunately.
I've tried:
div p div.someclass:nth-child(2n)
div p div.someclass:nth-of-type(2n)
none seem to quite do exactly what I need.

Related

CSS selector for the element without any classname or attribute

Is it possible to write a CSS selector matching the element which does not contain any attributes or class names?
For example, I have html like the following (but with tons of divs and dynamic class names) and I want to match the second div (it does not contain class)
<div class="xeuugli x2lwn1j x1cy8">
<div>
<div class="xeuugli x2lwn1j x1cy8">
<div class="xeuugli x2lwn1j n94">
<div class="x8t9es0 x10d9sdx xo1l8bm xrohj xeuugli">$0,00</div>
</div>
</div>
<div class="xeuugli x2lwn1j x1cy8zghib x19lwn94">
<span class="x8t9es0 xw23nyj xeuugli">Helloworld.</span>
</div>
</div>
</div>
P.S. Getting the div like div:nth-child(2) is not a solution.
P.P.S. Could you please advise in general why the dynamic class names are used in the development?
Well, if you can't use classes, maybe try giving it an ID if possible, like
<div class="xeuugli x2lwn1j x1cy8">
<div id="myId">
<div class="xeuugli x2lwn1j x1cy8">
<div class="xeuugli x2lwn1j n94">
<div class="x8t9es0 x10d9sdx xo1l8bm xrohj xeuugli">$0,00</div>
</div>
</div>
<div class="xeuugli x2lwn1j x1cy8zghib x19lwn94">
<span class="x8t9es0 xw23nyj xeuugli">Helloworld.</span>
</div>
</div>
</div>
ad then you can select the ID via the css #id selector like so:
#myId {
/*stuff here*/
}
If you can't have IDs either, we could get really creative by finding a grouping element which you will swear to never use on another place, like <section> or <article>, and then you could use
const elem = document.getElementsByTagName("article")[0];
elem.style.border = '2px solid red';
which returns an array of all elements with that tag name, which in our case would be the only one you need. Then you could via Javascript give it the css you need.

Xpath node-set nesting order selection

Is there an Xpath 1.0 expression that I could use starting at the div[#id='rootTag'] context to select the different nested span descendants based on how deep they are nested?
For example could you use something like span[2] to select the second most deeply nested span tag rather than second span child of the same parent element?
<div id='rootTag'>
<span>Test</span>
<div>
<span>Test</span>
<span>Test</span>
</div>
</div>
<span>Test</span>
</div>
<div>
<div>
<div>
<div>
<span>Test</span>
</div>
<span>Test</span>
</div>
</div>
</div>
</div>
It's a bit (a lot...) of a hack, but it can be done this way:
Assume your html is like this:
levels = """<div id='rootTag'>
<span>Level2</span>
<div>
<span>Level3</span>
<div>
<span>Level4</span>
</div>
</div>
<div>
<span>Level3</span>
</div>
<div>
<div>
<div>
<div>
<span>Level6</span>
</div>
<span>Level5</span>
</div>
</div>
</div>
</div>"""
We then do this:
#First collect the data:
from lxml import etree #you have to make sure your html is well-formed, or it won't work
root = etree.fromstring(levels)
tree = etree.ElementTree(root)
#collect the paths of all <span> elements
paths = [tree.getpath(e) for e in root.iter('span')]
#determine the nesting level of each <span> element
nests = [e.count('/') for e in paths] #or, alternatively:
#nests = [tree.getpath(e).count('/') for e in root.iter('span')]
From here, we use the nesting level in the nests list to extract the comparable element in the paths list. For example, to get the <span> element with the deepest nesting level:
deepest = nests.index(max(nests))
print(paths[deepest],root.xpath(paths[deepest])[0].text)
Output:
/div/div[3]/div/div/div/span Level6
Or to extract the <span> element with a level 4 nesting:
print(paths[nests.index(4)],root.xpath(paths[nests.index(4)])[0].text)
Output:
/div/div[1]/div/span Level4

Xpath: select div with an anchor descendant whose depth is unknown

Sample html:
<div>
<div class="foo bar baz"> <-- target 1 -->
<div>
<span>
hello world
</span>
</div>
</div>
<div class="foo bar">foo</div>
<div class="bar"> <-- target 2 -->
<div>
<div>
<span>
hello world
</span>
</div>
</div>
</div>
</div>
I want to select: divs that: 1)has class name bar 2) has an <a> descendant whose href contains hello.
My problem is that the <a> tag could be nested in different levels. How to handle this correctly?
You can use relative descendant-or-self axis (.//) to check <a> element in various possible level depth :
//div[contains(concat(' ', #class, ' '), ' bar ')][.//a[contains(#href,'hello')]]
related discussion : How can I find an element by CSS class with XPath?

How to access div element text based on adjacent text

I have the following HTML code and am trying to access "QA1234", which is the value of the Serial Number. Can you let me know how I can access this text?
<div class="dataField">
<div class="dataName">
<span id="langSerialNumber">Serial Number</span>
</div>
<div class="dataValue">QA1234</div>
</div>
<div class="dataField">
<div class="dataName">
<span id="langHardwareRevision">Hardware Revision</span>
</div>
<div class="dataValue">05</div>
</div>
<div class="dataField">
<div class="dataName">
<span id="langManufactureDate">Manufacture Date</span>
</div>
<div class="dataValue">03/03/2011</div>
</div>
I assume you are trying to get the "QA1234" text in terms of being the "Serial Number". If that is correct, you basically need to:
Locate the "dataField" div that includes the serial number span.
Get the "dataValue" within that div.
One way is to get all the "dataField" divs and find the one that includes the span:
parent = browser.divs(class: 'dataField').find { |div| div.span(id: 'langSerialNumber').exists? }
p parent.div(class: 'dataValue').text
#=> "QA1234"
parent = browser.divs(class: 'dataField').find { |div| div.span(id: 'langManufactureDate').exists? }
p parent.div(class: 'dataValue').text
#=> "03/03/2011"
Another option is to find the serial number span and then traverse up to the parent "dataField" div:
parent = browser.span(id: 'langSerialNumber').parent.parent
p parent.div(class: 'dataValue').text
#=> "QA1234"
parent = browser.span(id: 'langManufactureDate').parent.parent
p parent.div(class: 'dataValue').text
#=> "03/03/2011"
I find the first approach to be more robust to changes since it is more flexible to how the serial number is nested within the "dataField" div. However, for pages with a lot of fields, it may be less performant.

Two lines in h1 tag

I need to fit two lines in one h1 tag (instead of making two separated h1 tags).
How can I create a line break inside of h1 tag?
Using:
<h1>Line 1 <br/> Line 2</h1>
A W3C validated method is
<h1>Line 1 <span style = "display: block;">Line 2</span></h1>
Summarizing all clever answers, this is what https://validator.w3.org says for each one:
Validated:
<h1>Line 1 <br/> Line 2</h1>
<h1>Line 1<br>Line 2</h1>
<h1>Line 1 <span style = "display: block;">Line 2</span></h1>
Invalid
<h1>
<p>Line1</p>
<p>Line2</p>
</h1>
Reason:
Error: Element p not allowed as child of element h1 in this context
<h1>
<div>line1</div>
<div>line2</div>
</h1>
Reason:
Error: Element div not allowed as child of element h1 in this context.
Tested code:
<!DOCTYPE html>
<html>
<head>
<title>test</title>
</head>
<body>
<h1>Line 1 <br/> Line 2</h1>
<h1>Line 1<br>Line 2</h1>
<h1>
<p>Line1</p>
<p>Line2</p>
</h1>
<h1>Line 1 <span style = "display: block;">Line 2</span></h1>
<h1>
<div>line1</div>
<div>line2</div>
</h1>
</body>
</html>
You can insert markup inside h1, so that you can simply do <h1>foo<br>bar</h1>.
Standard quote that br inside h1 is valid
Let's teach more people to read the current standard
4.3.6 "The h1, h2, h3, h4, h5, and h6 elements" says:
Content model: Phrasing content.
Then we click the definition of "Phrasing content", which leads to 3.2.5.2.5 "Phrasing content" which says:
Phrasing content is the text of the document, as well as elements that mark up that text at the intra-paragraph level. Runs of phrasing content form paragraphs.
..., br, ..., span, ...
so we see that br is in the huge list of phrasing content elements, and therefore can go inside h1.
This also shows us that another option would be to do something like:
<h1><span>ab</span><span>cd</span></h1>
and then make the span be display: inline-block; with CSS.
You can use the line break tag
<h1>heading1 <br> heading2</h1>
You can also do it with PHP
$str = "heading1 heading2 heading3";
$h1 = '<h1>';
foreach(explode(" ",$str) as $data)
{
$h1 .= $data .'<br>';
}
$h1 .= '</h1>';
echo $h1;