I want to select P element which has 'ul' as immediate following-sibling from below sample xml.
<root>
<p>abc</p>
<br>
<p>def</p>
<br>
<p>FEATURES</p>
<ul>
<li>design</li>
<li>softness</li>
</ul>
<p>SIZING</p>
<ul>
<li>17'' x 24''</li>
<li>20'' x 32''</li>
<li>24'' x 38''</li>
</ul>
<p>CONSTRUCTION & CARE</p>
<ul>
<li>Nylon</li>
<li>Latex backing</li>
<li>Machine wash</li>
<li>Made in the USA</li>
</ul>
<p>SUSTAINABILITY FEATURES</p>
</root>
I have tried this //root/p[following-sibling::ul] xpath but didn't get desired answer.
Try this one to get output:
//p[following-sibling::*[position()=1 and self::ul]]
"//root/ul/preceding-sibling::p[1]"
I don't think this needs an explanation :), it's already understandable.
Related
Hi I am try to amke regexp which extract only li tags in ul tags (no ol)
Text:
<ul><li>some text</li></ul>
<ol><li>some text</li></lo>
Extracted
<ul>**<li>**some text</li></ul>
<ol><li>some text</li></lo>
Could you help me ?
Solution 1
Regex solution
/(?<=<ul>\s*(?:<li>.*?<\/li>\s*)*)<li>.*?<\/li>/gi
Demo
If you work in a team and someone else may read your code I advise you to use Solution 2. It's more simple and easy to understand by code reading.
Solution 2
Do it in 2 steps:
Delete all <ol>...</ol> nodes;
Take all <li>...</li> nodes.
*I assume your html is valid and you have no <li> outside <ul> or <ol>.
Code example in JavaScript:
let html = `
<ul>
<li>take this node 1</li>
<li>take this node 2</li>
</ul>
<ol>
<li>exclude this node</li>
<li>exclude this node</li>
</ol>
<ul>
<li>take this node 3</li>
<li>take this node 4</li>
</ul>
<ol>
<li>exclude this node</li>
<li>exclude this node</li>
</ol>
`;
let htmlWithoutOl = html.replace(/<ol>.*?<\/ol>/gis, '');
let matches = htmlWithoutOl.matchAll(/<li>.*?<\/li>/gis);
for (const match of matches) {
console.log(match[0]);
}
I have the following web page:
<div id="childcategorylist" class="link-list-container links__listed" data-reactid="7">
<div data-reactid="8">
<strong data-reactid="9">Categories</strong>
</div>
<div data-reactid="10">
<ul id="categoryLink" aria-label="shop by category" data-reactid="11">
<li data-reactid="12">
Contact Lenses
</li>
<li data-reactid="14">
Beauty
</li>
<li data-reactid="16">
Personal Care
</li>
I want to have css selector of href tags under li tag, i.e. for contact lens, beauty and personal-care. How to write it?
I am writing it in the following way:
#childcategorylist li
gives me following output:
['<li class="titleitem" data-reactid="16"><strong data-reactid="17">Categories</strong></li>']
Please help!
I am not a expert in scrapy, but usually html elements should have a .text object.
If not, you might want to use regexp to extract the text between > and < like:
import re
txt = someArraycontainingStrings[0]
x = re.search(">[a-zA-Z]*</", txt)
Maybe that gives you proper results
I want to get the parent of specify element (span with class is mw-headline) and then get the first next element of this parent.
<h2>
<span class="mw-headline" id="Botany">Botany
</span>
<span class="mw-editsection">
<span class="mw-editsection-bracket">
</span>edit<span class="mw-editsection-bracket">
</span>
</span>
</h2>
<ul>
<li><i>Malus</i>, the genus of all apples and crabapples</li>
<li>Cashew apple, the fruit that grows with the cashew nut</li>
<li>Custard apple, several fruits</li>
<li>Love apple:
<ul>
<li>Tomato</li>
<li><i>Syzygium samarangense</i>, a plant species in the Myrtaceae family</li>
</ul>
</li>
<li>Mammee apple (disambiguation)</li>
<li>May apple (<i>Podophyllum peltatum</i>)</li>
<li>Oak apple, a type of gall that grows on oak trees</li>
<li>Rose apple (disambiguation), several fruits</li>
<li>Thorn apple (disambiguation):
<ul>
<li><i>Crataegus</i> species</li>
<li><i>Datura</i> species</li>
</ul>
</li>
<li>Wax apple (<i>Syzygium samarangense</i>)</li>
<li>Hedge apple (<i>Maclura pomifera</i>)</li>
</ul>
I want to get first ul after h2 tag has specify span with class is mv-headline.
From xpath, i have very simple soltuion:
$x('//span[#class="mw-headline"]/following::ul[1]')
But I don't know how select it in selector with get parent (.. in xpath) and next element (following::node in xpath) in my case.
Please give me a solution by CSS selector.
Thanks & Best Regards,
Phuong Hoang
You do one mistake in your XPath, xpath should be:
$x('//h2[//span[#class="mw-headline"]]/following::ul[1]')
I'm trying to create a recursive list using Thymeleaf. I'm using a simple Java object to model a node which has has two fields, a description and then an array list of child nodes. I'm using the following HTML/Thymeleaf to process the structure but it isn't recursively iterating through to the next level down.
My Java code looks as follows:
public class Node {
public String description;
public ArrayList<Node> children;
}
My Thymeleaf/HTML code is as follows:
<html>
...
<body>
<div th:fragment="fragment_node" th:remove="tag">
<ul th:if="${not #lists.isEmpty(node.children)}" >
<li th:each="child : ${node.children}"
th:text="${child.description}"
th:with="node = ${child}"
th:include="this::fragment_node">List Item</li>
</ul>
</div>
</body>
</html>
If my data structure looks as follows:
Main node 1
Child node 1
Child node 2
Main node 2
Child node 3
Child node 4
I'd expect to get:
<ul>
<li>Main Node 1</li>
<li>
<ul>
<li>Child node 1</li>
<li>Child node 2</li>
</ul>
</li>
<li>Main Node 2</li>
<li>
<ul>
<li>Child node 3</li>
<li>Child node 4</li>
</ul>
</li>
</ul>
However, I only get:
<ul>
<li>Main Node 1</li>
<li>Main Node 2</li>
</ul>
Can anyone spot why this may not be working?
The cause of the problem is
You are trying to th:text and trying to add the description to a <li> as well as you are trying to th:include the fragment inside the same tag <li>.
Your th:include is replaced by the th:text as th:text is processed with priority by default.
Direct solution to your source code
.....
<li th:each="child : ${node.children}" th:inline="text" th:with="node = ${child}">
[[${child.description}]]
<ul th:replace="this::fragment_node">List Item</ul>
</li>
.....
Even thought the above will work as you want, personally I find some design issues in your thymeleaf page.
Better solution using fragment parameters
...
<ul th:fragment="fragment_node(node)" th:unless="${#lists.isEmpty(node.children)}" >
<li th:each="child : ${node.children}" th:inline="text">
[[${child.description}]]
<ul th:replace="this::fragment_node(${child})"></ul>
</li>
</ul>
...
I am supposed to find a class and apply a logic for that.
My code structure is as follows.
<div class="class">
<form>
<ul>
<li>xxx</li><li>xxx</li>
</ul>
<ul>
<li>xxx</li><li>xxx</li>
</ul>
<ul class="ul_class">
<li>
<input ....><a ...><span ..></span>
<a href="#" title="View History" class="hstry">
<span class="hide"> </span></a>
</li>
<li>xxx</li>
</ul>
</form>
How to find the class hstry inside the ul with the class named ul_class.
Just use a normal CSS selector to find nested classes like the following:
$( 'ul.ul_class .hstry' )
Note the whitespace between both classes. Without it, it would match an element having both classes, instead of an element with class hstry which is below some <ul> element with class ul_class.
If you want the content, try
var hstry = $('body').find('.hstry').html();
Then you can operate with this variable any way you want.
Using jquery:
$("ul.ul_class").find(".hstr");
$('ul.ul_class .hstry').html(); //for html content
$('ul.ul_class .hstry').text(); //for text data