selecting the last node from a list of matches - html

I'm trying to figure out a way of finding the last node that matches a given xpath using last() function. The problem is that the last element of path also has a constraint specified.
"//div[#id='someId']/ul/li/div[#class='class1 class2 ']/span[#class='someType2 ']"
if I use
"//div[#id='someId']/ul/li/div[#class='class1 class2 ']/span[#class='someType2 ']' and last()]"
it still matches multiple nodes. Maybe one of the reasons is that the last div tag in the path contains 2 span elements. please help me select the last node which matches the above path.
Thanks and Regards,
Vamyip

If your xml is xhtml, why don't use CSS selectors ?
If I'm not mistaken, the selectors should be
#someId > ul > li > div.class1.class2 > span.someType2
#someId > ul > li > div.class1.class2 > span.someType2:last
I was using xpath on html pages too, but when CSS selectors became widespread I found that they are more supported across browsers than xpath.

Use:
(//div[#id='someId']/ul/li/div[#class='class1 class2 ']
/span[#class='someType2 '])
[last()]
Do note: the brackets surrounding the expression starting with //. This is a FAQ. [] binds stronger than // and this is why brackets are necessary to indicate different precedence.

In selenium, you can also use javascript to retrieve elements. How about something like this?
dom=var list1 =
document.getElementById('someId').
getElementsByTagName('li');
var finallist = new Array();
for (var i=0; i<list1.length; i++) {
var list2 = list1[i].getElementsByClassName("class1 class2");
for (var j=0; j<list2.length; j++) {
var list3 = list2[j].getElementsByClassName("someType2");
for (var k=0; k<list3.length; k++) {
finallist.push(list3[k];
}
}
}
finallist.pop()
http://seleniumhq.org/docs/04_selenese_commands.html#locating-by-dom

Related

Chrome extension: Add style to element if it contains particular text

I want to add styling to an element only if it contains a particular string. i.e. if(el contains str) {el:style}.
If I wanted just the links containing w3.org to be pink, how would I find <a href="http://www.w3.org/1999/xhtml">article < /a> inside the innerHTML and then style the word "article" on the page.
So far, I can turn ALL the links pink but I can't selectively target the ones containing "www.w3.org".
var links = [...document.body.getElementsByTagName("a")];
for (var i = 0; i < links.length; i++) {
links[i].style["color"] = "#FF00FF";
}
How would I apply this ONLY to the elements containing the string "w3.org"?
I thought this would be so simple at first! Any and all help is appreciated.
While you can't filter by non-exact href values when finding the initial list, and you can't filter by contained text then either, you can filter the list after the fact using plain javascript:
var links = [...document.body.getElementsByTagName("a")];
for (var i = 0; i < links.length; i++) {
if (links[i]['href'].indexOf('www.w3.org') == -1) { continue };
links[i].style["color"] = "#FF00FF";
}
Assuming you want to filter by the href, that is. If you mean the literal text, you would use links[i]['text'] instead.

Can you target a specific element among the results of a css selector independent of it's location? or relation? [duplicate]

This question already has an answer here:
Matching the first/nth element of a certain type in the entire document
(1 answer)
Closed 7 years ago.
Given some css selector that returns a set of matching elements from the document. Is there any way within css to take the resulting set and target the nth result?
nth-of-type and nth-child pseudoclasses will not work to my understanding because they will not treat all possible matches as a linear list. Such as:
<div>
<span class="aClass" /> <!-- found by :nth-of-type(1) -->
<span class="aClass" /> <!-- found by:nth-of-type(2) -->
<div>
<span class="aClass" /> <!-- found by :nth-of-type(1) -->
</div>
I want to be able to treat all these occurrences as a linear list of 3 elements, and target one of them independently of where in the document they may be located.
I don't think this is possible as you described it. A general rule of CSS is that queries can delve deeper, and occasionally they can move "sideways" along the tree through a set of neighbors (and for that matter, only in one direction), but they can never take information from one node, traverse upward, go into a neighbor, and apply that information to another node. An example:
<div>
<div class="relevant">
<!-- *whistles spookily* - "Zis WILL be the last time you see me!" -->
</div>
<span class="myCssTarget"></span>
</div>
The comment in that HTML is a space that is, for all intents and purposes, "invisible" to myCssTarget. If I added any HTML inside of there, then it could never directly affect the span outside.
I could offer further suggestions if you offer a specific situation, but this may be either a call for a redesign of the components you're putting in, or perhaps a JavaScript-based solution.
I just saw some clarification to the question. Here is a much simpler fiddle to get all spans with "aClass" into a list that will let you target the nTh span. Still using Jquery instead of CSS.
https://jsfiddle.net/h2e0xgwf/6/
$(document).ready(function(){
var nTh = 5; // change this to whichever N you wish
var allSpans = $("div > span.aClass");
$(allSpans[nTh-1]).html($(allSpans[nTh-1]).html() + " : found the " + nTh + "th element").css("background-color", "blue").css("color","white");
});
I know that there is no way to do that within CSS. You can select the nth element of the given class name with JavaScript
var elem = getElementsByClassName('.aClass').item(n-1)
or with jQuery
var elem = $('.aClass').toArray().filter(function(elem, i){
return i==(n-1);
})[0];
If I understood you correctly you want a linear list of all spans that have class="aClass" who are direct children of a div.
Which means that in your example you will have 2 list of spans, the first list will have 2 elements and the second list will have 1.
You then wish to change the style of all nth children; for example changing the firsts' style would cause 2/3 spans to be affected: the two directly under a new div. And if you were to change the second child, only 1/3 spans would be affected.
If that is what you are looking for I don't believe it can be done in CSS but it can be done in JQuery. I created a fiddle with an example just in case my understanding of your question was correct.
https://jsfiddle.net/h2e0xgwf/4/
$(document).ready(function(){
var nTh = 3; // change this to whichever N you wish
var rowsOfSpans = new Array();
var divsWithChildren = $("div:parent");
for(var i = 0; i < divsWithChildren.length; i++){
rowsOfSpans[i] = $(divsWithChildren[i]).children("span.aClass");
}
for(var i = 0; i < rowsOfSpans.length; i ++){
for(var j =0; j < rowsOfSpans[i].length; j++){
if(j == nTh-1){
// THIS IS THE NTH ELEMENT
$(rowsOfSpans[i][j]).html($(rowsOfSpans[i][j]).html() + " : found the " + nTh + "th element").css("background-color", "blue").css("color","white");
}
}
}
});

Loop in jade with curly brackets

I am really struggling to master Jade. I want to do something very very simple: print out "some text" 3 times. I have a mixin function:
mixin outputText()
- for (var i = 0; i <= 3; i++)
span some text
This works fine. Now when I try to output more text on a second line, so first I need to use {} as later there will now be 2 spans on 2 different lines. So first, surrounding current function with curly brackets:
- for (var i = 0; i <= 3; i++){
span some text
- }
But I get the error: unexpected token "indent"
I have seen someone here doing the EXACT same thing. Why wont it work for me?
Might I recommend iteration? If you are working with values this is perfect:
ul
each val, index in ['zero', 'one', 'two']
li= val
li= Some Text
However if you are simply looking to repeat lines over you could do this:
ul
while n < 4
li= Sometext
A handy guide by Jade
Try it. When your function have surrounded with curly brackets, you don't append indent within for-loop code
- for (var i = 0; i <= 3; i++){
span some hello
- }

highlight words in html using regex & javascript - almost there

I am writing a jquery plugin that will do a browser-style find-on-page search. I need to improve the search, but don't want to get into parsing the html quite yet.
At the moment my approach is to take an entire DOM element and all nested elements and simply run a regex find/replace for a given term. In the replace I will simply wrap a span around the matched term and use that span as my anchor to do highlighting, scrolling, etc. It is vital that no characters inside any html tags are matched.
This is as close as I have gotten:
(?<=^|>)([^><].*?)(?=<|$)
It does a very good job of capturing all characters that are not in an html tag, but I'm having trouble figuring out how to insert my search term.
Input: Any html element (this could be quite large, eg <body>)
Search Term: 1 or more characters
Replace Txt: <span class='highlight'>$1</span>
UPDATE
The following regex does what I want when I'm testing with http://gskinner.com/RegExr/...
Regex: (?<=^|>)(.*?)(SEARCH_STRING)(?=.*?<|$)
Replacement: $1<span class='highlight'>$2</span>
However I am having some trouble using it in my javascript. With the following code chrome is giving me the error "Invalid regular expression: /(?<=^|>)(.?)(Mary)(?=.?<|$)/: Invalid group".
var origText = $('#'+opt.targetElements).data('origText');
var regx = new RegExp("(?<=^|>)(.*?)(" + $this.val() + ")(?=.*?<|$)", 'gi');
$('#'+opt.targetElements).each(function() {
var text = origText.replace(regx, '$1<span class="' + opt.resultClass + '">$2</span>');
$(this).html(text);
});
It's breaking on the group (?<=^|>) - is this something clumsy or a difference in the Regex engines?
UPDATE
The reason this regex is breaking on that group is because Javascript does not support regex lookbehinds. For reference & possible solutions: http://blog.stevenlevithan.com/archives/mimic-lookbehind-javascript.
Just use jQuerys built-in text() method. It will return all the characters in a selected DOM element.
For the DOM approach (docs for the Node interface): Run over all child nodes of an element. If the child is an element node, run recursively. If it's a text node, search in the text (node.data) and if you want to highlight/change something, shorten the text of the node until the found position, and insert a highligth-span with the matched text and another text node for the rest of the text.
Example code (adjusted, origin is here):
(function iterate_node(node) {
if (node.nodeType === 3) { // Node.TEXT_NODE
var text = node.data,
pos = text.search(/any regular expression/g), //indexOf also applicable
length = 5; // or whatever you found
if (pos > -1) {
node.data = text.substr(0, pos); // split into a part before...
var rest = document.createTextNode(text.substr(pos+length)); // a part after
var highlight = document.createElement("span"); // and a part between
highlight.className = "highlight";
highlight.appendChild(document.createTextNode(text.substr(pos, length)));
node.parentNode.insertBefore(rest, node.nextSibling); // insert after
node.parentNode.insertBefore(highlight, node.nextSibling);
iterate_node(rest); // maybe there are more matches
}
} else if (node.nodeType === 1) { // Node.ELEMENT_NODE
for (var i = 0; i < node.childNodes.length; i++) {
iterate_node(node.childNodes[i]); // run recursive on DOM
}
}
})(content); // any dom node
There's also highlight.js, which might be exactly what you want.

HTML Parsing - Get Innermost HTML Tags

When I parse HTML I wish to obtain only the innermost tags for the entire document. My intention is to semantically parse data from the HTML doc.
So if I have some html like this
<html>
<table>
<tr><td>X</td></tr>
<tr><td>Y</td></tr>
</table>
</html>
I want <td>X</td> and <td>Y</td> alone. Is this possible using Beautiful Soup or lxml?
In .NET I've used HtmlAgilityPack library to do all html parsings easy. It loads DOM and you can select by nodes, in your case select nodes with no childs. Maybe that helps.
After you made sure your document is well-formed (by parsing it using lxml, for example), you could use XPath to query for all nodes that have no further child elements.
//*[count(*) = 0]
That's one of the few situations where you could actually use a Regular Expression to parse the HTML string.
\<(\w+)[^>]*>[^\<]*\</\1\s*>
If you can use or DOM handling (i.e. in a browser) you can work with the parentNode attribute for all the tags and recursively count the total, and keep the largest one.
In javascript-pseudocode (tested on FireFox):
var allElements = document.getElementsByTagName("*");
var maxElementReference, maxParentNodeCount = 0;
var i;
for (i = 0; i < allElements.length; i++) {
var count = recursiveCountParentNodeOn(allElements[i]);
if (maxParentNodeCount < count) {
maxElementReference = allElements[i];
maxParentNodeCount = count;
}
}