What is a way to uniquely identify all DOM nodes in an HTML document. To illustrate what I mean, here is a (fictional) example:
Script X randomly selects a DOM node from document.html.
Script X needs to tell script Y which DOM node it has chosen.
How does script X uniquely identify the DOM node it has chosen so that script Y knows exactly which node it is in document.html?
I'm really interested in how to uniquely identify the DOM node so that the script Y can identify it and manipulate it. Preferably, it should work with text nodes as well. I was thinking of XPath maybe, but I'm not sure how to generate a unique XPath to any given node.
You should be able to determine a unique XPath by working backwards from the node to the root node, and tracking the node you're on, and which sibling it is, such that you get something like:
/a[1]/b[2]/c[101]/text()
so that's the 101st C node under the second B node, etc. As such, that's a unique path and can be copied around with reference to the original document
You might want to take a look at XPathGen https://github.com/amouat/XPathGen
It will create a unique XPath of the form /node()[1]/node()[1] for a given DOM node. However, there are some issues with XPath, namely non-coalesced text nodes and "prolog" nodes, which cannot be uniquely identified purely with XPath. For example if you have the following document in DOM:
<a>b</a>
And add a text node to become:
<a>bc</a>
The XPath to nodes b and c will be the same, but you will still have separate DOM nodes (unless you call normalize on the document). If you need to handle this situation you will need to store offsets and lengths for text nodes.
Well, an XPath expression that results in a single node should be unique. What do you mean by "how to generate a unique XPath to any given node"?
Ordinal child positions along XPath axes. Nodes are strongly ordered, and so saying:
child 1 of child 3 of child 4 of child 5.
should do it.
Related
In XSLT, what is the difference between the "current node" and the "context node"? You can find both terms used here: http://www.w3.org/TR/xslt.
When would you use one or the other? How do you refer to each?
The current node is whatever the template is currently operating on. Normally this happens to also be the context node, but the context node has special meaning within a nested XPath expression (the part in square brackets). There, it refers to whatever node is currently being tested for a match. Hence, the context node changes within the XPath expression, but not the current node.
The context node can be abbreviated with a dot (.) or sometimes left out entirely. This is probably a little confusing, because outside of a nested expression, a dot signifies the current node. (In that case the current node happens to be the context node, so one might say that it is the current node only proximately, and it is more properly called the context node. But even the spec calls it the current node here.)
Since a dot gives you the context node, in a nested XPath expression the user needs a way to refer back to the current node, the one being processed by the current template. You can do this via the current() function.
Distinguishing these two is useful in some cases. For instance, suppose you have some XML like this:
<a>
<b>
<c>foo<footnote fn="1"/></c>
<d>bar</d>
</b>
<b>
<c>baz</c>
<d>aak<footnote fn="2"/></d>
</b>
<b>
<c>eep</c>
<d>blech<footnote fn="2"/></d>
</b>
<footnote-message fn="1">Batteries not included.</footnote>
<footnote-message fn="2">Some assembly required.</footnote>
</a>
Now suppose you want to convert it to LaTeX like this:
foo\footnote{Batteries not included.}
bar
baz
aak\footnote{Some assembly required.}
eep
blech\footnotemark[2]
The trick is the tell whether a footnote has already been used or not. If this is the first time you've encountered the footnote, you want to write a \footnote command; otherwise you want to write a \footnotemark command. You could use XSL code like this:
<xsl:choose>
<xsl:when test="count(preceding::*[./#fn = current()/#fn]) = 0">\footnote{...}</xsl:when>
<xsl:otherwise>\footnotemark[...]</xsl:otherwise>
</xsl:choose>
Here we are comparing the context-node fn attribute (from the results of the preceding::* node-set) to the current-node fn attribute. (You don't actually have to say ./#fn; you could just say #fn.)
So in short, the context node leaves you inside the XPath predicate; the current node reaches outside the predicate, back to the node being processed by the current template.
Context Node
The context node is part of the XPath evaluation context and varies with each location step:
step1 / step2 / step3 / ...
where each step is
axis::node-test[predicate]
Each step is evaluated with respect to the context nodes set by the preceding steps.
Each step then selects nodes that become the context node for following steps.
When evaluating predicate, the context node is the node along axis that has passed node-test.
The context node can be accessed as ..
Current Node
The current node () is part of the XSLT processing model:1
The current node is the node in the source XML document best matched by an XSLT template.
The current node becomes the starting context node for each XPath expression in the matched template.
The current node can be accessed as current() within XPath predicates.
1Although insignificant to understanding the basic difference between context node and current node, note that in XSLT 2.0 the description of the evaluation context has been changed. The concepts of current node and current node list have been replaced by the XPath concepts of context item, context position, and context size.
I need to search through its contents with a recursive function, so it returns a boolean response depending whether the value I read was found or not. I dunno how to make it work. Here's the type for the tree I defined:
text=string[30];
list=^nodeL;
nodeL=record
title:text;
ISBN:text;
next:list;
end;
tree=^nodeT;
nodeT=record
cod:text;
l:list;
LC:tree;
RC:tree;
end;
This looks like a "please do my assignment for me post", which I won't do. I will try and help you do the assignment yourself.
I don't know exactly what your assignment is, so I'm going to have to make some guesses.
I think your assignment is to write a recursive function that will search a tree and return a boolean response depending on whether a value (input to the function) is found or not.
I don't know how the tree gets its content. You say, you defined the tree type, so I'm guessing that means you are not provided with a tree that already has content. So, at least for testing purposes, you are going to have to write code to add content to the tree (so you can search it).
I don't know exactly what kind of tree you are supposed to create. Usually trees have rules about how the items are arranged in the tree. A common type of tree, is a binary tree, where for each node, the item in the left node (if present) is "less than" the item in the right node (if present). You probably need this when adding items (i.e. content) to the tree.
I think you need to change your definition of the tree node, nodeT (I could be wrong). A tree is a kind of linked list, it does not usually contain linked lists. Usually each tree node contains an item of data (not a list of items).
If I were doing in this assignment (and learning to program in Pascal) I would do the following (in this order):
Make sure I understand linked lists (at least singe-linked list). Write at least one program to add data to a linked list, and search
it (do not use recursion).
Make sure I understand recursion. Read some tutorials on recursion (that do not use linked lists, or trees). For example "First Textbook Examples of Recursion". Write at least one program that uses recursion (do not use linked lists or trees).
Make sure I understand trees. Read some tutorials on trees. For example, "Binary Search Trees"
Do the assignment.
P.S. You might want to change the name of your text type from "text", because, in Pascal, "text" is the name of a predefined type, for text files.
How can I get the output of this generator? out.next() or next(out) doesn't work:
out=nx.tree_all_pairs_lowest_common_ancestor(G)
print(out)
<generator object tree_all_pairs_lowest_common_ancestor at 0x000002BE4EF90D48>
nx.tree_all_pairs_lowest_common_ancestor is only aimed at working on certain graph structures, as mentioned in the docs. In the case no root is specified, as in your case, the function will do as follows:
If root is not specified, find the exactly one node with in degree 0 and
use it. Raise an error if none are found, or more than one is. Also check
for any nodes with in degree larger than 1, which would imply G is not a
tree.
So it is likely that your function either has multiple root nodes, or there are none, i.e your graph is not a tree. So you can either search locally using a Breadth-first search, or specify the root node of the subtree to operate on in nx.tree_all_pairs_lowest_common_ancestor.
I need to travel through all the ancestors or descendants of a matched AST node to later use that info to moodify parts of the input source code.
I tried to look for ways to do that. I looked at the getParents member function of the ASTContext class. I could use that to just go up the AST hierarchy to visit all the ancestor nodes of my currently-matched node. but the problem with that is, when i get the parent node, i no longer have the context for that node to try and get its parent. I could try to rebuild the ASTContext for the new node but that seems to be another big task on its own, if possible.
the lowest NodeKind (lowest in the C hierarchy) I'm looking for is a callExpr and the highest I'm looking for is a functionDecl.
how can I obtain all the ancestors or descendants of a matched AST node after the match returns the control to run in MatchCallback?
It may be possible to keep reaching for a parent declaration recursively until you reach TranslationUnitDecl, however, I would instead suggest actually iterating over the declarations in TranslationUnitDecl and working your way toward the FunctionDecl instead.
You can create a recursive function which finds all TagDecl in a translation unit, searches all methods in that class for the FunctionDecl you specify, and also recursively consumes TagDecls within that TagDecl, until you have nothing left to consume.
This would allow you to more easily keep a complete record of the specific AST nodes you want, and would probably be less confusing to write.
However, if you choose to work your way backward you can try something like this (untested)
FunctionDecl *FD;
DeclContext *PC = FD->getParent();
while (!isa<TranslationUnitDecl>(Decl::castFromDeclContext(PC))) {
//consume
PC = PC->getParent();
}
for descendants (children) you'll just have to cast to a type with children and iterate.
I am a total rookie and am trying to query data from a website and import it to Google docs spreadsheet. I have used firebug/firepath to find the xpath, when i paste the xpath into a cell =importxml(Url, query) it errors.
Here it is the url: http://www.sportfishingreport.com/pages/boatdetail.php?boat_id=781
Boat Trip Type Anglers Catch
03-22-2015 Full Day 21 48 Ocean Whitefish, 210 Rockfish, 21 Lingcod
Can someone help me write the xpath because the xpath that firepath tells me to write errors in google docs.
Thanks in advance, Jess
There are no tables in the source HTML of the second page you have indicated (that is, http://www.channelislandssportfishing.com/fish-counts). If anything, those tables are generated by Javascript, but then this content cannot be found by IMPORTXML, because it operates on the raw source HTML.
But what you get from Firepath is endlessly complicated anyway, because the tool tends to return path expressions that rely on positions of nodes, rather than actual values, or IDs, or names. If you look at the source HTML, the portion of HTML that contains "Erna B" looks like
Erna B
And there is in fact a trivial XPath expression that selects this content, because the href attribute value is unique. To have "Erna B" appear in a cell in Google Sheets, use
=IMPORTXML("http://www.channelislandssportfishing.com/fish-counts","//a[#href='/erna-b-sportfishing']")
For all other cells, look for similar properties that uniquely identify nodes, and turn those into path expressions.