I'm using a descendant accessor like so:
var myxml:XMLList = new XMLList;
...
myxml..node
and would like to replace it w/
const sNode:String = 'node';
myxml..[{sNode}]
This sort of thing has worked before.
const sAttrib:String = 'attrib';
myxml.#[{sAttrib}]
works, but trying the same sort of thing w/a descendant accessor causes a compiler error.
Yes, I could do
myxml.descendants(sNode)
but I'd rather do it w/operators, if I can.
The XML might be something like:
<map>
<node>
<node />
</node>
</map>
Related
I'm fairly new to XQuery so forgive me if this is extremely simple.
Essentially I'm searching a corpus of xml data for the word "has", and then I want to be able to return the word that follows immediately after "has" e.g. if the sentence was "has there been a fire?" I would like to return the word "there".
The XML corpus structure looks like this:
<s n="129">
<w c5="NP0" hw="indonesia" pos="SUBST">Indonesia</w>
<w c5="VHZ" hw="have" pos="VERB">has</w>
<w c5="AJ0" hw="large" pos="ADJ">large</w>
<w c5="NN2" hw="industry" pos="SUBST">industries</w>
<c c5="PUN">,</c>
<w c5="AV0" hw="recently" pos="ADV">recently</w>
<w c5="VVN" hw="develop" pos="VERB">developed</w>
</s>
In this sample of data, I'd like the word "large" as it immediately follows "has".
My current XQuery code looks like this:
<hascount>
{
for $v in
doc ("KS0.xml")/bncDoc/stext/div/u/s/w
where
$v = "has"
return ($v)
}
</hascount>
It simply returns all the instances of has at the moment. How would I change this code to be able to perform what my intended task is above?
Thank you in advance.
Try This Code
let $markup:=doc ("KS0.xml")
return $markup//w[matches(.,'^has$')]/following-sibling::w[1]
So I've found the answer to my own question.
This can be done by using XPath axis "following-sibling".
The implementation of this code in xquery would be:
<hascount>
{
for $v in
doc ("KS0.xml")/bncDoc/stext/div/u/s/w
where
$v = "has"
return ($v/following-sibling::*[1])
}
</hascount>
How can I merge all the context that are in identical and repeated elements throughout a document using Xquery?
sample document:
<webMessage xmlns="http://www.website.gov.uk/CM/envelope">
<EnvelopeVersion>2.0</EnvelopeVersion>
<Header>
<MessageDetails>
<Class>Web-CT600</Class>
<Qualifier/>
<Function/>
</MessageDetails>
<SenderDetails>
<IDAuthentication>
<SenderID/>
<Authentication>
<Method/>
<Role/>
<Value/>
</Authentication>
</IDAuthentication>
</SenderDetails>
</Header>
<webTalkDetails>
<Keys>
<Key Type="UTR">2274792909</Key>
</Keys>
<ChannelRouting>
<Channel>
<URI/>
<Product/>
<Version/>
</Channel>
</ChannelRouting>
</webTalkDetails>
<Body>
<IRenvelope xmlns="http://www.website.gov.uk/taxation/CT/3">
<IRheader>
<Keys>
<Key Type="UTR">2274792909</Key>
</Keys>
<PeriodEnd/>
<DefaultCurrency/>
<IRmark Type="generic">n1uS2MiavBsb6YwL82MK</IRmark>
<Sender/>
</IRheader>
<CompanyReturn ReturnType="new">
<CompanyInformation>
<CompanyName/>
<RegistrationNumber/>
<Reference/>
<PeriodCovered>
<From>2013-01-07</From>
<To>2014-01-07</To>
</PeriodCovered>
</CompanyInformation>
<Turnover>
<Total>45893</Total>
</Turnover>
<CompanyCalculation>
<Income>
<TradingAndProfessional>
<Profits>95517</Profits>
<NetProfits>51276</NetProfits>
</TradingAndProfessional>
</Income>
</CompanyCalculation>
<AttachedFiles>
<Xsubmission>
<Accounts>
<Instance>
<EncodedInlineSubmission> TEXT I WANT TO JOIN</EncodedInlineSubmission>
</Instance>
</Accounts>
<Computations>
<Instance>
<EncodedInlineSubmission> MORE TEXT I WANT TO JOIN</EncodedInlineSubmission>
</Instance>
</Computations>
</Xsubmission>
</AttachedFiles>
</CompanyTaxReturn>
</IRenvelope>
</Body>
So in This XML here I want to combine all the text in all the instances of and put them into one element single element so it will read:
<EncodedInlineSubmission> TEXT I WANT TO JOIN MORE TEXT I WANT TO JOIN</EncodedInlineSubmission>
Update: added an element constructor around the returned string.
You can use fn:string-join() to join a sequence of strings with a joiner string. You'll need to evaluate an XPath expression that selects all the nodes you want to join, and then retrieve their string values.
Here's an example:
declare namespace env = "http://www.website.gov.uk/CM/envelope";
let $nodes := $doc/env:webMessage/env:Body//env:EncodedInlineSubmission
return element EncodedInlineSubmission { fn:string-join($nodes/fn:string(), " ") }
Notes:
Assume $doc is bound to the document-node of your sample document
you may need a different XPath expression
your sample is not well-formed
You can simplye construct a new element using the concatenated values of all elements and the previous name of an element:
for $x in //*:Xsubmission
let $encoded := $x//*:EncodedInlineSubmission
return element {$encoded[1]/local-name()} {string-join($encoded)}
What does that #, in data.#state means?
<s:State name="normal" basedOn="{data.#state}"/>
Thank you.
# is a e4x attribute identifier operators.
var myXML:XML =
<order>
<item id='1'>
<menuName>burger</menuName>
<price>3.95</price>
</item>
<item id='2'>
<menuName>fries</menuName>
<price>1.45</price>
</item>
</order>
trace(myXML.item[0].#id); // Output: 1
As others have stated, # is an e4x attribute.
In the context you have provided, I must assume that data is an XMLLst. But, it may be an XML variable. In the context of Flex it may also be an XMLListCollection; which is just a wrapper around an XMLList used as the dataProvider to a Flex listed-based class.
I assume that the data variable must point to something like this:
<someElement state="someStateValue"> </someElement>
And therefore, data.#state should return the value 'someStateValue'
I'm processing an HTML page with a variable number of p elements with a css class "myclass", using Python + Selenium RC.
When I try to select each node with this xpath:
//p[#class='myclass'][n]
(with n a natural number)
I get only the first p element with this css class for every n, unlike the situation if I iterate through selecting ALL p elements with:
//p[n]
Is there any way I can iterate through elements by css class using xpath?
XPath 1.0 doesn't provide an iterating construct.
Iteration can be performed on the selected node-set in the language that is hosting XPath.
Examples:
In XSLT 1.0:
<xsl:for-each select="someExpressionSelectingNodes">
<!-- Do something with the current node -->
</xsl:for-each>
In C#:
using System;
using System.IO;
using System.Xml;
public class Sample {
public static void Main() {
XmlDocument doc = new XmlDocument();
doc.Load("booksort.xml");
XmlNodeList nodeList;
XmlNode root = doc.DocumentElement;
nodeList=root.SelectNodes("descendant::book[author/last-name='Austen']");
//Change the price on the books.
foreach (XmlNode book in nodeList)
{
book.LastChild.InnerText="15.95";
}
Console.WriteLine("Display the modified XML document....");
doc.Save(Console.Out);
}
}
XPath 2.0 has its own iteration construct:
for $varname1 in someExpression1,
$varname2 in someExpression2,
. . . . . . . . . . .
$varnameN in someExpressionN
return
SomeExpressionUsingTheVarsAbove
Now that I look again at this question, I think the real problem is not in iterating, but in using //.
This is a FAQ:
//p[#class='myclass'][1]
selects every p element that has a class attribute with value "myclass" and that is the first such child of its parent. Therefore this expression may select many p elements, none of which is really the first such p element in the document.
When we want to get the first p element in the document that satisfies the above predicate, one correct expression is:
(//p)[#class='myclass'][1]
Remember: The [] operator has a higher priority (precedence) than the // abbreviation.
WHanever you need to index the nodes selected by //, always put the expression to be indexed in brackets.
Here is a demonstration:
<nums>
<a>
<n x="1"/>
<n x="2"/>
<n x="3"/>
<n x="4"/>
</a>
<b>
<n x="5"/>
<n x="6"/>
<n x="7"/>
<n x="8"/>
</b>
</nums>
The XPath expression:
//n[#x mod 2 = 0][1]
selects the following two nodes:
<n x="2" />
<n x="6" />
The XPath expression:
(//n)[#x mod 2 = 0][1]
selects exactly the first n element in the document with the wanted property:
<n x="2" />
Try this first with the following transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:copy-of select="//n[#x mod 2 = 0][1]"/>
</xsl:template>
</xsl:stylesheet>
and the result is two nodes.
<n x="2" />
<n x="6" />
Now, change the XPath expression as below and try again:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:copy-of select="(//n)[#x mod 2 = 0][1]"/>
</xsl:template>
</xsl:stylesheet>
and the result is what we really wanted -- the first such n element in the document:
<n x="2" />
Maybe all your divs with this class are at the same level, so by //p[#class='myclass'] you receive the array of paragraphs with the specified class. So you should iterate through it using indexes, i.e.
//p[#class='myclass'][1], //p[#class='myclass'][2],...,//p[#class='myclass'][last()]
I don't think you're using the "index" for it's real purpose. The //p[selection][index] syntax in this selection is actually telling you which element within its parent it should be... So //p[selection][1] is saying that your selected p must be the first child of its parent. //p[selection][2] is saying it must be the 2nd child. Depending on your html, it's likely this isn't what you want.
Given that you're using Selenium and Python, there's a couple ways to do what you want, and you can look at this question to see them (there are two options given there, one in selenium Javascript, the other using the server-side selenium calls).
Here's a C# code snippet that might help you out.
The key here is the Selenium function GetXpathCount(). It should return the number of occurrences of the Xpath expression you are looking for.
You can enter //p[#class='myclass'] in XPather or any other Xpath analysis tool so you can indeed verify multiple results are returned. Then you just iterate through the results in your code.
In my case, it was all the list items in an UL that needed to be iterated -i.e. //li[#class='myclass']/ul/li - so based on your requirements should be something like:
int numProductsInLeftNav = Convert.ToInt32(selenium.GetXpathCount("//p[#class='myclass']"));
List<string> productsInLeftNav = new List<string>();
for (int i = 1; i <= numProductsInLogOutLeftNav; i++) {
string productName = selenium.GetText("//p[#class='myclass'][" + i + "]");
productsInLogoutLeftNav.Add(productName);
}
what are main differences between SelectNodes and GetElementsByTagName.
SelectNodes is a .NET/MSXML-specific method that gets a list of matching nodes for an XPath expression. XPaths can select elements by tag name but can also do lots of other, more complicated selection rules.
getElementByTagName is a DOM Level 1 Core standard method available in many languages (but spelled with a capital G in .NET). It selects elements only by tag name; you can't ask it to select elements with a certain attribute, or elements with tag name a inside other elements with tag name b or anything clever like that. It's older, simpler, and in some environments faster.
SelectNodes takes an XPath expression as a parameter and returns all nodes that match that expression.
GetElementsByTagName takes a tag name as a parameter and returns all tags that have that name.
SelectNodes is therefore more expressive, as you can write any GetElementsByTagName call as a SelectNodes call, but not the other way around. XPath is a very robust way of expressing sets of XML nodes, offering more ways of filtering than just name. XPath, for example, can filter by tag name, attribute names, inner content and various aggregate functions on tag children as well.
SelectNodes() is a Microsoft extension to the Document Object Model (DOM) (msdn).
SelectNodes as mentioned by Welbog and others takes XPath expression. I would like to mention difference with GetElementsByTagName() when deleting xml node is needed.
Answer and code provided user chilberto at msdn forum
The next test illustrates the difference by performing the same function (removing the person nodes) but by using the GetElementByTagName() method to select the nodes. Though the same object type is returned its construction is different. The SelectNodes() is a collection of references back to the xml document. That means we can remove from the document in a foreach without affecting the list of references. This is shown by the count of the nodelist not being affected. The GetElementByTagName() is a collection that directly reflects the nodes in the document. That means as we remove the items in the parent, we actually affect the collection of nodes. This is why the nodelist can not be manipulated in a foreach but had to be changed to a while loop.
.NET SelectNodes()
[TestMethod]
public void TestSelectNodesBehavior()
{
XmlDocument doc = new XmlDocument();
doc.LoadXml(#"<root>
<person>
<id>1</id>
<name>j</name>
</person>
<person>
<id>2</id>
<name>j</name>
</person>
<person>
<id>1</id>
<name>j</name>
</person>
<person>
<id>3</id>
<name>j</name>
</person>
<business></business>
</root>");
XmlNodeList nodeList = doc.SelectNodes("/root/person");
Assert.AreEqual(5, doc.FirstChild.ChildNodes.Count, "There should have been a total of 5 nodes: 4 person nodes and 1 business node");
Assert.AreEqual(4, nodeList.Count, "There should have been a total of 4 nodes");
foreach (XmlNode n in nodeList)
n.ParentNode.RemoveChild(n);
Assert.AreEqual(1, doc.FirstChild.ChildNodes.Count, "There should have been only 1 business node left in the document");
Assert.AreEqual(4, nodeList.Count, "There should have been a total of 4 nodes");
}
.NET GetElementsByTagName()
[TestMethod]
public void TestGetElementsByTagNameBehavior()
{
XmlDocument doc = new XmlDocument();
doc.LoadXml(#"<root>
<person>
<id>1</id>
<name>j</name>
</person>
<person>
<id>2</id>
<name>j</name>
</person>
<person>
<id>1</id>
<name>j</name>
</person>
<person>
<id>3</id>
<name>j</name>
</person>
<business></business>
</root>");;
XmlNodeList nodeList = doc.GetElementsByTagName("person");
Assert.AreEqual(5, doc.FirstChild.ChildNodes.Count, "There should have been a total of 5 nodes: 4 person nodes and 1 business node");
Assert.AreEqual(4, nodeList.Count, "There should have been a total of 4 nodes");
while (nodeList.Count > 0)
nodeList[0].ParentNode.RemoveChild(nodeList[0]);
Assert.AreEqual(1, doc.FirstChild.ChildNodes.Count, "There should have been only 1 business node left in the document");
Assert.AreEqual(0, nodeList.Count, "All the nodes have been removed");
}
With SelectNodes() we get collection / list of references to xml document nodes. We can manipulate with those references. If we delete node, the change will be visible to xml document, but the collection / list of references is the same (although node which was deleted, it's reference points now to null -> System.NullReferenceException) Although I do not really know how this is implemented. I suppose if we use XmlNodeList nodeList = GetElementsByTagName() and delete node with nodeList[i].ParentNode.RemoveChild(nodeList[i]) is frees/deletes reference in nodeList variable.