XSLT - Remove nodes when current node is equal to previous node - html

I need to write XSLT for an xml which contains in below format.
<books>
<book>
<a>name</a>
<a>name</a>
<b>name</b>
<b>name</b>
</book>
</books>
I need to eliminate the duplicate child nodes in some conditions.
Only if(current node == previous node) then it should be removed.
ie.. if previous node (element) is <a> and current node (element) is also <a>, Then one node should be removed.
output for the above be,
`<a>name</a>`
`<b>name</b>`
please help me to do this.

In XSLT 2 or 3 you can easily group adjacent sibling elements by their node name with for-each-group select="*" group-adjacent="node-name()" and simply output the first item in each group (which is equal to the context item .):
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="3.0">
<xsl:mode on-no-match="shallow-copy"/>
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="book">
<xsl:copy>
<xsl:for-each-group select="*" group-adjacent="node-name()">
<xsl:copy-of select="."/>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
https://xsltfiddle.liberty-development.net/6qVRKw4/1

As I understood, you want to omit a leaf element (without children elements)
if it has a previous sibling, which:
is also a leaf element,
has the same name,
has the same text content.
So the most intuitive solution (I think) is to write an empty template,
matching just these nodes:
<xsl:template match="*[not(*)][preceding-sibling::*[1][not(*)]
[name() = current()/name()][text() = current()/text()]]"/>
A brief description of the match attribute:
*[not(*)] - Every element without any child element (leaf element).
[ - Start of the second predicate.
preceding-sibling::*[1] - Take the first preceding sibling.
[not(*)] - It must not have any child element.
[name() = current()/name()] - It must have the same name as the
"starting" element.
[text() = current()/text()] - It must have the same text as the
"starting" element.
] - End of the second predicate.
Of course, the script must contain also an identity template.
For a working example, with a bit extended source, see http://xsltransform.net/jxN8Nqm
If requirement concerning the same text is not necessary, delete the respective
predicate fragment.

Related

XLST creates an empty space after convert to html

I don´t get it.
My xml input:
<?xml version="1.0" encoding="UTF-8"?>
<results>
<error file="mixed.cpp" line="11" id="unreadVariable" severity="style" msg="Variable 'wert' is assigned a value that is never used."/>
<error file="mixed.cpp" line="13" id="unassignedVariable" severity="style" msg="Variable 'b' is not assigned a value."/>
<error file="mixed.cpp" line="11" id="arrayIndexOutOfBounds" severity="error" msg="Array 'wert[2]' accessed at index 3, which is out of bounds."/>
<error file="mixed.cpp" line="15" id="uninitvar" severity="error" msg="Uninitialized variable: b"/>
<error file="mixed.cpp" line="5" id="unusedFunction" severity="style" msg="The function 'func' is never used."/>
<error file="*" line="0" id="unmatchedSuppression" severity="style" msg="Unmatched suppression: missingIncludeSystem"/>
</results>
using this xsl file:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" omit-xml-declaration="yes"/>
<xsl:template match="error">
<tr>
<td><xsl:value-of select="#file"/></td>
<td><xsl:value-of select="#line"/></td>
<td><xsl:value-of select="#test"/></td>
<td><xsl:value-of select="#severity"/></td>
<td><xsl:value-of select="#msg"/></td>
</tr>
</xsl:template>
</xsl:stylesheet>
But the first line I get is empty:
empty line
<tr><td>mixed.cpp</td><td>11</td><td/><td>style</td><td>Variable 'wert' is assigned a value that is never used.</td></tr>
Where is the empty line coming from?
The default template kicks in for templates not matching your error template, and the default template just outputs the text. Since you have whitespace text nodes, and you are not matching results, the whitespace inside results (and before and after error) will become part of the output.
There are multiple ways to fix this. A typical method is to write a low priority template that matches text that you do not want to match. I.e., if you add the following, your whitespace will disappear:
<xsl:template match="text()" />
Another approach would be to positively match your structure. I.e., if you would add the following, the whitespace also disappears, because now you match the root element and subsequently only apply templates on the elements that you are interested in (and not also the text nodes under results).
<xsl:template match="results">
<xsl:apply-templates select="error" />
</xsl:template>
A third approach would be to add a whitespace-stripping declaration, but this may influence the input XML if your actual stylesheet is larger and would depend on whitespace elsewhere. This would only strip the whitespace on the results element:
<xsl:strip-space elements="results"/>
All three solution work, it depends on your project as a whole which one is most suitable.
Remember that in XSLT 1.0 and XSLT 2.0 non-matching nodes will be matched by the default template (which is invisible) and simply outputs the text value of that node. In XSLT 3.0 you have more control over this process:
<!-- XSLT 3.0 only -->
<xsl:mode on-no-match="shallow-skip" />

Populate Drop Down List with XML Node Distinct Values Only

I'm trying to produce a search filter that will work with an existing XML data file and I'm starting with trying to display just unique values from one of the nodes in a drop down list box.
An example of the XML file can be seen at...
http://kirk.rapweb.co.uk/testing/filter/tidy/plain/products.xml
I've managed to use XSLT to display the XML in a more readable format...
http://kirk.rapweb.co.uk/testing/filter/tidy/products.xml
The same data with filter applied...
http://kirk.rapweb.co.uk/testing/filter/tidy/filter/products.xml
The values I would like to list in the 1st drop down list box filtered out...
http://kirk.rapweb.co.uk/testing/filter/tidy/distinct/products.xml
All the data pulled into a HTML page...
http://kirk.rapweb.co.uk/testing/filter/tidy/html/
I'm struggling to work out how to get the data from...
http://kirk.rapweb.co.uk/testing/filter/tidy/distinct/products.xml
into a HTML page drop down list box.
Can anyone offer advice, point me in the right direction or confirm that I'm on the right track?
The end result that I'm trying to achieve is to have 2 drop down boxes.
Drop Down List 1 would contain... Brakes, Exhaust, Lighting
Drop Down List 2 would contain...
If Brakes were selected previously... Discs and Drums, Pads and Shoes.
If Exhaust were selected previously... Centre, Rear.
If Lighting were selected previously... Headlamps, Rear Lights.
Right now I'd just like to focus on populating Drop Down Box 1 with the data discussed above.
With your updated decription it looks like you want to create a dropdown with the distinct group values. This is an example of a grouping problem and in XSLT 1.0, the way to do this is with a technique called Muenchian grouping. It goes like this:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" omit-xml-declaration="yes"/>
<xsl:key name="kGroup" match="product" use="group"/>
<xsl:template match="/">
<select name="productGroup" id="productGroup">
<xsl:apply-templates
select="dataroot/product[generate-id() =
generate-id(key('kGroup', group)[1])]" />
</select>
</xsl:template>
<xsl:template match="product">
<option value="{group}">
<xsl:value-of select="group"/>
</option>
</xsl:template>
</xsl:stylesheet>
When run on your input XML, it produces this:
<select name="productGroup" id="productGroup">
<option value="Brakes">Brakes</option>
<option value="Exhaust">Exhaust</option>
<option value="Lighting">Lighting</option>
</select>
Then you would use the same JavaScript as you're using now to put the result inside a particular element in the HTML DOM that can be easily found by its ID.
Now as for the next step, once the above is working, you would put a JavaScript event on this dropdown so that it runs another transform when an item is selected. You would get this selected value and then you can pass this into yet another XSLT as a parameter value. This page has good information on passing parameters into an XSLT in JavaScript. The XSLT would look like this (yet again with Muenchian grouping):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" omit-xml-declaration="yes"/>
<xsl:key name="kType" match="product" use="type"/>
<xsl:param name="group" />
<xsl:template match="/">
<select name="productType" id="productType">
<xsl:apply-templates
select="dataroot/product[generate-id() =
generate-id(key('kType', type)[1])][group = $group]" />
</select>
</xsl:template>
<xsl:template match="product">
<option value="{type}">
<xsl:value-of select="type"/>
</option>
</xsl:template>
</xsl:stylesheet>
When the parameter value "Lighting" is passed in as the group parameter and this is run on your input XML, this produces:
<select name="productType" id="productType">
<option value="Headlamps">Headlamps</option>
<option value="Rear Lights">Rear Lights</option>
</select>
I guess you are under the assumption that you do not know how many different 'groups' there are going to be in the data (I mean, someone could add a new group anytime, no?).
One of the standard techniques in XSLT 1.0 for grouping elements is 'Muenchian Grouping' (XSLT 2.0 implements native grouping functions and elements) which is based in comparing unique ids.
In the following solution I assume that the data from the XML document is not already grouped.
<xsl:key name="group-key"
match="product"
use="group" />
<xsl:template match="dataroot">
<select>
<xsl:for-each select="product[generate-id(.) = generate-id(key('group-key', group)[1])]">
<!-- We sort the group names alphabetically. If the names of the groups are already ordered alphabetically, the xsl:sort can be omitted. -->
<xsl:sort select="group" />
<option><xsl:value-of select="group" /></option>
</xsl:for-each>
</select>
</xsl:template>
The result of applying this template (adding the xsl:stylesheet element) to your original 'product.xml' is:
<select>
<option>Brakes</option>
<option>Exhaust</option>
<option>Lighting</option>
</select>
Note: even if we remove the 'xsl:sort' element, the output is going to be the same because the data is already ordered in 'product.xml'.

How do use xsl to select one entry in a table and entries above and below it?

I have an xml feed for displaying a football league table. I would like to be able to filter the table to only display one particular team and a variable number of entries above and below it.
For example, out of a league of 24 teams I would like to display team A and the 3 higher positioned teams and 3 lower positioned teams, I am currently using xsl to format the data, however I cant figure out how to do this. Would it even be with xsl?
Once you have managed to select your "particular team", you can use the following-sibling and preceding-sibling axes to select your nearby items, using position() to limit the number of them.
Here is a simplified example:
Let's have this XML document (every element represents a "team") and the document represents all 10 teams, already sorted by points:
<nums>
<num>01</num>
<num>02</num>
<num>03</num>
<num>04</num>
<num>05</num>
<num>06</num>
<num>07</num>
<num>08</num>
<num>09</num>
<num>10</num>
</nums>
If the team we are interested in is represented by <num>05</num> then this transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="vgivenTeam"
select="/*/*[. = 5]"/>
<xsl:template match="/">
<xsl:copy-of select=
"$vgivenTeam
| $vgivenTeam/preceding-sibling::*[not(position() > 3)]
| $vgivenTeam/following-sibling::*[not(position() > 3)]
"/>
</xsl:template>
</xsl:stylesheet>
when applied on the above XML document, produces the wanted result -- the three teams preceding the given one, the given team and the three teams following the given one:
<num>02</num>
<num>03</num>
<num>04</num>
<num>05</num>
<num>06</num>
<num>07</num>
<num>08</num>

XPath Expression: Select elements between A HREF="expr" tags

I didn't found an explicit way to select all nodes that exist between two anchors (<a></a> tag pair) in an HTML file.
The first anchor has the following format:
Second anchor:
I've verified that both can be selected using starts-with (note that I'm using HTML Agility Pack):
HtmlNode n0 = html.DocumentNode.SelectSingleNode("//a[starts-with(#href,'file://START')]"));
HtmlNode n1 = html.DocumentNode.SelectSingleNode("//a[starts-with(#href,'file://END')]"));
With this in mind, and with my amateurish XPath skills, I wrote the following expression to get all tags between the two anchors:
html.DocumentNode.SelectNodes("//*[not(following-sibling::a[starts-with(#href,'file://START0')]) and not (preceding-sibling::a[starts-with(#href,'file://END0')])]");
This seems to work, but selects all HTML document!
I need to, for example for the following HTML fragment:
<html>
...
<p>First nodes</p>
<p>First nodes
<span>X</span>
</p>
<p>First nodes</p>
...
</html>
remove both anchors, the three P (including of course the inner SPAN).
Any way to do this?
I don't know if XPath 2.0 offers better ways to achieve this.
*EDIT (special case!) *
I should also handle the case where:
"Select tags between X and X', where X is <p></p>"
So instead of:
<!-- xhtml to be extracted -->
I should handle also:
<p>
</p>
<!-- xhtml to be extracted -->
<p>
</p>
Thank you very much, again.
Use this XPath 1.0 expression:
//a[starts-with(#href,'file://START')]/following-sibling::node()
[count(.| //a[starts-with(#href,'file://END')]/preceding-sibling::node())
=
count(//a[starts-with(#href,'file://END')]/preceding-sibling::node())
]
Or, use this XPath 2.0 expression:
//a[starts-with(#href,'file://START')]/following-sibling::node()
intersect
//a[starts-with(#href,'file://END')]/preceding-sibling::node()
The XPath 2.0 expression uses the XPath 2.0 intersect operator.
The XPath 1.0 expression uses the Kayessian (after #Michael Kay) formula for the intersectioon of two node-sets:
$ns1[count(.|$ns2) = count($ns2)]
Verification with XSLT:
This XSLT 1.0 transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:copy-of select=
" //a[starts-with(#href,'file://START')]/following-sibling::node()
[count(.| //a[starts-with(#href,'file://END')]/preceding-sibling::node())
=
count(//a[starts-with(#href,'file://END')]/preceding-sibling::node())
]
"/>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<html>...
<p>First nodes</p>
<p>First nodes
<span>X</span>
</p>
<p>First nodes</p>
...
</html>
produces the wanted, correct result:
<p>First nodes</p>
<p>First nodes
<span>X</span>
</p>
<p>First nodes</p>
This XSLT 2.0 transformation:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:copy-of select=
" //a[starts-with(#href,'file://START')]/following-sibling::node()
intersect
//a[starts-with(#href,'file://END')]/preceding-sibling::node()
"/>
</xsl:template>
</xsl:stylesheet>
when applied on the same XML document (above) again produces exactly the wanted result.
I've added a special case that I should handle
To handle this special case you can work in the same way, I mean use the Kayessian (and use XPath Visualizer as well ;-)). The intersecting node-sets change as follows:
Intersecting node-set C
"//p[.//a[starts-with(#href,'file://START')]]
/following-sibling::node()"
All following sibling of p containing a START.
Intersecting node-set D
"./following-sibling::p[.//a[starts-with(#href,'file://END')]]
/preceding-sibling::node())"
All preceding siblings of p containing a END and following sibling of current p
Now you can perform the intersection as:
C ∩ D
That is
"//p[.//a[starts-with(#href,'file://START')]]
/following-sibling::node()[
count(.| ./following-sibling::p
[.//a[starts-with(#href,'file://END')]]
/preceding-sibling::node())
=
count(./following-sibling::p
[.//a[starts-with(#href,'file://END')]]
/preceding-sibling::node())
]"
If you need to manage both situations, you can proceed with the union of the intersecting node-sets as
(A ∩ B) ∪ (C ∩ D)
Where:
The XPath union operator | must be used:
the node-sets A e B are already showed in the #Dimitre'answer
the node-sets C e D are those showed in my answer.

Is there an elegant way to add multiple HTML classes with XSLT?

Let's say I'm transforming a multiple-choice quiz from an arbitrary XML format to HTML. Each choice will be represented as an HTML <li> tag in the result document. For each choice, I want to add an HTML class of correct to the <li> if that choice was the correct answer. Additionally, if that choice was the one selected by the user, I want to add a class of submitted to the <li>. Consequently, if the choice is the correct one as well as the submitted one, the <li> should have a class of correct submitted.
As far as I know, white-space separated attribute values aren't a part of the XML data model and thus cannot directly be created via XSLT. However, I have a feeling there's a better way of doing this than littering the code with one conditional for every possible combination of classes (which would be acceptable in this example, but unwieldy in more complex scenarios).
How can I solve this in an elegant way?
Example of Desired Result:
<p>Who trained Obi-Wan Kenobi?</p>
<ul>
<li>Mace Windu</li>
<li class="correct submitted">Qui-Gon Jinn</li>
<li>Ki-Adi-Mundi</li>
<li>Yaddle</li>
</ul>
Firstly, there is nothing wrong with whitespace in attribute values in XML: roughly speaking, attribute value normalization converts whitespace characters to spaces and collapses adjacent spaces to a single space when a document is parsed, but whitespace is definitely allowed. EDIT: See below for more on this.
Matthew Wilson's approach fails to include whitespace between the possible values, as you mention in your comment thereto. However, his approach is fundamentally sound. The final piece of the jigsaw is your dislike of redundant spaces: these can be eliminated by use of the normalize-space XPath function.
The following stylesheet puts all the bits together - note that it doesn't do anything with its input document, so for testing purposes you can run it against any XML document, or even against itself, to verify that the output meets your requirements.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:variable name="foo0" select="false()"/>
<xsl:variable name="bar0" select="true()"/>
<xsl:variable name="foo1" select="true()"/>
<xsl:variable name="bar1" select="false()"/>
<xsl:variable name="foo2" select="true()"/>
<xsl:variable name="bar2" select="true()"/>
<xsl:template match="/">
<xsl:variable name="foobar0">
<xsl:if test="$foo0"> foo</xsl:if>
<xsl:if test="$bar0"> bar</xsl:if>
</xsl:variable>
<xsl:variable name="foobar1">
<xsl:if test="$foo1"> foo</xsl:if>
<xsl:if test="$bar1"> bar</xsl:if>
</xsl:variable>
<xsl:variable name="foobar2">
<xsl:if test="$foo2"> foo</xsl:if>
<xsl:if test="$bar2"> bar</xsl:if>
</xsl:variable>
<li>
<xsl:attribute name="class">
<xsl:value-of select="normalize-space($foobar0)"/>
</xsl:attribute>
</li>
<li>
<xsl:attribute name="class">
<xsl:value-of select="normalize-space($foobar1)"/>
</xsl:attribute>
</li>
<li>
<xsl:attribute name="class">
<xsl:value-of select="normalize-space($foobar2)"/>
</xsl:attribute>
</li>
</xsl:template>
</xsl:stylesheet>
EDIT: Further to the question of spaces separating discrete components within the value of an attribute: The XML Spec defines a number of possible valid constructs as attribute types, including IDREFS and NMTOKENS. The first case matches the Names production, and the second case matches the NMTokens production; both these productions are defined as containing multiple values of the appropriate type, delimited by spaces. So space-delimited lists of values as the value of a single attribute are an inherent component of the XML information set.
Off the top of my head, you can build up a space-separated list with something like:
<li>
<xsl:attribute name="class">
<xsl:if cond="...">correct</xsl:if>
<xsl:if cond="...">submitted</xsl:if>
</xsl:attribute>
</li>
As far as I know, white-space separated attribute values aren't a part of the XML data model and thus cannot directly be created via XSLT
Unless you are converting to an XML language (which HTML is not, XHTML is), you shouldn't worry about the XML validity of the XSLT ouput. This can be anything, and doesn't need to conform to XML!