I'm fairly new to xslt. So what im trying to do, is parse a book in xml to an html. A Basic example would be this.
<book>
<title>
Some important title
</title>
<section>
<title>animal</title>
<kw>RealAnimal</kw>
<kw>something|something more about it</tkw>
<para>Some really important facts</para>
<section>
<title>something</title>
<kw>something else</kw>
<para>Enter Text</para>
</section>
<section>
<title>Even more</title>
<kw>and more</kw>
<para>hell of a lot more</para>
</section>
</section>
</book>
a section can have an unknown number of subsections. So obviously i need to handle this with recusrion. so far i designed 2 templates, in order to handle a book and a section, based on my needs.
<xsl:template match="book">
<html>
<body>
<h1><xsl:value-of select="title" /></h1>
<xsl:apply-templates select="section" />
</body>
</html>
</xsl:template>
<xsl:template match="section[title]">
<li><xsl:value-of select="title" /></li>
<!-- do something more here -->
</xsl:template>
what i cant figure out is, can i get my current recursion depth, because i want to make a decision which kind of header to use based on the depth.
Also, the book is supposed to consist of 2 parts. its normal content at the beginning, like header and para below that header. and an index in the end. This leads me to believe that i need to parse it in 2 different ways within one document, but how would i do that? Any hints or Code would be greatly appreciated
so i figured out how to make section and subsection headers with numbers like a list in Word.
<xsl:number level="multiple" />
gives me for a subsection x.y basend on parents section position and its own position. what i now want is that it gives me the number of groups, as it groups the values based on the depth, but i cant figure out how
what id expect is that it parses to
<h1>Some important title</h1>
...
<h2> animal </h2>
...
<h3> something </h3>
...
<h3> Even more </h3>
and if i were to add another section to the "something"-section it would be h4 and so on...
solved it like this
<xsl:param name="depth"/>
<xsl:choose>
<xsl:when test="6 > $depth">
<xsl:element name="h{$depth}">
<xsl:number level="multiple" />.
<xsl:value-of select="title" />
</xsl:element>
</xsl:when>
<xsl:otherwise>
<h6><xsl:number level="multiple" />. <xsl:value-of select="title" /></h6>
</xsl:otherwise>
</xsl:choose>
well what im trying to do is set h2 for section as subsection for
book, and h3 for a subsection of section
Here's one way you could do this - with unlimited nesting of subsections:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/book">
<html>
<body>
<h1><xsl:value-of select="title" /></h1>
<xsl:apply-templates select="section" />
</body>
</html>
</xsl:template>
<xsl:template match="section">
<h2><xsl:value-of select="title" /></h2>
<xsl:apply-templates select="subsection">
<xsl:with-param name="depth" select="3"/>
</xsl:apply-templates>
</xsl:template>
<xsl:template match="subsection">
<xsl:param name="depth"/>
<xsl:element name="h{$depth}">
<xsl:value-of select="title" />
</xsl:element>
<xsl:apply-templates select="subsection">
<xsl:with-param name="depth" select="$depth + 1"/>
</xsl:apply-templates>
</xsl:template>
</xsl:stylesheet>
Note that this is recursive and unlimited; AFAIK, HTML will run out of levels after h6.
Edit:
a subsection isnt named subsection, it just a section as a child of
another section.
Well, then this could be simpler. Or at least shorter.
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/book">
<html>
<body>
<h1><xsl:value-of select="title" /></h1>
<xsl:apply-templates select="section">
<xsl:with-param name="depth" select="2"/>
</xsl:apply-templates>
</body>
</html>
</xsl:template>
<xsl:template match="section">
<xsl:param name="depth"/>
<xsl:element name="h{$depth}">
<xsl:value-of select="title" />
</xsl:element>
<xsl:apply-templates select="section">
<xsl:with-param name="depth" select="$depth + 1"/>
</xsl:apply-templates>
</xsl:template>
</xsl:stylesheet>
Edit 2:
im supposed to set h2-h5 for th first 4, and h6 after that.
If you mean you want to limit the heading to a maximum of h6 regardless of the section's depth, then change this:
<xsl:with-param name="depth" select="$depth + 1"/>
to:
<xsl:with-param name="depth" select="$depth + ($depth < 6)"/>
You might try tinkering with "count(ancestor::*)" if you really want to know how deep you are. However, I'd suggest taking a look at automatic numbering first, just in case it does the trick. It even handles nested items pretty handily.
"XSLT's xsl:number instruction makes it easy to insert a number into your result document. Its value attribute lets you name the number to insert, but if you really want to add a specific number to your result, it's much simpler to add that number as literal text. When you omit the value attribute from an xsl:value-of instruction, the XSLT processor calculates the number based on the context node's position in the source tree or among the nodes being counted through by an xsl:for-each instruction, which makes it great for automatic numbering."
XML.com reference page
Related
I am trying to make an XSLT transformation from XML, I want to transform font style tags into HTML tags, but my I am doing something wrong.
My XML file is like this one :
<root>
<p>
<span>
<i/>
italic
</span>
<span>
<i/>
<b/>
bold-italic
</span>
<span>
normal
</span>
</p>
</root>
What I want is HTML with the same tags but my XSLT transformation does not work:
HTML:
<p>
<i>italic</i>
<i><b>bold-italic</b></i>
normal
<p>
I was trying xsl:if condition but it does not work,i do not know what I am doing wrong:
XSLT:
<xsl:template match="p">
<p>
<xsl:for-each select="span">
<xsl:if test="i">
<i>
<xsl:value-of select="."/>
</i>
</xsl:if>
<xsl:if test="b">
<b>
<xsl:value-of select="."/>
</b>
</xsl:if>
</xsl:for-each>
</p>
</xsl:template>
Do you know how to repair my code ?
Can you have more than just b and i elements? It may be possible to do this with a generic solution, that creates a nested element for each child element of a span element.
This solution uses a recursive template, that matches span, but with a parameter contain the index number of the child element that needs to be output. When this index exceeds the number of child elements, the text is output.
Try this XSLT too:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="span">
<xsl:param name="num" select="1"/>
<xsl:variable name="childElement" select="*[$num]"/>
<xsl:choose>
<xsl:when test="$childElement">
<xsl:element name="{local-name($childElement)}">
<xsl:apply-templates select=".">
<xsl:with-param name="num" select="$num + 1"/>
</xsl:apply-templates>
</xsl:element>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
This does assume that all the span element only contain elements you want to nest, in addition to the text.
You can test the contents of span element using an XPath expression with a predicate which tests for its contents, and match different templates for each situation. Since you need b and i for bold-italic, you should use that expression in one of your predicates.
The stylesheet below does the transformation using only templates (without the need of a for-each). I'm assuming the contents of your <span> elements is text (not mixed content):
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:strip-space elements="*"/>
<xsl:template match="p">
<p><xsl:apply-templates/></p>
</xsl:template>
<xsl:template match="span[i]">
<i><xsl:value-of select="."/></i>
</xsl:template>
<xsl:template match="span[b]">
<b><xsl:value-of select="."/></b>
</xsl:template>
<xsl:template match="span[i and b]">
<i><b><xsl:value-of select="."/></b></i>
</xsl:template>
</xsl:stylesheet>
I have an xml that has a description node:
<config>
<desc>A <b>first</b> sentence here. The second sentence with some link The link. The <u>third</u> one.</desc>
</config>
I am trying to split the sentences using dot as separator but keeping in the same time in the HTML output the eventual HTML tags.
What I have so far is a template that splits the description but the HTML tags are lost in the output due to the normalize-space and substring-before functions.
My current template is given below:
<xsl:template name="output-tokens">
<xsl:param name="sourceText" />
<!-- Force a . at the end -->
<xsl:variable name="newlist" select="concat(normalize-space($sourceText), ' ')" />
<!-- Check if we have really a point at the end -->
<xsl:choose>
<xsl:when test ="contains($newlist, '.')">
<!-- Find the first . in the string -->
<xsl:variable name="first" select="substring-before($newlist, '.')" />
<!-- Get the remaining text -->
<xsl:variable name="remaining" select="substring-after($newlist, '.')" />
<!-- Check if our string is not in fact a . or an empty string -->
<xsl:if test="normalize-space($first)!='.' and normalize-space($first)!=''">
<p><xsl:value-of select="normalize-space($first)" />.</p>
</xsl:if>
<!-- Recursively apply the template for the remaining text -->
<xsl:if test="$remaining">
<xsl:call-template name="output-tokens">
<xsl:with-param name="sourceText" select="$remaining" />
</xsl:call-template>
</xsl:if>
</xsl:when>
<!--If no . was found -->
<xsl:otherwise>
<p>
<!-- If the string does not contains a . then display the text but avoid
displaying empty strings
-->
<xsl:if test="normalize-space($sourceText)!=''">
<xsl:value-of select="normalize-space($sourceText)" />.
</xsl:if>
</p>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
and I am using it in the following manner:
<xsl:template match="config">
<xsl:call-template name="output-tokens">
<xsl:with-param name="sourceText" select="desc" />
</xsl:call-template>
</xsl:template>
The expected output is:
<p>A <b>first</b> sentence here.</p>
<p>The second sentence with some link The link.</p>
<p>The <u>third</u> one.</p>
A good question, and not an easy one to solve. Especially, of course, if you're using XSLT 1.0 (you really need to tell us if that's the case).
I've seen two approaches to the problem. Both involve breaking it into smaller problems.
The first approach is to convert the markup into text (for example replace <b>first</b> by [b]first[/b]), then use text manipulation operations (xsl:analyze-string) to split it into sentences, and then reconstitute the markup within the sentences.
The second approach (which I personally prefer) is to convert the text delimiters into markup (convert "." to <stop/>) and then use positional grouping techniques (typically <xsl:for-each-group group-ending-with="stop"/> to convert the sentences into paragraphs.)
Here is one way to implement the second approach suggested by Michael Kay using XSLT 2.
This stylesheet demonstrates a two-pass transformation where the first pass introduces <stop/> markers after each sentence and the second pass encloses all groups ending with a <stop/> in a paragraph.
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<!-- two-pass processing -->
<xsl:template match="/">
<xsl:variable name="intermediate">
<xsl:apply-templates mode="phase-1"/>
</xsl:variable>
<xsl:apply-templates select="$intermediate" mode="phase-2"/>
</xsl:template>
<!-- identity transform -->
<xsl:template match="#*|node()" mode="#all" priority="-1">
<xsl:copy>
<xsl:apply-templates select="#*|node()" mode="#current"/>
</xsl:copy>
</xsl:template>
<!-- phase 1 -->
<!-- insert <stop/> "milestone markup" after each sentence -->
<xsl:template match="text()" mode="phase-1">
<xsl:analyze-string select="." regex="\.\s+">
<xsl:matching-substring>
<xsl:value-of select="regex-group(0)"/>
<stop/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
<!-- phase 2 -->
<!-- turn each <stop/>-terminated group into a paragraph -->
<xsl:template match="*[stop]" mode="phase-2">
<xsl:copy>
<xsl:for-each-group select="node()" group-ending-with="stop">
<p>
<xsl:apply-templates select="current-group()" mode="#current"/>
</p>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
<!-- remove the <stop/> markers -->
<xsl:template match="stop" mode="phase-2"/>
</xsl:stylesheet>
This is my humble solution, based on the second suggestion of #Michael Kay answer.
Differently from #Jukka answer (which is very elegant indeed) I'm not using xsl:analyse-string, as XPath 1.0 functions contains and substring-after are enough to accomplish the split. I've also started the match pattern from the config.
Here's the transform:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<!-- two pass processing -->
<xsl:template match="config">
<xsl:variable name="pass1">
<xsl:apply-templates select="node()"/>
</xsl:variable>
<xsl:apply-templates mode="pass2" select="$pass1/*"/>
</xsl:template>
<!-- 1. Copy everything as is (identity) -->
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<!-- 1. Replace "text. text" with "text<dot/> text" -->
<xsl:template match="text()[contains(.,'. ')]">
<xsl:value-of select="substring-before(.,'. ')"/>
<dot/>
<xsl:value-of select="substring-after(.,'. ')"/>
</xsl:template>
<!-- 2. Group by examining in population order ending with dot -->
<xsl:template match="desc" mode="pass2">
<xsl:for-each-group select="node()"
group-ending-with="dot">
<p><xsl:apply-templates select="current-group()" mode="pass2"/></p>
</xsl:for-each-group>
</xsl:template>
<!-- 2. Identity -->
<xsl:template match="node()|#*" mode="pass2">
<xsl:copy>
<xsl:apply-templates select="node()|#*" mode="pass2"/>
</xsl:copy>
</xsl:template>
<!-- 2. Replace dot with mark -->
<xsl:template match="dot" mode="pass2">
<xsl:text>.</xsl:text>
</xsl:template>
</xsl:stylesheet>
Applied on the input shown in your question, produces:
<p>A <b>first</b> sentence here.</p>
<p>The second sentence with some link The link.</p>
<p>The <u>third</u> one.</p>
this might do the trick:
http://symphony-cms.com/download/xslt-utilities/view/20816/
/J
I have looked up solutions on stackflow, but none of them seem to work for me. Here is my question. Lets say I have the following text :
Source:
<greatgrandparent>
<grandparent>
<parent>
<sibling>
Hey, im the sibling .
</sibling>
<description>
$300$ <br/> $250 <br/> $200! <br/> <p> Yes, that is right! <br/> You can own a ps3 for only $200 </p>
</description>
</parent>
<parent>
... (SAME FORMAT)
</parent>
... (Several more parents)
</grandparent>
</greatgrandparent>
Output:
<newprice>
$300$ <br/> $250 <br/> $200! <br/> Yes, that is right! <br/> You can own a ps3 for only $200
</newprice>
I can't seem to find a way to do that.
Current XSL:
<xsl:template match="/">
<xsl:apply-templates />
</xsl:template>
<xsl:template match="greatgrandparents">
<xsl:apply-templates />
</xsl:template>
<xsl:template match = "grandparent">
<xsl:for-each select = "parent" >
<newprice>
<xsl:apply-templates>
</newprice>
</xsl:for-each>
</xsl:template>
<xsl:template match="description">
<xsl:element name="newprice">
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
<xsl:template match="p">
<xsl:apply-templates/>
</xsl:template>
Use templates to define behavior on specific elements
<!-- after standard identity template -->
<xsl:template match="description">
<xsl:element name="newprice">
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
<xsl:template match="p">
<xsl:apply-templates/>
</xsl:template>
The first template says to swap description with newprice. The second one says to ignore the p element.
If you're unfamiliar with the identity template, take a look here for a few examples.
EDIT: Given the new example, we can see that you want to only extract the description element and its contents. Notice that the template action starts with the match="/" template. We can use this control where our stylesheet starts and thus skip much of the riffraff we want to filter out.
change the <xsl:template match="/"> to something more like:
<xsl:template match="/">
<xsl:apply-templates select="//description"/>
<!-- use a more specific XPath if you can -->
</xsl:template>
So altogether our solution looks like this:
<xsl:stylesheet
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"
exclude-result-prefixes="xs">
<xsl:template match="/">
<xsl:apply-templates select="//description" />
</xsl:template>
<!-- this is the identity template -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="description">
<xsl:element name="newprice">
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
<xsl:template match="p">
<xsl:apply-templates/>
</xsl:template>
</xsl:stylesheet>
Shouldn't the contents of be inside a CDATA element? And then probably disable output encoding on xsl:value-of..
You should look into xsl:copy-of.
You would probably wind up with somthing like:
<xsl:template match="description">
<xsl:copy-of select="."/>
</xsl:template>
Probably the shortest solution is this one:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="description">
<newprice>
<xsl:copy-of select="node()"/>
</newprice>
</xsl:template>
<xsl:template match="text()[not(ancestor::description)]"/>
</xsl:stylesheet>
When this transformation is applied on the provided XML document, the wanted result is produced:
<newprice>
$300$ <br /> $250 <br /> $200! <br /> <p> Yes, that is right! <br /> You can own a ps3 for only $200 </p>
</newprice>
Do note:
The use of <xsl:copy-of select="node()"/> to copy all the subtree rooted in description, without the root itself.
How we override (with a specific, empty template) the XSLT built-in template, preventing any text nodes that are not descendents of a <description> element, to be output.
I want to select an abstract, along with HTML format elements using XSLT. Here is an example the XML:
<PUBLDES>The <IT>European Journal of Cancer (including EJC Supplements),</IT>
is an international comprehensive oncology journal that publishes original
research, editorial comments, review articles and news on experimental oncology,
clinical oncology (medical, paediatric, radiation, surgical), translational
oncology, and on cancer epidemiology and prevention. The Journal now has online
submission for authors. Please submit manuscripts at
<SURL>http://ees.elsevier.com/ejc</SURL> and follow the instructions on the
site.<P/>
The <IT>European Journal of Cancer (including EJC Supplements)</IT> is the
official Journal of the European Organisation for Research and Treatment
of Cancer (EORTC), the European CanCer Organisation (ECCO), the European
Association for Cancer Research (EACR), the the European Society of Breast
Cancer Specialists (EUSOMA) and the European School of Oncology (ESO). <P/>
Supplements to the <IT>European Journal of Cancer</IT> are published under
the title <IT>EJC Supplements</IT> (ISSN 1359-6349). All subscribers to
<IT>European Journal of Cancer</IT> automatically receive this publication.<P/>
To access the latest tables of contents, abstracts and full-text articles
from <IT>EJC</IT>, including Articles-in-Press, please visit <URL>
<HREF>http://www.sciencedirect.com/science/journal/09598049</HREF>
<HTXT>ScienceDirect</HTXT>
</URL>.</PUBLDES>
How do I get say 45 words from it, along with HTML tags in it. When I use substring() or concat() it removes the tags (like <IT> etc.).
You would probably be better off doing this programmatically, rather than with pure XSLT, but if you have to use XSLT, here is one way to do it. It does involve multiple stylesheets, although if you had were able to use extension functions, you can make use of node-sets, and combine them into one big (and nasty) style sheet.
The first stylesheet would copy the intial XML, but 'tokenise' any text it finds, so that each word in the text becomes a separate 'WORD' element.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<!-- Copy existing nodes and attributes -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<!-- Match text nodes -->
<xsl:template match="text()">
<xsl:call-template name="tokenize">
<xsl:with-param name="string" select="."/>
</xsl:call-template>
</xsl:template>
<!-- Splits a string into separate elements for each word -->
<xsl:template name="tokenize">
<xsl:param name="string"/>
<xsl:param name="delimiter" select="' '"/>
<xsl:choose>
<xsl:when test="$delimiter and contains($string, $delimiter)">
<xsl:variable name="word" select="normalize-space(substring-before($string, $delimiter))"/>
<xsl:if test="string-length($word) > 0">
<WORD>
<xsl:value-of select="$word"/>
</WORD>
</xsl:if>
<xsl:call-template name="tokenize">
<xsl:with-param name="string" select="substring-after($string, $delimiter)"/>
<xsl:with-param name="delimiter" select="$delimiter"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:variable name="word" select="normalize-space($string)"/>
<xsl:if test="string-length($word) > 0">
<WORD>
<xsl:value-of select="$word"/>
</WORD>
</xsl:if>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
The XSLT template used to 'tokenize' a string of text, I took from this question here:
tokenizing-and-sorting-with-xslt-1-0
(Note that in XSLT2.0, I believe there is a tokenize function, which would simplify the above)
This would give you XML like so...
<PUBLDES>
<WORD>The</WORD>
<IT>
<WORD>European</WORD>
<WORD>Journal</WORD>
<WORD>of</WORD>
....
And so on...
Next, it is a case of traversing this XML document, using another XSLT document, outputting only upto the first 45 word elements. To do this, I repeatedly apply a template, keeping a running total of the number of WORDS currently found. When matching a node, there are three possibilities
Match a WORD element: Output it. Carry on processing from next sibling if total is not reached.
Match a element where the number of words below it is less than the total: Copy the whole element, and then carry on processing from next sibling if total is not reached
Match elements where number of words below would exceed total: Copy the current node (but not its children) and continue processing at first child.
Here is the style sheet, in all its hideousness
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:variable name="WORDCOUNT">6</xsl:variable>
<!-- Match root element -->
<xsl:template match="/">
<xsl:apply-templates select="descendant::*[1]" mode="word">
<xsl:with-param name="previousWords">0</xsl:with-param>
</xsl:apply-templates>
</xsl:template>
<!-- Match any node -->
<xsl:template match="node()" mode="word">
<xsl:param name="previousWords"/>
<!-- Number of words below the element (at any depth) -->
<xsl:variable name="childWords" select="count(descendant::WORD)"/>
<xsl:choose>
<!-- Matching a WORD element -->
<xsl:when test="local-name(.) = 'WORD'">
<!-- Copy the word -->
<WORD>
<xsl:value-of select="."/>
</WORD>
<!-- If there are still words to output, continue processing at next sibling -->
<xsl:if test="$previousWords + 1 < $WORDCOUNT">
<xsl:apply-templates select="following-sibling::*[1]" mode="word">
<xsl:with-param name="previousWords">
<xsl:value-of select="$previousWords + 1"/>
</xsl:with-param>
</xsl:apply-templates>
</xsl:if>
</xsl:when>
<!-- Match a node where the number of words below it is within allowed limit -->
<xsl:when test="$childWords <= $WORDCOUNT - $previousWords">
<!-- Copy the element -->
<xsl:copy>
<!-- Copy all its desecendants -->
<xsl:copy-of select="*|#*"/>
</xsl:copy>
<!-- If there are still words to output, continue processing at next sibling -->
<xsl:if test="$previousWords + $childWords < $WORDCOUNT">
<xsl:apply-templates select="following-sibling::*[1]" mode="word">
<xsl:with-param name="previousWords">
<xsl:value-of select="$previousWords + $childWords"/>
</xsl:with-param>
</xsl:apply-templates>
</xsl:if>
</xsl:when>
<!-- Match nodes where the number of words below it would exceed current limit -->
<xsl:otherwise>
<!-- Copy the node -->
<xsl:copy>
<!-- Continue processing at very first child node -->
<xsl:apply-templates select="descendant::*[1]" mode="word">
<xsl:with-param name="previousWords">
<xsl:value-of select="$previousWords"/>
</xsl:with-param>
</xsl:apply-templates>
</xsl:copy>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
If you were outputting just the first 4 words, say, this would give you the following output
<PUBLDES>
<WORD>The</WORD>
<IT>
<WORD>European</WORD>
<WORD>Journal</WORD>
<WORD>of</WORD>
</IT>
</PUBLDES>
Of course, you would then need yet another transformation to remove the WORD elements, and just leave the text. This should be fairly straight-forward....
This is all very nasty though, but it is the best I could come up with for now!
I am writing this because I have really hit the wall and cannot go ahead. In my database I have escaped HTML like this: "<p>My name is Freddy and I was".
I want to show it as HTML OR strip the HTML tags in my XSL template. Both solutions will work for me and I will choose the quicker solution.
I have read several posts online but cannot find a solution. I have also tried disable-output-escape with no success. Basically it seems the problem is that somewhere in the XSL execution the engine is changing this <p> into this: <p>.
It is converting the & into &. If it helps, here is my XSL code. I have tried several combinations with and without the output tag on the top.
Any help will be appreciated. Thanks in advance.
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" omit-xml-declaration="yes"/>
<xsl:template match="DocumentElement">
<div>
<xsl:attribute name="id">mySlides</xsl:attribute>
<xsl:apply-templates>
<xsl:with-param name="templatenumber" select="0"/>
</xsl:apply-templates>
</div>
<div>
<xsl:attribute name="id">myController</xsl:attribute>
<xsl:apply-templates>
<xsl:with-param name="templatenumber" select="1"/>
</xsl:apply-templates>
</div>
</xsl:template>
<xsl:template match="DocumentElement/QueryResults">
<xsl:param name="templatenumber">tobereplace</xsl:param>
<xsl:if test="$templatenumber=0">
<div>
<xsl:attribute name="id">myController</xsl:attribute>
<div>
<xsl:attribute name="class">article</xsl:attribute>
<h2>
<a>
<xsl:attribute name="class">title</xsl:attribute>
<xsl:attribute name="title"><xsl:value-of select="Title"/></xsl:attribute>
<xsl:attribute name="href">/stories/stories-details/articletype/articleview/articleid/<xsl:value-of select="ArticleId"/>/<xsl:value-of select="SEOTitle"/>.aspx</xsl:attribute>
<xsl:value-of select="Title"/>
</a>
</h2>
<div>
<xsl:attribute name="style">text-indent: 25px;</xsl:attribute>
<xsl:attribute name="class">articlesummary</xsl:attribute>
<xsl:call-template name="removeHtmlTags">
<xsl:with-param name="html" select="Summary" />
</xsl:call-template>
</div>
</div>
</div>
</xsl:if>
<xsl:if test="$templatenumber=1">
<div>
<xsl:attribute name="id">myController</xsl:attribute>
<span>
<xsl:attribute name="class">jFlowControl</xsl:attribute>
aa
</span>
</div>
</xsl:if>
</xsl:template>
<xsl:template name="removeHtmlTags">
<xsl:param name="html"/>
<xsl:choose>
<xsl:when test="contains($html, '<')">
<xsl:value-of select="substring-before($html, '<')"/>
<!-- Recurse through HTML -->
<xsl:call-template name="removeHtmlTags">
<xsl:with-param name="html" select="substring-after($html, '>')"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$html"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
Based in the assumption that you have this HTML string,
<p>My name is Freddy & I was
then if you escape it and store it in a database it would become this:
<p>My name is Freddy & I was
Consequently, if you retrieve it as XML (without unescaping it beforehand), the result would be this:
<p>My name is Freddy &amp; I was
and <xsl:value-of select="." disable-output-escaping="yes" /> would produce:
<p>My name is Freddy & I was
You are getting exactly the same thing you have in your database, but of course you see the HTML tags in the output. So what you need is a mechanism that does the following string replacements:
"<" with "<" (effectively changing < to < in unescaped ouput)
">" with ">" (effectively changing > to > in unescaped ouput)
""" with """ (effectively changing " to " in unescaped ouput)
"&" with "&" (effectively changing & to & in unescaped ouput)
From your XSL I have inferred the following test input XML:
<DocumentElement>
<QueryResults>
<Title>Article 1</Title>
<ArticleId>1</ArticleId>
<SEOTitle>Article_1</SEOTitle>
<Summary><p>Article 1 summary &amp; description.</p></Summary>
</QueryResults>
<QueryResults>
<Title>Article 2</Title>
<ArticleId>2</ArticleId>
<SEOTitle>Article_2</SEOTitle>
<Summary><p>Article 2 summary &amp; description.</p></Summary>
</QueryResults>
</DocumentElement>
I have changed the stylesheet you supplied and implemented such a replacement mechanism. If you apply the following XSLT 1.0 template to it:
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:my="my:namespace"
exclude-result-prefixes="my"
>
<xsl:output method="html" omit-xml-declaration="yes"/>
<my:unescape>
<my:char literal="<" escaped="<" />
<my:char literal=">" escaped=">" />
<my:char literal=""" escaped=""" />
<my:char literal="&" escaped="&" />
</my:unescape>
<xsl:template match="DocumentElement">
<div id="mySlides">
<xsl:apply-templates mode="slides" />
</div>
<div id="myController">
<xsl:apply-templates mode="controller" />
</div>
</xsl:template>
<xsl:template match="DocumentElement/QueryResults" mode="slides">
<div class="article">
<h2>
<a class="title" title="{Title}" href="{concat('/stories/stories-details/articletype/articleview/articleid/', ArticleId, '/', SEOTitle, '.aspx')}">
<xsl:value-of select="Title"/>
</a>
</h2>
<div class="articlesummary" style="text-indent: 25px;">
<xsl:apply-templates select="document('')/*/my:unescape/my:char[1]">
<xsl:with-param name="html" select="Summary" />
</xsl:apply-templates>
</div>
</div>
</xsl:template>
<xsl:template match="DocumentElement/QueryResults" mode="controller">
<span class="jFlowControl">
<xsl:text>aa </xsl:text>
<xsl:value-of select="Title" />
</span>
</xsl:template>
<xsl:template match="my:char">
<xsl:param name="html" />
<xsl:variable name="intermediate">
<xsl:choose>
<xsl:when test="following-sibling::my:char">
<xsl:apply-templates select="following-sibling::my:char[1]">
<xsl:with-param name="html" select="$html" />
</xsl:apply-templates>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$html" disable-output-escaping="yes" />
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<xsl:call-template name="unescape">
<xsl:with-param name="html" select="$intermediate" />
</xsl:call-template>
</xsl:template>
<xsl:template name="unescape">
<xsl:param name="html" />
<xsl:choose>
<xsl:when test="contains($html, #escaped)">
<xsl:value-of select="substring-before($html, #escaped)" disable-output-escaping="yes"/>
<xsl:value-of select="#literal" disable-output-escaping="yes" />
<xsl:call-template name="unescape">
<xsl:with-param name="html" select="substring-after($html, #escaped)"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$html" disable-output-escaping="yes"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
Then this output HTML is produced:
<div id="mySlides">
<div class="article">
<h2>
<a class="title" title="Article 1" href="/stories/stories-details/articletype/articleview/articleid/1/Article_1.aspx">Article 1</a>
</h2>
<div class="articlesummary" style="text-indent: 25px;">
<p>Article 1 summary & description.</p>
</div>
</div>
<div class="article">
<h2>
<a class="title" title="Article 2" href="/stories/stories-details/articletype/articleview/articleid/2/Article_2.aspx">Article 2</a>
</h2>
<div class="articlesummary" style="text-indent: 25px;">
<p>Article 2 summary & description.</p>
</div>
</div>
</div>
<div id="myController">
<span class="jFlowControl">aa Article 1</span>
<span class="jFlowControl">aa Article 2</span>
</div>
Note
the use of a temporary namespace and embedded elements (<my:unescape>) to create a list of characters to replace
the use of recursion to emulate an iterative replacement of all affected characters in the input
the use of the implicit context within the unescape template to transport the information which character is to be replaced at the moment
Furthermore note:
the use of template modes to get different output for the same input (this replaces your templatenumber parameter)
most of the time there is no need for <xsl:attribute> elements. They can safely be replaced by inline notation (attributename="{attributevalue}")
the use of the concat() function to create the URL
Generally speaking, it is a bad idea to store escaped HTML in a database (more generally speaking: It is a bad idea to store HTML in a database.). You set yourself up to get all kinds of problems, this being one of them. If you can't change this setup, I hope that the solution helps you.
I cannot guarantee that it does the right thing in all situations, and it may open up security holes (think XSS), but dealing with this was not part of the question. In any case, consider yourself warned.
I need a break now. ;-)
You shouldn't store escaped HTML in your database. If your database contained the actual "<" character, then the "disable-output-escaping" command would do what you wanted.
If you can't change the data then you'll have to unescape the data before your perform the transform.
Add this line to your stylesheet
<xsl:output method="html" indent="yes" version="4.0"/>
It is a bad idea to store HTML in a database
What? How are you supposed to store it then? In an XML doc so you have to use XSLT anyway? As a web developer, we've always used SQL databases to store user-defined HTML data. There's nothing wrong with that method as long as it is sanitized properly for your purposes.