Using XSLT to span text between two empty nodes - html

I have an XML file with a series of pairings like the following:
<metamark function="let-stand" spanTo="#meta-93"/>some text between the two empty nodes<anchor xml:id="meta-93"/>
In other words, the text is always preceded with a metamark tag with #function='let-stand' and a spanTo with a unique value. And the text is always followed with an anchor tag whose #xml:id value match that of the #spanTo value on the metamark.
When transforming such text via XSLT into HTML, I would like to wrap it in a span tag as follows:
<span class="dotted">some text between the two empty nodes</span>
How can I achieve this? Note that the text between the two empty nodes will always be siblings. The value I've put on the span #class is arbitrary. I'm just using "dotted" for demonstration purposes here.

The basic idea is that for each metamark:
create span tag,
get following siblings of the current metamark,
which as a following sibling have anchor tag with proper id (end point, exclusive),
and apply templates to them.
Of course, you have to block "normal" template application within the parent tag of your metamark tags.
Try the following transformation:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output method="html" doctype-public="XSLT-compat"
encoding="UTF-8" indent="yes" />
<xsl:template match="metamark">
<xsl:element name="span">
<xsl:attribute name="class" select="'dotted'"/>
<xsl:variable name="termId" select="substring(#spanTo, 2)"/>
<xsl:variable name="srcRange" select="following-sibling::node()
[following-sibling::anchor[#xml:id=$termId]]"/>
<xsl:apply-templates select="$srcRange"/>
</xsl:element>
<xsl:text>
</xsl:text>
</xsl:template>
<!-- In "main" process only "metamark" tags -->
<xsl:template match="main">
<xsl:apply-templates select="metamark"/>
</xsl:template>
<!-- HTML envelope -->
<xsl:template match="/">
<html>
<body>
<xsl:text>
</xsl:text>
<xsl:apply-templates />
</body>
</html>
</xsl:template>
<!-- Identity transform -->
<xsl:template match="#*|node()">
<xsl:copy><xsl:apply-templates select="#*|node()"/></xsl:copy>
</xsl:template>
</xsl:transform>
I tried it for the following XML sample:
<?xml version="1.0" encoding="utf-8"?>
<main>
<metamark function="let-stand" spanTo="#meta-93"/>Aaaaaa bbbbbbb<anchor xml:id="meta-93"/>
<metamark function="let-stand" spanTo="#meta-94"/>Eeeeee <b>bbb</b> ccc<anchor xml:id="meta-94"/>
<metamark function="let-stand" spanTo="#meta-95"/>Ffffff bbbbbbb<anchor xml:id="meta-95"/>
</main>
and got result:
<!DOCTYPE html PUBLIC "XSLT-compat">
<html>
<body>
<span class="dotted">Aaaaaa bbbbbbb</span>
<span class="dotted">Eeeeee <b>bbb</b> ccc</span>
<span class="dotted">Ffffff bbbbbbb</span>
</body>
</html>

Related

Transformation of nested HTML span element

We are trying to transform an HTML template with placeholders into the final HTML using XSLT.
The (simplified) HTML template looks like:
<p>
<span class="condition" id="v6">Some text here
<span class="placeholder" id="v1" />
</span>
</p>
The transformation should
replace every span element with placeholder class;
hide or show each span element that contains a condition class
The (simplified) XSLT we have is:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:local="urn:local"
xmlns:s0="http://www.w3.org/1999/xhtml">
<xsl:output omit-xml-declaration="yes" method="xml" version="1.0" />
<!-- Take the HTML template -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<!--Replace every placeholder in the HTML template with the value from the XML data-->
<xsl:template match="span[#class='placeholder']">
<xsl:variable name="this" select="current()/#id"/>
<xsl:value-of select="'replaced'" />
</xsl:template>
<!-- Handle conditions based on custom logic -->
<xsl:template match="span[#class='condition']">
<xsl:variable name="this" select="current()/#id"/>
<xsl:if test="$this = 'v6'">
<xsl:value-of select="current()" />
</xsl:if>
</xsl:template>
</xsl:stylesheet>
If the placeholder span is not nested into the condition span, everything works fine. However, in the HTML example above, it doesn't work, the output is:
<p>Some text here</p>
We would like it to be:
<p>
Some text here
replaced
</p>
It seems as if the condition is executed, but the placeholder is not executed or somehow overwritten; so basically, the placeholder span elements are not being replaced.
Does anyone has an explanation for this, and what we are doing wrong?
Thanks!
Meanwhile found a solution:
<!-- Handle conditions based on custom logic -->
<xsl:template match="span[#class='condition']">
<xsl:variable name="this" select="current()/#id"/>
<xsl:if test="$this = 'v6'">
<xsl:apply-templates select="node()"/> <<<<----------------
</xsl:if>
</xsl:template>
Instead of taking the current text, I just re-apply the node.

XML encoded HTML to XSLT

How do I take XML-encoded HTML and create a XSLT? I have the xml/html page linked to the XSLT file and it shows the text from the document but will not show the link or picture. The image in the XML/HTML is in a folder called images within the folder where the xml and xslt are.
<?xml version="1.0" ?>
<?xml-stylesheet type="text/xsl" href="XLST.xslt"?>
<html>
<head>
<title>CATS! CATS! CATS!</title>
</head>
<body>
<h1>CATS</h1>
<p>
Visit Google...
</p>
Cats like milk!
<p>
<image> <img src="Cats.jpg" alt="Cats, so cute!"/></image>
</p>
</body>
</html>
And the XSLT file:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="html">
<xsl:apply-templates select="body" />
</xsl:template>
<xsl:template match="body">
<img alt="">
<xsl:attribute name="src">
<xsl:value-of select="//Cats"/>
</xsl:attribute>
</img>
</xsl:template>
</xsl:stylesheet>
Transforming an existing xhtml document to an almost identical xhtml document with just a few additions isn't something you would normally do, but it's relatively easy. Start with a stylesheet that only includes the identity template:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
This will output the xhtml exactly as-is, then you can add in templates to supplement your source xhtml. For example, adding in this:
<xsl:template match="body">
<xsl:copy>
<xsl:apply-templates select="#*"/>
<img alt="" src="(insert url here)"/>
<xsl:apply-templates select="node()"/>
</xsl:copy>
</xsl:template>
Will add an image at the start of the body element. It copies the element itself, then it's attributes, inserts an image, then copies the child elements.

Adding HTML tags with XSLT on the fly

I have an XML document containing this:
<d1/>
<p1>...</p1>
<p2>...</p2>
<d2/>
<p3>...</p3>
<d3/>
Where pn are elements with possibly subelements and other stuff, and dn indicates where an HTML DIV tag wrapping the p tags should begin, but without a corresponding closing tag, this is only indicated implicitly by the next dn tag. The desired HTML output is this:
<div>
<p1>...</p1>
<p2>...</p2>
</div>
<div>
<p3>...</p3>
</div>
I have written an XSLT to introduce the <div> and </div> tags on the fly, using the following:
<xsl:text disable-output-escaping="yes"><div></xsl:text>
and
<xsl:text disable-output-escaping="yes"></div></xsl:text>
and this works on Safari, but it fails on FireFox, which makes me suspect that it's not the right way to do it.
Do you have a better idea that will work on every browser?
Thanks a lot in advance.
Firefox does not support disable-output-escaping because it does not serialize the result tree. The problem is a grouping problem, one way to solve it is to use a key:
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output indent="yes"/>
<xsl:key name="group" match="body/*[not(starts-with(local-name(), 'd'))]" use="generate-id(preceding-sibling::*[starts-with(local-name(), 'd')][1])"/>
<xsl:template match="body">
<xsl:copy>
<xsl:apply-templates select="*[starts-with(local-name(), 'd')]"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*[starts-with(local-name(), 'd')]">
<div>
<xsl:copy-of select="key('group', generate-id())"/>
</div>
</xsl:template>
</xsl:transform>
That would create an empty div however at the end of your sample, so you might want to change the last template to
<xsl:template match="*[starts-with(local-name(), 'd')]">
<xsl:if test="key('group', generate-id())">
<div>
<xsl:copy-of select="key('group', generate-id())"/>
</div>
</xsl:if>
</xsl:template>
You could use a technique known as "sibling recursion".
Given a well-formed input such as:
XML
<root>
<d1/>
<p1>a</p1>
<p2>b</p2>
<d2/>
<p3>c</p3>
<d3/>
</root>
the following stylesheet:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/>
<xsl:template match="/root">
<body>
<xsl:apply-templates select="*[starts-with(name(), 'd')][position()!=last()]"/>
</body>
</xsl:template>
<xsl:template match="*[starts-with(name(), 'd')]">
<div>
<xsl:apply-templates select="following-sibling::*[1][not(starts-with(name(), 'd'))]"/>
</div>
</xsl:template>
<xsl:template match="/root/*[not(starts-with(name(), 'd'))]">
<xsl:copy-of select="."/>
<xsl:apply-templates select="following-sibling::*[1][not(starts-with(name(), 'd'))]"/>
</xsl:template>
</xsl:stylesheet>
will return:
<body>
<div>
<p>a</p>
<p>b</p>
</div>
<div>
<p>c</p>
</div>
</body>

Convert 'embedded' XML doc into CDATA output in XSLT (1.0)

Given an input XML document like this:
<?xml version="1.0" encoding="utf-8"?>
<title> This contains an 'embedded' HTML document </title>
<document>
<html>
<head><title>HTML DOC</title></head>
<body>
Hello World
</body>
</html>
</document>
</root>
How I can extract that 'inner' HTML document; render it as CDATA and include in my output document ?
So the output document will be an HTML document; which contains a text-box showing the elements as text (so it will be displaying the 'source-view' of the inner document).
I have tried this:
<xsl:template match="document">
<xsl:value-of select="*"/>
</xsl:template>
But this only renders the Text Nodes.
I have tried this:
<xsl:template match="document">
<![CDATA[
<xsl:value-of select="*"/>
]]>
</xsl:template>
But this escapes the actual XSLT and I get:
<xsl:value-of select="*"/>
I have tried this:
<xsl:output method="xml" indent="yes" cdata-section-elements="document"/>
[...]
<xsl:template match="document">
<document>
<xsl:value-of select="*"/>
</document>
</xsl:template>
This does insert a CDATA section, but the output still contains just text (stripped elements):
<?xml version="1.0" encoding="UTF-8"?>
<html>
<head>
<title>My doc</title>
</head>
<body>
<h1>Title: This contains an 'embedded' HTML document </h1>
<document><![CDATA[
HTML DOC
Hello World
]]></document>
</body>
</html>
There are two confusions you need to clear up here.
First, you probably want xsl:copy-of rather than xsl:value-of. The latter returns the string value of an element, the former returns a copy of the element.
Second, the cdata-section-elements attribute on xsl:output affects the serialization of text nodes, but not of elements and attributes. One way to get what you want would be to serialize the HTML yourself, along the lines of the following (not tested):
<xsl:template match="document/descendant::*">
<xsl:value-of select="concat('<', name())"/>
<!--* attributes are left as an exercise for the reader ... *-->
<xsl:text>></xsl:text>
<xsl:apply-templates/>
<xsl:value-of select="concat('</', name(), '>')"/>
</xsl:template>
But the quicker way would be something like the following solution (squeamish readers, stop reading now), pointed out to me by my friend Tommie Usdin. Drop the cdata-section-elements attribute from xsl:output and replace your template for the document element with:
<xsl:template match="document">
<document>
<xsl:text disable-output-escaping="yes"><![CDATA[</xsl:text>
<xsl:copy-of select="./html"/>
<xsl:text disable-output-escaping="yes">]]></xsl:text>
</document>
</xsl:template>

XSLT: all elements to div elements with class attribute = original element name

I've got a xml in the following form (but much larger..)
<entry>
<lemma>coaster</lemma>
<sense>
<trans>Untersetzer</trans>
</sense>
</entry>
What I want to get by xsl-transformation is this:
<div class="entry">
<div class="lemma>coaster</div>
<div class="sense">
<div class="trans">Untersetzer</div>
</div>
</div>
Not that complicated: Transform all elements to div elements with class attribute = original element name.
Could anybody please give me a hint how an appropriate XSL should look like?
Thanks!
You can do that (XSLT 1.0) :
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="*">
<div class="{local-name()}">
<xsl:apply-templates/>
</div>
</xsl:template>
</xsl:stylesheet>
Note that the stylesheet skip the attributes it encounters.
EDIT after comment
If you want to keep attributes, you just have to skip any class attributes (because you create a new one). For example like this :
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="*">
<div class="{local-name()}">
<xsl:apply-templates select="node()|#*"/>
</div>
</xsl:template>
<xsl:template match="#*">
<xsl:if test="name() != 'class'">
<xsl:copy-of select="."/>
</xsl:if>
</xsl:template>
</xsl:stylesheet>