I have XML that has encoded HTML data. I am trying to render the data but can't seem to figure out how. Best I can tell is I need to disable-output-escaping="yes" twice but not sure how to do that.
For example, this is a snippet of my XML:
<root>
<node value="<b>body</b>" />
</root>
My XSLT is outputting HTML. Here is the rendered output (the HTML source) with various options
<xsl:value-of select="#value" /> outputs <b>hi</b>
<xsl:value-of select="#value" disable-output-escaping="yes" /> outputs <b>hi</b>
I would like it to output <b>hi</b> to the HTML source so its actually rendered as a bolded hi. Does that make sense? Is that possible?
Escaping is the process of turning < into <. If you disable escaping, it will leave < as <. What you want to achieve is to turn < into <, which would normally be called "unescaping".
In the normal course of events, a parser performs unescaping, while a serializer performs escaping. So if you want to unescape characters, you need to put them through a parsing process, which means you need to take the content of the #value attribute and put it through an operation like fn:parse-xml-fragment() in XPath 3.0, or an equivalent extension function in your chosen processor.
Assuming Sharepoint as a Microsoft .NET product uses XslCompiledTransform you could try to implement the unescaping and parsing with extension "script" (C# or VB or JScript.NET code embedded in XSLT) as follows:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:mf="http://example.com/mf"
exclude-result-prefixes="msxsl mf">
<msxsl:script language="C#" implements-prefix="mf">
<msxsl:using namespace="System.IO"/>
public string Unescape(string input)
{
XmlDocument doc = new XmlDocument();
XmlDocumentFragment frag = doc.CreateDocumentFragment();
frag.InnerXml = input;
return frag.InnerText;
}
public XPathNavigator ParseXml(string xmlInput)
{
using (StringReader sr = new StringReader(xmlInput))
{
return new XPathDocument(sr).CreateNavigator();
}
}
</msxsl:script>
<xsl:output method="html" doctype-public="XSLT-compat" omit-xml-declaration="yes" encoding="UTF-8" indent="yes" />
<xsl:template match="/">
<html>
<head>
<title>Test</title>
</head>
<xsl:apply-templates/>
</html>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="node">
<div>
<xsl:copy-of select="mf:ParseXml(mf:Unescape(#value))" />
</div>
</xsl:template>
</xsl:stylesheet>
If you have access to an XSLT processor (like any version of Saxon 9.7 or Exselt or the latest Altova or XmlPrime) supporting the XPath 3 functions parse-xml and parse-xml-fragment you can write that template without extension functions (in a version="3.0" stylesheet) as
<xsl:template match="node">
<div>
<xsl:copy-of select="parse-xml(string(parse-xml-fragment(#value)))"/>
</div>
</xsl:template>
Output your result with disable-output-escaping, then treat it again in another XSL with disable-output-escaping.
Related
I'm trying to transform an XML document to be in single-line and wrap it in a one-element JSON. Using XSLT 1.0
The problem is, XSL generates double quotes in the xmlns definitions so the resulting JSON is invalid.
This is my input:
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<otm:Transmission xmlns:otm='http://xmlns.oracle.com/apps/otm/transmission/v6.4' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'>
<otm:TransmissionHeader/>
<otm:TransmissionBody>
<otm:GLogXMLElement>
<otm:Invoice>
<otm:Payment>
<otm:PaymentHeader>
<otm:DomainName>CompanyX</otm:DomainName>
<otm:TransactionCode>EX</otm:TransactionCode>
<otm:InvoiceDate>
<otm:GLogDate>20220414000000</otm:GLogDate>
</otm:InvoiceDate>
</otm:PaymentHeader>
</otm:Payment>
</otm:Invoice>
</otm:GLogXMLElement>
</otm:TransmissionBody>
</otm:Transmission>
This is what I'm getting:
{"jsonElement":"<otm:Transmission xmlns:otm="http://xmlns.oracle.com/apps/otm/transmission/v6.4" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><otm:TransmissionHeader/><otm:TransmissionBody><otm:GLogXMLElement><otm:Invoice><otm:Payment><otm:PaymentHeader><otm:DomainName>CompanyX</otm:DomainName><otm:TransactionCode>EX</otm:TransactionCode><otm:InvoiceDate><otm:GLogDate>20220414000000</otm:GLogDate></otm:InvoiceDate></otm:PaymentHeader></otm:Payment></otm:Invoice></otm:GLogXMLElement></otm:TransmissionBody></otm:Transmission>"}
The XSL that I use:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:otm='http://xmlns.oracle.com/apps/otm/transmission/v6.4'>
<xsl:output method="text" indent="no" suppress-indentation="otm:Transmission"/>
<xsl:strip-space elements="*" />
<xsl:template match="/">
{"jsonElement":"<xsl:apply-templates select="*"/>"}
</xsl:template>
<xsl:template match="#* | *">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
As you can see, the JSON is invalid due to double quotes in the xmlns definitons.
I tried several approaches and I am not able to get rid of the double quotes. In the input, they are single quotes but XSL is generating them differently.
What would be the best approach to have a valid JSON result?
The XML in the JSON has to be 1:1 copy of the input but transformed into a single line and I can only use XSLT 1.0
An XSLT 1.0 processor is going to reject the suppress-indentation attribute, and it's going to output the text of the source document without markup. Like #MartinHonnen, I don't see how any XSLT processor can give you the output you claim to be getting.
In XSLT 3.0 you can do
<xsl:output method="json">
<xsl:template match="/">
<xsl:map key="'jsonElement'"
select="serialize(., map{'method':'xml'})"/>
</xsl:template>
I would like to use a single XSL to produce multiple output formats (xml and html for now)
I would like to define which output format by means of a stylesheet
So the code I have is as follows:
<xd:doc scope="stylesheet">
<xd:desc>
<xd:p><xd:b>Created on:</xd:b> July 1, 2015</xd:p>
<xd:p><xd:b>Author:</xd:b> me</xd:p>
<xd:p>A stylesheet to test the application of XYZABC</xd:p>
<xd:p>takes a single parameter - xslt_output_format</xd:p>
<xd:p>valid inputs - xml html</xd:p>
</xd:desc>
</xd:doc>
<xsl:output name="xml_out" encoding="UTF-8" indent="yes" method="xml" />
<xsl:output name="html_out" encoding="ISO-8859-1" indent="yes" method="html"/>
<xsl:template match="/">
<xsl:choose>
<xsl:when test="$xslt_output_format = 'xml'">
<data>
<p>This is some test xml output</p>
</data>
</xsl:when>
<xsl:when test="$xslt_output_format = 'html'">
<html>
<head>
<title>HTML Test Output</title>
</head>
<body>
<p>This is some test html output</p>
</body>
</html>
</xsl:when>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
If I pass 'xml' as the parameter I get
This is some test xml output
and if I pass 'html' I get
HTML Test Output
This is some test html output
That doesn't seem to respect my respect for ISO-8859-1 encoding on the html (which I was just using to test the was working)
Michael Kay's XSLT 2.0 and Xpath 2.0 tome is a little vague and definitely short of examples on using multiple statements (sorry Mike)
So I am just asking am I using it correctly?
Can I achieve what I am aiming for?
TIA
Feargal
I think you need to use xsl:output together with xsl:result-document http://www.w3.org/TR/xslt20/#creating-result-trees, so try along the lines of
<xsl:template match="/">
<xsl:choose>
<xsl:when test="$xslt_output_format = 'xml'">
<xsl:result-document format="xml_out" href="output.xml">
<data>
<p>This is some test xml output</p>
</data>
</xsl:result-document>
</xsl:when>
<xsl:when test="$xslt_output_format = 'html'">
<xsl:result-document format="html_out" href="output.html">
<html>
<head>
<title>HTML Test Output</title>
</head>
<body>
<p>This is some test html output</p>
</body>
</html>
</xsl:result-document>
</xsl:when>
</xsl:choose>
</xsl:template>
I would probably use templates and modes to distinguish the two different ways of processing but the advice on using xsl:output and xsl:result-document remains the same.
xsl:output by itself doesn't allow you to make any run-time selection of output method.
In 2.0 you can use xsl:output in conjunction with xsl:result-document: the xsl:result-document can select a named xsl:output declaration, or it can override some of its attributes selectively.
Another option (also available in 1.0) is to override xsl:output from the calling API: if you're using JAXP look at Transformer.setOutputProperty().
On the Saxon command line you can set output properties using the syntax !indent=yes on the command line (the "!" needs to be "\!" with some shells).
I have the following XML and XSLT to transform to HTML.
XML
<?xml version="1.0" encoding="UTF-8"?>
<root>
<te>t1</te>
</root>
XSLT
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="html" indent="yes" />
<xsl:template match="root">
<html>
<div>
<xsl:variable name="name1" select="te" />
**
<xsl:value-of select="CtrlList['$name1']" />
**
</div>
<script language="javascript">var List={
"t1":"test"
}</script>
</html>
</xsl:template>
</xsl:stylesheet>
So my objective is get the value of "te" from the XML and map it with the JavaScript object "List" and return the value test while transforming with the XSLT. So i should get the value test as output.
Can anyone figure out what wrong in the XSLT.
When you look at your XSLT, it may seem like there is JavaScript there, but all XSLT sees is that it is outputing an element named "script", with an attribute "language", which contains some text. It is also worth noting that xsl:value-of is used to get the value from the input document, but your script element is actually part of the result tree, and so not accessible to xsl:value-of.
However, it is possible to extend XSLT so it can use javascript functions, but this is very much processor dependant, and you should think of it the same way as embedding JavaScript in HTML. Have a look at this question, as an example
How to include javaScript file in xslt
So, in your case, your XSLT would be something like this (Note this particular example will only work in Mircorsofts MSXML processor)
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:user="http://mycompany.com/mynamespace"
exclude-result-prefixes="msxsl user">
<xsl:output method="xml" indent="yes" />
<msxsl:script language="JScript" implements-prefix="user">
var List={
"t1":"test"
}
function lookup(key) {
return List[key];
}
</msxsl:script>
<xsl:template match="root">
<html>
<div>
<xsl:variable name="name1" select="te"/>
<xsl:value-of select="user:lookup(string($name1))"/>
</div>
</html>
</xsl:template>
</xsl:stylesheet>
Of course, it might be worth asking why you want to use javascript in your XSLT. It may be possible to achieve the same result using purely XSLT, which would certainly make you XSLT more portable.
I want to match mulitple values of a attribute for replacing. for example
<div class="div h1 full-width"></div>
Should produces div, h1 and full-width as seperate matches.
I want to do this to prefix the classes. So instead of div h1 full-width it should be pre-div pre-h1 pre-full-width
The regex I have sofar is
(?<=class=["'])(\b-?[_a-zA-Z]+[_a-zA-Z0-9-]*\b)+
This matches only the first class. This is offcourse because that is the only thing this pattern should match :( I tried to make the lookbehind take more then just class=" but I just end up with it taking everying and leaving nothing to replace.
I want to make a pattern that matches any value individually between the quotes of the class attribute.
I want to do this for an Ant buildscript that processes all files and replaces the class="value1 value2 value3" with a set prefix. Ive done this with little trouble for replacing the classes in css files but ye html seems to be alot trickier.
It is a Ant buildscript. Java regexp package is used to process the pattern. The ant tag used is: replaceregexp
The ant implemtentation of above pattern is:
<target name="prefix-class" depends="">
<replaceregexp flags="g">
<regexp pattern="(?<=class=['"])(\b-?[_a-zA-Z]+[_a-zA-Z0-9-]*\b)+"/>
<substitution expression=".${prefix}\1"/>
<fileset dir="${dest}"/>
</replaceregexp>
</target>
I don't think that you can find n (or in your case 3) different class entries and substitude them in one simple regexp. If you need to do this in ant i think you have to write your own ant task. A better way would be xslt, are you familiar with xslt?
Gave up on Ants ReplaceRegExp and sorted my problem with XSLT to transform xhtml to xhtml.
Following code adds a prefix to all values of a elements class attribute. the xhtml source document must be properly formatted to be parsed.
<xsl:stylesheet version="2.0"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
xmlns="http://www.w3.org/1999/xhtml"
xmlns:fn="http://www.w3.org/2005/xpath-functions"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xhtml xsl xs">
<xsl:output method="xml" version="1.0" encoding="UTF-8"
doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"
doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1.dtd"
indent="yes" omit-xml-declaration="yes"/>
<xsl:param name="prefix" select="'oo-'"/>
<xsl:template match="/">
<xsl:apply-templates select="./#*|./node()" />
</xsl:template>
<!--remove these atts from output, default xhtml values from dtd -->
<xsl:template match="xhtml:a/#shape"/>
<xsl:template match="#rowspan"/>
<xsl:template match="#colspan"/>
<xsl:template match="#class">
<xsl:variable name="replace_regex">
<xsl:value-of select="$prefix"/>
<xsl:text>$1</xsl:text>
</xsl:variable>
<xsl:attribute name="class">
<xsl:value-of select="fn:replace( . , '(\w+)' , $replace_regex )"/>
</xsl:attribute>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
I'm starting using XSLT and write this scipt:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="utf-8" />
<xsl:template match="span[#class='thumb']" >
Link: <xsl:value-of select="$base" /><xsl:value-of select="a/#href" />
</xsl:template>
<xsl:template match="/">
Base href: <xsl:value-of select="$base" />
<xsl:apply-templates/>
</xsl:template>
</xsl:stylesheet>
And using this command:
xsltproc --html --param base "'http://example.com'" lista.xslt test.html
I need to get list of Links, but I get whole page on output. What's wrong? How can I get it works?
There are some default templates which are unseen here. The really easy way to resolve it is to just explicitly limit to the span elements you're matching as below. Otherwise, you can override the default templates.
<xsl:template match="/">
Base href: <xsl:value-of select="$base" />
<xsl:apply-templates select="//span[#class='thumb']" />
</xsl:template>
There's a default template that matches essentially everything if you let it. Your 4th last line calls that template.
That's part of the problem. The rest can probably be taking care of by matching just the stuff you're looking for, directly in the top-level template.