Output php processing instruction inside attribute value - html

In my XSLT (2.0 - the output method is html) I have this:
<img>
<xsl:attribute name="href">
<xsl:text disable-output-escaping="yes"><?php echo get_url(); ?></xsl:text>
</xsl:attribute>
</img>
The output I want is as follows:
<img href="<?php echo get_url(); ?>">
The output I get is as follows:
<img href="<?php echo get_url(); ?>">
Tried a bunch of different things to get the ">" coming out in the output instead of > (CDATA marked sections etc.) but nothing seems to work. Strange that the less than sign works fine, but the greater than doesn't. I'm using Saxon-PE 9.5.1.7.

Use a character map with some characters you don't need elsewhere, here is an example (https://www.w3.org/TR/xslt20/#character-maps) adapted from the XSLT 2.0 spec:
<img href="«?php echo get_url(); ?»"/>
and
<xsl:output method="html" use-character-maps="m1"/>
<xsl:character-map name="m1">
<xsl:output-character character="«" string="<"/>
<xsl:output-character character="»" string=">"/>
</xsl:character-map>
Online example is at http://xsltransform.net/93dEHFP.
As for disable-output-escaping, it does not work in attribute values as far as I know, that result that you get is not the result of disable-output-escaping but just the use of xsl:output method="html" (https://www.w3.org/TR/xslt-xquery-serialization/#HTML_ATTRIBS) mandating 'The HTML output method MUST NOT escape "<" characters occurring in attribute values.'.

Related

Value in CDATA tag not being displayed in XSL file?

I want to replace the & symbol inside of a piece of text that is generated dynamically with its encoded value %26 to prevent it from breaking the URL string. I am storing the text inside hrefvalue variable. My goal is to replace & with %26 to be output in the final HTML code in the browser.
For example:
"listening & comprehension" should become "listening %26 comprehension"
I am using <![CDATA[ to preserve %26 but this seems not to be working. I still end up with "listening & comprehension" in the browser. Why?
<xsl:variable name="hrefvalue" select="./node()" />
<xsl:choose>
<xsl:when test="contains($hrefvalue, '&')">
<xsl:variable name="string-before" select="substring-before($hrefvalue, '&')" />
<xsl:variable name="string-after" select="substring-after($hrefvalue, '&')" />
<xsl:variable name="ampersand"><![CDATA[%26]]></xsl:variable>
<xsl:value-of select="concat($string-before, $ampersand, $string-after)" disable-output-escaping="yes" />
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$hrefvalue" disable-output-escaping="no" />
</xsl:otherwise>
</xsl:choose>
I am working on a system that uses XSL 1.0
Firstly, the CDATA is completely unnecessary and irrelevant. The only purpose of CDATA is to allow & and < to be written without escaping, and if those special characters are not present, the CDATA tag makes no difference.
Also, if you're generating an attribute (which seems likely if it's a URL) then disable-output-escaping has no effect. But again, it's probably not needed.
Your code only deals with one ampersand in a string.
But your code looks fine. If there's a problem, it's in the part of the code that you haven't shown us. Try to construct a complete reproducible example: a complete stylesheet and source document that we can actually run to see if we can reproduce the problem.

Getting HTML elements via XPath in bash

I was trying to parse a page (Kaggle Competitions) with xpath on MacOS as described in another SO question:
curl "https://www.kaggle.com/competitions/search?SearchVisibility=AllCompetitions&ShowActive=true&ShowCompleted=true&ShowProspect=true&ShowOpenToAll=true&ShowPrivate=true&ShowLimited=true&DeadlineColumnSort=Descending" -o competitions.html
cat competitions.html | xpath '//*[#id="competitions-table"]/tbody/tr[205]/td[1]/div/a/#href'
That's just getting a href of a link in a table.
But instead of returning the value, xpath starts validating .html and returns errors like undefined entity at line 89, column 13, byte 2964.
Since man xpath doesn't exist and xpath --help ends with nothing, I'm stuck. Also, many similar solutions relate to xpath from GNU distributions, not in MacOS.
Is there a correct way of getting HTML elements via XPath in bash?
Getting HTML elements via XPath in bash
from html file (with not valid xml)
One possibility may be to use xsltproc. (I hope it is available for MAC). xsltproc has an option --html to use html as input. But with that you need
to have a xslt stylesheet.
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text" />
<xsl:template match="/*">
<xsl:value-of select="//*[#id='competitions-table']/tr[205]/td[1]/div/a/#href" />
</xsl:template>
</xsl:stylesheet>
Notice that the xapht is changed. There is no tbodyin the input file.
Call xsltproc:
xsltproc --html test.xsl competitions.html 2> /dev/null
Where the xslproc complaining about errors in html is ignored ( send to /devn/null ).
The output is: /c/R
To use different xpath expression from command line you may use a xslt template and replace the __xpath__.
E.g. xslt template:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text" />
<xsl:template match="/*">
<xsl:value-of select="__xpaht__" />
</xsl:template>
</xsl:stylesheet>
And use (e.g) sed for the replacement.
sed -e "s,__xpaht__,//*[#id='competitions-table']/tr[205]/td[1]/div/a/#href," test.xslt.tmpl > test.xsl
xsltproc --html test.xsl competitions.html 2> /dev/null

How to store superscript in XML attribute and read using XSL?

I have a requirement where I need create an XML document dynamically. Some of the attributes of the nodes of this XML contain superscript Reg etc. My question is how should I store such superscript characters in XML and then read it using XSL to render as HTML. A sample XML is shown below:
<?xml version="1.0" encoding="utf-8"?>
<node name="Some text <sup>®</sup>"/>
I know this cannot be stored under sup tag inside attribute as it breaks XML. I tried using <sup> also in place of opening and closing tag. But then they are rendered as <sup> on HTML instead of actually making it superscript.
Please let me know the solution for this problem. I have control over generation of XML. I can write it the correct way, If I know what is the right way to store superscripts.
Since you're using XSL to transform the input into HTML, I would suggest using a different method to encode the fact that some things need to be superscripts. Make up your own simple markup, for example
<node name="Some text [[®]]"/>
The markup can be anything that you can uniquely identify later and doesn't occur naturally in your data. Then in your XSL process the attribute values that can contain this markup with a custom template that converts the special markup to <sup> and </sup>. This allows you to keep the document structure (i.e. not move these string values to text nodes) and still achieve your goal.
Please let me know the solution for this problem. I have control over
generation of XML. I can write it the correct way, If I know what is
the right way to store superscripts.
Because attributes can only contain values (no nodes), the solution is to store markup (nodes) inside elements:
<node>
<name>Some text <sup>®</sup></name>
</node>
If it's only single characters like ® that need to be made superscript, then you can leave the XML without crooks like <sup>, i.e. like
<node name="Some text ®"/>
and look for the to-be-superscripted characters during processing. A template like this might help:
<xsl:template match="node/#name">
<xsl:param name="nameString" select="string()"/>
<!-- We're stepping through the string character by character -->
<xsl:variable name="firstChar" select="substring($nameString,1,1)"/>
<xsl:choose>
<!-- '®' can be extended to be a longer string of single characters
that are meant to be turned into superscript -->
<xsl:when test="contains('®',$firstChar)">
<sup><xsl:value-of select="$firstChar"/></sup>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$firstChar"/>
</xsl:otherwise>
</xsl:choose>
<!-- If we we didn't yet step through the whole string,
chop off the first character and recurse. -->
<xsl:if test="$firstChar!=''">
<xsl:apply-templates select=".">
<xsl:with-param name="nameString" select="substring($nameString,2)"/>
</xsl:apply-templates>
</xsl:if>
</xsl:template>
This approach is however not very efficient, especially if you have lots of name attributes and/or very long name attributes. If your application is performance critical, then better do some testing whether the impact on processing times is justifiable.

Is there a way to detect numeric string in xslt?

I am now doing a html to xml xslt transformation, pretty straigh-forward. But I have one slight problem that is left unsolved.
For example, in my source html, a node looks like:
<p class="Arrow"><span class="char-style-override-12">4</span><span class="char-style-override-13"> </span>Sore, rash, growth, discharge, or swelling.</p>
As you can see, the first child node < span> has a value of 4, is it actually rendered as a arrow point in the browser (maybe some encoding issue, it is treated as a numeric value in my xml editor).
So my question is, I wrote a template to match the tag, then pass the text content of it to another template match :
<xsl:template match="text()">
<xsl:variable name="noNum">
<xsl:value-of select="normalize-space(translate,'4',''))"/>
</xsl:variable>
<xsl:copy-of select="$noNum"/>
</xsl:template>
As you can see, this is definitely not a good solution, it will replace all the numbers appearing in the string, not only the first character. So I wonder if there is a way to remove only the first character IF it is a number, maybe using regular expression? Or, I am actually going the wrong way, should there be a better way to think of solving this problem(e.g, changing the encoding)?
Any idea is welcomed! Thanks in advance!
Just use this :
<xsl:variable name="test">4y4145</xsl:variable>
<xsl:if test= "not(string(number(substring($test,1,1)))='NaN')">
<xsl:message terminate="no">
<xsl:value-of select="substring($test,2)"/>
</xsl:message>
</xsl:if>
This is a XSLT 1.0 solution. I think regex is an overkill for this.
Output :
[xslt] y4145
Use this single XPath expression:
concat(translate(substring(.,1,1), '0123456789', ''),
substring(.,2)
)

xslt need to select a single quote

i need to do this:
<xsl:with-param name="callme" select="'validateInput(this,'an');'"/>
I've read Escape single quote in xslt concat function and it tells me to replace ' with &apos; I've done that yet its still not working..
Does anyone know how do we fix this?:
<xsl:with-param name="callme" select="'validateInput(this,&apos;an&apos;);'"/>
Something simple that can be used in any version of XSLT:
<xsl:variable name="vApos">'</xsl:variable>
I am frequently using the same technique for specifying a quote:
<xsl:variable name="vQ">"</xsl:variable>
Then you can intersperse any of these variables into any text using the standard XPath function concat():
concat('This movie is named ', $vQ, 'Father', $vApos, 's secrets', $vQ)
So, this transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:variable name="vApos">'</xsl:variable>
<xsl:variable name="vQ">"</xsl:variable>
<xsl:template match="/">
<xsl:value-of select=
"concat('This movie is named ', $vQ, 'Father', $vApos, 's secrets', $vQ)
"/>
</xsl:template>
</xsl:stylesheet>
produces:
This movie is named "Father's secrets"
In XSLT 2.0 you can the character used as a string delimiter by doubling it, so
<xsl:with-param name="callme" select="'validateInput(this,''an'');'"/>
Another solution is to use variables:
<xsl:variable name="apos">'</xsl:variable>
<xsl:variable name="quot">"</xsl:variable>
<xsl:with-param name="callme" select="concat('validateInput(this,', $apos, 'an', $apos, ');')"/>
This is a little tricky, but you need to invert the apostrophe and quotes, like this:
<xsl:with-param name="callme" select='"validateInput(this,&apos;an&apos;);"' />
You're enclosing a string within one set of quotes, and the attribute value that contains it in another. In XSLT, which quotes you use are interchangeable in both cases, as long as you don't use the same one.
Previously, your &apos; was being parsed as the value of the match attribute was being read, and it was trying to set the value of the select to 'validateInput(this,'an');'. Although this is technically a valid string value, when XSLT processes it, it fails to parse it because it's tries to read it as a string literal, which is terminated prematurely before the an, as the same apostrophe is used there as was used to enclose the string.
Use " rather than &apos; (by using &apos; you are effectively nesting single quotation marks inside single quotation marks; you need to alternate single and double quotation marks as you nest, escaping as necessary).