HTML to Formatting Objects conversion - html

this is a sample of my XML output (which I cannot change because it's output by a CRM software over which I have no control):
<description><div><strong>Why</strong> is this not bold?</div> </description>
the <description> tags are proper xml tags, while <div> and <strong> are output as plain text in the pdf (see pic)
pdf output
I am trying to have them rendered properly and found this article online https://www.ibm.com/developerworks/library/x-xslfo2app/index.html but unfortunately the downloads are not available anymore. Can anyone explain to me how to apply the templates provided in the article to my XSLT file?
the XSLT for the snippet provided is currently this:
<fo:table margin-bottom="10mm">
<fo:table-column column-width="100%"/>
<fo:table-body>
<fo:table-row>
<fo:table-cell xsl:use-attribute-sets="testBorderLeft">
<fo:block font-weight="bold">Detailed Description</fo:block>
</fo:table-cell>
</fo:table-row>
<fo:table-row>
<fo:table-cell xsl:use-attribute-sets="testBorderLeft">
<fo:block>
<xsl:value-of select="/hash/description"/>
</fo:block>
</fo:table-cell>
</fo:table-row>
</fo:table-body>
</fo:table>
How can incorporate for example this?
<xsl:template match="strong">
<fo:inline font-weight="bold">
<xsl:apply-templates select="*|text()"/>
</fo:inline>
</xsl:template>
Coding is really not my job, but I have to do this for work, so if you could be patient and very very clear it would be great. Thanks!

Related

Is there something invalidating my code from being accepted as an XML file?

Getting an error that this body of code isn't valid as part of an XSL file. That's the only feedback that this document system gives me. This is part of a larger document but I thought I would try isolating the issue to see if it's in the first half of the document
<xsl:template match="ws:Worker_Sync">
<File etv:separator="
">
<Header etv:separator=";">
<HeaderItem1>EEID</HeaderItem1>
<HeaderItem2>FirstNAME</HeaderItem2>
<HeaderItem3>LastNAME</HeaderItem3>
<HeaderItem4>INTNAME</HeaderItem4>
<HeaderItem5>OHDATE</HeaderItem5>
<HeaderItem6>SALARY</HeaderItem6>
</Header>
<xsl:apply-templates select="ws:Worker"/>
<Footer>
<WorkerCount etv:number="totalCount">
</WorkerCount>
<DateTimeviaXpath>
<etv:class etv:name="FormatDates" etv:dateFormat="yyyy-MM-dd"/>
<etv:class etv:name="FormatTime" etv:timeFormat="HH-mm-ss"
<xsl:value-of select="current-dateTime"/>
</DateTimeviaXpath>
</Footer>
</File>
</xsl:template>
This etv:class element
<etv:class etv:name="FormatTime" etv:timeFormat="HH-mm-ss"
needs to be
<etv:class etv:name="FormatTime" etv:timeFormat="HH-mm-ss"/>

XSLT Transformation of Hyperlinks

I have an XSLT web page that transforms a table I have extracted from MS Access. In the xml document I have some hyperlinks such as C:\My Work\My HTML test.htm and as far as I know the white space is preserved. My problem is that when I transform this to a Hyperlink the link changes to file:///C:\My%20Work\My%20HTML%20test.htm which does not work. I have other links that are formed in the normal way (without spaces) that work fine so I can pinpoint the issue to the addition of the %20.
I have the command:
<xsl:preserve-space elements="clmAttach1Link clmAttach2Link"/>
in the XSL document. The code to transform the XML is:
<a>
<xsl:attribute name="href">
<xsl:value-of select="clmAttach1Link"/>
</xsl:attribute>
<xsl:value-of select="clmAttach2Name"/>
</a>
This code displays all of the information correctly just does not link to the local files.
Can anyone help me transform the hyperlinks to retain the spaces so I can link to local files?
Thanks
What is the correct link suppose to be? Should it be with spaces? else you could try to use:
<xsl:strip-space>
And have a path without spaces.
Spaces in URLs is not considered safe, but you can try using <xsl:text>:
<xsl:text disable-output-escaping="yes"><a href="</xsl:text>
<xsl:value-of disable-output-escaping="yes" select="clmAttach1Link"/>
<xsl:text disable-output-escaping="yes">"></xsl:text>
<xsl:value-of disable-output-escaping="yes" select="clmAttach2Name"/>
<xsl:text disable-output-escaping="yes"></a></xsl:text>
or only value-of and concat:
<xsl:value-of disable-output-escaping="yes" select="concat(
'<a href="', clmAttach1Link, '">', clmAttach2Name ,'</a>')"/>

Import HTML Drupal module with custom XSL template

I have a requirement to import a html website to Drupal and I have decided to using Import HTML module to do it.
I have to be able to grab just the text from html page (inside tag) without the html tags.
For this, I'm trying to create a custom xsl template based on the default template: html2simplehtml.xsl.
Currently my import is working fine with html2simplehtml.xsl template.
here is example of the result node body from the import:
<div class="container-narrow">
<div class="masthead">
<ul class="nav nav-pills pull-right">
<li class="active">
Home
</li>
<li>
Applications
</li>
<li>
Middleware
</li>
now, the requirement is to only get:
Home
Applications
Middleware
I have found this to remove html tags:
<!-- This will remove the tag -->
<xsl:template name="remove-html">
<xsl:param name="text"/>
<xsl:choose>
<xsl:when test="contains($text, '<')">
<xsl:value-of select="substring-before($text, '<')"/>
<xsl:call-template name="remove-html">
<xsl:with-param name="text" select="substring-after($text, '>')"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$text"/>
</xsl:otherwise>
</xsl:choose>
but I am not sure where to put and how to call it using this:
<!-- Calling the template that removes tag -->
<xsl:call-template name="remove-html">
<xsl:with-param name="text" select="{HtmlBody}"/>
</xsl:call-template>
How can I do this?
I'm not quite familiar with the way that Drupal calls your XSLT but let's assume it's a simple XSLT 1.0 processor using some HTML page as input and generating the output that you showed above. Let's further assume that the original HTML is well formed with all required closing tags, so that it's in fact XHTML which can be processed by the XSLT processor. (This is not true for the HTML you included in your question, by the way.)
So what you want to do is basically prevent all tags in the XML/XHTML input from showing up in the output. I think the easiest way to achieve this to use the <xsl:value-of select> tag. Assuming that you copy all the child tags of the <body></body> section of your XHTML like this:
<xsl:template match="body">
<xsl:copy-of select="*">
</xsl:template>
instead you could do this:
<xsl:template match="body">
<xsl:value-of select=".">
</xsl:template>
<xsl:value-of> forces the evaluation of the XML sub tree into a string which is done (simply put) by concatenating all contained text elements. This does not, however, take care of white space yet. If you want to the eliminate disturbing white space you could brace the call like this:
<xsl:template match="body">
<xsl:value-of select="normalize-space(.)">
</xsl:template>
Now for the template you originally wanted to use: This does in fact remove tags from the input, too. But if I interpret the code right the input is NOT an XML node set but it must already be a STRING. So this works for other context in which you have a literal XML representation in a string. If you tried to use it here you would have to explicitly convert your XML representation into a string beforehand by using e.g. <xsl:value-of>. In this case the template would already be stripped off the tags (as described above) and would effectively not do anything at all but return the same string that it was passed as parameter. So IMHO, you will not need this template at all.

Allow html styling of text from XML element.

I have been using abit of xslt to style my xml into something readable. However there is one thing I have not been able to figure out.
I was woundering how you can apply stylying to the text inside the xml elemnts. for instance this is what part of my xml looks like
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="mystylesheet.xsl" type="text/xsl"?>
<Collection>
<Tals>
<Indent="0">Weapon Training</Talent>
<Cost>1</Cost>
<Description>Confers <b>proficiency</b> of <i>two weapons</i>, either melee or ranged. This talent make be aquired multiple times</Description>
</Tals>
I would like to know how I could get my description element to output in a html format.. so ou can see the bold text and italic text.
This is how I am catching my Description element from my xml in mystylesheet.xsl
Description: </b><xsl:value-of select="Description"/>
any help would be greatly appreciated.
If my understanding is right you like to copy the content of Description.
This could easily done by changing the <xsl:value-of select="Description"/> to
<xsl:apply-templates select="Description/node()"/>
To make it work you have also to add an "identity transform template"
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
Update:
Alternative you can also use
<xsl:copy-of select="Description/node()"/>
But the "identity transform template" is the better solution, because it is possible to add still more specialized templates.

How to let xsl:fo allow table cell to take empty blocks and print mutiple lines

I have a one row and one of the columns has a list of data. Say I have a row, the 3rd column has 0 or more rows. <xsl:for-each select="./parts"> and for some reason the code I have doesn't seem to work. I am not sure how to implement it. I am getting this error.
org.apache.fop.events.LoggingEventListener processEvent The following
feature isn't implemented by Apache FOP, yet: table-layout="auto" (on
fo:table) (No context info available) [4/1/13 19:14:38:002 CDT]
00000053 SystemErr R org.apache.fop.fo.ValidationException:
"fo:table-cell" is missing child elements. Required content model:
marker* (%block;)+ (No context info available)
I have this code and this doesn't work.
<xsl:for-each select="./List">
<fo:table-row>
<fo:table-cell border="solid 1px" text-align="center">
<fo:block font-size="8pt"><xsl:value-of select="group" /></fo:block>
<fo:table-cell>
<fo:table-cell border="solid 1px" text-align="left">
<xsl:for-each select="./parts">
<fo:block font-size="8pt"><xsl:value-of select="partNumber" /><fo:leader />
</fo:block>
</xsl:for-each>
</fo:table-cell>
</fo:table-row>
</xsl:for-each>
Try removing strict validations:
fopFactory.setStrictValidation(false);
Ok, so the problem your getting is coming from this block.
<fo:table-cell border="solid 1px" text-align="left">
<xsl:for-each select="./parts">
<fo:block font-size="8pt"><xsl:value-of select="partNumber" /><fo:leader />
</fo:block>
</xsl:for-each>
</fo:table-cell>
As I believe others have already pointed out, if you have 0 parts elements, then your table cell has no block child. The way I see it, there are two easy fixes. First, try wrapping your for-each statement in another block element like so.
<fo:table-cell border="solid 1px" text-align="left">
<fo:block>
<xsl:for-each select="./parts">
<fo:block font-size="8pt"><xsl:value-of select="partNumber" /><fo:leader />
</fo:block>
</xsl:for-each>
<fo:block>
</fo:table-cell>
If you find it has unwanted effects on your formatting, you can play around with the padding and other properties so that the added block still preserves your alignment. This will fix your issue for sure. A slightly more complex alternative would be to use an xsl:choose statement that tests to see if there is at least one part before trying to iterate over them, otherwise it inserts an empty block.
<fo:table-cell border="solid 1px" text-align="left">
<xsl:choose>
<xsl:when test="count(./parts) > 0">
<xsl:for-each select="./parts">
<fo:block font-size="8pt"><xsl:value-of select="partNumber" /><fo:leader />
</fo:block>
</xsl:for-each>
</xsl:when>
<xsl:otherwise>
<fo:block> </fo:block>
</xsl:otherwise>
</xsl:choose>
</fo:table-cell>
Although this is longer, it is also more extensible, for example, if at a future point you wanted it to display the list of parts if there is data, and if not display another value (which also may or may not be present), you can simply add another when block to mitigate that change in the logic.
One last note, the block that I put in the otherwise statement contains  , which is just an encoding for a single white space. If you want your 'empty' block to still reserve a line of space (ie stop it from collapsing if there is no text content) you can use the white space to keep the block from collapsing, otherwise, if you don't care about it collapsing or not, just remove the white space.
It seems sometimes your code:
<xsl:for-each select="./parts">
<fo:block font-size="8pt"><xsl:value-of select="partNumber" /><fo:leader />
</fo:block>
</xsl:for-each>
does not return anything. You need to put it into a variable and check it. If there is no value put empty to avoid this error.
If you look in the W3 XSL-FO spec for fo:table-cell you will see
Contents:
(%block;)+
The + means "one or more", i.e. obligatory
and the %block entity is defined as follows by W3
The parameter entity, "%block;" in the content models below, contains the following formatting objects:
block
block-container
table-and-caption
table
list-block
So Navin Rawat is right, you need to ensure that something is inside your cell.
/ Colm