Special Characters in HTML Element

Special Characters in HTML Element - html

What I'm trying to do is output a percent sign (%) directly into a < td > tag. Below is my code:
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td class="item_container" %%=v(#Item_Container_Style)=%%>
...
When I test the XSL I get the following error:
SAXParseException: Expected an attribute name (Set_A_Custom.xsl, line 205, column 38)
So basically it's seeing "%%=v(#Item_Container_Style)=%%" as invalid HTML but I need this code to be there.
If you are wondering why I am doing this it is because I am writing the XSL to output HTML that contains AMPscript (An ExactTarget proprietary Scripting language). You don't need to know anything about AMPscript though to help me out though, I just need to output the percent sign (%) in the HTML and everything will work.
Any ideas? For the record I'm using XSL 1.0. Thanks all!

An XSLT stylesheet must itself be well-formed XML, so you can't include this kind of construct directly in the stylesheet. If the XSLT processor you're using supports disable-output-escaping then you would be able to do something like
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<xsl:text disable-output-escaping="yes"><![CDATA[<td class="item_container" %%=v(#Item_Container_Style)=%%>]]></xsl:text>
...
<xsl:text disable-output-escaping="yes"><![CDATA[</td>]]></xsl:text>
</tr>
</table>
If it does not allow disable-output-escaping then your only option is to use the text output method, and write all the tags you want to output as text with the angle brackets escaped (or in CDATA).

What I'm trying to do is output a percent sign (%) directly into a <td> tag.
Not possible with the "html" or "xml" output modes. XSLT has been designed to create syntactically sane HTML, you cannot make it do anything else.
Of course you could switch to the "text" output mode and do whatever you like, but generating HTML this way it a lot harder.
Alternatively you can use disable-output-escaping, if your XSLT processor supports it, but this will quickly degenerate your XSLT stylesheet into a mess if you need to do it in many places.
That being said, here's a proposal. In XSLT you use the "html" output mode and this:
<td
class="item_container"
amp-1="%%=v({#Item_Container_Style})%%"
amp-2="%%=v({#Some_Other_Element})%%"
>
some text %%=v(<xsl:value-of select="Other_Stuff" />)%% more text
</td>
That is syntactically valid XSLT which covers both cases (multiple placeholders in attributes, multiple placeholders in the text) and creates syntactically valid HTML:
<td
class="item_container"
amp-1="%%=v(item container style content)%%"
amp-2="%%=v(some other element content)%%"
>
Here some text %%=v(other stuff)%%
</td>
and then you use a post-processing step to convert that HTML into AMPscript:
Regex-replace \bamp-\d+="(%%[\s\S]*?%%)" with $1, which would result in
<td
class="item_container"
%%=v(item container style content)%%
%%=v(some other element content)%%
>
Here some text %%=v(other stuff)%%
</td>
Handling HTML with regular expressions is generally strongly dis-recommended, but this might just be a narrow-enough use case.

AMPScript appears to have a standards-based syntax as an alternative to its proprietary syntax:
Delimiter Comparison
The table below demonstrates the similarities between standard AMPscript delimiters and server-side delimiters.
Standard AMPscript Delimiter Tag-based AMPscript Delimiter
%%[ <script runat=server language=ampscript>
etc
Does this help you?

Related

How to select elements containing special characters in XPath?

I am trying to exclude three <td> elements from a result set:
<td>
🥇
</td>
<td>
🥈
</td>
<td>
🥉
</td>
I've tried using:
td[not(contains(., '🥈'))]
For example, but the element I don't want still comes back...

In the xpath expression, you need to use the escape conventions of the host language. Using &-escaping is fine if the host is XSLT, but if it’s JavaScript, for example, you’ll need to use backslash escaping.

To avoid the labyrinth of escaping conventions, just use literal Unicode characters themselves, which can be searched and then copy-and-pasted from sites such as Compart:
Char Entity Ref
Literal Unicode
XPath
🥇
🥇
//td[not(contains(.,'🥇'))]
🥈
🥈
//td[not(contains(.,'🥈'))]
🥉
🥉
//td[not(contains(.,'🥉'))]
Here's a single XPath 2.0+ expression that will select all td elements in the document except those consisting of only the targeted special characters:
//td[not(normalize-space() = ('🥇', '🥈','🥉'))]
In XPath 1.0, you'd have to write out the clauses separately:
//td[not(normalize-space() = '🥇') and
not(normalize-space() = '🥈') and
not(normalize-space() = '🥉')]
Rearrange via DeMorgan's per taste. Go back to contains() if you truly want to test via substring containment rather than string value equality.

Notepad++, find and replace, convert html to xml tags

This is a huge record of mixed html and xml tags which i want to clean.
I want to replace all the html tag into xml ones which i tried but didn't work
Find:
<tr>
<td class="fid">FID</td>
<td class="fidvalue">(.*)</td>
</tr>
Replace:
<fid>\1<fid>
this should replace all similar values in the tags, where were more than 300 occurrences but want to maintain the contents of 'tag' class values
what's the appropriate regex to use?

I might be missing something in your question, as Notepad++ (as of v6.9.2) does not have multi-line inputs in its find-and-replace dialog. However, I was able to achieve what you seem to want by specifying those line-breaks manually (and assuming that you want carriage returns);
Find: <tr>\r\n<td class="fid">FID</td>\r\n<td class="fidvalue">(.*)</td>\r\n</tr>
Replace: <fid>\1</fid>

Using XPath to select table that includes specific class

I have an HTML table that I need to select using XPath. The table may or may not contain multiple classes, but I only want tables that include a specific class.
Here is a sample HTML snippet:
<html>
<body>
<table class="no-border">
<tr>
<th colspan="2">Blah Blah Blah</th>
</tr>
<tr>
<td>Content</td>
<td>
<table class="info no-border">
<tr>
<!-- Inner table content -->
</tr>
</table>
</td>
</tr>
</table>
</body>
</html>
I need to use XPath to retrieve ONLY the table that includes the class info. I've tried using /html/body/table/tr/td/table[#class='info*'], but that doesn't work. The table I'm trying to retrieve may exist ANYWHERE in the HTML document - technically, not ANYWHERE, but there may be varying levels of hierarchy between the outer and inner table.
If anyone can point me in the right direction, I'd be grateful.

The closest you can do is with the contains function:
//table[contains(#class,'info')]
But please be aware that this would capture a table with the class information, or anything else that has the info substring. As far as I know XPath can't distinguish whole-word matches. So you'd have to filter results to check for this possible condition.

What you'd ideally need is a CSS selector like table.info. And some XPath engines and toolkits fo XML/HTML parsing do support these selectors, which are translated to XPath expressions internally, e.g. cssselect if you use Python and which is included in lxml, or Nokogiri for Ruby.
In the general case, to emulate a CSS selector like table.info with XPath, a common trick or pattern is to use contains() combined with concat() and space characters. In your case, it looks like this:
.//table[contains(concat(' ', normalize-space(#class), ' '), ' info')]

I know that you did not asked for this answer, but I think it will help you to make your queries more precise.
//table[ (contains(#class,"result-cont") or contains(#class,"resultCont")) and not(contains(#class,"hide")) ]
This will get classes that contain 'result-cont' or 'resultCont', and do not have the 'hide' class.

XPath 1.0 is , indeed, fairly limited in its string processing. You can do modest amounts of processing with starts-with() substring() and similar functions. See this answer for creating something similar to a regex.
XSLT2.0 (which not all browsers and software support) has support for regex.

Parsing through text with incorrect attribute definition

While trying to parse an html document as XML (added xml start at the beginning) I've ran into a problem with attribute inside tags.
<tr>
<td class="yfnc_tabledata1" nowrap align="right">Jun 4, 2013</td>
<td class="yfnc_tabledata1" align="right">453.22</td>
<td class="yfnc_tabledata1" align="right">454.43</td>
<td class="yfnc_tabledata1" align="right">447.39</td>
<td class="yfnc_tabledata1" align="right">449.31</td>
<td class="yfnc_tabledata1" align="right">10,454,600</td>
<td class="yfnc_tabledata1" align="right">449.31</td>
</tr>
While normally it wouldn't matter (since my xslt code doesn't actually reference it), I am getting an error :
ERROR: 'Attribute name "nowrap" associated with an element type "td" must be followed by the ' = ' character.'
ERROR: 'com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Attribute name "nowrap" associated with an element type "td" must be followed by the ' = ' character.'
So i was wondering if there's a way to make it suppress / ignore those errors. (Looking for a way of doing it that doesn't involve a separate parse through that would remove all nowrap first.)
(For reference, xml : http://pastebin.com/TLD4bZkq , xslt : http://pastebin.com/dPzDzeAX )

The data you're trying to process isn't XML, so the XML parser is right to produce an error.
Depending on what XSLT processor you're using and how you call it you might be able to use an HTML parser instead of an XML parser to parse your HTML into a DOM tree which you then pass to the XSLT processor, rather than having the processor parse the file itself.
But remember that XSLT expects namespace-well-formed XML and if the parser's output doesn't conform to this then you will have problems. For example, in Java (which is what I'm most familiar with), for a DOM Document to be usable by XSLT it must have been produced by a namespace-aware parser even if the document in question doesn't actually use any namespaces.

use a xslt variable to define table padding

I would like to declare a variable in my xslt file that is styling my xml. I intend to use the variable to pad the tableview cells. depending on what data is being read will depend how far the padding is from the left of the table.
So I was wondering if its possible to then use this padding1 var on tablecell? This is currently how I am attempting this idea, with not much success mind you as when I now load the xml I am getting a blank screen.
Anyway this is my code
<xsl:variable name="padding1" select="15"/>
<td style="padding-left: padding1;" colspan="2" bgcolor="#C0C0C0">
In the above code I am declaring a variable named padding1 and passing it the value 15, I would then like to use it on padding-left much like you would with a 15px type value.

You can use an AVT to reference your variable:
<td style="padding-left: {$padding1}px;" colspan="2" bgcolor="#C0C0C0">...</td>

You should probably include the unit in your variable. You can even do math operations with the unit intact:
<xsl:variable name="content-width">180mm</xsl:variable>
...
<fo:table-column column-width="{$content-width}*0.5"/>

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Special Characters in HTML Element - html

Related

How to select elements containing special characters in XPath?

Notepad++, find and replace, convert html to xml tags

Using XPath to select table that includes specific class

Parsing through text with incorrect attribute definition

use a xslt variable to define table padding

Categories

Resources