Go html/template & escaping - html

Based on the example in the html/template documentation I can't say I fully understand why it appears that less and greater than are inconsistently escaped in my experiment:
https://golang.org/pkg/html/template/#hdr-Introduction
Does this warrant a bug report? I am holding off since I am relatively new to Go.
$ go version
go version go1.16 linux/amd64
I saw similar behavior with go1.15.8.
package main
import (
htmltemplate "html/template"
"os"
texttemplate "text/template"
)
type MyVars struct {
Flavor string
}
func main() {
Vars := MyVars{
Flavor: "##### html #####",
}
htmlTmpl, _ := htmltemplate.ParseFiles("index.html")
htmlTmpl.Execute(os.Stdout, Vars)
Vars = MyVars{
Flavor: "##### text #####",
}
textTmpl, _ := texttemplate.ParseFiles("index.html")
textTmpl.Execute(os.Stdout, Vars)
}
$ cat index.html
{{ .Flavor }}
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
< span >Hello< /span >
<span>Hello</span>
{{ "<" }}span{{ ">" }}Hello{{ "<" }}/span{{ ">" }}
$ ./experiment
##### html #####
<?xml version="1.0" encoding="UTF-8" standalone="no"?> # Why is only < escaped?
< span >Hello< /span > # Why is only < escaped?
<span>Hello</span> # Why is neither < nor > escaped?
<span>Hello</span>
##### text #####
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
< span >Hello< /span >
<span>Hello</span>
<span>Hello</span>

1: {{ .Flavor }}
2: <?xml version="1.0" encoding="UTF-8" standalone="no"?>
3: < span >Hello< /span >
4: <span>Hello</span>
5: {{ "<" }}span{{ ">" }}Hello{{ "<" }}/span{{ ">" }}
The < on lines 2 and 3 are text. The HTML template package escapes < in text to prevent a document reader from misinterpreting the < as the start of a tag.
The > on lines 2 and 3 is written as is to the output. There is no security benefit to escaping the >.
The < and > on line 4 are part of a tag. Tags are not escaped.
The < and > on line 5 are the value of an expression. The HTML template package fully escapes expression results.

Related

Unable to use <link> tag in Razor View

If I use the following code the view renders fine.
But if I change the url to the necessary RSS spec. the view will not render and throws an error saying that the tag is invalid so the error is occurring at the link tag. No matter what I try the link tag inside the razor foreach will not compile correctly.
#inherits Umbraco.Web.Mvc.UmbracoTemplatePage<ContentModels.RSsfeed>
#using ContentModels = Umbraco.Web.PublishedContentModels;
#{
Layout = null;
Response.ContentType = "text/xml";
var rootNode = Umbraco.TypedContentAtRoot().First();
var newsNodes = umbraco.uQuery.GetNodesByType("newsDetail");
}<?xml version="1.0"?>
<!-- News Aritcles -->
<rss version="2.0" xmlns:newsArticles="https://xxx.xxxxxxx.xxx/news">
<channel>
<title>News Aritcles</title>
<link>https://xxx.xxxxxxx.xxx/news</link>
<description>News Aritcles</description>
<language>en-us</language>
<ttl>1440</ttl>
#foreach(var newsNode in newsNodes){
var newsContent = UmbracoContext.Current.ContentCache.GetById(newsNode.Id);
string nnDescription = newsContent.GetPropertyValue("description").ToString();
string nnPublishDate = newsContent.GetPropertyValue("publishDate").ToString();
<item>
<title>#newsNode.Name</title>
<url>https://xxx.xxxxxxx.xxx#{#newsNode.Url}</url>
<description>#nnDescription</description>
<pubDate>#nnPublishDate</pubDate>
<guid>https://xxx.xxxxxxx.xxx#{#newsNode.Url}</guid>
</item>
}
</channel>
</rss>
<link/> is a void element, and so only has a start tag and no end tag - See W3C HTML Language Reference
You could output the tag like this
#("<link>" + newsNode.Url + "</link>")
Hope this helps

Jade: HTML-escape content (not buffered code)

This jade:
h1 a < b
Produces this HTML:
<h1>a < b</h1>
How can I get it to automatically escape the <? (i.e. not typing in < myself)
<h1>a < b</h1>
this h1 a < b is a shortcut for h1!= "a < b"
just use this here:
h1= "a < b"

Extract data from html/xml

I'm using Webharvest to retrieve data from websites. It converts the html pages to xml documents before getting for me the wanted data based on the xPath provided.
Now I'm working on a page like this: pastebin Where I showed the blocks I'd like to get. Each block should be returned as a single unit.
the xPath the first element of the block is: //div[#id="layer22"]/b/span[#style="background-color: #FFFF99"]
I tested it and it gives all "bloc start" elements.
the xPath of the last element of the block is: //div[#id="layer22"]/a[contains(.,"Join")]
I tested it and it gives all the "bloc end" elements.
The xPath should return a set of blocks as:
(xPath)[1] = block 1
(xPath)[2] = block 2
....
Thank you in advance
Use (for the first wanted result):
($first)[1] | ($last)[1]
|
($first)[1]/following::node()
[count(.|($last)[1]/preceding::node()) = count(($last)[1]/preceding::node())]
where you need to substitute $first with:
//div[#id="layer22"]/b/span[#style="background-color: #FFFF99"]
and substitute $last with:
//div[#id="layer22"]/a[contains(.,"Join")]
To get the k-th result, substitute in the final expression ($first)[1] with ($first)[{k}] and ($last)[1] with ($last)[{k}], where {k} should be replaced by the number k.
This technique follows directly from the well-known Kayessian formula for set intersection in XPath 1.0:
$ns1[count(.|$ns2) = count($ns2)]
which selects the intersection of the two node-sets $ns1 and $ns2 .
Here is XSLT verification with a simple example:
<nums>
<num>01</num>
<num>02</num>
<num>03</num>
<num>04</num>
<num>05</num>
<num>06</num>
<num>07</num>
<num>03</num>
<num>07</num>
<num>10</num>
</nums>
This transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="v1" select=
"(//num[. = 3])[1]/following-sibling::*"/>
<xsl:variable name="v2" select=
"(//num[. = 7])[1]/preceding-sibling::*"/>
<xsl:template match="/">
<xsl:copy-of select=
"$v1[count(.|$v2) = count($v2)]"/>
</xsl:template>
</xsl:stylesheet>
applies the XPath expression and the selected nodes are copied to the output:
<num>04</num>
<num>05</num>
<num>06</num>

Ruby Builder - XML output is encoding HTML entities

I have a small ruby script which uses Builder.
require 'rubygems'
require 'builder'
content = <<eos
SOME TEXT, GOES TO UPPERCASE
other text
<em>italics<em>
eos
xml = Builder::XmlMarkup.new
xml.instruct! :xml, :version => '1.0'
xml.book :id => 1.0 do
xml.keyPic "keyPic1.jpg"
xml.parts do
xml.part :partId => "1", :name => "name" do
xml.chapter :title => "title", :subtitle => "subtitle" do
xml.text content
end
end
end
end
p xml
When running from the CLI (Cygwin), I get the following:
<?xml version="1.0" encoding="UTF-8"?>
<book id="1.0">
<keyPic>keyPic1.jpg</keyPic>
<parts>
<part partId="1" name="name">
<chapter title="title" subtitle="subtitle">
<text>
SOME TEXT, GOES TO UPPERCASE
other text
<em>italics<em>
</text>
</chapter>
</part>
</parts>
</book><inspect/>
However, the output I would like between is:
<text>
SOME TEXT, GOES TO UPPERCASE
other text
<em>italics<em/>
</text>
I have tried using the htmlentities gem 'decoding' the content but to no avail.
Use the << operation to insert your text without modification.
xml.text do |t|
t << content
end

How to let special XML chars unconverted in flex?

That's my xml node:
<node att="something < something else"> </node>
When i write in my code :
trace(xml.node.#att.toString());
it prints out the string :
something < something else
My problem is that i need to print out the orriginal string:
something $lt; something else //put $ instead of &
Does anyone know how to solve this?
Thanks in advance.
Hey !
Just tried this and it seems to work... : )
var xml:XML=<node att="something < something else"> </node>;
trace(xml.toXMLString());//<node att="something < something else"/>
trace(xml.#att.toXMLString());//something < something else
The basic XML Parser will unespace them.
If you want the converted char again, you'll need to reconvert it again using ActionScript 3.
You can also convert it twice, using &lt;
Option 1 - USE <![CDATA[ ]]>
<chapter title="hello" nr="111" src="">
<page>
<text><![CDATA[To adjust <br/><br/>. . . ]]></text>
<text><![CDATA[To adjust <br/><br/>. . . ]]></text>
</page>
</chapter>
Option 2 - use < and > for < and >
Then <br/> = <b/>