AntiXss.HtmlEncode vs AntiXss.GetSafeHtmlFragment

AntiXss.HtmlEncode vs AntiXss.GetSafeHtmlFragment - html

Can anyone please let me know the difference between these two?
AntiXss.HtmlEncode() vs AntiXss.GetSafeHtmlFragment()

HtmlEcode actually encodes tags:
AntiXss.HtmlEncode("<b>hello</b><script>");
//Output: <b>hello</b><script>
GetSafeHtmlFragment (AntiXss v4.0) returns HTML fragments with tags intact:
Sanitizer.GetSafeHtmlFragment("<b>hello2</b><script>")
//Output: <b>hello2</b>
Update
Many consider the latest version of Microsoft's AntiXSS library broken. I've started using HTML Sanitizer as a decent replacement.

It should also be mentioned that antixss.GetSafeHtmlFragment does encode characters too. A double quote changes to ". A plus sign turns into + etc.

I would also add that GetSafeHtmlFragment messes up your CSS, by ading x_ in front of styles, and removes your HTML entity encoding. It is a less than beautiful thing.
Herc

Related

Why do some strings contain " " and some " ", when my input is the same(" ")?

My problem occurs when I try to use some data/strings in a p-element.
I start of with data like this:
data: function() {
return {
reportText: {
text1: "This is some subject text",
text2: "This is the conclusion",
}
}
}
I use this data as follows in my (vue-)html:
<p> {{ reportText.text1 }} </p>
<p> {{ reportText.text2 }} </p>
In my browser, when I inspect my elements I get to see the following results:
<p>This is some subject text</p>
<p>This is the conclusion</p>
As you can see, there is suddenly a difference, one p element uses and the other , even though I started of with both strings only using . I know and technically represent the same thingm, but the problem with the string is that it gets treated as a string with 1 large word instead of multiple separate words. This screws up my layout and I can't solve this by using certain css properties (word-wrap etc.)
Other things I have tried:
Tried sanitizing the strings by using .replace( , ), but that doesn't do anything. I assume this is because it basically is the same, so there is nothing to really replace. Same reason why I have to use blockcode on stackoverflow to make the destinction between and .
Logged the data from vue to see if there is any noticeable difference, but I can't see any. If I log the data/reportText I again only see string with 's
So I have the following questions:
Why does this happen? I can't seem to find any logical explanation why it sometimes uses 's and sometimes uses 's, it seems random, but I am sure I am missing something.
Any other things I could try to follow the path my string takes, so I can see where the transformation from to happens?

Per the comments, the solution devised ended up being a simple unicode character replacement targeting the \u00A0 unicode code point (i.e. replacing unicode non-breaking spaces with ordinary spaces):
str.replace(/[\\u00A0]/g, ' ')
Explanation:
JavaScript typically allows the use of unicode characters in two ways: you can input the rendered character directly, or you can use a unicode code point (i.e. in the case of JavaScript, a hexadecimal code prefixed with \u like \u00A0). It has no concept of an HTML entity (i.e. a character sequence between a & and ; like ).
The inspector tool for some browsers, however, utilizes the HTML concept of the HTML entity and will often display unicode characters using their corresponding HTML entities where applicable. If you check the same source code in Chrome's inspector vs. Firefox's inspector (as of writing this answer, anyway), you will see that Chrome uses HTML entities while Firefox uses the rendered character result. While it's a handy feature to be able to see non-printable unicode characters in the inspector, Chrome's use of HTML entities is only a convenience feature, not a reflection of the actual contents of your source code.
With that in mind, we can infer that your source code contains unicode characters in their fully rendered form. Regardless of the form of your unicode character, the fix is identical: you need to target these unicode space characters explicitly and replace them with ordinary spaces.

Inserting HTML inside quotes

I want a page break inside the title attribute of a link, but when I put one in, it appears correct in a browser, but returns 7 errors when I validate it.
This is the code.
<a href="images/Bosses/Lord Yarkan Large.jpg" class="hastipz" target="_blank" title="Lord Yarkan, a level 80 Unique from Silkroad Online -- Click for a Larger Image">
<img class="bosspic" src="images/Bosses/Lord Yarkan.jpg" style="float:right; position:relative;" alt="Lord Yarkon; Silkroad Unique"/>
</a>
The reason is because the title attribute appears in a tooltip, and I need a page break inside that tooltip. How can I add a page break inside the quotes without returning errors?

I found this forum post:
There are two approaches:
1) Use the character entity for a carriage return, which is 
 Thus:
<...title="Exemplary
website">
(For a full list of character entities, try Googling "HTML Character Codes".)
2) to do any additional styling to your "tooltips", Google "CSS tooltips"
1) is Non-standard though. Works on IE/Chrome, not with Firefox. The new spec appears to recommend
(newline) instead.

Do you need to validate for work?
If not, do not worry about the errors if it works as you want it.
Validation is not the goal. It is a tool to help build better Web sites. which is the goal. ;-)
If you must have it validate, you could try to use some script to switch out a specific keyword / set of characters for a <br /> at dom ready. Although this is untested and I am not sure it wouldn't throw errors, too.
EDIT
As requested, a little jQuery to switch out a word:
$('a').each(function(){
var a = $(this).attr('title');
var b = a.replace('lineBreak','\n');
$(this).attr('title', b);
});
Example: http://jsfiddle.net/jasongennaro/qRQaq/1/
Nb:
I used "lineBreak" as the keyword, as this is unlikely to be matched. "br" might be
I replaced it with the \n line break character.
You should try the \n line break character on its own... might work without needing to replace anything.

FreeMarker cannot seem to parse HTML 5 data-* atttributes, chokes on dash

I wrote a simple custom directive, and have it pass all attributes through as regular element attributes. The syntax of the tag as follows:
<#link_to controller="unobtrusive" action="do-get" data-target="result">Do Get
Unfortunately, I get an exception:
Caused by: freemarker.core.ParseException: Encountered "-" at line 32, column 56 in unobtrusive/index.ftl.
Was expecting:
"=" ...
This is because it cannot seem to parse data-target attribute. When I change it to "data_target" with the underscore, all is fine.... but I really would need the dash: "-".
Can someone help?
Thanks,
Igor

Try this tip from the FAQ
<#link_to controller="unobtrusive" action="do-get" "data-target"="result">
I haven't tried this personally so can't vouch if it will work.

I just stuck in the same problem. <#form.textarea ... data-maxCount="100" />. It seems that freemarker misinterprets special characters in names... Freemarker FAQ

As of 2.3.22, you can use - (and . and :) as part of any name if you precede it with a \, like in <#link_to data\-target=...>. (It's not too cute, but - is already used as subtraction operator, and fixing this wouldn't backward compatible, and so must wait for a major FTL version increase.)

Should characters in <pre><code> be html encoded?

Writing documentation in html requires some code examples. What to do with characters that should be replaced with & and > etc.? Should they be encoded in this case too? When I have these characters inside of <pre><code> tags, they display like they should as far as I can see.

Yes, you should use HTML entities inside of <pre> and <code>. Some browsers are forgiving, but leaving < and > as non-entities won't work in all cases.

If you don't you'll eventually come across code like: print "</code>". And it won't work non-escaped.

escaping html inside comment tags

escaping html is fine - it will remove <'s and >'s etc.
ive run into a problem where i am outputting a filename inside a comment tag eg. <!-- ${filename} -->
of course things can be bad if you dont escape, so it becomes:
<!-- <c:out value="${filename}"/> -->
the problem is that if the file has "--" in the name, all the html gets screwed, since youre not allowed to have <!-- -- -->.
the standard html escape doesnt escape these dashes, and i was wondering if anyone is familiar with a simple / standard way to escape them.

Definition of a HTML comment:
A comment declaration starts with <!, followed by zero or more comments, followed by >. A comment starts and ends with "--", and does not contain any occurrence of "--".
Of course the parsing of a comment is up to the browser.
Nothing strikes me as an obvious solution here, so I'd suggest you str_replace those double dashes out.

There is no good way to solve this. You can't just escape them because comments are read in plaintext. You will have to do something like put a space between the hyphens, or use some sort of code for hyphens (like [HYPHEN]).

Since it is obvoius that you cannnot directly display the '--'s you can either encode them or use the fn:escapeXml or fn:replace tags for appropriate replacements.
JSTL documentation

There's no universal working way to escape those characters in html unless the - characters are in multiples of four so if you do -- it wont work in firefox but ---- will work. So it all depends on the browser. For Example, looking at Internet Explorer 8, it is not a problem, those characters are escaped properly. The same goes for Googles Chrome... However Firefox even the latest browser (3.0.4), it doesn't handle escaping of these characters well.

You shouldn't be trying to HTML-escape, the contents of comments are not escapable and it's fine to have a bare ‘>’ or ‘&’ inside.
‘--’ is its own, unrelated problem and is not really fixable. If you don't need to recover the exact string, just do a replacement to get rid of them (eg. replace with ‘__’).
If you do need to get a string through completely unmolested to a JavaScript that will be reading the contents of the comment, use a string literal:
<!-- 'my-string' -->
which the script can then read using eval(commentnode.data). (Yes, a valid use for eval() at last!)
Then your escaping problem becomes how to put things in JS string literals, which is fairly easily solvable by escaping the ‘'’ and ‘-’ characters:
<!-- 'Bob\x27s\x2D\x2Dstring' -->
(You should probably also escape ‘<’, ‘&’ and ‘"’, in case you ever want to use the same escaping scheme to put a JS string literal inside a <script> block or inline handler.)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

AntiXss.HtmlEncode vs AntiXss.GetSafeHtmlFragment - html

Can anyone please let me know the difference between these two? AntiXss.HtmlEncode() vs AntiXss.GetSafeHtmlFragment()

It should also be mentioned that antixss.GetSafeHtmlFragment does encode characters too. A double quote changes to ". A plus sign turns into + etc.

I would also add that GetSafeHtmlFragment messes up your CSS, by ading x_ in front of styles, and removes your HTML entity encoding. It is a less than beautiful thing. Herc

Related

Why do some strings contain " " and some " ", when my input is the same(" ")?

Inserting HTML inside quotes

FreeMarker cannot seem to parse HTML 5 data-* atttributes, chokes on dash

Should characters in <pre><code> be html encoded?

escaping html inside comment tags

Categories

Resources