Usage: Escape HTML problem - html

I ran into an interesting problem.
In our webpage a user can write their own description. We escape all text to make it easy to write (<3 shows up properly and isnt the start of a tag). This also avoids any problems with trying to inject their javascript code or hide something or do anything with html.
A side effect is when a user writes
Hi
My name is
shows up as
Hi My name is
Initially we (really i) wrote var desc = (SafeHtml)obj.desc.HtmlEscape.replace("\n", "\n<br>") however this doesnt replace anything because what really happens is \n is replaced as #&10; since all characters < 0x20 (<--i think) needs an escape to be represented in html.
So my question is, am i doing things right? I changed the replace to ("
", "\n<br/>");. Is this the right way? Escape everything and replace characters you deem 'legal'? ATM i cant think of any other characters to escape.

That's how I'd do it - escape everything, and then replace safe escaped sequences. That said, I don't think you need to replace all characters < 0x20 - I'd leave 0x10 (newline) and 0x13 (carriage return) alone in the escaping step, and then replace them by <br />. Doesn't make much difference though.

Related

How to limit simple form input to 50 characters

Is it possible to limit a simple form input to only 50 characters without javascript?
I have used the max_length attribute, however this includes blank spaces which is not what i want.
I've attempted to use pattern (as suggested on another post), but i can't seem to get that to work either.
Thanks
I don't know why you don't want it to include blanks.
Usually I use max_length including blanks and leave it to the user to trim their excess whitespace. I'm not disagreeing, I honestly don't know what your requirement is.
If you want to allow leading and trailing whitespace, but are willing to leave it to the user to replace excess whitespace within the text to one whitespace character then this is the pattern you want:
<input pattern="^\s*.{0,50}\s*$">
Sometimes for multiline regular expressions, \A is used instead of ^ and \z is used instead of $, but I'm not sure HTML supports that in their regular expressions.

How do I type html in a markdown file without it rendering?

I want to type the following sentence in a markdown file: she says <h1> is large. I can do it in StackOverflow with three backticks around h1, but this doesn't work for a .md file. I've also tried a single backtick, single quote, double quote, hashtags, spacing, <code>h1</code> and everything else I could think of. Is there a way to do this?
You can escape the < characters by replacing them with <, which is the HTML escape sequence for <. You're sentence would then be:
she says <h1> is large
As a side note, the original Markdown "spec" has the following to say:
However, inside Markdown code spans and blocks, angle brackets and ampersands are always encoded automatically. This makes it easy to use Markdown to write about HTML code. (As opposed to raw HTML, which is a terrible format for writing about HTML syntax, because every single < and & in your example code needs to be escaped.)
...which means that, if you're still getting tags when putting them in backticks, whatever renderer you're using isn't "compliant" (to the extent that one can be compliant with that document), and you might want to file a bug.
Generally, you can surround the code in single backticks to automatically escape the characters. Otherwise just use the HTML escapes for < <and > >.
i.e.
she says <h1> is large or she says `<h1>` is large
A backslash (\) can be used to escape < and >.
Ex: she says <h1> is large
P.S. See this answer's source by clicking Edit.

IE(11) does not escape "<" character when concatenated with an alphabetical character

I posted a similar question earlier but all replies missed the point or just assumed something basic/simple, so I'll try to explain again. Please read on if you wish to help...
I want to be able to type something like <this> is visible and have it show up on a rendered page. When I type the same text without this site's code-text, this is what I get: "something like is visible". Notice that the text between the "<" and ">" is missing.
In fact, I had to add the ">" character otherwise this text would have never showed up. This issue does not happen if the "<" character is not concatenated (i.e.: "something like < this> is visible")
The reason for that is that IE believes I am creating a tag. I want to escape the "<" special character.
Conversion does not work (i.e.: converting "<" to < or <).
Thank you.
I can't insert examples into my comment, but have you tried the following:
Something like <this> is visible.
You can use this page as a reference.

JSON escape space characters

How would I escape space characters in a JSON string? Basically my problem is that I've gotten into a situation where the program that reads the string can use HTML tags for formatting, but I need to be able to use these HTML tags without adding more spaces to the string. so things like
<u>text</u>
is fine, for adding underline formatting
but something like
<font size="14">text</font>
is not fine, because the <font> tag with the size attribute adds an extra space to the string. I know, funny criteria, but at this point thats what has happened.
My first speculative solution would be to have some kind of \escape character that JSON can put in between font and size that will solve my "space" problems, something that the HTML will ignore but leave the human readable string in the code without actual spaces.
ex. <font\&size="14">text</font>
displays as: text
kind of like but better?
any solutions?
You can use \u0020 to escape the ' ' character in JSON.

escaping html inside comment tags

escaping html is fine - it will remove <'s and >'s etc.
ive run into a problem where i am outputting a filename inside a comment tag eg. <!-- ${filename} -->
of course things can be bad if you dont escape, so it becomes:
<!-- <c:out value="${filename}"/> -->
the problem is that if the file has "--" in the name, all the html gets screwed, since youre not allowed to have <!-- -- -->.
the standard html escape doesnt escape these dashes, and i was wondering if anyone is familiar with a simple / standard way to escape them.
Definition of a HTML comment:
A comment declaration starts with <!, followed by zero or more comments, followed by >. A comment starts and ends with "--", and does not contain any occurrence of "--".
Of course the parsing of a comment is up to the browser.
Nothing strikes me as an obvious solution here, so I'd suggest you str_replace those double dashes out.
There is no good way to solve this. You can't just escape them because comments are read in plaintext. You will have to do something like put a space between the hyphens, or use some sort of code for hyphens (like [HYPHEN]).
Since it is obvoius that you cannnot directly display the '--'s you can either encode them or use the fn:escapeXml or fn:replace tags for appropriate replacements.
JSTL documentation
There's no universal working way to escape those characters in html unless the - characters are in multiples of four so if you do -- it wont work in firefox but ---- will work. So it all depends on the browser. For Example, looking at Internet Explorer 8, it is not a problem, those characters are escaped properly. The same goes for Googles Chrome... However Firefox even the latest browser (3.0.4), it doesn't handle escaping of these characters well.
You shouldn't be trying to HTML-escape, the contents of comments are not escapable and it's fine to have a bare ‘>’ or ‘&’ inside.
‘--’ is its own, unrelated problem and is not really fixable. If you don't need to recover the exact string, just do a replacement to get rid of them (eg. replace with ‘__’).
If you do need to get a string through completely unmolested to a JavaScript that will be reading the contents of the comment, use a string literal:
<!-- 'my-string' -->
which the script can then read using eval(commentnode.data). (Yes, a valid use for eval() at last!)
Then your escaping problem becomes how to put things in JS string literals, which is fairly easily solvable by escaping the ‘'’ and ‘-’ characters:
<!-- 'Bob\x27s\x2D\x2Dstring' -->
(You should probably also escape ‘<’, ‘&’ and ‘"’, in case you ever want to use the same escaping scheme to put a JS string literal inside a <​script> block or inline handler.)