How to use HTML entity codes instead of character in HTML - html

I'm writing a post about quote characters like curly quotes. I want to be able to print this string “ to demonstrate html entity of this character “ but when I open the html page it turns to the character itself.
I think this has to do with the default charset of html5 documents set to UTF-8 but I've seen pages like this one using the raw code.
Is there any way to do this in html?

When you output “ it's converted to HTML entity ". So to output it as-is you must double encode it: “

Related

Proper Way to Escape the | Character Using HTML Entities

To escape the ampersand character in HTML, I use the & HTML entity, for example:
Link
If I have the following code in my HTML, how would I escape the | character?
Link
HTML Tidy is complaining, claiming an illegal character was found in my HTML.
I tried using ¦ and several other HTML entities, but Tidy says "malformed URI reference."
You wouldn't.
The problem (as the message says) is that the character is illegal in URLs. It is perfectly fine in HTML.
You need to apply encoding for URLs which would be %7C.
I don't know why tidy is complaining about it, but this character is not problematic in HTML nor in URL. | is not a reserved character and can be used in URL as is. You can percent-encode every character, but there is really no need for it.
What I would presume Tidy might be complaining is =. You have got two of them, the second being an invalid one.
There is no need to encode this character in HTML entities. It has no special meaning in HTML.

Why doesn't nbsp display as nbsp in the URL

I am following a tutorial where a web application written in PHP, blacklists spaces from the input(The 'id' parameter). The task is to add other characters, which essentially bypasses this blacklist, but still gets interpreted by the MySQL database in the back end. What works is a URL constructed like so -
http://192.168.2.15/sqli-labs/Less-26/?id=1'%A0||%A0'1
Now, my question is simply that if '%A0' indicates an NBSP, then why is it that when I go to a site like http://www.url-encode-decode.com, and try to decode the URL http://192.168.2.15/sqli-labs/Less-26/?id=1'%A0||%A0'1, it gets decoded as http://192.168.2.15/sqli-labs/Less-26/?id=1'�||�'1.
Instead of the question mark inside a black box, I was expecting to see a blank space.
I suspect that this is due to differences between character encodings.
The value A0 represents nbsp in the ISO-8859-1 encoding (and probably in other extended-ASCII encodings too). The page at http://www.url-encode-decode.com appears to use the UTF-8 encoding.
Your problem is that there is no character represented by A0 in UTF-8. The equivalent nbsp character in UTF-8 would be represented by the value C2A0.
Decoding http://192.168.2.15/sqli-labs/Less-26/?id=1'%C2%A0||%C2%A0'1 will produce the nbsp characters that you expected.
Independently from why there is an encoding error, try %20 as a replacement for a whitespace!
Later on you can str_replace the whitespace with a
echo str_replace(" ", " ", $_GET["id"]);
Maybe the script on this site does not work properly. If you use it in your PHP code it should work properly.
echo urldecode( '%A0' );
outputs:

In Rich Text Editor Special Character are getting stored as Special Character not as their HTML code

We are using the Rich Text Editor in CQ, with special characters.
Whenever we add special characters by our button in the RTE, the character is added but is saved as the character in the source too, rather than the encoded HTML entity.
We are calling:
doc.execCommand("InsertHTML", false, htmlToInsert);
In htmlToInsert, we are sending the HTML code value of special character like ¥ for yen, but it is saving ¥ for yen, not ¥.
We need to store HTML code values only. Please help me in achieving this.

Escape special (HTML tag) characters in XML attribute?

As part of an XML node attribute, I need to pass up HTML characters as part of an attribute value, such as hello" />. I can't use CDATA as part of the value of the node, as lots of other systems use this method and I cannot afford to break or rewrite that process, so I'm stuck with this.
I can't HTML encode the values, as they're used inside of an email and are subsequently outputted literally as HTML encoded values (<br >hello, for example).
Is there a way to escape HTML (specifically, the < character) and allow me to keep un-encoded HTML inline as an attribute? Thanks.
The XML characters <>&" must be escaped identical to the HTML entities < and so on. Using XML APIS will receive/store the original character. Other character entities in HTML should be converted to UTF-8. Numeric entities, hex (ü) and decimal (࣭) are simple, but for named entities (•) one needs a Library. (If one wants to achieve completeness.)

HTML Character Encoding

When outputting HTML content from a database, some encoded characters are being properly interpreted by the browser while others are not.
For example, %20 properly becomes a space, but %AE does not become the registered trademark symbol.
Am I missing some sort of content encoding specifier?
(note: I cannot realistically change the content to, for example, ® as I do not have control over the input editor's generated markup)
%AE is not valid for HTML safe ASCII,
You can view the table here: http://www.ascii.cl/htmlcodes.htm
It looks like you are dealing with Windows Word encoding (windows-1252?? something like that) it really will NOT convert to html safe, unless you do some sort of translation in the middle.
The byte AE is the ISO-8859-1 representation for the registered trademark. If you don't see anything, then apparently the URL decoder is using other charset to URL-decode it. In for example UTF-8, this byte does not represent any valid character.
To fix this, you need to URL-decode it using ISO-8859-1, or to convert the existing data to be URL-encoded using UTF-8.
That said, you should not confuse HTML(XML) encoding like ® with URL encoding like %AE.
The '%20' encoding is URL encoding. It's only useful for URLs, not for displaying HTML.
If you want to display the reg character in an HTML page, you have two options: Either use an HTML entity, or transmit your page as UTF-8.
If you do decide to use the entity code, it's fairly simple to convert them en-masse, since you can use numeric entities; you don't have to use the named entities -- ie use ® rather than &#reg;.
If you need to know entity codes for every character, I find this cheat-sheet very helpful: http://www.evotech.net/blog/2007/04/named-html-entities-in-numeric-order/
What server side language are you using? Check for a URL Decode function.
If you are using php you can use urldecode() but you should be careful about + characters.