How to insert these non-ascii characters as html content? - html

Any non-ascii representation is written as &#xYYYY.
As per below code,
Editor is Sublime Text.
How do I represent these emoticons in html?

I found this Sublime Text 3 Plugin to insert emojis into the editor.
https://packagecontrol.io/packages/Emoji
Is this what you are looking for?

As long as you save the file you're editing using the UTF-8 character encoding, and make sure it is delivered with a suitable content type header, such as Content-Type: text/html; charset=utf-8 you don't need to do anything at all.
Another option, as others have noted, is adding them as HTML entities instead. In order to do that you would need to know their character codes. How to do that differs between different environments, but there are multiple questions on SO about that.
Here's how you could do it in Python (Python 3, you'd need to use u"" strings in earlier versions):
chars = [
"😀",
"😐",
"😳",
"😫",
"💩"
]
for char in chars:
print("{}: &#{:02x};".format(char, ord(char)))

You can represent Emoticons in html as its unicode symbol formatted as 򪪪 (some unicode list)
<div>😁</div>
I am using sublime text as editor and when you see it on browser it should look like these:

Related

£ getting converted to ? by HTML Tidy, EncodingType?

I am cleaning a HTML file using HTML Tidy, well the .NET version called TidyManaged, and my "£" symbols are being converted to "?"
ie:
Income (£)
becomes:
Income (�)
I believe it is to do with encoding types. In TidyManaged, one can specify the input encoding type and output encoding type, including such things as Latin1, utf8, utf16, win1252.
The XHTML doc will ultimately gets converted into a DOC which uses win1252.
So what should my input and output encoding be to preserve £ symbols?
Many thanks.
Well, when I've used other char-sets it's always different. I'm not fluent in them but I do know that to create symbols, punctuation you need to use a 'code' rather than their literal. Never seen win1252 but google says it's 0x00A3.
Try putting that somewhere in your document.
I know in html I would put £ for a pound sign. So Html:
<p>£0.00</p>
Where I got the code

Displaying UTF-8 codes from JSON file as Emoticons

I am loading a JSON file that contains some UTF-8 codes, that represent emoticons.
The JSON content looks as follows:
"Studying! \uf4d6"
"Winning \uf40e\uf3c1 #4mile"
"Cheer me on \uf603 #werunamsterdam"
These UTF-8 codes are displayed as blocks in the browser. But when I look at this Unicode reference in Firefox, the codes are actually recognized!
(for example, UF4D6 is a book)
How do I convert the code from my json so that a browser can display them?
The code points from \uE000 to \uF8FF are in a private use area, so there aren't any standard glyphs associated with them.
You can, however, create your own font with suitable icons at these code points. This can be done quite easily using online tools like IcoMoon. Alternatively, use a string replacement routine to swap these characters with suitable markup (e.g., replace \uf4d6 with <img src="/icons/book.png" alt="[Book]" />)
These emoticons are encoded as regular characters as defined in Unicode, i.e. they're no different from the letter "A" or "%". All you need is a font that has glyphs for these "characters". Since not everyone can be expected to have such fonts installed (apparently you don't), if you want maximum compatibility, there are libraries for most languages that replace these characters with equivalent images. Google for one that suits your needs.

Properly display utf-8 HTML characters in non-multibyte Vim

I've been attempting to open .epub files in vim for reading (yes it's silly, let's ignore that for now) and I'm having trouble with how the internal html of epubs displays characters such as ' and " among other things.
Vim displays ' as â~#~Y while opening the file with less gives me <E2><80><99>. I'm not sure how vim deals with this (it seems to treat ~# and ~Y as single characters) and as such I'm not sure how to go about replacing the special HTML characters with their utf-8 equivalent.
Is there a encoding setting that will display this properly? Or a way to manually input these characters such that I could create a search and replace macro?
Thanks
It looks like Vim doesn't properly detect the UTF-8 encoding; you can check with
:setlocal fileencoding?
and force UTF-8 with
:edit ++enc=utf-8 file.epub
(or tweak your 'fileencodings' option to have it automatically detected).

My Browser won't interpret "ΧΨ" when I load the website I'm building

I pretty much built this website in firebug, then when I copied the code into a text document and tried loading it, firefox wouldn't interpret the "ΧΨ" in the source. However, it does a fantastic job using them while I'm typing this.
Wassup wid dat?
You can't just type a character into an HTML tag, it must be a valid character and if not use the proper character code. See this list:
http://htmlhelp.com/reference/html40/entities/symbols.html
You can use Entity, Decimal, or Hex to represent your character like this:
<p>ΧΨ</p>
That's the HTML representation of "ΧΨ"
Cheers

HTML Character Encoding

When outputting HTML content from a database, some encoded characters are being properly interpreted by the browser while others are not.
For example, %20 properly becomes a space, but %AE does not become the registered trademark symbol.
Am I missing some sort of content encoding specifier?
(note: I cannot realistically change the content to, for example, ® as I do not have control over the input editor's generated markup)
%AE is not valid for HTML safe ASCII,
You can view the table here: http://www.ascii.cl/htmlcodes.htm
It looks like you are dealing with Windows Word encoding (windows-1252?? something like that) it really will NOT convert to html safe, unless you do some sort of translation in the middle.
The byte AE is the ISO-8859-1 representation for the registered trademark. If you don't see anything, then apparently the URL decoder is using other charset to URL-decode it. In for example UTF-8, this byte does not represent any valid character.
To fix this, you need to URL-decode it using ISO-8859-1, or to convert the existing data to be URL-encoded using UTF-8.
That said, you should not confuse HTML(XML) encoding like ® with URL encoding like %AE.
The '%20' encoding is URL encoding. It's only useful for URLs, not for displaying HTML.
If you want to display the reg character in an HTML page, you have two options: Either use an HTML entity, or transmit your page as UTF-8.
If you do decide to use the entity code, it's fairly simple to convert them en-masse, since you can use numeric entities; you don't have to use the named entities -- ie use ® rather than &#reg;.
If you need to know entity codes for every character, I find this cheat-sheet very helpful: http://www.evotech.net/blog/2007/04/named-html-entities-in-numeric-order/
What server side language are you using? Check for a URL Decode function.
If you are using php you can use urldecode() but you should be careful about + characters.