I'm making a website that is in Croatian, and I need to use signs like: "č", "ć", "ž", "đ" and "š". They are currently displayed as little boxes.
Info:
I use Notepad ++.
I set the encoding there to UTF-8.
I put the following line of HTML in: <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
However, it does not work. Even Notepad ++ can't display my characters using UTF-8, so that would suggest that I should probably use something else...
http://webdesign.maratz.com/lab/utf_table/
Use HTML entities, for example
č : č
ž : ž
This sounds more like a font issue than a character encoding issue. If it were a character encoding issue, the characters would most likely be displayed as 2+ ASCII characters. The boxes, however, typically mean the character encoding is correct, but that specific character is not available in the font being used (which is especially common with lesser-used fonts). This would explain why it's behaving incorrectly in both the website and Notepad++.
To fix the issue, simply use a different font in your editor and website.
Note: I recommend a widely used font for the best chance of it working. Specifying a generic name in the website (e.g. serif or sans-serif) will probably have even better results, as the OS/browser would decide on the best font to use.
In short, be consistent about your character encoding throughout.
Configure your editor to save in the encoding you want
If you use any server side programming, make sure it isn't transcoding your data
If you use a database, make sure it is configured to use the same encoding
Configure your server to emit a Content-Type header that specifies that encoding
Use the meta tag in your question
The W3C provides useful material on encodings that starts here.
A useful site for special characters and their ASCII-codes: CopyPaste Character
To 'type' them, use the alt codes.
However, to use them in your site, you'll better use the HTML codes like you can find on CPC
As a test, try this:
<span style="font-family:Arial Unicode MS">
č ć ž đ š
</span>
You should be able to see your characters correctly.
I've just copied and pasted a line from your question along with your meta tag, placed it into a plain text file in vi.
It works just fine - all characters are displayed fine: http://www.dusystems.com/tmp/1.html
If you can't do the same with your editor then the problem is with the editor and not character sets and encodings.
If you're on Windows you can use its built-in Notepad to edit UTF-8 files. Open Notepad, type all of your special characters, add the meta tag. When doing Save As select UTF-8 from the Encoding drop-down in the dialog. Save as something.html and open in IE. It will 100% work.
Related
I started learning HTML + CSS a week or two ago, and I'm facing a problem. I'm european so I need to use special characters like á, ã, ç , etc a lot. Is there any other way I can do that without using the corresponding code for each letter every time I need to use one? Like a code I can put in the beggining of the html document or something like that that would make all the special characters accepted.
Decide which encoding you want to use for your site; if you don't have any preference, use UTF-8.
Save the .html file in that encoding in your text editor. Consult the help of your specific text editor how to choose which encoding the file gets saved in.
Add <meta charset="utf-8"> to your <head> to instruct the browser to treat the page as UTF-8 encoded.
Preferably also configure your web server to output a Content-Type: text/html; charset=utf-8 HTTP header, since that takes precedence if present. Consult the manual of your web server how to do that.
Write literally any character you can input directly as is into your document and enjoy.
Further reading:
https://www.w3.org/International/tutorials/tutorial-char-enc/
Handling Unicode Front To Back In A Web App
What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text
UTF-8 all the way through
I'm writing a paragraph that requires me to use a Greek word that means something else, but when I put the Greek word into my text editor and save it, it looks weird in my browser. I tried using a span but it still shows the same weird code.
<p>Music is an art form whose medium is sound and silence. Its common elements are pitch
(which governs melody and harmony), rhythm (and its associated concepts tempo, meter, and
articulation), dynamics, and the sonic qualities of timbre and texture. The word derives
from Greek <span lang="el">μουσική</span> (mousike; "art of the Muses").</p>
Perhaps your page is being interpreted using the wrong charset, try adding <meta charset="UTF-8"> inside your <head> element. This tells the browser how to interpret more complex characters like you described.
Make sure this tag is present in your head:
<meta charset="utf-8">
ANSI and ISO charsets are the default when that tag is not present. ISO (the newest of those two) only supports 256 characters. UTF-8 character set allows you to use unicode characters directly in your HTML page.
That meta tag tells the browser to interpret your HTML page with the correct character set.
Check out the wikipedia page on ISO 8859-1 for more info. Also, here's the utf-8 wikipedia page.
Edit
As Juhana pointed out in the comments, make sure your editor is set to the appropriate encoding as well (most programming/web-specific editors, like Sublime for example, should do this by default, but other multi-purpose text editors may not.)
Currently, I have my webpage set to Unicode/UTF-8. When trying to display a special character (for example, em dash, double arrow, etc), it shows up as a question mark symbol. I cannot change these characters to the HTML entity equivalent. How can I circumvent this issue?
A question mark in a lozenge, �, indicates a character-level error: the data contains bytes that do no represent any character, according to the character encoding being applied. This typically happens when the document is declared as UTF-8 encoded but is really in iso-8859-1, windows-1252, or some similar encoding. Windows-1252 is a common default encoding used by various programs on Windows platforms. So you may need to open the file in your authoring program and re-save it as UTF-8 encoded.
If problems remain, please post the URL. Posting the code alone is not sufficient, since the character encoding is primarily specified in HTTP headers.
If you see a question mark in a small box, then it might be a font-level problem (lack of glyph in the fonts being used), but this would be very rare for common characters like the em dash. Different browsers have different ways of indicating character- or font-level problems.
Make sure your document is set to the correct character encoding in the actual code editor, as well as in the doctype. Both are necessary. I spent hours trying to tweak HTML when the only problem was that I needed to set the text setting in Coda.
<head>
<meta charset="utf-8">
See the following screenshot:
Make sure your characters are actually UTF-8 characters. They will probably look something like this:
® or U+0020
http://www.kinsmancreative.com/transfer/char/index.php is a handy site for finding the decimal values of commonly used UTF-8 special characters if you need a reference.
I have some pages I'm copying Chinese text into from a Word doc. I have 2 kind of HTML documents. Some are parent pages, with a full head tag, and meta tag, where I have spec'd:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
I also have some snippets that just start with <div> because they are include files.
When I paste Chinese into the docs with the meta tag, it pastes fine and I can see the Chinese.
When I paste Chinese into the HTML snippets that start with a <div>, I just get boxes.
How do I paste Chinese into a snippet so that I can see the characters?
In Dreamweaver, I can flip over to visual mode and paste them in, but when I flip back to code mode, It shows me that all the characters have been converted to URL encoded equivalents. On the UTF-8 page, I can paste the Chinese and read it in code mode as Chinese characters.
HOWEVER
The weird thing is I have several include files. One I opened up has no meta tag on it, and it already had Chinese in it from development I did a few weeks back. I can still paste new Chinese into it and it's fine.
So basically, I have regular old HTML files, with no meta tag about UTF-8, and some allow me to paste Chinese into them in code mode and it works fine, and others don't allow it. The structure of these various HTML snippets are nearly identical.
Could this be a DreamWeaver bug or is there some trick / setting?
You cannot do so reliably without some sort of metadata; in HTML, that is what the <meta> tag is for. (Otherwise the browser will have to guess, and uses the default that the user has chosen. Which, in most cases, is anything but what one expects.)
Dreamweaver also has to determine the encoding of the files it opens, and in the absence of metadata it will presumably use the same techniques as a web browser (such as detecting byte sequences / character distributions that are only likely to appear in a specific encoding) to come up with a best guess.
This is the likely reason why one snippet opened with the correct encoding and another did not. Once open, Dreamweaver "knows" that a file is encoded with UTF-8, so it can continue to use UTF-8 for that file (and may even add a BOM when saving, to ensure it opens correctly in future) which is probably why saving over the bad one with the good one fixed the issue.
A good editor will let you specify the encoding to use when opening (and saving) a file. You can do this in Dreamweaver via Modify > Page Properties (see: UTF-8 Without BOM?).
i have a website in which i have to put some lines in Arabic.... how to do it...
where to get the Arabic text characters... how to make the page support Arabic...
i have to put a line per page and there is a lotta lotta pages so can't go around making images and putting them...
This is the answer that was required but everybody answered only part one of many.
Step 1 - You cannot have the multilingual characters in unicode document.. convert the document to UTF-8 document
advanced editors don't make it simple for you... go low level...
use notepad to save the document as meName.html & change the encoding
type to UTF-8
Step 2 - Mention in your html page that you are going to use such characters by
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
Step 3 - When you put in some characters make sure your container tags have the following 2 properties set
dir='rtl'
lang='ar'
Step 4 - Get the characters from some specific tool\editor or online editor like i did with Arabic-Keyboard.org
example
<p dir="rtl" lang="ar" style="color:#e0e0e0;font-size:20px;">رَبٍّ زِدْنٍي عِلمًا</p>
NOTE: font type, font family, font face setting will have no effect on special characters
The W3C has a good introduction.
In short:
HTML is a text markup language. Text means any characters, not just ones in ASCII.
Save your text using a character encoding that includes the characters you want (UTF-8 is a good bet). This will probably require configuring your editor in a way that is specific to the particular editor you are using. (Obviously it also requires that you have a way to input the characters you want)
Make sure your server sends the correct character encoding in the headers (how you do this depends on the server software you us)
If the document you serve over HTTP specifies its encoding internally, then make sure that is correct too
If anything happens to the document between you saving it and it being served up (e.g. being put in a database, being munged by a server side script, etc) then make sure that the encoding isn't mucked about with on the way.
You can also represent any unicode character with ASCII
You not only have to put the meta tag, telling that it is UTF-8 but really make the document UTF-8. You can do that with good editors (like notepad++) by converting them to "unicode" or "UTF-8 without BOM". Than you can simply use arabic characters
As this page is UTF-8, here are some examples (I hope I don't write anything rude here): شغف
If you use a server side scripting language make sure that it does not output the page in a different encoding. In PHP e.g. you can set it like this:
header('Content-Type: text/html; charset=utf-8');
If you don't even know where to get Arabic characters, but you want to display them, then you're doing something wrong.
Save files containing Arabic characters with encoding UTF-8. A good editor allows you to set the character encoding.
In the HTML page, place the following after <head>:
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
If you're using XHTML:
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8" />
That's it.
An alternative way (without messing with the encoding of a file), is using HTML escape sequences. This website does that jobs for you: http://www.htmlescape.net/
Won't you need the ensure the area where you display the Arabic is Right-to-Left orientated also?
e.g.
<p dir="rtl">
i edit the html page with notepad ++ ,set encoding to utf-8 and its work
As mentioned above, by default text editors will not use UTF-8 as the standard encoding for documents.
However most editors will allow you to change that in the settings. Even for each specific document.
Check you have <meta charset="utf-8"> inside head block.