Custom google font error in HTML - html

I have a blog where I use custom fonts from Google Fonts in each and every text of the <body> element, but whenever there is an inverted comma or a double inverted comma in my text, it is not shown as it should be - it is replaced by an unknown character.
I had even looked into the font and there is the character support for the inverted commas.

I don't think this has anything to do with your font.
If you look at the source code you will see the characters already are broken there:
This rather is a problem of your encoding. Your site is UTF-8, but the characters seem to be non-UTF-8. You either need to use UTF-8 characters or change the encoding of your site. (1st option is preferable)
If you change the site encoding to Windows-1252 (which is automatically suggested by Chrome based on the content) everything seems fine:
The question is how did you create this text? Maybe in Word and then copy and pasted? Or is your blog backend not UTF-8?
Also note there are two different characters: ’vs ´.

It's a special character. Please check below example
if you want to write "Don't" than you have to use "don’t"
if you want to write in double quote "highly sought users" than you have to use “highly sought users”
I hope this will help you.

Usually the special characters appears when you copy the text from other sources like MS word. This can be solved by manually entering inverted commas while entering or modifying in the database.

Related

Words showing differently in the browsers

I am working on a site which has some Norwegian words. When I used "På" inside a <span> it is showing as "PÃ¥" in the browser.This is happening only for a particular page. For others it is working fine.I have tried to copy-paste from other working pages.But had no effect.It is showing "PÃ¥" instead of "På".Why this is happening?
you need to use &aring insead of å
see this link for html codes-
http://www.ascii.cl/htmlcodes.htm
Try converting your special characters to equivalent HTML entities using this converter
The character encoding of the page is wrong: the real encoding differs from the declared encoding. Using entity references for all non-Ascii characters would hide the symptoms (with the pertaining risk that later on, when someone inserts an “å”, things go wrong again). But the solution is to remove the conflict.
Check out the tutorial Declaring character encodings in HTML. If you need further help with this, posting the URL (not just copy of all code) is essential.

How to make the website show signs like "č" and "ć"?

I'm making a website that is in Croatian, and I need to use signs like: "č", "ć", "ž", "đ" and "š". They are currently displayed as little boxes.
Info:
I use Notepad ++.
I set the encoding there to UTF-8.
I put the following line of HTML in: <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
However, it does not work. Even Notepad ++ can't display my characters using UTF-8, so that would suggest that I should probably use something else...
http://webdesign.maratz.com/lab/utf_table/
Use HTML entities, for example
č : č
ž : ž
This sounds more like a font issue than a character encoding issue. If it were a character encoding issue, the characters would most likely be displayed as 2+ ASCII characters. The boxes, however, typically mean the character encoding is correct, but that specific character is not available in the font being used (which is especially common with lesser-used fonts). This would explain why it's behaving incorrectly in both the website and Notepad++.
To fix the issue, simply use a different font in your editor and website.
Note: I recommend a widely used font for the best chance of it working. Specifying a generic name in the website (e.g. serif or sans-serif) will probably have even better results, as the OS/browser would decide on the best font to use.
In short, be consistent about your character encoding throughout.
Configure your editor to save in the encoding you want
If you use any server side programming, make sure it isn't transcoding your data
If you use a database, make sure it is configured to use the same encoding
Configure your server to emit a Content-Type header that specifies that encoding
Use the meta tag in your question
The W3C provides useful material on encodings that starts here.
A useful site for special characters and their ASCII-codes: CopyPaste Character
To 'type' them, use the alt codes.
However, to use them in your site, you'll better use the HTML codes like you can find on CPC
As a test, try this:
<span style="font-family:Arial Unicode MS">
č ć ž đ š
</span>
You should be able to see your characters correctly.
I've just copied and pasted a line from your question along with your meta tag, placed it into a plain text file in vi.
It works just fine - all characters are displayed fine: http://www.dusystems.com/tmp/1.html
If you can't do the same with your editor then the problem is with the editor and not character sets and encodings.
If you're on Windows you can use its built-in Notepad to edit UTF-8 files. Open Notepad, type all of your special characters, add the meta tag. When doing Save As select UTF-8 from the Encoding drop-down in the dialog. Save as something.html and open in IE. It will 100% work.

HTML Unicode Issue: How to display special characters

Currently, I have my webpage set to Unicode/UTF-8. When trying to display a special character (for example, em dash, double arrow, etc), it shows up as a question mark symbol. I cannot change these characters to the HTML entity equivalent. How can I circumvent this issue?
A question mark in a lozenge, �, indicates a character-level error: the data contains bytes that do no represent any character, according to the character encoding being applied. This typically happens when the document is declared as UTF-8 encoded but is really in iso-8859-1, windows-1252, or some similar encoding. Windows-1252 is a common default encoding used by various programs on Windows platforms. So you may need to open the file in your authoring program and re-save it as UTF-8 encoded.
If problems remain, please post the URL. Posting the code alone is not sufficient, since the character encoding is primarily specified in HTTP headers.
If you see a question mark in a small box, then it might be a font-level problem (lack of glyph in the fonts being used), but this would be very rare for common characters like the em dash. Different browsers have different ways of indicating character- or font-level problems.
Make sure your document is set to the correct character encoding in the actual code editor, as well as in the doctype. Both are necessary. I spent hours trying to tweak HTML when the only problem was that I needed to set the text setting in Coda.
<head>
<meta charset="utf-8">
See the following screenshot:
Make sure your characters are actually UTF-8 characters. They will probably look something like this:
® or U+0020
http://www.kinsmancreative.com/transfer/char/index.php is a handy site for finding the decimal values of commonly used UTF-8 special characters if you need a reference.

Are unicode characters better or more semantic than the simple text versions?

When I copy/paste text from most sites and pdfs, the following characters are almost always in the unicode equivalent:
double quote: " is “ and ” (“ and ”)
single quote: ' is ‘ and ’ (‘ and ’)
ellipsis: ... is … (…)
I understand ones that can't be represented without unicode like © and ¢, but even for those, I wonder.
When should you use these unicode equivalents? Are they more semantic than not using them? Are they better interpreted by devices (copy/paste/print)? I always find it annoying getting those quote and ellipsis characters because with textmate + programming, you don't use them.
When should you use these unicode equivalents? Are they more semantic than not using them?
Note that these are not “unicode equivalents”. Those characters are available in many character sets other than Unicode, and they are strictly distinct from the alternatives that you propose.
In typography, the left and right versions of the single and double quotation marks are correct. They provide the traditional appearance for those characters that has been used in print media for many years. The ellipsis character provides the correct spacing for an ellipsis that does not naturally occur when using consecutive full stop characters. So the reason all of these are used is to make the text appear correctly to human readers.
Are they better interpreted by devices (copy/paste/print)?
Any system that uses any character set should be designed to correctly handle that character set. If the text is encoded in Unicode, then any recent system (from the last 15 years at least) should be able to handle it, since Unicode is the de facto standard character set for all modern systems.
Not all Unicode-conformant systems will be able to display all characters correctly. This will depend on the fonts available, and even the rendering system that uses the fonts. But any Unicode-conformant system will be able to transmit the characters unaltered (such as in a copy and paste operation).
I always find it annoying getting those quote and ellipsis characters because with textmate + programming, you don't use them.
It is unusual to copy English (or whatever language) text directly into a program without having to add separate delimiters to that text. But most modern programming languages will not have any difficulty handling the text once it is property delimited.
Any systems that cannot handle Unicode correctly should be updated. Legacy character encodings will have no place in the future.
I think there's a simple explanation: MS Word converts these characters/sequences automatically as you type and a lot of text in the internet has been copied from this text editor.
Most of the articles I get for my site from other authors are sent as .doc file and I have to convert it. Usually, it contains these characters you've mentioned.
I'd also add one more: many different types of dashes instead of the hyphen. And also the low opening double quote (as seen in some european languages).
I usually let them stay in the text (all my pages are unicode). It's just important to remember it when playing around with regex etc (especially the dashes can be tricky and hard to spot).
HTML entities serve a triple purpose:
Being able to use characters that do not belong to the document character set, e.g., insert an euro symbol in a ISO-8859-1 document.
Escape characters that have a special meaning in HTML, such as angle brackets.
Make it easier to type characters that are not in your keyboard or are not supported by your editor, e.g. a copyright symbol.
Update:
My info is correct but I suspect I've answered the wrong question...
On the web, I would consider that markup adds semantic meaning, content does not. So it doesn't really matter which you use in this context.
Typographers would insist on “ and ”, where programmers don't care and just use regular old quotes ".
The key here is interoperability. There are different encoding schemes. As we've all been victim to, people paste content into an editor from WORD, which uses windows-1251 encoding. When you serve this content up via AJAX is usually breaks because AJAX uses UTF-8 encoding by default.
Office 2010 now allows for the saving of documents in UTF-8 format. Also, databases have different unicode encoding schemes. The best bet is to use UTF-8 end-to-end.
When you copy-pasta text that includes special characters, they will be left as they are. This is perfectly fine if the characters match the charset used by the webpage.
HTML entities are just a convenience for producing specific characters in any character set. Keyboards tend not to have keys to get symbols like ©, so the HTML entity is a shortcut.
I'm going to generalize and say that most of the time the content is UTF-8 (please correct me if I'm wrong). The copied characters are usually copied correctly and everything works great, if they aren't copied correctly, or the charset is subject to change, or you're after i18n support, go with the HTML or XML entities. Otherwise, leave them as they are, the browser will display them just fine.

Special HTML Characters

Ok, so I want to have the characters from below in my html page. Seems easy, except I can't find the HTML encoding for them.
Note: I would like to do this without having sized elements, plain ol' text would be fine ^_^.
Cheers.
You can see that they have a unicode number of the selected character - at the bottom of the picture ("U+266A: Eighth Note").
Simply use the last portion in a unicode character entity: ♪ - ♪
If your page is already UTF-8, you can simply paste it in.
Try encoding it as █ - that should do the trick!
In a UTF-8 encoded page, just copy and paste them as-is.
Otherwise, use the number that the dialog gives you for each character, e.g. ♪
However, when working with rather exotic characters, be very wary of font support. See e.g. this question for background: Unicode support in Web standard fonts
This page gives some information about support for the characters you want to use. They seem to be relatively well supported, but a test on Linux and Mac machines won't hurt.
Here is one comprehensive entity reference. If you want to convert symbols into their entity counterparts, I suggest using this converter.
My suggestion is to use hexadecimal reference. ( it's easy dont worry :) )
for example, the first character you have highlighted in red got ascii value of 175, which is AF in hex.
So in short you can encode it using %AF, and so on...
is it clear mate? Let me know if you need further explanation or help about this :)
Edit: my post is meant for url encoding.