I am working on a site which has some Norwegian words. When I used "På" inside a <span> it is showing as "PÃ¥" in the browser.This is happening only for a particular page. For others it is working fine.I have tried to copy-paste from other working pages.But had no effect.It is showing "PÃ¥" instead of "På".Why this is happening?
you need to use å insead of å
see this link for html codes-
http://www.ascii.cl/htmlcodes.htm
Try converting your special characters to equivalent HTML entities using this converter
The character encoding of the page is wrong: the real encoding differs from the declared encoding. Using entity references for all non-Ascii characters would hide the symptoms (with the pertaining risk that later on, when someone inserts an “å”, things go wrong again). But the solution is to remove the conflict.
Check out the tutorial Declaring character encodings in HTML. If you need further help with this, posting the URL (not just copy of all code) is essential.
Related
I have got two HTML documents based on the same template. I built both exactly the same and then changed the contents inside the divs. I'm using the DTD HTML 4.01 Transitional and the charset ISO-8859-15 (for Spanish language, you know accents and so on) in a meta tag inside the head.
And when it comes to validation, one parses and the other doesn't, and I can't figure out why.
It complains about some accents in one of the documents that are also present in the other document which gets no complaints.
I find it funny, but there must be a reason.
I think I found the problem, I just opened with the simple notepad the file that was giving me trouble and once opened there I could see that I had strange characters like “ or ‰ in my code. I just removed them and wrote the contents properly and, of course, it passed the parser. I could not see those characters with my file opened from notepad++, that's why the parser error I was getting was so strange to me.
I didn't set the encoding in my Notepad++ to ANSI and maybe that was the reason I couldn't see those odd characters.
I cant believe what im seeing here! I have a normal, basic html form (havent changed the enctype), if someone puts a strange japanese character in the field and posts the form then in my database it is saving an HTML encoded version of the character. I am not processing the string at all except with a Trim(). Using classic ASP (not out of choice i might add!). I have a feeling this might have something to do with utf-8/encoding but ive tried messing around with the meta tag and content type and been unable to get the character to come through properly. To make things harder i dont seem to be able to get classic ASP debugging in VS express 2010. Any comments appreciated :)
As you can see in this demo and read in the standard (4.10.22.6.4.2), characters that are not supported by the selected encoding (such as Japanese ones in an ISO8859-* or cp1252 encoding) are encoded as HTML entities.
If you are fine with incorrectly handling user input that contains html entities in the clear, you can replace all numeric HTML entities in the user input with the corresponding Unicode character (however, doing so in ASP is hard since there is no inverse function to Server.HTMLEncode and Unicode support is pretty much nonexistent in the first place.
As an alternative, use UTF-8 (and/or a web development platform from this millennium) and all these problems go away. However, since that may not be an option, you may want the to unescape the HTML entities in different programs, for example with HttpUtility.HtmlDecode in C#, html_entity_decode in PHP, or HTMLParser.unescape in Python.
Ok, so I want to have the characters from below in my html page. Seems easy, except I can't find the HTML encoding for them.
Note: I would like to do this without having sized elements, plain ol' text would be fine ^_^.
Cheers.
You can see that they have a unicode number of the selected character - at the bottom of the picture ("U+266A: Eighth Note").
Simply use the last portion in a unicode character entity: ♪ - ♪
If your page is already UTF-8, you can simply paste it in.
Try encoding it as █ - that should do the trick!
In a UTF-8 encoded page, just copy and paste them as-is.
Otherwise, use the number that the dialog gives you for each character, e.g. ♪
However, when working with rather exotic characters, be very wary of font support. See e.g. this question for background: Unicode support in Web standard fonts
This page gives some information about support for the characters you want to use. They seem to be relatively well supported, but a test on Linux and Mac machines won't hurt.
Here is one comprehensive entity reference. If you want to convert symbols into their entity counterparts, I suggest using this converter.
My suggestion is to use hexadecimal reference. ( it's easy dont worry :) )
for example, the first character you have highlighted in red got ascii value of 175, which is AF in hex.
So in short you can encode it using %AF, and so on...
is it clear mate? Let me know if you need further explanation or help about this :)
Edit: my post is meant for url encoding.
I've been asked to add a testimonial to this page...
http://www.orchardkitchens.com/Showroom/testimonials.html
As you will see there are funny characters showing up all over the place, and it has thrown the structure of the page out.
I've since reloaded the backup and the funny chars are still appearing. Any ideas what I need to do??
Please ask if you need more info from me about the problem in hand.
Many thanks,
ETFairfax.
Looks to me as though some of the text was encoded as UTF-8 yet loaded as if it were an ANSI charset then an HTML encode run over it. Resulting in these extra characters. You will need to find the source text re-build the HTML ensuring whatever is reading the source text understands that its in UTF-8 encoding.
Valid HTML might be a start; a HTML document shouldn't start with a meta tag directly. Also it seems that the charset problem is not with your web page but rather in the backend code. Look at the source, there are numerous things such as
“
appearing which are HTML character entities for things that UTF-8 encoding yields when interpreted as Latin 1. So you should probably fix your code instead of the HTML (well, that too).
Your HTML is syntactically invalid. The <!doctype> is missing, the <html> tag is missing, the <head> tag is missing, the meta information cannot be parsed reasonably by the webbrowser.
Fix your HTML first and then retry.
As to the character encoding story, just ensure that you're using one and same character encoding everywhere. In the datastore, in the source files, in the response headers, etcetera. You may find the introductory text of this article useful to learn a bit more about character encodings. If you actually know/use Java, then you may find the proposed solutions useful as well.
I am reading in HTML from a file and displaying it on a web page:
When I look at in the source I see:
The Club’s summer junior programs
but it shows up as:
The Club�s summer junior program
What is happening here and why the � is showing up?
Did you set the proper encoding of the html page?
Read here and here.
I'm guessing you (or someone close to you) is copy/pasting from Word and you are seeing the webby effects of word's [not so] smart quotes. The work around is to set the character encoding to utf-8 or windows-1252.
This is definitely a character encoding issue. It means the page says it has X encoding, but actually it has Y.
A very interesting read by Joel: http://www.joelonsoftware.com/articles/Unicode.html about this topic, definitively a must read if you didn't already read this.
It explains pretty well why these problems occur, how they came to be and how to avoid it :).
May be you have copied text from a work editor, like MS Word, which changes quotes to open quotes and closed quotes characters. When such a text is copied to a text file, it gives these problems.
A simple solution can be to type these quotes again in the text editor.