How do I get rid of this character ? - html

This character:

shows up on my site 3 times and for all 3 cases it's shown after a closed div tag. I searched the web and SOF and there are some solutions but none of them worked on mine so decided to post here. I am using .NET. I realize that this is not sufficient info but i am new to programming so not sure what other info you might need. Please let me know. Thanks!

Looks like an byte order mark. Please check your source and output encoding.

Yes it is the Byte Order Mark (BOM). It was driving me crazy too. I researched and started reading about BOM and tried adding charset="UTF-8" to some script tags but no go.
I use Dreamweaver and found that when I saved (save as) some recent html files, the option for "Include Unicode Signature (BOM) was checked. I unchecked and saved and it resolved the unwanted characters (I guess it saves it without the BOM)!!
Updating the meta tags charset to UTF-8 will resolve this too and is recommended (which means dozens of pages for me) but I needed this quick fix.
Also, saving with notepad++ looks to do the trick as well. Here's a related article wrt ++ and settings wrt BOM: notepad++ converting ansi encoded file to utf-8
I hope this help someone!

I use Dreamweaver and found that when I saved (save as) some recent html files, the option for "Include Unicode Signature (BOM) was checked. I unchecked and saved and it resolved the unwanted characters (I guess it saves it without the BOM)!!
This is the perfect solution. its worked for me.
thx everyone

Related

Words showing differently in the browsers

I am working on a site which has some Norwegian words. When I used "På" inside a <span> it is showing as "PÃ¥" in the browser.This is happening only for a particular page. For others it is working fine.I have tried to copy-paste from other working pages.But had no effect.It is showing "PÃ¥" instead of "På".Why this is happening?
you need to use &aring insead of å
see this link for html codes-
http://www.ascii.cl/htmlcodes.htm
Try converting your special characters to equivalent HTML entities using this converter
The character encoding of the page is wrong: the real encoding differs from the declared encoding. Using entity references for all non-Ascii characters would hide the symptoms (with the pertaining risk that later on, when someone inserts an “å”, things go wrong again). But the solution is to remove the conflict.
Check out the tutorial Declaring character encodings in HTML. If you need further help with this, posting the URL (not just copy of all code) is essential.

joomla generated page vs static html getting unknown character in transition

I have been tasked with cleaning up a very messy site, http://www.investravel.com/, built in joomla. I have first copied the entire output source to a static html file http://www.investravel.com/test.html but am getting the unknow character symbol repeated throughout the copy in the html version.
Does anybody have any idea why that might be as I find it quite curious given they should present the same source to the browser.
It might be worth nothing there are two
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
in the original, both spelt slightly differently. I have removed both and added the correct W3C version but still to no avail.
Any help much appreciated.
I just tried saving it with firefox and it saved everything in UTF8.
The way I did it was:
Go to the "view" menu, select "Character Encoding", and make sure it has "Unicode (UTF-8)" (note that after forcing the encoding, make sure all characters are correct, I tried with that encoding and at first glance all seems right).
Then save the page as html and open it, all should be ok!
The reason your characters are wrong is probably because you had some other encoding forced, in your case I detected the Western (ISO-8859-1) encoding.
Those are encoded in the database, then they show up as the symbol once it makes it in the browser. You will notice the same thing happens with things like the copyright symbol (in the database it is © but in the source it will show up as the actual symbol. You are not going to be able to make accurate copies of the pages as static HTML if they used a lot of smart quotes and other symbols.
Why would you want to take a dynamic site and make it static in the first place? That seems horribly inefficient.

Validation error: "Byte-Order Mark found in UTF-8 File"

I'm working on a website and, while displaying it on Firefox is fine, on Internet Explorer I've got a lot of problems. I used the W3C validator and I got a lot of strange errors.
Here's the link to the website: http://misenplacecatering.it/
The first validation error, which I think is the most relevant, is this:
Byte-Order Mark found in UTF-8 File. The Unicode Byte-Order Mark (BOM) in UTF-8 encoded files is known to cause problems for some text editors and older browsers. You may want to consider avoiding its use until it is better supported.
and
Line 1, Column 1: Non-space characters found without seeing a doctype first. Expected .
<!DOCTYPE HTML>
I've read other questions about this issue, so I tried to open the file with different editors (I always use Vim, anyway), but I don't see any space or anything else before the doctype definition. I even used Notepad++ and used an option to remove the BOM, but nothing.
How can I fix it?
If using Notepad++, use Convert to UTF-8 without BOM.
If you are using PHP, make sure that any included/required file is in either in ASCII or UTF without a BOM, as PHP doesn't handle non-ASCII file very well (this one gave me a headache once)
You could try converting your files to ASCII, if you don't need UTF characters.
In your <meta charset> attribute, try writing the value within quotes.
The free text editor PSPad has a hex editing mode which is very handy for seeing exactly what you really have in your text files.

HTML Encoding Charset Problem I think?

I've been asked to add a testimonial to this page...
http://www.orchardkitchens.com/Showroom/testimonials.html
As you will see there are funny characters showing up all over the place, and it has thrown the structure of the page out.
I've since reloaded the backup and the funny chars are still appearing. Any ideas what I need to do??
Please ask if you need more info from me about the problem in hand.
Many thanks,
ETFairfax.
Looks to me as though some of the text was encoded as UTF-8 yet loaded as if it were an ANSI charset then an HTML encode run over it. Resulting in these extra characters. You will need to find the source text re-build the HTML ensuring whatever is reading the source text understands that its in UTF-8 encoding.
Valid HTML might be a start; a HTML document shouldn't start with a meta tag directly. Also it seems that the charset problem is not with your web page but rather in the backend code. Look at the source, there are numerous things such as
“
appearing which are HTML character entities for things that UTF-8 encoding yields when interpreted as Latin 1. So you should probably fix your code instead of the HTML (well, that too).
Your HTML is syntactically invalid. The <!doctype> is missing, the <html> tag is missing, the <head> tag is missing, the meta information cannot be parsed reasonably by the webbrowser.
Fix your HTML first and then retry.
As to the character encoding story, just ensure that you're using one and same character encoding everywhere. In the datastore, in the source files, in the response headers, etcetera. You may find the introductory text of this article useful to learn a bit more about character encodings. If you actually know/use Java, then you may find the proposed solutions useful as well.

apostrophes coming in as �

I am reading in HTML from a file and displaying it on a web page:
When I look at in the source I see:
The Club’s summer junior programs
but it shows up as:
The Club�s summer junior program
What is happening here and why the � is showing up?
Did you set the proper encoding of the html page?
Read here and here.
I'm guessing you (or someone close to you) is copy/pasting from Word and you are seeing the webby effects of word's [not so] smart quotes. The work around is to set the character encoding to utf-8 or windows-1252.
This is definitely a character encoding issue. It means the page says it has X encoding, but actually it has Y.
A very interesting read by Joel: http://www.joelonsoftware.com/articles/Unicode.html about this topic, definitively a must read if you didn't already read this.
It explains pretty well why these problems occur, how they came to be and how to avoid it :).
May be you have copied text from a work editor, like MS Word, which changes quotes to open quotes and closed quotes characters. When such a text is copied to a text file, it gives these problems.
A simple solution can be to type these quotes again in the text editor.