Browser does not follow the page charset - html

I defined a webpage to use iso-8859-1 like the following:
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
But when I open the page in the browser, the browser is using UTF-8 to read the page. Why the browsers does not following the page charset?

If you have access to your apache config, you should look in the httpd.conf (or equivalent) for the following directive:
AddDefaultCharset UTF-8
According to apache docs, this will override the meta declaration that you set. http://httpd.apache.org/docs/2.0/mod/core.html#adddefaultcharset
You could turn it off by replacing the directive with this:
AddDefaultCharset Off

The information that really matters is the real Content-Type HTTP header sent by the web server. You can inspect it with Firebug of a similar tool. <meta> tags should only matter if you save the file to disk and the HTTP header is lost.

Related

how to prevent server I don't own from sending charset=UTF-8 in the http request

I have an old web site in French tha I want to preserve and whose html files were encoded in iso-8859-1. All html files included
<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
in the <head> element, however the host of my website changed something in the configuration an now pages are sent from their server with an HTTP header including
content-type: text/html; charset=UTF-8
and unfortunately someone decided this would override the <meta> information.
Do I have to trans-code all my html files to UTF-8 or is there a faster solution?
Update
In fact the charset was added to the http header's content-type field only for html content issued by php, not for pure html files. I'll put the solution I adopted as an answer.
Your options:
Transcode the files
Persuade whomever changed the server configuration to change it again
Change servers
Run all every request through a server side script which outputs a different Content-Type header and then outputs the HTML (which accounting for cache-control headers)
Took me a while to realize the problem occurs only for .php files. The fix I chose is the following: I added the line
ini_set('default_charset', NULL);
at the beginning of every php files. A bit tedious but seems reasonable to me.

hebrew characters don't show in "UTF-8 without BOM" only "UTF-8"

My html document starts as follows:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
</head>
אבגד
If I encode my document as UTF-8, it appears correctly in the browser. If I encode as UTF-8 without BOM (which I understand is more standard) I get unusual characters.
What am I doing wrong?
Your web server is declaring that the encoding is ISO-8859-1, and the browser is respecting that. Ironically enough, using a byte order mark sends a stronger signal to the browser that the encoding must actually be UTF-8. (The exact reason for this is complicated and boring.)
Fixing your web server depends on what the server is. If this is a static resource on disk served by Apache httpd, then something like AddCharset UTF-8 .html will add the header.
If this resource is served dynamically, then you should make sure you add the proper HTTP headers when producing the response, something like self.send_header('Content-Type', 'text/html; charset=utf-8') for Python's basic http server.

How to set a charset header for a html page?

Even though i use the below meta tag to set the content-type and charset, i am not seeing the charset header in the firefox firebug debugger.
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
Any help is appericiated.
The meta tag does not affect the HTTP headers sent. (Long ago, it was kind-of meant to do such things, but apart some forgotten experiments, it never did.) It specifies the encoding to be implied if HTTP headers do not specify the encoding; so it’s really not equivalent to an HTTP header (as the name ´http-equiv` suggests) but a replacement, surrogate, Ersatz for an HTTP header.
The way to set the HTTP headers depends on the server software and its settings.
But if the headers do not specify the encoding, then the meta tag takes effect. You ca check via the View → Encoding menu in Firefox which encoding is being applied.
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
It is how you set charset header for HTML files,there is nothing wrong.
Why would you use firebug to check the Charset? Just right click your mouse key and from the context menu select view page info and it will give you the page charset.

How to set the "Content-Type ... charset" in the request header using a HTML link

I have a simple HTML-page with a UTF-8 encoded link.
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body>
<a charset='UTF-8' href='http://server/search?q=%C3%BC'>search for "ü"</a>
</body>
</html>
However, I don't get the browser to include Content-Type:application/x-www-form-urlencoded; charset=utf-8 into the request header. Therefore I have to configure the web server to assume all requests are UTF-8 encoded (URIEncoding="UTF-8" in the Tomcat server.xml file). But of course the admin won't let me do that in the production environment (WebSphere).
I know it's quite easy to achieve using Ajax, but how can I control the request header when using standard HTML links? The charset attribute doesn't seem to work for me (tested in Internet Explorer 8 and Firefox 3.5)
The second part of the required solution would be to set the URL encoding when changing an IFrame's document.location using JavaScript.
This is not possible from HTML on.
The closest what you can get is the accept-charset attribute of the <form>. Only Internet Explorer adheres that, but even then it is doing it wrong (e.g., CP-1252 is actually been used when it says that it has sent ISO-8859-1). Other browsers are fully ignoring it and they are using the charset as specified in the Content-Type header of the response.
Setting the character encoding right is basically fully the responsibility of the server side. The client side should just send it back in the same charset as the server has sent the response in.
To the point, you should really configure the character encoding stuff entirely from the server side on. To overcome the inability to edit the URIEncoding attribute, someone here on Stack Overflow wrote a (complex) filter: Detect the URI encoding automatically in Tomcat. You may find it useful as well (note: I haven't tested it).
Noted should be that the meta tag as given in your question is ignored when the content is been transferred over HTTP. Instead, the HTTP response Content-Type header will be used to determine the content type and character encoding. You can determine the HTTP header with for example Firebug, in the Net panel.

Webview doesnt show Æ Ø Å properly

I have some content on a webpage which contains æ ø å, but my webview cant show them properly.
Does anyone know what the problem might be ?
In order to use UTF-8 characters inside an (X)HTML page you declare the encoding with this meta tag (in the head section of the page):
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
If that alone does not work you may be able to find more useful information here.
You need to ensure that the HTML file is saved as UTF-8 and that the Content-Type header in the HTTP response contains the proper charset. You can verify the headers by among others Firebug.
A <meta> tag for Content-Type would only work when the Content-Type header in the response is absent and this is usually not the case when the HTML file is served over HTTP. However, its presence is good for offline viewing and self-documentary purposes.