Charset changing when reloading or refreshing the page - html

I have an ASP.NET page with the encoding defined in the header that loads with the proper encoding only once, then shows the data with no encoding (ASCII?).
The HTML header is written like this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" >
<meta http-equiv="content-type" content="text/html" />
<meta http-equiv="charset" content="utf-8" />
The data that should appear as UTF-8 is fetched from a nvarchar cell in SQL and put in a cookie in ASP.Net. When the page reloads the first time, it appears ok:
Salé
Then for all other redirects / refreshes, it appears as:
Salé
The charset meta is in the source when checking the bad page's source in a browser.
Is my header wrong? If not, I'll dig deeper into my code.
[EDIT]
I tried to change the header programmatically with a runat="server" in the <head>, but it wasn't working.
Dim meta As System.Web.UI.HtmlControls.HtmlMeta = New System.Web.UI.HtmlControls.HtmlMeta()
meta.HttpEquiv = "content-type"
meta.Content = Response.ContentType + "; charset=" + Response.ContentEncoding.HeaderName
Page.Header.Controls.Add(meta)
[EDIT]
Found a similar issue: UTF-8 problems with characters from MySQL database (e.g. é as é)

Related

Changing Charset from Access database record

I have an MS Access database that contains many records. The ASP classic pages in the website that loaded records into the database were written years ago in HTML 4.01 transitional using charset iso-8859-1.
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
There are some special characters (e.g. é) in some of the database fields. The pages that were coded at the same time as the database input pages display these characters correctly.
However, I have now added some mobile friendly pages to the site which are coded in HTML 5 and use the charset UTF-8.
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
Those pages, using the same data from the same database do not show the special characters correctly. They show a � instead.
I have tried re-coding the charset on the new pages to iso-8859-1 but that does not fix the problem. I have searched this forum and read pages like http://kunststube.net/frontback/ but cannot see where I am going wrong.
Could it be that the MS Access database holds the information in charset iso-8859-1 and I need to change it when I run the "select * from" command in ASP? If so how do I do that? Or am I way off track with that idea?
I know I could change all of the new pages and code them in HTML 4.01 transitional and that will work, but I was hoping to update the old ones in the fullness of time to HTML 5 rather than go backward.
OK I seem to have solved it by using
<%# language=vbscript codepage=65001 %>

Why max-age is ignored?

I have a simple html page which starts like this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<meta http-equiv="Cache-Control" content="public, must-revalidate">
<meta http-equiv="Cache-Control" content="max-age=88000" />
<script type="text/javascript" src="/js/index.js"></script>
....
However, when I check index.js file in FF web console, I see Cache-Control: "max-age=0". Why is that and how can I fix it? Thanks!
There is no reason to expect a meta tag in an HTML file to affect the HTTP headers sent for a JavaScript file that it refers to (or even the HTTP headers sent for the HTML file itself, for that matter).
The HTTP headers are set by the web server (or, more generally, HTTP server) software in use, possibly as affected by system-wide or directory-wide settings on the server. Long ago, the idea was that certain meta tags might affect the HTTP headers for the HTML document itself, but this was generally not implemented in servers. Instead, browsers may use some meta tags and act as if corresponding HTTP headers had been sent, but a) this only applies to the HTML document itself, if at all and b) it cannot be seen by tools that inspect the HTTP headers actually sent.

FB Thinks Meta Tags Are in Body

If you follow this links you will see that there is a paragraph-tag that should not be there in line 3
This is the actual code that "causes" this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:fb="http://www.facebook.com/2008/fbml">
<head>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1" />
I do not get why there would be a paragraph-Tag cancelling the element and making FB believe that meta-tags are in the body instead... do you?
https://developers.facebook.com/tools/debug/
I used the Facebook Open Graph Debugger to solve the issue. On the very bottom there is a link called
Scraped URL See exactly what our scraper sees for your URL
This shows exactly what Facebook catches as code and there was a print of a paragraph-tag from one of the libraries I have been using. I can only recommend to debug it this way because I wasted a lot of time not doing so (because I did not see the link at the bottom of the page).

ù,é it turns in to �

I have a french website, the below is my header.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" >
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
I am trying to put these charactors ù,é it turns in to �
Please tell me why?
Thank you.
I am trying to put these charactors ù,é it turns in to �
That is a pretty certain indicator that the text you output is not UTF-8 encoded as you say in the header. My guess would be it's ISO-8859-1 encoded.
This can be because
The HTML file you are editing isn't UTF-8 encoded. Save it as UTF-8 - the option for that is often in the "Save As..." dialog of your editor or IDE.
The database connection you are getting the text from isn't UTF-8 encoded.
You need to save the html file as UTF-8 format. Also you can add an attribute lang="fr" to your html tag.
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
You can use the following code in head section
<meta http-equiv="encoding" content="text/html" />
I think it will works for you.

Html encoding defaults to "Western (ISO-8859-1)" locally

Lets say I have the following file in called index in the directory D:\Experimental:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" >
<head>
<title>Minimal XHTML 1.1 Document</title>
</head>
<body>
<p>This is a minimal XHTML 1.1 document.</p>
</body>
</html>
If I open the link
file:///D:/experimental/index.html
I get to see the html, but it seems that the character encoding defaults to Western (ISO-8859-1), I can see this when I click view -> character encoding in firefox.
I want to display this in UTF-8 because Western (ISO-8859-1) doesn't display some characters correctly. Does anyone know how to fix this?
You should include:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
in your HEAD element.
Edit
I've just tried your example in Firefox on the Mac, and even without the meta tag, it correctly interprets the document as UTF-8. The Standard seems to indicate that it should use the XML processing instruction, but that you should also use the correct HTTP headers. Since you're not sending headers (because you're not using HTTP) you can specify them with the meta tag.
Maybe try adding
<meta http-equiv="content-type" content="text/html;charset=utf-8" />
in <head> section?
When loading files from disk, your browser does not have an HTTP Content-Type header to read the encoding from, so it guesses. To guess the document encoding it uses your operative systems current encoding, the actual bytes that are in the files and information inside the file itself.
As Jonathan wrote, you can add a
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
element that will help the browser using the correct content type. Anyway, note that that element will often be ignored by browsers if your document is sent from a misconfigured HTTP server that explicitly specifies another encoding the Content-Type header.