I have Thai characters in MySQL but they don't transfer to my ASP generated web page correctly. I've linked to screen shots of the crucial factors showing the data is OK in the table but not on the website as seen in the last screen shot. Any thoughts what I'm doing wrong?
http://www.transum.com/Temp/ThaiScript.PNG
The first two pictures are from phpMyAdmin
The third picture is the header of my webpage.
The fourth picture shows what appears on the webpage.
I have already added the UTF-8 instruction to the connection:
Conn.execute ("SET NAMES utf8")
SQL = "select * from Phrases WHERE Checked = TRUE Order by English ASC"
set RSrecord = Conn.execute (SQL)
Response.CharSet = "utf-8"
Without seeing more of your source, my first question (or suggestion) is whether you are HTML encoding your output?
Response.Write Server.HTMLEncode(RSrecord.Fields("Thai_Script"))
If this doesn't work for you, can you show a bit more code?
I think I've worked this one out. First of all I should repeat that if you can possibly install MyODBC v5.1 then do so and don't bother trying to configure for 3.51. Unfortunately it would appear that the older driver is all that a certain large and well known web host seems to offer.
Here's a link to a version of my Russian Cyrillic page. It uses the old driver, (MyODBC 3.51)
http://clubdanceholidays.co.uk/aboutusruansi.asp
The crucial points to note are:
1) In the first line:
<%# LANGUAGE="VBSCRIPT" CODEPAGE="1252" LCID="2057" %>
Use 1252 rather than 65001 as your codepage value - I realise just how counterintuitive this is, but in this specific situation it works.
2) Your line
Conn.execute ("SET NAMES utf8")
Adding this query immediately before the query to populate my recordset was the missing piece of the jigsaw for me.
3) The Charset meta tag
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
This should definitely be UTF-8. Essentially we're telling the server that it's a Windows-1252 page and the browser that it's a UTF-8 page.
4) What kind of encoding to use when you save the file
This is a little more complicated. If you're using Notepad then you need to use ANSI. The UTF-8 option is actually UTF-8 + BOM, which seems to create a conflict with the codepage definition in point 1. The problem with ANSI comes if you need to hardcode any non-western characters into your page - you couldn't insert them as they are, you would need to manually encode them first, eg for เครื่องบิน you would need to insert เครื่à¸à¸‡à¸šà¸´à¸™
If you have a more powerful editor than Notepad, (I use Editplus - http://www.editplus.com/ - it's not free but it's quite cheap) you may find two UTF-8 options. Choose the one without BOM.
Related
I'm trying to load data from a MySQL DB from a varchar(35) / utf8_swedish_ci field through TBS (tinybutstrong) and PHP using the example (MySQL data merge). My issue is that data loads fine if only ascii characters are in the fields but as soon as I add a single scandinavian special character like ö or ä the field contents vanishes entirely and other fields in row display correctly.
My understanding is that the latest versions on TBS automatically use UTF-8 coding (I have 3.9.0 for PHP 5) so I assumed it would work out-of-the-box. To be safe, I even added the coding to template as so:
'$TBS->LoadTemplate('mysql.html','UTF-8');' but to no avail.
Could someone please advice what is causing this.
For a good UTF-8 processing, all elements of the chain must be UTF-8.
You have to ensure that your template is UTF-8 : check the entered text and the HTML element <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
You have to ensure that all your PHP scripts are UTF-8 and not Ansi.
You also have to ensure that your MySQL connection is set to receive UTF-8 queries and to return UTF-8 item data. This can be done for example by querying the SQL : SET NAMES 'UTF8'
I have a ASP page.I am fetching data from database.In the database the character are all Russian.But when I fetch that data to show in the webpage it render as '?' marks.
oCommBM.Parameters.Append oCommBM.CreateParameter("#menu", adVarChar, adParamOutput,2000, "0")
I am passing the parameter like that.If instead of adVarChar I used aVarWCahr then it's showing the Russian character but the below contained not render properly.
I checked by execute store procedure from database.There it's showing fine.
Also I add below 2 lines in the asp page.
Response.codePage = 65001
Response.Charset = "UTF-8"
I changed every possible encoding type from asp page code as well as from Notepad++ encoding type.
Any suggestion is greatly appreciated.Thanks in advance.
Firs part: recover from DB
When you use aVarWChar, the W is for "Wide", which means that this parameter is an Unicode string. Unicode support all languages without problem, so it's the way to go.
When you use aVarChar, the string is encoded in some other way, but not in Unicode (It could depend on the server configuration, or something else. I don't know if you can control it. ASP it's such old technology!!)
SO your best bet is to recover Unicode from the DB and let ASP encode it for the browser.
Second part: show in ASP page
Look at this: Classic ASP: How to write unicode string data in classic ASP?
This shows how to get ASP to encode the Unicode string in a way that the browser will show it correctly.
I am currently working on a site that connects a DB and bring the information, some of this information has special characteres because is in polish languague, for example, in the database I have this one ę and I get e printed at my web,I already added the meta
<meta charset="ISO-8859-2">
but doesnt work, only if I write & #281; which is not pract and needs a lot of work, my question is if somebody did this , get the character, like ę, and print it just like that?
Thanks.
Make sure that:
the data really is in ISO-8859-2
the data isn't be corrupted by the configuration of the database
the HTTP headers aren't claiming the data is encoded a different way
whatever you are using to pull the data out of the database isn't transcoding it
You should also ditch ISO-8859-2 (as it is very legacy) and move to UTF-8.
Use a Unicode entity. &#xxxx; where xxxx is the Unicode value for the character.
I am currently having an issue saving special characters (® & ©) to a mySQL database.
On a local stack (local to my development machine) the operation runs smoothly and what is entered in the front-end gets saved as what is expected.
With the same front-end code on the centralised server when saving to the DB the character is proceeded with other characters it is saved as ®.
This isn't an issue when viewing the record on the front-end as it reverses what it does and displays correctly.
The issue comes from another separate system which utilises the same database and doesn't do any alteration so appears as the ® instead of simply ®.
The reason for my confusion is that it seems to work as required on my local stack but not on the centralised server.
I am looking for ideas as to what to look for which might be causing this on the servers configuration compared to my local.
Thanks in advance
Mark
#Pranav Hosangadi (thanks) covers three areas to check for consistency of encoding. The following solution adds to that. It may also be worth considering (a variation of) #Soaice Mircea's answer (thanks also) for some scenarios whereby this answer doesn't fix the problem although this wasn't required when I was able to reproduce and find a solution to your problem. #Pranav's line of thinking seems to be successful for this problem as it's about consistency of using one character set everywhere rather than a particular one.
five things to do:
ensure database charset and tables use same charset throughout, inspect this in phpmyadmin for example, note and this charset for use below
use php header() function with database charset e.g.:
header('Content-Type: text/html; charset=latin1_swedish_ci');
insert meta tag in html header e.g.:
<meta http-equiv="content-type"
content="text/html;charset=latin1_swedish_ci">
add charset-accept in form tag
<form action=\"testsubmit.php\" method=\"post\" accept-charset=\"latin1_swedish_ci\">
set charset of mysql connection e.g.:
$con = mysql_connect("localhost","test","test");
mysql_set_charset ( "latin1_swedish_ci", $con );
after you select the DB in php code add this:
mysql_query("set names 'utf8'");
also check if the column where you store the text has utf8 encoding.
I have a custom made CMS that I must migrate to work on Wordpress. Everything worked fine except the charset module.
Since this is about a Rumanian blog content, there are some special chars used (this will be ă, î, ș, â, Ț). When i insert this content on wordpress wp_posts, Wordpress displays them as "?".
I've tried all kind of stuff, like changing the charset from utf8 to latin1, latin2, and so on, but no result.
Even more, when i try to replace that special characters with normal ones (eg: ă to a, î to i) nothing happens, the content remains the same (there are actually some chars that are changed but not all)
What i do wrong and what i must do to do it right?
Thanks!
Character sets are a complete nightmare. What I'd do is to use mysqldump to dump your database to a sql file. Check to see if the special chars still look right.
Then, using find and replace in a text editor, replace all special characters with the correct html entity. e.g. Ă becomes Ă.
http://meta.wikimedia.org/wiki/Help:Romanian_characters
Then delete your database, set all conceivable settings to utf-8, and import your dump.
Wordpress also has an extensive article about character encodings.
Good luck!