How to convert from windows-1257 to utf-8 (mysql) - mysql

I'm looking for a help with encoding problem I've been struggling for couple days now already.
I have a database with Collation "latin1_swedish_ci". When I view a single entry it shows a messed up text
grieþos informçs
it should be
griežos informēs
Ok so... I tried to output text into browser with php script and set
<meta http-equiv="Content-Type" content="text/html; charset=windows-1257" />
Now... it shows data correctly ("griežos informēs"). What I need to do is convert this data to UTF-8 so I can use
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
and get everything displayed correctly.
I tried to use Notepad2 and create a file with encoding windows-1257 then copy text from database and save it.... same problem.
Tried to even create a table in that database with utf-8 collation and insert data there... no luck - just shows ? where special chars should be.
Any help would be appreciated,
Thanks

SELECT CAST(_cp1257_general_ci'test' AS CHAR CHARACTER SET utf8);
(test is the column name)
That will get you the row in the proper character set. You should create a new table using UTF8 then SELECT CAST(....) FROM old_table INTO new_table
You may also want to change the character set of the old table to cp1257_general_ci instead of latin1_swedish_ci.
In order to do it without losing data first change the datatype of the column to blob, then change it a second time to varchar with the cp1257_general_ci, then finally a third time to utf8_unicode_ci (I suggest a backup first, just in case.)

Use this query right after you open your connection to your database so you run it once on every page load :
mysql_query('SET NAMES `utf8`');

Related

tinybutstrong not showing special characters from mysql

I'm trying to load data from a MySQL DB from a varchar(35) / utf8_swedish_ci field through TBS (tinybutstrong) and PHP using the example (MySQL data merge). My issue is that data loads fine if only ascii characters are in the fields but as soon as I add a single scandinavian special character like ö or ä the field contents vanishes entirely and other fields in row display correctly.
My understanding is that the latest versions on TBS automatically use UTF-8 coding (I have 3.9.0 for PHP 5) so I assumed it would work out-of-the-box. To be safe, I even added the coding to template as so:
'$TBS->LoadTemplate('mysql.html','UTF-8');' but to no avail.
Could someone please advice what is causing this.
For a good UTF-8 processing, all elements of the chain must be UTF-8.
You have to ensure that your template is UTF-8 : check the entered text and the HTML element <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
You have to ensure that all your PHP scripts are UTF-8 and not Ansi.
You also have to ensure that your MySQL connection is set to receive UTF-8 queries and to return UTF-8 item data. This can be done for example by querying the SQL : SET NAMES 'UTF8'

MySQL does not identify character '?' on select

I got a table in MySQL with the following columns:
id name email address borningDate
I have a form in a HTML page that submits this data to a servlet, responsible for saving it at the database. Due to charset issues (already fixed), I saved a row like this, when trying to store letters with accents:
19 ? ? ? 2015-03-01
and now I want to delete this row.
Yeah, doing this:
DELETE FROM table WHERE id=19;
works nice. My didatic question is: why, if I try something like this:
DELETE FROM table WHERE name='?';
it returns 0 rows affected, like if it can't see ? as a valid character?
Try doing
SELECT id, HEX(name), HEX(email), HEX(address), borningDate FROM table
This will tell you what's actually in the database. It probably isn't actually ASCII question marks. The question marks are probably substitution characters applied when MySQL tries to convert the column's character set to the connection's character set.
To manage this more specifically, do SHOW CREATE TABLE table and look for the character set being used for the text columns. This probably shows up at the end of the table definition as something like DEFAULT CHARSET utf8 or some such thing. But it might be specified in the column definition.
Once you know the character set, issue the command SET NAMES charset, for example, SET NAMES utf8. Then reissue your commands and see if you get better results than the ? substitution character. That assumes, of course, that the client program you are using can handle the character set mentioned.

Is there a special code for a half symbol?

Because my database is about postage stamps there are a lot of half symbols in the descriptions. They type into mysql with no problem but when I get the results of a query all the half symbols have been replaced by a question mark.
Is there a special code I should be using when I input the descriptions instead of using the half symbol? Otherwise, is there another solution like changing the character set. I'm using utf-8 at the moment.
It seems like you are facing character encoding issue. You should use UTF-8 everywhere:
Make sure the column that contains text data is encoded as utf8_...
Check that when you get information from database, you keep this encoding. You can force it by sending SET NAMES utf8; before any request to MySQL.
Check that when you display this information the encoding is UTF-8 (in a webpage, that means <meta charset='utf-8'> in <head>).

Mysql charset and form

I have a problem with MySQL and charsets .
When I insert something into my mysql database, the special chars like "é" change into "è" and some other strange character combinations...
Searching on the web, I read about set utf8_unicode_ci tables and general configuration. I also added the following line to my webpage
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
But after all this it always returns the same error when I insert something into the database. The characters are still garbled.
What do I miss?
You might want to add this to your code:
mysql_query("SET CHARACTER SET utf8");
It will tell PHP to use Unicode to talk with the database engine. So if your form data encoding is unicode too, you shouldn't have any garbled characters problems anymore.

Inserting Chinese Meta Tags

I have a multilingual site and I am having a problem inserting Chinese meta tags. These are transformed into question marks.
Is there a way how I can achieve this?
Many thanks
--EDIT--
The table storing the SEF Urls is in the latin1_swedish_ci character set. How can I change this single table to utf8_general_ci without breaking the URLs?
Many thanks!
Make sure that:
The character encoding you are using includes those characters (UTF-8 is safe)
Your editor is configured to use that character encoding
Your database (if these details are stored in one) is configured to use that encoding
Your webserver is configured to output a charset parameter on the Content-type header (and it uses the correct encoding)
Your browser is not configured to ignore the specified encoding
Use numeric character references.
EDIT
wiki numeric character references
Convert Chinese characters to Unicode
Are you retrieving the data from a database?
If so ensure that you connection character set is also set to utf-8.
In MySQL for example you would need to issue this query before any other:
SET NAMES 'utf8';
It could be that you need to encode the Chinese characters to HTML entities, or specify a character set.
Have you checked your character set in your document headers? I usually use UTF-8 to achieve chinese character sets.
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
If you're using a program like dreamweaver, make sure your files are actually being SAVED in the correct character set as well. We had a problem where characters in a dreamweaver file were coming through as ???? because the editor itself was set to iso-8859-1
Maybe your Browser - or more exactly, the font you selected to display the page - doesn't support chinese characters. What system and browser is this on?