MySql Database change existing table to UTF8 - mysql

I have a weird problem with MySql supporting cyrilic alphabet. The database has been created in utf8_unicode_ci from the start, however the tables were not. Right now the table data, if supplied in cyrrilic looks like this ????????, if I create a table from start in utf there is no problem, however if I try to change the existing table encoding by using
ALTER TABLE <table_name> CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Which is supposed to change existing data or
ALTER TABLE Strategies
CHARACTER SET utf8,
COLLATE utf8_unicode_ci;
which is supposed to change future data, it doesn't work.
I have also change my.cnf file and added in
[mysqld]
#
#default-character-set=utf8 this one breaks mysql restart
character-set-server=utf8
skip-character-set-client-handshake
collation-server=utf8_unicode_ci
init-connect='SET NAMES utf8'
init_connect='SET collation_connection = utf8_general_ci'
If I run SHOW VARIABLES WHERE Variable_name LIKE 'character_set_%' OR Variable_name LIKE 'collation%'; I get:
I also change to utf directly in PHP my admin and it actually shows that the table is in utf but nothing happens to the existing ????????? or to the future cyrillic inputs.
Hopefully someone else had experinced this kind of issue, would be really greatfull for any help or suggestions. Thank you.

If a table starts out as latin1 and has latin1-encoded characters in it, use ALTER TABLE ... CONVERT TO CHARACTER SET utf8 (as you did)
Before converting, test the old encoding do two things:
SHOW CREATE TABLE ... -- to see that the columns say latin1
SELECT HEX(col) ... -- to see what the encoding looks like: é should show E9
I say, "before" because it is possible to cram utf8 into latin1 incorrectly. é should show C3A9 -- this is "double-encoding".
Do likewise after the conversion:
SHOW CREATE TABLE ... -- to see that the columns say utf8
SELECT HEX(col) ... -- to see what the encoding looks like; é should show C3A9
C383C2A9 would indicate double-encoding. A mess.
Do not depend on init-connect='SET NAMES utf8' if you connect as root, init-connect is ignored for root and any other SUPER user.
But... You say you put Cyrillic text into a latin1 column? That is "impossible" since latin1 cannot represent anything other than latin-based Western European characters. So... You probably have "double-encoding".
For more debugging, see Trouble with utf8 . Note especially "question mark".
To repair double-encoding, see Fixes , and pick the appropriate case. This link also says what you should have done (the 2-step Alter) instead of what you did.

Related

How can i change Mysql character_set_system UTF8 to utf8bin

When i dump mysql data out , some data has changed because of the character_set_system which is UTF8.Server , client and connection character sete are utf8mb4.
I guess the problem is system character set and server character set differences.
I am trying to change system caharacter set from UTF8 to utf8mb4 with this
Change MySQL default character set to UTF-8 in my.cnf?
But i can not
The title is incorrectly phrased.
"utf8" is a "character set"
"utf8_bin" is a "collation" for the character set utf8.
You cannot change character_set... to collation. You may be able to set some of the collation_% entries to utf8_bin.
But none of that a valid solution for the problem you proceed to discuss.
Probably you can find more tips here: Trouble with UTF-8 characters; what I see is not what I stored
To help you further, we need to see the symptoms that got you started down this wrong path.

Error Code: 1267. Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) [duplicate]

I'm getting this strange error while processing a large number of data...
Error Number: 1267
Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='
SELECT COUNT(*) as num from keywords WHERE campaignId='12' AND LCASE(keyword)='hello again 昔 ã‹ã‚‰ ã‚ã‚‹ å ´æ‰€'
What can I do to resolve this? Can I escape the string somehow so this error wouldn't occur, or do I need to change my table encoding somehow, and if so, what should I change it to?
SET collation_connection = 'utf8_general_ci';
then for your databases
ALTER DATABASE your_database_name CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE your_table_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
MySQL sneaks swedish in there sometimes for no sensible reason.
CONVERT(column1 USING utf8)
Solves my problem. Where column1 is the column which gives me this error.
You should set both your table encoding and connection encoding to UTF-8:
ALTER TABLE keywords CHARACTER SET UTF8; -- run once
and
SET NAMES 'UTF8';
SET CHARACTER SET 'UTF8';
Use following statement for error
be careful about your data take backup if data have in table.
ALTER TABLE your_table_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
In general the best way is to Change the table collation. However I have an old application and are not really able to estimate the outcome whether this has side effects. Therefore I tried somehow to convert the string into some other format that solved the collation problem.
What I found working is to do the string compare by converting the strings into a hexadecimal representation of it's characters. On the database this is done with HEX(column). For PHP you may use this function:
public static function strToHex($string)
{
$hex = '';
for ($i=0; $i<strlen($string); $i++){
$ord = ord($string[$i]);
$hexCode = dechex($ord);
$hex .= substr('0'.$hexCode, -2);
}
return strToUpper($hex);
}
When doing the database query, your original UTF8 string must be converted first into an iso string (e.g. using utf8_decode() in PHP) before using it in the DB. Because of the collation type the database cannot have UTF8 characters inside so the comparism should work event though this changes the original string (converting UTF8 characters that are not existend in the ISO charset result in a ? or these are removed entirely). Just make sure that when you write data into the database, that you use the same UTF8 to ISO conversion.
I had my table originally created with CHARSET=latin1. After table conversion to utf8 some columns were not converted, however that was not really obvious.
You can try to run SHOW CREATE TABLE my_table; and see which column was not converted or just fix incorrect character set on problematic column with query below (change varchar length and CHARSET and COLLATE according to your needs):
ALTER TABLE `my_table` CHANGE `my_column` `my_column` VARCHAR(10) CHARSET utf8
COLLATE utf8_general_ci NULL;
I found that using cast() was the best solution for me:
cast(Format(amount, "Standard") AS CHAR CHARACTER SET utf8) AS Amount
There is also a convert() function. More details on it here
Another resource here
Change the character set of the table to utf8
ALTER TABLE your_table_name CONVERT TO CHARACTER SET utf8
My user account did not have the permissions to alter the database and table, as suggested in this solution.
If, like me, you don't care about the character collation (you are using the '=' operator), you can apply the reverse fix. Run this before your SELECT:
SET collation_connection = 'latin1_swedish_ci';
After making your corrections listed in the top answer, change the default settings of your server.
In your "/etc/my.cnf.d/server.cnf" or where ever it's located add the defaults to the [mysqld] section so it looks like this:
[mysqld]
character-set-server=utf8
collation-server=utf8_general_ci
Source: https://dev.mysql.com/doc/refman/5.7/en/charset-applications.html

mysql unicode text incorrect string warning on insert, despite character set variables set utf8mb4

First, I know, yes, this is yet another mysql unicode question.
Problem: I am unable to insert unicode text into my mysql database
I want to execute the following query:
INSERT INTO usert SET username='田中'
When I do, I get this warning:
Incorrect string value: '\x93c\x92\x86' for column 'username' at row 1
A blank space is inserted into the table instead of the data
I have tried as many answers and forums as I could, and I believe that all appropriate variables, table, and column settings are set to 'utf8mb4' character set, with collation 'utf8mb4_general_ci' or 'utfmb4_unicode_ci'
I will tell you why I believe that by giving you the details, and sql commands used to show them.
First, mysql version:
mysql:> SHOW VARIABLES LIKE 'version'
Confirms that the version is 5.6.23
To show the character set variables in mysql:
mysql:> SHOW VARIABLES LIKE '%char%'
That command shows (in slightly different format):
character_set_client: utf8mb4
character_set_connection: utf8mb4
character_set_database: utf8mb4
...
character_set_results: utf8mb4
character_set_server: utf8mb4
character_set_system: utf8
Collation:
mysql:> SHOW VARIABLES LIKE '%collat%'
RESULTS:
collation_connection: utf8mb4_unicode_ci
collation_database: utf8mb4_unicode_ci
collation_server: utf8mb4_unicode_ci
So far so good?
Now, for the table character set and collation:
Look at table details command:
mysql:> SHOW TABLE STATUS
shows that the collation is utf8mb4_general_ci
Command for looking at column details:
mysql:> SHOW FULL COLUMNS IN usert
Confirms that the collation for column 'username' is utf8mb4_general_ci
In summary, from what I have studied, all relevant variables, database, table, and column settings seem to be set to the relevant utf8mb4 setting. Despite that, I am unable to insert the unicode Japanese text.
(By the way, I dont think the 4-byte unicode settings utf8mb4 is necessary here, but it is what I am using because it seemed to fix many other unicode mysql problems)
What other settings in mysql or the system are likely causing this problem?
What other settings can I/ should I change to allow inserting japanese text appropriately?
EDIT UPDATE: I am on a Japanese computer
The problem was the default system settings, which also affected the input settings at the command line.
Its a Japanese computer, which apparently uses shift-jis encoding using, NOT unicode, by default. The text I was inputting was encoded in this way, and in similar input files I was trying to use.
Therefore, I set the character set to be 'jsis' in the server,
i.e. setting character-set-server=sjis in the my.ini initializer file, and set the mysql character set to be the same by entering skip-character-set-client-handshake into the same initilization file.
The character set for the column of course must also be changed via
ALTER TABLE usert MODIFY username varchar(30) CHARACTER SET sjis COLLATE sjis_japanese_ci
Now, you can insert the japanese text from command line, and other japanese files which use shift-jis encoding.
Another option for inputting japanese text seems to be cp932, which is the windows version of shift-jis.
Incidentally, if you DO wish to use unicode via command line, apparently powershell has better support for it, rather than the normal cmd I was using, but I haven't tried it personally.
Try check character set of Database.
Check character set of your Database with command bellow:
SELECT ##character_set_database, ##collation_database;
If result of 1 different UTF-8 then try command bellow:
ALTER DATABASE yourDatabase CHARACTER SET utf8 COLLATE
utf8_unicode_ci;
Hope it work for you.

storing Arabic in to mysql

I have a problem related to Arabic encoding and storing Arabic in to mysql.
i applied all the following steps:
set MySQL charset: UTF-8 Unicode (utf8)
Set MySQL connection collation: utf8_general_ci
Set database and table collations are set to:utf8_general_ci
orutf8_unicode_ci
mysql_query("SET NAMES 'utf8'");
mysql_query('SET CHARACTER SET utf8');
however the problem still exist.
Arabic values appear like this in mysql table: أح&Ugr.
(Too many questions for a 'comment'.)
Are you using PHP? If so, use mysqli_* intervace, not mysql_*.
set_charset('utf8') should suffice, not most of the rest of the actions.
Is that supposed to look something like 'ɣɭ&Ugr'? If so, I think you have "double encoding". Please pick some cell in the table and do SELECT col, HEX(col) ... so I can further analyze it.
(I am puzzled, because it looks more like some obscure latin encoding than Arabic.)
"Collation" is not relevant at this point; it may become important as we debug this.

Illegal mix of collations MySQL Error

I'm getting this strange error while processing a large number of data...
Error Number: 1267
Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='
SELECT COUNT(*) as num from keywords WHERE campaignId='12' AND LCASE(keyword)='hello again 昔 ã‹ã‚‰ ã‚ã‚‹ å ´æ‰€'
What can I do to resolve this? Can I escape the string somehow so this error wouldn't occur, or do I need to change my table encoding somehow, and if so, what should I change it to?
SET collation_connection = 'utf8_general_ci';
then for your databases
ALTER DATABASE your_database_name CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE your_table_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
MySQL sneaks swedish in there sometimes for no sensible reason.
CONVERT(column1 USING utf8)
Solves my problem. Where column1 is the column which gives me this error.
You should set both your table encoding and connection encoding to UTF-8:
ALTER TABLE keywords CHARACTER SET UTF8; -- run once
and
SET NAMES 'UTF8';
SET CHARACTER SET 'UTF8';
Use following statement for error
be careful about your data take backup if data have in table.
ALTER TABLE your_table_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
In general the best way is to Change the table collation. However I have an old application and are not really able to estimate the outcome whether this has side effects. Therefore I tried somehow to convert the string into some other format that solved the collation problem.
What I found working is to do the string compare by converting the strings into a hexadecimal representation of it's characters. On the database this is done with HEX(column). For PHP you may use this function:
public static function strToHex($string)
{
$hex = '';
for ($i=0; $i<strlen($string); $i++){
$ord = ord($string[$i]);
$hexCode = dechex($ord);
$hex .= substr('0'.$hexCode, -2);
}
return strToUpper($hex);
}
When doing the database query, your original UTF8 string must be converted first into an iso string (e.g. using utf8_decode() in PHP) before using it in the DB. Because of the collation type the database cannot have UTF8 characters inside so the comparism should work event though this changes the original string (converting UTF8 characters that are not existend in the ISO charset result in a ? or these are removed entirely). Just make sure that when you write data into the database, that you use the same UTF8 to ISO conversion.
I had my table originally created with CHARSET=latin1. After table conversion to utf8 some columns were not converted, however that was not really obvious.
You can try to run SHOW CREATE TABLE my_table; and see which column was not converted or just fix incorrect character set on problematic column with query below (change varchar length and CHARSET and COLLATE according to your needs):
ALTER TABLE `my_table` CHANGE `my_column` `my_column` VARCHAR(10) CHARSET utf8
COLLATE utf8_general_ci NULL;
I found that using cast() was the best solution for me:
cast(Format(amount, "Standard") AS CHAR CHARACTER SET utf8) AS Amount
There is also a convert() function. More details on it here
Another resource here
Change the character set of the table to utf8
ALTER TABLE your_table_name CONVERT TO CHARACTER SET utf8
My user account did not have the permissions to alter the database and table, as suggested in this solution.
If, like me, you don't care about the character collation (you are using the '=' operator), you can apply the reverse fix. Run this before your SELECT:
SET collation_connection = 'latin1_swedish_ci';
After making your corrections listed in the top answer, change the default settings of your server.
In your "/etc/my.cnf.d/server.cnf" or where ever it's located add the defaults to the [mysqld] section so it looks like this:
[mysqld]
character-set-server=utf8
collation-server=utf8_general_ci
Source: https://dev.mysql.com/doc/refman/5.7/en/charset-applications.html