Chinese and Japanese characters not working with mysql - mysql

I have a table named CHINESE which has only one column NAME.
The output of SHOW VARIABLES LIKE 'char%' is:
+--------------------------+--------------------------------------------------------+
| Variable_name | Value |
+--------------------------+--------------------------------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/local/mysql-5.1.73-osx10.6-x86_64/share/charsets/ |
+--------------------------+--------------------------------------------------------+
When I run this query: INSERT INTO CHINESE VALUES ('你好'), the values get inserted.
But, when I try to execute this query: SELECT * FROM CHINESE, the result is:
+------+
| NAME |
+------+
| ?? |
+------+
The result of SELECT HEX(NAME) FROM CHINESE is:
+-----------+
| HEX(NAME) |
+-----------+
| 3F3F |
+-----------+
Where am I making mistake?

If mysql>=5.5.3, use utf8mb4 .
Alter origin table
ALTER TABLE $tablename
CONVERT TO CHARACTER SET utf8mb4
COLLATE utf8mb4_general_ci
Create new table
CREATE TABLE $tablename (
`id` int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
Modify Column
ALTER TABLE $tablename
MODIFY $col1
VARCHAR(191)
CHARACTER SET utf8mb4;
refer: Mysql DOC: Column Character Set Conversion

Try the following to change the character set: SET NAMES 'big5';

Related

Saving a set of strings using stylized fonts to MariaDB with strange results

I currently have MariaDB version 10.4.18 on CentOS 8.0. When I'm trying to save a string with stylized fonts like below,
𝘈𝘴𝘵𝘳𝘪 𝘈𝘯𝘢𝘯𝘵𝘢
MariaDB saved them as "??? ????"
The statement
mysql> insert into testings(test) values ('𝘈𝘴𝘵𝘳𝘪 𝘈𝘯𝘢𝘯𝘵𝘢');
Here is my database's charset and collation
mysql> select ##collation_database;
+----------------------+
| ##collation_database |
+----------------------+
| utf8mb4_unicode_ci |
+----------------------+
1 row in set (0.00 sec)
mysql> SELECT ##character_set_database;
+--------------------------+
| ##character_set_database |
+--------------------------+
| utf8mb4 |
+--------------------------+
The table
mysql> SHOW FULL COLUMNS FROM testings;
+-------+------+--------------------+------+-----+---------+-------+---------------------------------+---------+
| Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment |
+-------+------+--------------------+------+-----+---------+-------+---------------------------------+---------+
| test | text | utf8mb4_unicode_ci | NO | | NULL | | select,insert,update,references | |
+-------+------+--------------------+------+-----+---------+-------+---------------------------------+---------+
Can anyone point me to right direction?
Answered by #Akina, I edited my database config with parameters below
SET collation_connection = 'utf8mb4_unicode_ci';
SET character_set_client = 'utf8mb4';
SET character_set_results = 'utf8mb4';
SET character_set_system = 'utf8mb4';
Now it works!

Cant write special characters like "á" "é" mysql server in a docker container

Cant write special characters like 'á' 'ñ' mysql server in a docker container
These are the character sets
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/
When I try to copy and paste "Amélie" in the terminal, the actual output is "Amlie"
Set the default characterset for the table or even for a column
CREATE TABLE t1 (
col1 varCHAR(10) ,
col2 varCHAR(10)
) DEFAULT CHARSET=utf8;
INSERT INTO t1
(`col1`, `col2`)
VALUES
('Amélie','Amélie');
Resuls in
Select * From t1;
col1 col2
Amélie Amélie

How to configure my.cnf for multiple CHARACTER SET of database in one instance

In a instance i have two databases:
1st databse -> my_db
2nd database -> sample_db
mysql> show global variables like 'char%';
+--------------------------+-------------------------------------------+
| Variable_name | Value |
+--------------------------+-------------------------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /rdsdbbin/mysql-5.6.27.R1/share/charsets/ |
+--------------------------+-------------------------------------------+
mysql> show global variables like 'coll%';
+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | latin1_swedish_ci |
| collation_database | latin1_swedish_ci |
| collation_server | latin1_swedish_ci |
+----------------------+-------------------+
mysql> use my_db;
SHOW CREATE DATABASE my_db ;
+-------------------+-------------------------------------------------------------------------------------------+
| Database | Create Database
+-------------------+-------------------------------------------------------------------------------------------+
| plum_production_1 | CREATE DATABASE `my_db` /*!40100 DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci */ |
+-------------------+-------------------------------------------------------------------------------------------+
1st database; my_db
mysql> show variables like '%coll%';
+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8_unicode_ci |
| collation_server | utf8_unicode_ci |
+----------------------+-------------------+
mysql> show variables like '%char%';
+--------------------------+-------------------------------------------+
| Variable_name | Value |
+--------------------------+-------------------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /rdsdbbin/mysql-5.6.27.R1/share/charsets/ |
+--------------------------+-------------------------------------------+
2nd database:
mysql> use sample_db;
mysql> show create database sample_db;
+-----------------+----------------------------------------------------------------------------+
| Database | Create Database
|
+-----------------+----------------------------------------------------------------------------+
| plum_production | CREATE DATABASE `plum_production` /*!40100 DEFAULT CHARACTER SET latin1 */ |
+-----------------+----------------------------------------------------------------------------+
mysql> show variables like '%char%';
+--------------------------+-------------------------------------------+
| Variable_name | Value |
+--------------------------+-------------------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /rdsdbbin/mysql-5.6.27.R1/share/charsets/ |
+--------------------------+-------------------------------------------+
mysql> show variables like '%coll%';
+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | utf8_general_ci |
| collation_database | latin1_swedish_ci |
| collation_server | latin1_swedish_ci |
+----------------------+-------------------+
How to configure my.cnf when we require multiple collation types of db's i.e.,
i need
my_db - character_set utf8 collate utf8_unicode_ci.
Sample_db - character_set latin1 collate latin1_swedish_ci.
With the above configuration am facing some issues like tables are locked when trying to insert records into multiple tables except 1st table of insert statement.And other queries are too slow.Temporarily i changed my_db -character_set latin1 collate latin1_swedish_ci,now it is working fine.
But my requirement was not this.
For my_db table & columns: character set- Utf8,collation-utf8_unicode_ci --> To get this done i altered
Database :- Alter database my_db characterset utf8 collate utf8_unicode_ci,
Tables :- For all tables - ALTER TABLE table_names CHARACTER SET utf8 COLLATE utf8_unicode_ci;
To convert all Columns :- ALTER TABLE table_names CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
am i in right way ?.is there anything to change other than in my.cnf?
and in sample_db:Charcter set-latin1,collation-latin_swedish_ci.
We are using awsrds my.cnf looks like this :-
[mysqld]
character_set_client: utf8
character_set_database: utf8
character_set_results: utf8
character_set_connection: utf8
character_set_server: utf8
collation_connection: utf8_unicode_ci
collation_server: utf8_unicode_ci
and also how to configure local instance my.cnf(not in aws) ? for example:
[client]
[mysql]
[mysqld]
When connecting how can i set names utf8_mb4?is it required to mention always when connecting to that db? i asked many qstions coz am confused and scared of data lose..thanks in advance.
my.cnf is mostly defaults that can be overridden. If you have a mixture, don't worry about it; focus on the other settings.
Client
What client do you have? (All I see is mysql commandline tool.) Probably the client should be always utf8mb4 (mysql character set, equivalent to the outside world of UTF-8).
When connecting, use the connection parameters to establish CHARACTER SET utf8mb4, possibly by doing SET NAMES utf8m4;
Data in Columns
Each column can have a CHARACTER SET and COLLATION. If not specified, they default from the CREATE TABLE. If that does not specify, it defaults from the CREATE DATABASE. Etc.
So, be sure each column is the way they need to be. Use SHOW CREATE TABLE to verify.
Client to/from Columns
MySQL transcodes data as it goes between the client and the server. So, it is OK to have the client using utf8mb4, but INSERTing/SELECTing a column that is declared latin1. (Some combinations won't work.)
Corollary: There is no problem if one DB is latin1 and another is utf8.
Garbage
See "best practice" in http://stackoverflow.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored . If you get gibberish, see that link for further debugging/cures.

Mysql 5.6 UTF-8 (utf8mb4) still displaying incorrect characters

I did a conversion of my database to utf8mb4, yet it still returns incorrect UTF8 characters:
For example, Café becomes Café
Here are my mysql collation variables:
mysql> SHOW VARIABLES LIKE 'char%'; SHOW VARIABLES LIKE 'collation%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
+----------------------+--------------------+
| Variable_name | Value |
+----------------------+--------------------+
| collation_connection | utf8mb4_unicode_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8mb4_unicode_ci |
+----------------------+--------------------+
Also, my DB has slowed down at least 10x since switching to utf8.
Mojibake. This is the classic case of
The bytes you have in the client are correctly encoded in utf8mb4 (good).
You connected with SET NAMES latin1 (or set_charset('latin1') or ...), probably by default. (It should have been utf8mb4.)
The column in the tables may or may not have been CHARACTER SET utf8mb4, but it should have been that.
If you need to fix for the data it takes a "2-step ALTER", something like
ALTER TABLE Tbl MODIFY COLUMN col VARBINARY(...) ...;
ALTER TABLE Tbl MODIFY COLUMN col VARCHAR(...) ... CHARACTER SET utf8mb4 ...;
where the lengths are big enough and the other "..." have whatever else (NOT NULL, etc) was already on the column.

Some confusing phenomena about insert emoji character into mysql table

When insert emoji character in mysql interactive interface, I found some phenomena very confusing. Hope someone could clear it. Now see below:
mysql> show variables like 'character%';
+--------------------------+---------------------------------------+
| Variable_name | Value |
+--------------------------+---------------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /opt/mysql/server-5.6/share/charsets/ |
+--------------------------+---------------------------------------+
CREATE TABLE `t` (
`data` varchar(100) CHARACTER SET utf8mb4 DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
mysql> insert into t select '\U+1F600';
ERROR 1366 (HY000): Incorrect string value: '\xF0\x9F\x98\x80' for column 'data' at row 1
mysql> set names utf8mb4;
mysql> insert into t select '\U+1F600';
Query OK, 1 row affected (0.00 sec)
mysql> select * from t;
+------+
| data |
+------+
| 😀 |
+------+
mysql> select data, hex(data) from t;
+------+-----------+
| data | hex(data) |
+------+-----------+
| 😀 | F09F9880 |
+------+-----------+
Why do I need execute set names utf8mb4 explicitly? From error message, it seems it resolved the data content to four byte(f0 9f 98 80) successully? Why still can't insert successfully?
Below is another puzzle for me.
mysql> show variables like 'character%';
+--------------------------+---------------------------------------+
| Variable_name | Value |
+--------------------------+---------------------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /opt/mysql/server-5.6/share/charsets/ |
+--------------------------+---------------------------------------+
mysql> insert into t select '\U+1F600';
Query OK, 1 row affected (0.01 sec)
mysql> select data,hex(data) from t;
+------+--------------------+
| data | hex(data) |
+------+--------------------+
| 😀 | C3B0C5B8CB9CE282AC |
+------+--------------------+
I have to say I feel a little shock about this. In my opinion only utf8mb4 support emoji character, but now latin1 support emoji character too.
Anybody can clear it for me. Thanks!
You can insert UTF8 data into a latin1 table, but MySQL won't treat the byte stream as a UTF8 character. So you won't be able to query against it for example. If your application understands the UTF8 byte stream then it will look like its working OK. But the table charset really needs to be utf8 (or utf8mb4) if MySQL is to understand those bytes as Unicode characters.