Update:
it turns out this is not directly related to the DB server itself, but the clients encoding. If the client uses encoding utf8, the german character is rendered incorrectly. But if the client uses encoding cp850, then the german character is rendered correctly. But I need to use utf8 since there might be other class of characters that the app needs to deal with. what should I do?
Original:
I have two database servers, viewing from the same mysql cient, server1 is rendering the german characters correctly, server2 is not. the following are the differences. But I'm baffled since server2 uses utf8 more. What could be the cause of this?
server1's encodings:
mysql> SHOW VARIABLES LIKE "character\_set\_database";
+------------------------+--------+
| Variable_name | Value |
+------------------------+--------+
| character_set_database | latin1 |
+------------------------+--------+
1 row in set (0.00 sec)
mysql> SHOW VARIABLES LIKE 'character\_set\_%';
+--------------------------+--------+
| Variable_name | Value |
+--------------------------+--------+
| character_set_client | cp850 |
| character_set_connection | cp850 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | cp850 |
| character_set_server | latin1 |
| character_set_system | utf8 |
+--------------------------+--------+
server2
mysql> SHOW VARIABLES LIKE 'character\_set\_%';
+--------------------------+--------+
| Variable_name | Value |
+--------------------------+--------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
+--------------------------+--------+
7 rows in set (0.01 sec)
mysql> SHOW VARIABLES LIKE "character\_set\_database";
+------------------------+-------+
| Variable_name | Value |
+------------------------+-------+
| character_set_database | utf8 |
+------------------------+-------+
1 row in set (0.00 sec)
Related
When i run SELECT CHAR(128,129,130,131,132,133,134,135,136,137);
I got 0x80818283848586878889 istead of Çüéâäàåçêë.
Does anybody know why?
I'm using charset utf8mb4.
When I run show variables like '%char%'; I got
+--------------------------------------+--------------------------------+
| Variable_name | Value |
+--------------------------------------+--------------------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8mb3 |
| character_sets_dir | /usr/share/mysql-8.0/charsets/ |
| validate_password.special_char_count | 1 |
+--------------------------------------+--------------------------------+
I'm using MySQL version 8.0.26
The output charset is "DOS West European"
Use this query it will display the text in charset "DOS West European"
SELECT CHAR(128,129,130,131,132,133,134,135,136,137 using cp850)
I am trying to create a database using the utf8mb4 character set and utf8mb4_unicode_ci collation. However, I don't seem to be able to insert unicode characters into my tables.
What I have done:
SET NAMES utf8mb4;
CREATE DATABASE mydb CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
USE mydb;
CREATE TABLE test (val VARCHAR(16));
INSERT INTO test (val) VALUES ("á");
ERROR 1366 (22007): Incorrect string value: '\xA0' for column `mydb`.`test`.`val` at row 1
If I don't use SET NAMES utf8mb4;, then I can insert the "á" character without issue.
These are my default character set variables:
show variables like 'char%'; show variables like 'collation%';
+--------------------------+-----------------------------------------------+
| Variable_name | Value |
+--------------------------+-----------------------------------------------+
| character_set_client | cp850 |
| character_set_connection | cp850 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | cp850 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | C:\Program Files\MariaDB 10.5\share\charsets\ |
+--------------------------+-----------------------------------------------+
8 rows in set (0.000 sec)
+----------------------+--------------------+
| Variable_name | Value |
+----------------------+--------------------+
| collation_connection | cp850_general_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8_general_ci |
+----------------------+--------------------+
3 rows in set (0.000 sec)
And after using SET NAMES:
show variables like 'char%'; show variables like 'collation%';
+--------------------------+-----------------------------------------------+
| Variable_name | Value |
+--------------------------+-----------------------------------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | C:\Program Files\MariaDB 10.5\share\charsets\ |
+--------------------------+-----------------------------------------------+
8 rows in set (0.000 sec)
+----------------------+--------------------+
| Variable_name | Value |
+----------------------+--------------------+
| collation_connection | utf8mb4_general_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8_general_ci |
+----------------------+--------------------+
3 rows in set (0.000 sec)
How can I fix this issue so I can insert characters in the utf8mb4 character set?
Your text (or .sql) file itself is encoded in cp850 and not in utf-8.
You can see that encoded value is a single byte - UTF-8 encoding should be at least 2 bytes.
In order to use SET NAMES utf8mb4; command, your file needs to be converted to utf-8. Some advanced editors allow that, and even windows notepad can save a text file as utf-8 in modern versions.
If you are using Windows cmd, the command "chcp" controls the "code page". chcp 65001 provides utf8, but it needs a special charset installed, too.
To set the font in the console window: Right-click on the title of the window → Properties → Font → pick Lucida Console
i have a problem with inserting data with polish chars to Mysql DB. Im working on windows 8 and Ubuntu. At Windows there is no problem but on ubuntu i can not insert that kind of chars: "żąśźćłż" in place of them i get: "?????". I have checked with TRACE lvl of logging. My application put correct Strings to prepared query but in db i see "???????". I can insert that kind of chars via cmd and its ok, so problably there is some problem with connector? Or some other settings. I have tried change:
mysql> show variables like "collation%";;
+----------------------+--------------------+
| Variable_name | Value |
+----------------------+--------------------+
| collation_connection | utf8_general_ci |
| collation_database | latin1_swedish_ci |
| collation_server | latin1_swedish_ci |
+----------------------+--------------------+
to
utf8_general_ci
every where but after service(mysql) restart its come back with the same with
mysql> show variables like "character%";
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
I can not set utf8 for database and server.
Anyone have some ideas?
Adding the line
character_set_server = utf8
in the [mysqld] section of the MySQL configuration file (my.ini or my.cnf) should set the new value the next time the MySQL server is started.
When insert emoji character in mysql interactive interface, I found some phenomena very confusing. Hope someone could clear it. Now see below:
mysql> show variables like 'character%';
+--------------------------+---------------------------------------+
| Variable_name | Value |
+--------------------------+---------------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /opt/mysql/server-5.6/share/charsets/ |
+--------------------------+---------------------------------------+
CREATE TABLE `t` (
`data` varchar(100) CHARACTER SET utf8mb4 DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
mysql> insert into t select '\U+1F600';
ERROR 1366 (HY000): Incorrect string value: '\xF0\x9F\x98\x80' for column 'data' at row 1
mysql> set names utf8mb4;
mysql> insert into t select '\U+1F600';
Query OK, 1 row affected (0.00 sec)
mysql> select * from t;
+------+
| data |
+------+
| 😀 |
+------+
mysql> select data, hex(data) from t;
+------+-----------+
| data | hex(data) |
+------+-----------+
| 😀 | F09F9880 |
+------+-----------+
Why do I need execute set names utf8mb4 explicitly? From error message, it seems it resolved the data content to four byte(f0 9f 98 80) successully? Why still can't insert successfully?
Below is another puzzle for me.
mysql> show variables like 'character%';
+--------------------------+---------------------------------------+
| Variable_name | Value |
+--------------------------+---------------------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /opt/mysql/server-5.6/share/charsets/ |
+--------------------------+---------------------------------------+
mysql> insert into t select '\U+1F600';
Query OK, 1 row affected (0.01 sec)
mysql> select data,hex(data) from t;
+------+--------------------+
| data | hex(data) |
+------+--------------------+
| 😀 | C3B0C5B8CB9CE282AC |
+------+--------------------+
I have to say I feel a little shock about this. In my opinion only utf8mb4 support emoji character, but now latin1 support emoji character too.
Anybody can clear it for me. Thanks!
You can insert UTF8 data into a latin1 table, but MySQL won't treat the byte stream as a UTF8 character. So you won't be able to query against it for example. If your application understands the UTF8 byte stream then it will look like its working OK. But the table charset really needs to be utf8 (or utf8mb4) if MySQL is to understand those bytes as Unicode characters.
I've implemented simple web server to have proxy to MySQL using 'sqljocky' package.
And I have issue with character encoding, cyrillic glyphs displays incorrectly:
ÐавÑдов ÐиÑалий instead of Давыдов Витайлий
EDIT: Table collation is utf8_general_ci.
I've tried to query SET NAMES UTF8:
pool.query('set names utf8');
[UPDATED] Then I've created my.cnf in /etc/ directory with this content:
[mysqld]
character-set-server=utf8
collation-server=utf8_general_ci
Output of show variables like "%char%";
mysql> show variables like "%char%";
+--------------------------+--------------------------------------------------------+
| Variable_name | Value |
+--------------------------+--------------------------------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/local/mysql-5.6.12-osx10.7-x86_64/share/charsets/ |
+--------------------------+--------------------------------------------------------+
8 rows in set (0,00 sec)
Output of show variables like 'collation%'
mysql> show variables like 'collation%';
+----------------------+-----------------+
| Variable_name | Value |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8_general_ci |
+----------------------+-----------------+
3 rows in set (0,00 sec)
But still having incorrect displayed characters.
How to get it displayed correctly?
There was a bug in sqljocky which meant that unicode characters weren't being encoded properly. I have fixed some of the places where the bug was occurring in v0.5.5 which I have just published. When I have more time I will make sure that it is fixed everywhere.