my setup:
mysql 5.1
show variables:
| character_set_client | utf8
| character_set_connection | utf8
| character_set_database | utf8
| character_set_filesystem | binary
| character_set_results | utf8
| character_set_server | utf8
| character_set_system | utf8
| character_sets_dir | D:\Programme\MySQL\MySQL Server 5.1\share
charsets\
| collation_connection | utf8_general_ci
| collation_database | utf8_unicode_ci
| collation_server | utf8_general_ci
and even
| init_connect | SET collation_connection = utf8_general_ci; SET NAMES utf8;
the table table has character set utf8
tomcat 6.0
the jdbc connector uses characterEncoding="utf8" useUnicode="true"
now when i try
stmt.execute("UPDATE *table* SET *value*=\"ÿ\" WHERE ...)
it works but for
stmt.execute("UPDATE *table* SET *value*=\"Ā\" WHERE ...)
i get an
java.sql.SQLException: Incorrect string value: '\xC4\x80' for column
'value' at row 1
furthermore it works for all characters below ÿ, which can be encoded with 1 byte but as soon as 2 bytes are needed: bang!
why is that so? and how can i get it to work?
after i added another two tables to check if it's an MyISAM vs. InnoDB problem it just worked on the new tables and why?
in the new tables each column used the default charset while in my existing tables the charsets of each column were set to latin1. this was because i copied the db from a non-utf8 mysql instance and manually changed the table charset to utf-8. BUT while copying, HeidiSQL added a "CHARACTER SET latin1" to each column which wasn't changed when changing the charset AND it is not very easily visible in HeidiSQL that a column has an individual charset ...
Related
I'm about to change the encoding for a database from latin1 to utf8mb4.
Due to privacy restrictions, I don't know what the database to be converted contains. I'm worried that by running below SQL, existing data may be changed.
ALTER TABLE table CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
However, the connection string from the grails application contains useUnicode=true&characterEncoding=UTF-8, does this mean that even though latin1_swedish_ci is used for a column, the actual value that has been saved is UTF-8 encoded?
And since this value is UTF-8 encoded, there is no risk that the data will be affected by the change from latin1 to utf8mb4?
+--------------------------+-------------------+
| Variable_name | Value |
+--------------------------+-------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| collation_connection | utf8_general_ci |
| collation_database | latin1_swedish_ci |
| collation_server | latin1_swedish_ci |
+--------------------------+-------------------+```
That's Ώπα? That's the interpretation in UTF-8 (as the outside world calls it), utf8mb4 (MySQL's equivalent) or utf8 (MySQL's partial implementation of UTF-8).
It would not work well in latin1.
The encoding in the client and the encoding of a column in the database need not be the same. However, Greek in the client cannot be crammed into latin1 in the table, hence the error message.
What ALTER TABLE table CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; does is to change all the text columns in that table to be utf8-encoded and convert from whatever encoding is currently used (presumably latin1). This is fine for Western European characters, all of which exist (with different encodings) in both latin1 and utf8.
To handle Emoji and some of Chinese, you may as well go for utf8mb4:
ALTER TABLE table CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8_unicode_520_ci;
I am trying to save some data on mysql database, input contains emoji characters like this : '\U0001f60a\U0001f48d' and I'm getting this error:
1366, "Incorrect string value: '\\xF0\\x9F\\x98\\x8A\\xF0\\x9F...' for column 'caption' at row 1"
I searched over net and read a lot of answers include these:
MySQL utf8mb4, Errors when saving Emojis or MySQL utf8mb4, Errors when saving Emojis or https://mathiasbynens.be/notes/mysql-utf8mb4#character-sets or http://www.java2s.com/Tutorial/MySQL/0080__Table/charactersetsystem.htm but nothing worked !!
I have different problems:
here is mydb info:
mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| collation_connection | utf8mb4_general_ci |
| collation_database | utf8mb4_general_ci |
| collation_server | utf8_general_ci |
+--------------------------+--------------------+
10 rows in set (0.00 sec)
I tried to change character_set_server value to utf8mb4 by
mysql>SET character_set_server = utf8mb4
Query OK, 0 rows affected (0.00 sec)
But when restart mysqld everything revert !
I don't have any /etc/my.cnf file in also, and I edited /etc/mysql/my.cnf file instead.
What should I do?
How can I save emoji file in my database?
1st or 2nd line in source code (to have literals in the code utf8-encoded: # -- coding: utf-8 --
Your columns/tables need to be CHARACTER SET utf8mb4
The python package "MySQL-python" version needs to be at least 1.2.5 in order to handle utf8mb4.
self.query('SET NAMES utf8mb4') may be necessary.
Django needs client_encoding: 'UTF8' -- I don't know if that should be 'utf8mb4`.
References:
https://code.djangoproject.com/ticket/18392
http://mysql.rjweb.org/doc.php/charcoll#python
I'm working with two mysql servers, trying to understand why they behave differently.
I've created identical tables on each:
| Field | Type | Collation |
+----------------+------------+-------------------+
| some_chars | char(45) | latin1_swedish_ci |
| some_text | text | latin1_swedish_ci |
and I've set identical character set variables:
| Variable_name | Value
+--------------------------+-------+
| character_set_client | utf8
| character_set_connection | utf8
| character_set_database | latin1
| character_set_filesystem | binary
| character_set_results | utf8
| character_set_server | latin1
| character_set_system | utf8
When I insert UTF-8 characters into the database on one server, I get an error:
DatabaseError: 1366 (HY000): Incorrect string value: '\xE7\xBE\x8E\xE5\x9B\xBD...'
The same insertion in the other server throws no error. The table just silently accepts the utf-8 insertion and renders a bunch of ? marks where the utf-8 characters should be.
Why is the behavior of the two servers different?
What command were you executing when you got the error?
Your data is obviously utf8 (good).
Your connection apparently is utf8 (good).
Your table/column is declared CHARACTER SET latin1? It should be utf8.
That is 美 - Chinese, correct? Some Chinese characters need 4-byte utf8. So you should use utf8mb4 instead of utf8 in all 3 cases listed above.
Other notes:
There is no substantive difference in this area in 5.6 versus 5.7.
##SQL_MODE is not relevant.
VARCHAR is usually advisable over CHAR.
I am trying to fix a character encoding issue - previously we had the collation set for this column utf8_general_ci which caused issues because it is accent insensitive..
I'm trying to find all the entries in the database that could have been affected.
set names utf8;
select * from table1 t1 join table2 t2 on (t1.pid=t2.pid and t1.id != t2.id) collate utf8_general_ci;
However, this generates the error:
ERROR 1253 (42000): COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'latin1'
The database is now defined with DEFAULT CHARACTER SET utf8
The table is defined with CHARSET=utf8
The "pid" column is defined with: CHARACTER SET utf8 COLLATE utf8_bin NOT NULL
The server version is Server version: 5.5.37-MariaDB-0ubuntu0.14.04.1 (Ubuntu)
Question: Why am I getting an error about latin1 when latin1 doesn't seem to be present anywhere in the table / schema definition?
MariaDB [(none)]> SHOW VARIABLES LIKE '%char%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
MariaDB [(none)]> SHOW VARIABLES LIKE '%collation%';
+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | utf8_general_ci |
| collation_database | latin1_swedish_ci |
| collation_server | latin1_swedish_ci |
+----------------------+-------------------+
First, run this query:
SHOW VARIABLES LIKE '%char%';
You have character_set_server='latin1' shown in your post ...
So, go into your my.cnf and add or uncomment these lines:
character-set-server = utf8
collation-server = utf8_unicode_ci
Restart the server.
The same error is produced in MariaDB (10.1.36-MariaDB) by using the combination of parenthesis and the COLLATE statement. My SQL was different, the error was the same, I had:
SELECT *
FROM table1
WHERE (field = 'STRING') COLLATE utf8_bin;
Omitting the parenthesis was solving it for me.
SELECT *
FROM table1
WHERE field = 'STRING' COLLATE utf8_bin;
In my case I created a database and gave the collation 'utf8_general_ci' but the required collation was 'latin1'. After changing my collation type to latin1_bin the error was gone.
I have set every encoding set variable I can figure out to utf8.
In database.yml:
development: &development
adapter: mysql2
encoding: utf8
In my.cnf:
[client]
default-character-set = utf8
[mysqld]
default-character-set = utf8
skip-character-set-client-handshake
character-set-server = utf8
collation-server = utf8_general_ci
init-connect = SET NAMES utf8
And if I run mysql client in terminal:
mysql> show variables like 'character%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
mysql> show variables like 'collation%';
+----------------------+-----------------+
| Variable_name | Value |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8_general_ci |
+----------------------+-----------------+
But it's to beat the air. When I insert utf8 data from Rails app, it finally becomes ????????????.
What do I miss?
Check not global settings but when you are connected to specific database for application. When you changed settings for mysql you have also change settings for your app database.
Simple way to check it is to log to mysql into app db:
mysql app_db_production -u db_user -p
or rails command:
rails dbconsole production
For my app it looks like this:
mysql> show variables like 'character%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
mysql> show variables like 'collation%';
+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | utf8_general_ci |
| collation_database | latin1_swedish_ci |
| collation_server | utf8_general_ci |
+----------------------+-------------------+
3 rows in set (0.00 sec)
Command for changing database collation and charset:
mysql> alter database app_db_production CHARACTER SET utf8 COLLATE utf8_general_ci ;
Query OK, 1 row affected (0.00 sec)
And remeber to change charset and collation for all your tables:
ALTER TABLE tablename CHARACTER SET utf8 COLLATE utf8_general_ci; # changes for new records
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; # migrates old records
Now it should work.
I had the same problem. I added characterEncoding to the end of mysql connection string:
use this: jdbc:mysql://localhost/dbname?characterEncoding=utf8
instead of this: jdbc:mysql://localhost/dbname
Okay for anybody else for whom the #Ravbaker answer does not cut it .. some more tips
MySQL has encoding specified in multiple levels : server, database, connection, table and even field/column. My problem was that the field/column was forced to latin (which over rides all the other encodings). I set the field back to the table encoding (which was utf-8) and the world was good again.
Most of these settings can be set at the usual places: my.cnf, alter queries and rails database.yml file.
ALTER TABLE t MODIFY col1 CHAR(50) CHARACTER SET utf8;
was the query which did the trick for me.
For server / connection encodings use my.cnf and database.yml
For database / table / column encodings use queries
(You can also achieve these by other means)
Do you have this in the HTML?
<meta http-equiv="content-type" content="text/html;charset=UTF-8" />
or on HTML5 pages with <!doctype html>
<meta charset="utf-8">
You may need this to let the browser send strings in utf8.
I have some problem today! It's solved by drop my table and creating new, then db:migrate and all is pretty works!
WARNING: IT WILL DELETE ALL YOUR DATA IN THIS TABLE
So:
$ mysql -u USER -p
mysql > drop database YOURDB_NAME_development;
mysql > create database YOURDB_NAME_development CHARACTER SET utf8 COLLATE utf8_general_ci;
mysql > \q
$ rake db:migrate
Well done!