COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'latin1' - mysql

I am trying to fix a character encoding issue - previously we had the collation set for this column utf8_general_ci which caused issues because it is accent insensitive..
I'm trying to find all the entries in the database that could have been affected.
set names utf8;
select * from table1 t1 join table2 t2 on (t1.pid=t2.pid and t1.id != t2.id) collate utf8_general_ci;
However, this generates the error:
ERROR 1253 (42000): COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'latin1'
The database is now defined with DEFAULT CHARACTER SET utf8
The table is defined with CHARSET=utf8
The "pid" column is defined with: CHARACTER SET utf8 COLLATE utf8_bin NOT NULL
The server version is Server version: 5.5.37-MariaDB-0ubuntu0.14.04.1 (Ubuntu)
Question: Why am I getting an error about latin1 when latin1 doesn't seem to be present anywhere in the table / schema definition?
MariaDB [(none)]> SHOW VARIABLES LIKE '%char%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
MariaDB [(none)]> SHOW VARIABLES LIKE '%collation%';
+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | utf8_general_ci |
| collation_database | latin1_swedish_ci |
| collation_server | latin1_swedish_ci |
+----------------------+-------------------+

First, run this query:
SHOW VARIABLES LIKE '%char%';
You have character_set_server='latin1' shown in your post ...
So, go into your my.cnf and add or uncomment these lines:
character-set-server = utf8
collation-server = utf8_unicode_ci
Restart the server.

The same error is produced in MariaDB (10.1.36-MariaDB) by using the combination of parenthesis and the COLLATE statement. My SQL was different, the error was the same, I had:
SELECT *
FROM table1
WHERE (field = 'STRING') COLLATE utf8_bin;
Omitting the parenthesis was solving it for me.
SELECT *
FROM table1
WHERE field = 'STRING' COLLATE utf8_bin;

In my case I created a database and gave the collation 'utf8_general_ci' but the required collation was 'latin1'. After changing my collation type to latin1_bin the error was gone.

Related

Mysql 5.5 how to set up everything to utf8?

I just want everything default to utf8. I've checked this question but nothing help.
Currently, My /etc/my.cnf is
[mysqld]
collation-server = utf8_unicode_ci
init-connect='SET NAMES utf8'
character-set-server = utf8
But when I restart the server, create a new database, it is still latin1(character_set_database and character_set_server):
mysql> show variables like 'char%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
mysql> show variables like 'collation%';
+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | utf8_general_ci |
| collation_database | latin1_swedish_ci |
| collation_server | latin1_swedish_ci |
+----------------------+-------------------+
3 rows in set (0.00 sec)
When I create a database, It is latin1:
mysql> create database d1;
Query OK, 1 row affected (0.00 sec)
mysql> use d1;
Database changed
mysql> show variables like "character_set_database";
+------------------------+--------+
| Variable_name | Value |
+------------------------+--------+
| character_set_database | latin1 |
+------------------------+--------+
1 row in set (0.00 sec)
When I create a table in this database, it can't recognize valid utf8 啊:
mysql> create table t1(name varchar(1) default '啊');
ERROR 1067 (42000): Invalid default value for 'name'
I know alter database d1 character set utf8; will fix this. But I just want everything default to utf8, is it possible?
This is tricky.
The character set and collation for the default database can be
determined from the values of the character_set_database and
collation_database system variables. The server sets these variables
whenever the default database changes. If there is no default
database, the variables have the same value as the corresponding
server-level system variables, character_set_server and
collation_server.
So one would assume the default for the collation-database is the same as the default for the collation-server variable.
Please check the following:
Is there any other config that would override your my.cnf, like /etc/mysql/mysql.cnf or ~/.my.cnf ?
The client (not server!) is setting its own collation upon startup, so you could set a client collation/encoding through [mysql] (not mysqld) or look if this is already set somewhere.
You do SHOW VARIABLES ... - this is querying SESSION based variables, try to query explicitly global settings through SHOW GLOBAL VARIABLES ...

Saving emoji characters in mysql database

I am trying to save some data on mysql database, input contains emoji characters like this : '\U0001f60a\U0001f48d' and I'm getting this error:
1366, "Incorrect string value: '\\xF0\\x9F\\x98\\x8A\\xF0\\x9F...' for column 'caption' at row 1"
I searched over net and read a lot of answers include these:
MySQL utf8mb4, Errors when saving Emojis or MySQL utf8mb4, Errors when saving Emojis or https://mathiasbynens.be/notes/mysql-utf8mb4#character-sets or http://www.java2s.com/Tutorial/MySQL/0080__Table/charactersetsystem.htm but nothing worked !!
I have different problems:
here is mydb info:
mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| collation_connection | utf8mb4_general_ci |
| collation_database | utf8mb4_general_ci |
| collation_server | utf8_general_ci |
+--------------------------+--------------------+
10 rows in set (0.00 sec)
I tried to change character_set_server value to utf8mb4 by
mysql>SET character_set_server = utf8mb4
Query OK, 0 rows affected (0.00 sec)
But when restart mysqld everything revert !
I don't have any /etc/my.cnf file in also, and I edited /etc/mysql/my.cnf file instead.
What should I do?
How can I save emoji file in my database?
1st or 2nd line in source code (to have literals in the code utf8-encoded: # -- coding: utf-8 --
Your columns/tables need to be CHARACTER SET utf8mb4
The python package "MySQL-python" version needs to be at least 1.2.5 in order to handle utf8mb4.
self.query('SET NAMES utf8mb4') may be necessary.
Django needs client_encoding: 'UTF8' -- I don't know if that should be 'utf8mb4`.
References:
https://code.djangoproject.com/ticket/18392
http://mysql.rjweb.org/doc.php/charcoll#python

mysql utf8 issues with french character

Trying to insert a simple 'é' into a table on mysql (v 5.6) terminal windows server 2008, I get Incorrect string value: '\x82' for column 'colum_name'
I've been searching on stack overflow for a day now. I think I am going crazy. All my collations are utf8mb4:
/*column*/
SHOW FULL COLUMNS FROM table_name;
utf8mb4_unicode_ci
/*database*/
show variables like "character_set_database";
utf8mb4
/*table*/
SHOW TABLE STATUS where name like 'table_name';
utf8mb4_unicode_ci
/*variables*/
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| collation_connection | utf8mb4_unicode_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8mb4_unicode_ci |
+--------------------------+--------------------+
Here's what I added to my my.ini
[client]
default-character-set=utf8mb4
[mysql]
default-character-set=utf8mb4
[mysqld]
collation-server = utf8mb4_unicode_ci
character-set-server = utf8mb4
init-connect='SET NAMES utf8'
I am stuck
I get Incorrect string value: '\x82' for column 'colum_name'
Please explain. Show us the query. And SHOW CREATE TABLE
x82, in the popular latin1 character set is a variant of comma: ‚. It is unrelated to e-acute (HEX: latin1: E9, utf8: C3A9).

Rails show question marks(????) for my input utf8 data

I have set every encoding set variable I can figure out to utf8.
In database.yml:
development: &development
adapter: mysql2
encoding: utf8
In my.cnf:
[client]
default-character-set = utf8
[mysqld]
default-character-set = utf8
skip-character-set-client-handshake
character-set-server = utf8
collation-server = utf8_general_ci
init-connect = SET NAMES utf8
And if I run mysql client in terminal:
mysql> show variables like 'character%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
mysql> show variables like 'collation%';
+----------------------+-----------------+
| Variable_name | Value |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8_general_ci |
+----------------------+-----------------+
But it's to beat the air. When I insert utf8 data from Rails app, it finally becomes ????????????.
What do I miss?
Check not global settings but when you are connected to specific database for application. When you changed settings for mysql you have also change settings for your app database.
Simple way to check it is to log to mysql into app db:
mysql app_db_production -u db_user -p
or rails command:
rails dbconsole production
For my app it looks like this:
mysql> show variables like 'character%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
mysql> show variables like 'collation%';
+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | utf8_general_ci |
| collation_database | latin1_swedish_ci |
| collation_server | utf8_general_ci |
+----------------------+-------------------+
3 rows in set (0.00 sec)
Command for changing database collation and charset:
mysql> alter database app_db_production CHARACTER SET utf8 COLLATE utf8_general_ci ;
Query OK, 1 row affected (0.00 sec)
And remeber to change charset and collation for all your tables:
ALTER TABLE tablename CHARACTER SET utf8 COLLATE utf8_general_ci; # changes for new records
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; # migrates old records
Now it should work.
I had the same problem. I added characterEncoding to the end of mysql connection string:
use this: jdbc:mysql://localhost/dbname?characterEncoding=utf8
instead of this: jdbc:mysql://localhost/dbname
Okay for anybody else for whom the #Ravbaker answer does not cut it .. some more tips
MySQL has encoding specified in multiple levels : server, database, connection, table and even field/column. My problem was that the field/column was forced to latin (which over rides all the other encodings). I set the field back to the table encoding (which was utf-8) and the world was good again.
Most of these settings can be set at the usual places: my.cnf, alter queries and rails database.yml file.
ALTER TABLE t MODIFY col1 CHAR(50) CHARACTER SET utf8;
was the query which did the trick for me.
For server / connection encodings use my.cnf and database.yml
For database / table / column encodings use queries
(You can also achieve these by other means)
Do you have this in the HTML?
<meta http-equiv="content-type" content="text/html;charset=UTF-8" />
or on HTML5 pages with <!doctype html>
<meta charset="utf-8">
You may need this to let the browser send strings in utf8.
I have some problem today! It's solved by drop my table and creating new, then db:migrate and all is pretty works!
WARNING: IT WILL DELETE ALL YOUR DATA IN THIS TABLE
So:
$ mysql -u USER -p
mysql > drop database YOURDB_NAME_development;
mysql > create database YOURDB_NAME_development CHARACTER SET utf8 COLLATE utf8_general_ci;
mysql > \q
$ rake db:migrate
Well done!

tomcat/jdbc/mysql: can insert ÿ(U+00FF) but not Ā (U+0100)

my setup:
mysql 5.1
show variables:
| character_set_client | utf8
| character_set_connection | utf8
| character_set_database | utf8
| character_set_filesystem | binary
| character_set_results | utf8
| character_set_server | utf8
| character_set_system | utf8
| character_sets_dir | D:\Programme\MySQL\MySQL Server 5.1\share
charsets\
| collation_connection | utf8_general_ci
| collation_database | utf8_unicode_ci
| collation_server | utf8_general_ci
and even
| init_connect | SET collation_connection = utf8_general_ci; SET NAMES utf8;
the table table has character set utf8
tomcat 6.0
the jdbc connector uses characterEncoding="utf8" useUnicode="true"
now when i try
stmt.execute("UPDATE *table* SET *value*=\"ÿ\" WHERE ...)
it works but for
stmt.execute("UPDATE *table* SET *value*=\"Ā\" WHERE ...)
i get an
java.sql.SQLException: Incorrect string value: '\xC4\x80' for column
'value' at row 1
furthermore it works for all characters below ÿ, which can be encoded with 1 byte but as soon as 2 bytes are needed: bang!
why is that so? and how can i get it to work?
after i added another two tables to check if it's an MyISAM vs. InnoDB problem it just worked on the new tables and why?
in the new tables each column used the default charset while in my existing tables the charsets of each column were set to latin1. this was because i copied the db from a non-utf8 mysql instance and manually changed the table charset to utf-8. BUT while copying, HeidiSQL added a "CHARACTER SET latin1" to each column which wasn't changed when changing the charset AND it is not very easily visible in HeidiSQL that a column has an individual charset ...