How could I do this SQL query?
Name Database: database_1
Column: Collation
Value: latin1_swedish_ci
I need the list of ALL "database_1" tables that have columns whose value contains the text %latin1%.
My final goal: I need to change these values in all the tables that contain this data, for example:
CHARACTER SET latin1 COLLATE latin1_swedish_ci
Change to:
CHARACTER SET utf8 COLLATE utf8_unicode_ci
I already have the consult ready for replacement:
ALTER TABLE `blog_sitemapconf` CHANGE `value` `value` VARCHAR(100) CHARACTER SET latin1 COLLATE latin1_english_ci NOT NULL DEFAULT '';
But I need to get the name of all tables that contain columns with
CHARACTER SET latin1 COLLATE latin1_swedish_ci
You can query this information from the information_schema.columns table:
SELECT table_name, column_name
FROM information_schema.columns
WHERE collation_name LIKE '%latin1%'
First, determine how you will be doing the charset conversion. If the table is correctly encoded in latin1, then this is the desired conversion:
ALTER TABLE tbl CONVERT TO CHARACTER SET utf8;
If, instead, you have utf8 bytes stored in latin1 columns, then the 'fix' is more complex.
However, that changes all CHAR/TEXT columns in the table. Is that OK?
If that is OK, then this will generate all the ALTERs.
SELECT DISTINCT
CONCAT("ALTER TABLE ", table_schema, ".", table_name,
" CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;")
FROM information_schema.columns
WHERE character_set = 'latin1'
AND table_schema NOT IN ('mysql', 'information_schema', 'performance_schema');
You can then copy/paste them into the mysql commandline tool.
If you need to change only individual columns, then the fix is difficult because the rest of the attributes of each column must be repeated. This is not easy to do in a query.
I have a utf8_general_ci database that I'm interested in converting to utf8_unicode_ci.
I've tried the following commands
ALTER DATABASE dbname CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tbl_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci; (for every single table)
But that seems to change the charset for future data but doesn't convert the actual existing data from utf8_general_ci to utf8_unicode_ci.
Is there any way to convert the existing data to utf8_unicode_ci?
SHOW CREATE TABLE to see if it really set the CHARACTER SET and COLLATION on the columns, not just the defaults.
What was the CHARACTER SET before the ALTERs?
Do SELECT col, HEX(col) ... for some field that should have utf8 in it. This will help us determine if you really have utf8 in the table. The encoding for characters is different based on CHARACTER SET; the HEX helps discover such.
The ordering (WHERE, ORDER BY, etc) is controlled by COLLATION. The indexes probably had to be rebuilt based on your ALTER TABLE. Did big tables with indexes take a 'long' time to convert?
To actually see the difference between utf8_general_ci and utf8_unicode_ci, you need a "combining accent" or, more simply, the German ß versus ss:
mysql> SELECT 'ß' = 'ss' COLLATE utf8_general_ci,
'ß' = 'ss' COLLATE utf8_unicode_ci;
+-------------------------------------+-------------------------------------+
| 'ß' = 'ss' COLLATE utf8_general_ci | 'ß' = 'ss' COLLATE utf8_unicode_ci |
+-------------------------------------+-------------------------------------+
| 0 | 1 |
+-------------------------------------+-------------------------------------+
However, to test that in your tables, you would need to store those values and use WHERE or GROUP_CONCAT or something else to determine the equality.
What 'proof' do you have that the ALTERs failed to achieve the collation change?
(Addressing other comments: REPAIR should be irrelevant. CONVERT TO tells the ALTER to actually modify the data, so it should have done the desired action.)
You have to change the collation of every field in every table. As you say, the collation of the table is only the default value for fields created later, and the collation of the database is only the default value for tables created later.
As Lorenz Meyer said, the collation of the table is only the default value for fields created later and you need to set the defaults for the columns explicitly too.
Such a change looks like:
ALTER TABLE mytable CHANGE mycolumn mycolumn varchar(15) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci
Error message on MySql:
Illegal mix of collations (utf8_unicode_ci,IMPLICIT) and (utf8_general_ci,IMPLICIT) for operation '='
I have gone through several other posts and was not able to solve this problem.
The part affected is something similar to this:
CREATE TABLE users (
userID INT UNSIGNED NOT NULL AUTO_INCREMENT,
firstName VARCHAR(24) NOT NULL,
lastName VARCHAR(24) NOT NULL,
username VARCHAR(24) NOT NULL,
password VARCHAR(40) NOT NULL,
PRIMARY KEY (userid)
) ENGINE = INNODB CHARACTER SET utf8 COLLATE utf8_unicode_ci;
CREATE TABLE products (
productID INT UNSIGNED NOT NULL AUTO_INCREMENT,
title VARCHAR(104) NOT NULL,
picturePath VARCHAR(104) NULL,
pictureThumb VARCHAR(104) NULL,
creationDate DATE NOT NULL,
closeDate DATE NULL,
deleteDate DATE NULL,
varPath VARCHAR(104) NULL,
isPublic TINYINT(1) UNSIGNED NOT NULL DEFAULT '1',
PRIMARY KEY (productID)
) ENGINE = INNODB CHARACTER SET utf8 COLLATE utf8_unicode_ci;
CREATE TABLE productUsers (
productID INT UNSIGNED NOT NULL,
userID INT UNSIGNED NOT NULL,
permission VARCHAR(16) NOT NULL,
PRIMARY KEY (productID,userID),
FOREIGN KEY (productID) REFERENCES products (productID) ON DELETE RESTRICT ON UPDATE NO ACTION,
FOREIGN KEY (userID) REFERENCES users (userID) ON DELETE RESTRICT ON UPDATE NO ACTION
) ENGINE = INNODB CHARACTER SET utf8 COLLATE utf8_unicode_ci;
The stored procedure I'm using is this:
CREATE PROCEDURE updateProductUsers (IN rUsername VARCHAR(24),IN rProductID INT UNSIGNED,IN rPerm VARCHAR(16))
BEGIN
UPDATE productUsers
INNER JOIN users
ON productUsers.userID = users.userID
SET productUsers.permission = rPerm
WHERE users.username = rUsername
AND productUsers.productID = rProductID;
END
I was testing with php, but the same error is given with SQLyog.
I have also tested recreating the entire DB but to no good.
Any help will be much appreciated.
The default collation for stored procedure parameters is utf8_general_ci and you can't mix collations, so you have four options:
Option 1: add COLLATE to your input variable:
SET #rUsername = ‘aname’ COLLATE utf8_unicode_ci; -- COLLATE added
CALL updateProductUsers(#rUsername, #rProductID, #rPerm);
Option 2: add COLLATE to the WHERE clause:
CREATE PROCEDURE updateProductUsers(
IN rUsername VARCHAR(24),
IN rProductID INT UNSIGNED,
IN rPerm VARCHAR(16))
BEGIN
UPDATE productUsers
INNER JOIN users
ON productUsers.userID = users.userID
SET productUsers.permission = rPerm
WHERE users.username = rUsername COLLATE utf8_unicode_ci -- COLLATE added
AND productUsers.productID = rProductID;
END
Option 3: add it to the IN parameter definition (pre-MySQL 5.7):
CREATE PROCEDURE updateProductUsers(
IN rUsername VARCHAR(24) COLLATE utf8_unicode_ci, -- COLLATE added
IN rProductID INT UNSIGNED,
IN rPerm VARCHAR(16))
BEGIN
UPDATE productUsers
INNER JOIN users
ON productUsers.userID = users.userID
SET productUsers.permission = rPerm
WHERE users.username = rUsername
AND productUsers.productID = rProductID;
END
Option 4: alter the field itself:
ALTER TABLE users CHARACTER SET utf8 COLLATE utf8_general_ci;
Unless you need to sort data in Unicode order, I would suggest altering all your tables to use utf8_general_ci collation, as it requires no code changes, and will speed sorts up slightly.
UPDATE: utf8mb4/utf8mb4_unicode_ci is now the preferred character set/collation method. utf8_general_ci is advised against, as the performance improvement is negligible. See https://stackoverflow.com/a/766996/1432614
I spent half a day searching for answers to an identical "Illegal mix of collations" error with conflicts between utf8_unicode_ci and utf8_general_ci.
I found that some columns in my database were not specifically collated utf8_unicode_ci. It seems mysql implicitly collated these columns utf8_general_ci.
Specifically, running a 'SHOW CREATE TABLE table1' query outputted something like the following:
| table1 | CREATE TABLE `table1` (
`id` int(11) NOT NULL,
`col1` varchar(4) CHARACTER SET utf8 NOT NULL,
`col2` int(11) NOT NULL,
PRIMARY KEY (`col1`,`col2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |
Note the line 'col1' varchar(4) CHARACTER SET utf8 NOT NULL does not have a collation specified. I then ran the following query:
ALTER TABLE table1 CHANGE col1 col1 VARCHAR(4) CHARACTER SET utf8
COLLATE utf8_unicode_ci NOT NULL;
This solved my "Illegal mix of collations" error. Hope this might help someone else out there.
I had a similar problem, but it occurred to me inside procedure, when my query param was set using variable e.g. SET #value='foo'.
What was causing this was mismatched collation_connection and Database collation. Changed collation_connection to match collation_database and problem went away. I think this is more elegant approach than adding COLLATE after param/value.
To sum up: all collations must match. Use SHOW VARIABLES and make sure collation_connection and collation_database match (also check table collation using SHOW TABLE STATUS [table_name]).
A bit similar to #bpile answer, my case was a my.cnf entry setting collation-server = utf8_general_ci. After I realized that (and after trying everything above), I forcefully switched my database to utf8_general_ci instead of utf8_unicode_ci and that was it:
ALTER DATABASE `db` CHARACTER SET utf8 COLLATE utf8_general_ci;
Answer is adding to #Sebas' answer - setting the collation of my local environment. Do not try this on production.
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Source of this solution
In my own case I have the following error
Illegal mix of collations (utf8_general_ci,IMPLICIT) and (utf8_unicode_ci,IMPLICIT) for operation '='
$this->db->select("users.username as matric_no, CONCAT(users.surname,
' ',
users.first_name, ' ', users.last_name) as fullname")
->join('users', 'users.username=classroom_students.matric_no', 'left')
->where('classroom_students.session_id', $session)
->where('classroom_students.level_id', $level)
->where('classroom_students.dept_id', $dept);
After weeks of google searching I noticed that the two fields I am comparing consists of different collation name. The first one i.e username is of utf8_general_ci while the second one is of utf8_unicode_ci so I went back to the structure of the second table and changed the second field (matric_no) to utf8_general_ci and it worked like a charm.
Despite finding an enormous number of question about the same problem (1, 2, 3, 4) I have never found an answer that took performance into consideration, even here.
Although multiple working solutions has been already given I would like to do a performance consideration.
EDIT: Thanks to Manatax for pointing out that option 1 does not suffer of performance issues.
Using Option 1 and 2, aka the COLLATE cast approach, can lead to potential bottleneck, cause any index defined on the column will not be used causing a full scan.
Even though I did not try out Option 3, my hunch is that it will suffer the same consequences of option 1 and 2.
Lastly, Option 4 is the best option for very large tables when it is viable. I mean there are no other usage that rely on the original collation.
Consider this simplified query:
SELECT
*
FROM
schema1.table1 AS T1
LEFT JOIN
schema2.table2 AS T2 ON T2.CUI = T1.CUI
WHERE
T1.cui IN ('C0271662' , 'C2919021')
;
In my original example, I had many more joins.
Of course, table1 and table2 have different collations.
Using the collate operator to cast, it will lead to indexes not being used.
See sql explanation in the picture below.
Visual Query Explanation when using the COLLATE cast
On the other hand, option 4 can take advantages of possible index and led to fast queries.
In the picture below, you can see the same query being run after applied Option 4, aka altering the schema/table/column collation.
Visual Query Explanation after the collation has been changed, and therefore without the collate cast
In conclusion, if performance are important and you can alter the collation of the table, go for Option 4.
If you have to act on a single column, you can use something like this:
ALTER TABLE schema1.table1 MODIFY `field` VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
This happens where a column is explicitly set to a different collation or the default collation is different in the table queried.
if you have many tables you want to change collation on run this query:
select concat('ALTER TABLE ', t.table_name , ' CONVERT TO CHARACTER
SET utf8 COLLATE utf8_unicode_ci;') from (SELECT table_name FROM
information_schema.tables where table_schema='SCHRMA') t;
this will output the queries needed to convert all the tables to use the correct collation per column
I was also facing a problem while upload excels file in MySql Database form Laravel. In excel file some addresses contain characters like PeterÕs that Error was
SQLSTATE[HY000]: General error: 1267 Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8mb4_unicode_ci,COERCIBLE) for operation '=' (SQL: select count(*) as aggregate from `.....` where `e....` = kas#email.com and `first_name` = Gill and `surname` = Harries and `address` = 6 St.PeterÕs Close,,,Woodbridge,Suffolk,IP12 4EJ and `status` = 1 and `client`.`deleted_at` is null)
I try many solutions that is mentioned above but unfortunately, no one works for me. so I post the solution that works for me. so maybe it will not work in some situations. Exception can be different. so you have to find a solution based on that. please vote Up if some finds this is useful.
I try 2 types of solutions.
I putted my exceptional code into try{} Catch() block for handling. because it was giving an error while executing select() query. so that I use TRY{} for this. using that uploaded record stored in the database. But when charset issue comes on that place it is put ?. that sample image is given below.
you can detect the charset that gives an issue while operating and change the MySql configuration by this code
\Config::set('database.connections.mysql.charset', 'latin1');
\Config::set('database.connections.mysql.collation', 'latin1_bin');
\DB::purge('mysql');
Both solution works for me.
Hello everyone I am using MySQL 5.0, but when I fire my queries through my web application that is in Java they are case insensitive.
First query:
select * from market where company='"abc"'
Second query:
select 8 from market where company='"ABC"'
Both queries give me same results. I just want rows with company "abc" only and not ABC.
How I can solve this problem? Thanks.
You need to set the proper collation.
http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html
You can use a binary collation for the field, for instance utf8_bin
http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html
If you want to do always case sensitive searches on that field, you can use an alter table to set utf8_bin, or another binary collation, for example:
ALTER TABLE `market` CHANGE `company` `company` VARCHAR( 255 )
CHARACTER SET utf8 COLLATE utf8_bin NULL DEFAULT NULL
I created a table and set the collation to utf8 in order to be able to add a unique index to a field. Now I need to do case insensitive searches, but when I performed some queries with the collate keyword and I got:
mysql> select * from page where pageTitle="Something" Collate utf8_general_ci;
ERROR 1253 (42000): COLLATION 'utf8_general_ci' is not valid for
CHARACTER SET 'latin1'
mysql> select * from page where pageTitle="Something" Collate latin1_general_ci;
ERROR 1267 (HY000): Illegal mix of collations (utf8_bin,IMPLICIT) and
(latin1_general_ci,EXPLICIT) for operation '='
I am pretty new to SQL, so I was wondering if anyone could help.
A string in MySQL has a character set and a collation. Utf8 is the character set, and utf8_bin is one of its collations. To compare your string literal to an utf8 column, convert it to utf8 by prefixing it with the _charset notation:
_utf8 'Something'
Now a collation is only valid for some character sets. The case-sensitive collation for utf8 appears to be utf8_bin, which you can specify like:
_utf8 'Something' collate utf8_bin
With these conversions, the query should work:
select * from page where pageTitle = _utf8 'Something' collate utf8_bin
The _charset prefix works with string literals. To change the character set of a field, there is CONVERT ... USING. This is useful when you'd like to convert the pageTitle field to another character set, as in:
select * from page
where convert(pageTitle using latin1) collate latin1_general_cs = 'Something'
To see the character and collation for a column named 'col' in a table called 'TAB', try:
select distinct collation(col), charset(col) from TAB
A list of all character sets and collations can be found with:
show character set
show collation
And all valid collations for utf8 can be found with:
show collation where charset = 'utf8'
Also please note that in case of using "Collate utf8_general_ci" or "Collate latin1_general_ci", i.e. "force" collate - such a converting will prevent from usage of existing indexes! This could be a bottleneck in future for performance.
Try this, Its working for me
SELECT * FROM users WHERE UPPER(name) = UPPER('josé') COLLATE utf8_bin;
May I ask why you have a need to explicitly change the collation when you do a SELECT? Why not just collate in the way you want to retrieve the records when sorted?
The problem you are having with your searches being case sensitive is that you have a binary collation. Try instead to use the general collation. For more information about case sensitivity and collations, look here:
Case Sensitivity in String Searches