I am aware about MySQL being case insensitive by default.
I also read about using the collation utf8_general_cs to enable case sensitivity. But I get an error saying the collation is not identified. Also when I query the collation for charset utf8, the resultset shows ci related collations only. So question number 1 would be, do we need to configure cs related collations? If so, then I would like some guidance over it. Or is it dependent on some particular database engine?
Also I read about using utf8_bin collation for making MySQL queries search case sensitive. I did so. Set the schema collation as utf8_bin. But it didn't work. Restarted MySQL services as well to ensure that collation has been updated. But yet, when I do a
Select * from table where name like 'el%';
It gives name starting from 'EL' as well.
Note: I am preferably looking for options to set the collation at the database level.
MySQL server version 5.6.x
Column collation has precedence over database and table collations. If you've been making changes, it's possible that your column is currently using the value that was the default when the table was created. You should be able to spot it with a proper SQL tool or by running:
SELECT table_schema, column_name, collation_name
FROM information_schema.columns
WHERE table_schema = 'your database name'
AND table_name = 'your table name'
If you aren't willing to change the column collation, you can set it at expression level:
SELECT *
FROM foo
WHERE bar LIKE 'el%' COLLATE utf8mb4_0900_as_cs;
(demo)
Collation affects sorting and character comparison so you'll have to read some docs to figure out which one suits your needs best (it isn't straightforward if you aren't a Unicode geek).
My projects are all in Spanish so I tend to use utf8mb4_spanish_ci a lot ;-)
I thought that normally worked?!
Anyway if this is a localised problem there are a few solutions:
Primarily one could use a SELECT * FROM users WHERE name REGEXP '^[E][P]*$' - regular expression.
The alternative
Surely you must be accessing this through another languages wrapper for either automation or handling rather than the MySQL console right? I would suggest sorting it using the language you’re using for a wrapper this can easily be implemented. Still however only deal with those beginning in el as this will reduce the number of things to check.
Related
I am working with a Debian Server with xampp 1.8.3-2 and mysqlserver version 5.6.14 installed. In the old database, latin1_german2_ci was used as character collation for the database and the tables because of the german characters like ä,ö,ü,ß etc.. Now, I have to convert the collation of the tables into utf8_unicode_ci(collate of database is still in latin1_german2_ci). But after that, queries like this don't produce the correct answers anymore. i.e. it not only give everything with kö but with ko as well. How can I fix this?
SELECT * FROM users Where lastname like "%kö%"
edit: Just found one solution which uses COLLATE:
SELECT * FROM users Where lastname like "%kö%" COLLATE utf8mb4_german2_ci
However, this has to be adjusted depend which server connection collation is used, so if the server connection collation is utf8_unicode_ci, the query has to be changed into
SELECT * FROM users Where lastname like "%kö%" COLLATE utf8_german2_ci
So my question now: is there a better/more elegant way to solve my problem? Is there any option in the database to prevent this?
try to set the table's charset to UTF-8 and the collation to utf8_* and make sure there is no _ci and the end (_ci is for "case insensitive") otherwise MySQL will both perform case and accent-insensitive searches.
I have my mySQL collation set to utf8_general_ci and despite the fact that my searches are diacritical-insensitive, ie LIKE 'test' returns 'tést', some searches which I would like to work fail, in particular LIKE 'host' will NOT return 'høst'.
Two questions: Is there a table that will show which characters are equivalent for a particular collation? Is there a way to set two characters as equivalents in mySQL as an override?
Thanks for the help.
To answer for first question you can referance collation-charts.org. It's kind of a pain because you will need to search each collation by hand, but it'll show you how they stack.
The relevant section in the MySQL manual can also be found here.
As far as your second question, I'm not sure if you can do an explicit override for a particular character; however you can create your own custom character set if you wish.
You can read about creating custom collations from the MySQL manual.
I'm trying to execute this concat query in mysql
SELECT CONCAT(if(fName,fName,''),Name)
From Student
Error:
#1271 - Illegal mix of collations for operation 'concat'
This is due to collections difference, you can solve by converting the two strings or columns
to one collection say UTF8
CONCAT(CAST(fName AS CHAR CHARACTER SET utf8),CAST('' AS CHAR CHARACTER SET utf8))
This will solve :)
you can check more about casting in MySQL here MySQL Casting
The charsets and/or collations you use in your connection do not match the charset/collation in your table.
There are 4 solutions:
1- Change the charset in your connection:
//find out the charset used in your table.
SHOW TABLES LIKE 'student'
//set the server charset to match
SET NAMES 'charset_name' [COLLATE 'collation_name']
2- Change the charset used in your table to match the server charset:
//find out the charset used in the server
SHOW VARIABLES LIKE 'character_set%';
SHOW VARIABLES LIKE 'collation%';
//Change the charset used in the table
ALTER TABLE student ......
3- Change the default charset settings and restart MySQL
Edit My.ini and replace the character_set_* options, so they match your tables.
4- Change the charset settings for your connection
You client can override the charset and collation settings.
If it does not option 1 or 3 should fix your issue, but if the connection overrides these settings, you need to check the connection-string and edit the charset/collation settings to match your database.
Some advice:
Find a charset. I recommend UTF8 and a collation: I recommend utf8_general_ci. And use those consistantly everywhere.
A concatenation can only work if the collation of all used values matches OR you use a collation that all collations are a subset of (from a logical standpoint).
If you want to concatenate text, each text should be the same collation. Take a look at the collation the database uses, then take a look at the collation that your connection uses:
show variables like '%coll%'
The collation_connection should match the collation of the table you try to concatenate. If it doesn't, the error message will arise.
You can then change the connection collation to match the one of the table.
Look like you have a miss use on the if statement there because it will resulting an undefined data type so the concat operation will fail as it different in data type. Try change the query by use ifnull instead.
Try this query instead:
SELECT concat(ifnull(fName,''),Name) From Student
see the demo here http://www.sqlize.com/kfy85j8f1e
for another reference read also http://forums.mysql.com/read.php?10,225982,225982#msg-225982
It can also be an error with your client library being too old for the mysql server.
We had a similar problem with LIKE and the character "ő" and using PHP MySQL library version 5.1.52 but MySQL server version 5.5.22.
The problem has gone away upon upgrading the client library.
Edit: if you're here because you're confused by the polish collation in MySQL, read this.
I'm trying to perform a full-text search on a table of polish cities and many of them contain accented characters. It's meant to be used in an ajax call for auto completion so it would be nice if the search was accent-insensitive. I've set the collation of the rows to ut8_polish_ci. Now, given the city "Zelów", I query the database like this
SELECT * FROMcitiesWHERE MATCH( city ) AGAINST ("zelow")
but to no avail. Mysql returns an empty result. I've tried different accents, tried adding different collations to the query but nothing helped. I'm not sure how I should approach this because accent-sensitivity seems to be poorly documented. Any ideas?
EDIT
So I found out that the case-insensitive full-text searches are performed only IN BOOLEAN MODE, so the correct query would be
SELECT * FROMcitiesWHERE MATCH( city ) AGAINST ( "zelow" IN BOOLEAN MODE )
Previously I thought otherwise due to a misleading comment on dev.mysql.com. There might be more to it but I'm just really confused right now.
Anyway, as mentioned in the comments below, I have UNIQUE index on the cities column so changing the collation of the table to accent-insensitive utf8_general_ci is out of the question.
I realized however, that the following query works quite well on a table with utf8_polish_ci collation:
SELECT * FROMcitiesWHERE city LIKE 'zelow' COLLATE utf8_general_ci
It would seem now that the most reasonable solution would be to do a full-text search in a similar fashion:
SELECT * FROMcitiesWHERE MATCH( city ) AGAINST ( 'zelow' IN BOOLEAN MODE ) COLLATE utf8_genral_ci
This however yields the following error:
#1253 - COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'binary'
This is really starting to get on my nerves. Might as well abandon full-text search in favour of a simple where-like approach but it doesn't seem sensible in a table with almost 50k records which will be intensively queried...
LAST EDIT
Ok, the thing with boolean mode was partly bullshit. Only partly because it indeed works as I said, however, on a utf8_general_ci it works the other way around. I'm utterly perplexed and have no will to study this issue further. I decided to drop the UNIQUE index (no further cities will be added anyway so no need to make a big deal out of it) and stick with the utf8_general_ci table collation. I appreciate all the help, it steered me in the right direction.
Change your collation to utf_general_ci. It ignores accent when searching and ordering but still stores them correctly.
MySQL is very flexible in the encoding/collation area, maybe too flexible. When changing your encoding/collation, make sure you are converting the table, not just changing the encoding/collation types.
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
You can also convert individual fields, so your table can have a collation setting of utf8_general_ci, but you can change one or more fields so they use some other collation. Base on the "binary" error you are seeing, it seems your text field might have a collation of UTF8-BIN (or be a blob). Can you post the result of CREATE TABLE?
Remember, the CHARACTER SET (encoding) is how the data is stored, the collation is how it is indexed. Not all combinations work.
My original problem, and question, might help a little:
Converting mysql tables from latin1 to utf8
If you try :
select * from cities where cityname like 'zelow'
Change your collation from binary to utf8_bin. utf8_bin should be compatible with utf8_general_ci, but will still allow you to store city names with differing accents.
I set up a MyISAM table to do FULLTEXT searching.
I do not want searches to be case-sensitive.
My searches are along the lines of:
SELECT * FROM search WHERE MATCH (keywords) AGAINST ('+diversity +kitten' IN BOOLEAN MODE);
Let's say the keywords field I'm looking for has the value "my Diversity kitten".
I noticed the searches were case-sensitive.
I double-checked my collation on the search table, it was set to utf8_bin. D'oh!
I changed it to utf8_general_ci.
But my query is still case-sensitive!
Why?
Is there a server setting I need to change, too?
Is there something I need to do besides change the collation?
I did a "REPAIR TABLE search QUICK" to rebuild the FULLTEXT index, but that didn't do it either...
My searches are still case-sensitive. =(
Aha, figured it out for reals this time.
I believe my issue was using NaviCat to update the collation. I have an older version of NaviCat, maybe it was a bug or something.
Doing:
ALTER TABLE search CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
fixed it correctly.
Always use command line, kids! =)
Hmm - that behavior doesn't match the manual:
By default, the search is performed in
case-insensitive fashion. However, you
can perform a case-sensitive full-text
search by using a binary collation for
the indexed columns. For example, a
column that uses the latin1 character
set of can be assigned a collation of
latin1_bin to make it case sensitive
for full-text searches.
Which version of MySQL do you use? Can you provide some data that would allow replicating the problem on another machine?