How to search in MySQL in a case-insensitive way - mysql

How can I search for a user even if I don't know the exact spelling with regards to case, like
SELECT * FROM USER WHERE U.NAME = 'STEVEN';
Meaning: if a user has the name "Steven" or "sTEVEN" how can I search all persons who have this name?
I tried it, but it does not work.
WHERE email LIKE '%GMAIL%'
It did not not work when case was lowercase.

The case sensitivity is based on the character set and collation of the table.
The default character set and
collation are latin1 and
latin1_swedish_ci, so nonbinary string
comparisons are case insensitive by
default.
http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html
You may want to look at the collation of your table if the searches are case-sensitive.

Use UPPER
SELECT * FROM USER WHERE UPPER(U.NAME) = 'STEVEN'

Have a look at using
UPPER and LOWER

Related

Mysql does'nt distinguish between characters "c" and "ç" in UNIQUE index [duplicate]

These two querys gives me the exact same result:
select * from topics where name='Harligt';
select * from topics where name='Härligt';
How is this possible? Seems like mysql translates åäö to aao when it searches. Is there some way to turn this off?
I use utf-8 encoding everywhere as far as i know. The same problem occurs both from terminal and from php.
Yes, this is standard behaviour in the non-language-specific unicode collations.
9.1.13.1. Unicode Character Sets
To further illustrate, the following equalities hold in both utf8_general_ci and utf8_unicode_ci (for the effect this has in comparisons or when doing searches, see Section 9.1.7.7, “Examples of the Effect of Collation”):
Ä = A
Ö = O
Ü = U
See also Examples of the effect of collation
You need to either
use a collation that doesn't have this "feature" (namely utf8_bin, but that has other consequences)
use a different collation for the query only. This should work:
select * from topics where name='Harligt' COLLATE utf8_bin;
it becomes more difficult if you want to do a case insensitive LIKE but not have the Ä = A umlaut conversion. I know no mySQL collation that is case insensitive and does not do this kind of implicit umlaut conversion. If anybody knows one, I'd be interested to hear about it.
Related:
Looking for case insensitive MySQL collation where “a” != “ä”
MYSQL case sensitive search for utf8_bin field
Since you are in Sweden I'd recommend using the Swedish collation. Here's an example showing the difference it makes:
CREATE TABLE topics (name varchar(100) not null) CHARACTER SET utf8;
INSERT topics (name) VALUES ('Härligt');
select * from topics where name='Harligt';
'Härligt'
select * from topics where name='Härligt';
'Härligt'
ALTER TABLE topics MODIFY name VARCHAR(100) CHARACTER SET utf8 COLLATE utf8_swedish_ci;
select * from topics where name='Harligt';
<no results>
select * from topics where name='Härligt';
'Härligt'
Note that in this example I only changed the one column to Swedish collation, but you should probably do it for your entire database, all tables, all varchar columns.
While collations are one way of solving this, the much more straightforward way seems to me to be the BINARY keyword:
SELECT 'a' = 'ä', BINARY 'a' = 'ä'
will return 1|0
In your case:
SELECT * FROM topics WHERE BINARY name='Härligt';
See also https://www.w3schools.com/sql/func_mysql_binary.asp
you want to check your collation settings, collation is the property that sets which characters are identical.
these 2 pages should help you
http://dev.mysql.com/doc/refman/5.1/en/charset-general.html
http://dev.mysql.com/doc/refman/5.1/en/charset-mysql.html
Here you can see some collation charts. http://collation-charts.org/mysql60/. I'm no sure which is the used utf8_general_ci though.
Here is the chart for utf8_swedish_ci. It shows which characters it interprets as the same. http://collation-charts.org/mysql60/mysql604.utf8_swedish_ci.html

mysql collation: case-preserving, case-insensitive but accent-sensitive [duplicate]

How can I perform accent-sensitive but case-insensitive utf8 search in mysql? Utf8_bin is case sensitive, and utf8_general_ci is accent insensitive.
If you want to differ "café" from "cafe"
You may use :
Select word from table_words WHERE Hex(word) LIKE Hex("café");
This way it will return 'café'.
Otherwise if you use :
Select word from table_words WHERE Hex(word) LIKE Hex("cafe");
it will return café.
I'm using the latin1_german2_ci Collation.
There doesn't seem to be one because case sensitivity is tough to do in Unicode.
There is a utf8_general_cs collation but it seems to be experimental, and according to this bug report, doesn't do what it's expected to when using LIKE.
If your data consists of western umlauts only (ie. umlauts that are included in ISO-8859-1), you might be able to collate your search operation to latin1_german2_ci or create a separate search column with it (that specific collation is accent sensitive according to this page; latin1_general_ci might be as well, I don't know and can't test right now).
You can use "hex" to make the search accent-sensitive. Then simply add lcase to make it case insensitive again. So that would give:
SELECT name FROM people WHERE HEX(LCASE(name)) = HEX(LCASE("René"))
You do throw all your indexes out of the window like that. If you want to avoid having to do a full table scan and you have an index on "name", also search for the same thing without the hex and lcase:
SELECT name FROM people WHERE name = "René" and HEX(LCASE(name)) = HEX(LCASE("René"))
This way the index on "name" will be used to find for example only the rows "René" and "Rene" and then the comparison with the "hex" needs to be done only on those two rows instead of on the complete table.

Case sensitive search in Django, but ignored in Mysql

I have a field in a Django Model for storing a unique (hash) value. Turns out that the database (MySQL/inno) doesn't do a case sensitive search on this type (VARCHAR), not even if I explicitly tell Django to do a case sensitive search Document.objects.get(hash__exact="abcd123"). So "abcd123" and "ABcd123" are both returned, which I don't want.
class document(models.Model):
filename = models.CharField(max_length=120)
hash = models.CharField(max_length=33 )
I can change the 'hash field' to a BinaryField , so in the DB it becomes a LONGBLOB , and it does do a case-sensitive search (and works). However, this doesn't seem very efficient to me.
Is there a better way (in Django) to do this, like adding 'utf8 COLLATE'? or what would be the correct Fieldtype in this situation?
(yes, I know I could use PostgreSQL instead..)
The default collation for character set for MySQL is latin1_swedish_ci, which is case insensitive. Not sure why that is. But you should create your database like so:
CREATE DATABASE database_name CHARACTER SET utf8;
As #dan-klasson mentioned, the default non-binary string comparison is case insensetive by default; notice the _ci at the end of latin1_swedish_ci, it stands for case-insensetive.
You can, as Dan mentioned, create the database with a case sensitive collation and character set.
You may be also interested to know that you can always create a single table or even set only a single column to use a different collation (for the same result). And you may also change these collations post creation, for instance per table:
ALTER TABLE documents__document CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
Additionally, if you rather not change the database/table charset/collation, Django allows to run a custom query using the raw method. So you may be able to work around the change by using something like the following, though I have not tested this myself:
Document.objects.raw("SELECT * FROM documents__document LIKE '%s' COLLATE latin1_bin", ['abcd123'])

is possible to have accent sensitive and case insensitive utf8 collation in mysql?

How can I perform accent-sensitive but case-insensitive utf8 search in mysql? Utf8_bin is case sensitive, and utf8_general_ci is accent insensitive.
If you want to differ "café" from "cafe"
You may use :
Select word from table_words WHERE Hex(word) LIKE Hex("café");
This way it will return 'café'.
Otherwise if you use :
Select word from table_words WHERE Hex(word) LIKE Hex("cafe");
it will return café.
I'm using the latin1_german2_ci Collation.
There doesn't seem to be one because case sensitivity is tough to do in Unicode.
There is a utf8_general_cs collation but it seems to be experimental, and according to this bug report, doesn't do what it's expected to when using LIKE.
If your data consists of western umlauts only (ie. umlauts that are included in ISO-8859-1), you might be able to collate your search operation to latin1_german2_ci or create a separate search column with it (that specific collation is accent sensitive according to this page; latin1_general_ci might be as well, I don't know and can't test right now).
You can use "hex" to make the search accent-sensitive. Then simply add lcase to make it case insensitive again. So that would give:
SELECT name FROM people WHERE HEX(LCASE(name)) = HEX(LCASE("René"))
You do throw all your indexes out of the window like that. If you want to avoid having to do a full table scan and you have an index on "name", also search for the same thing without the hex and lcase:
SELECT name FROM people WHERE name = "René" and HEX(LCASE(name)) = HEX(LCASE("René"))
This way the index on "name" will be used to find for example only the rows "René" and "Rene" and then the comparison with the "hex" needs to be done only on those two rows instead of on the complete table.

How to perform a case sensitive search on a case insensitive MySQL database?

In my database which is utf8_general_ci, 99.99% of searches should be done case insensitive. Now there's a specific situation where I need to find some data in a case sensitive manner. The field is a varchar field, which I usually search case insensitive.
My question is: Can I perform a case sensitive search on a field that is usually case insensitive?
select * from page
where convert(pageTitle using latin1) collate utf8_ = 'Something'
i found this answer descriptive enough :)
You have to change the collation of operands which are used in the search query using COLLATE operator
OR
If you want a column always to be
treated in case-sensitive fashion,
declare it with a case sensitive or
binary collation.
See here http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html
I would suggest changing the collation type so that it becomes case sensitive and then changing your code in the 99.99% of other cases. For information on changing the case of the field check: http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html
You can use the LOWER() and UPPER() functions in SQL to standardise the case for insensitive searches once the field type is set to be case sensitive.
You can make a query case sensitive,
for example if you want to check for userid and password:
Select * from TableName Where WHERE LoginId=#userid COLLATE SQL_Latin1_General_CP1_CS_AS AND Password=#password COLLATE SQL_Latin1_General_CP1_CS_AS
it checks for userid and password and returns the values if they are true (case sensitive).