I have a MySQL query, where I filter by a json field:
SELECT id, username
FROM (SELECT id,
Json_extract(payload, '$.username') AS username
FROM table1) AS tmp
WHERE username = 'userName1';
It returns 1 row, which looks like:
1, "userName1" See the quotes that are not in the clause?
What I need is to make the WHERE clause case insensitive.
But when I do
WHERE username LIKE 'userName1';
it returns 0 rows. I don't understand why it works this way, the = clause works though it doesn't have those double quotes.
If I do
WHERE username LIKE '%userName1%';
now also returns the row, because %% takes quotes into consideration:
1, "userName1"
But when I do
WHERE username LIKE '%username1%';
it returns 0 rows, so unlike the usual MySQL LIKE it's somehow case sensitive.
What am I doing wrong and how to filter the json payload the case insensitive way?
EDIT=========================================
The guess is that COLLATE should be used here, but so far I don't understand how to make it work.
Default collation of MySQL is latin1_swedish_ci before 8.0 and utf8mb4_0900_ai_ci since 8.0. So non-binary string comparisons are case-insensitive by default in ordinary columns.
However, as mentioned in MySQL manual for JSON type
MySQL handles strings used in JSON context using the utf8mb4 character set and utf8mb4_bin collation.".
Therefore, your JSON value is in utf8mb4_bin collation and you need to apply a case insensitive collation to either operand to make the comparison case insensitive.
E.g.
WHERE username COLLATE XXX LIKE '...'
where XXX should be a utf8mb4 collation (such as the utf8mb4_general_ci you've mentioned.).
Or
WHERE username LIKE '...' COLLATE YYY
where YYY should be a collation that match the character set of you connection.
For equality comparison, you should unquote the JSON value with JSON_UNQUOTE() or the unquoting extraction operator ->>
E.g.
JSON_UNQUOTE(JSON_EXTRACT(payload, '$.username'))
Or simply
payload->>'$.username'
The JSON type and functions work way different from ordinary data types. It appears that you are new to it. So I would suggest you to read the manual carefully before putting it into a production environment.
Okay, I was able to solve the case insensitivity by adding COLLATE utf8mb4_general_ci after the LIKE clause.
So the point here is to find a working collation, which in its turn can be found by researching the db you work with.
Related
I have a MySQL query, where I filter by a json field:
SELECT id, username
FROM (SELECT id,
Json_extract(payload, '$.username') AS username
FROM table1) AS tmp
WHERE username = 'userName1';
It returns 1 row, which looks like:
1, "userName1" See the quotes that are not in the clause?
What I need is to make the WHERE clause case insensitive.
But when I do
WHERE username LIKE 'userName1';
it returns 0 rows. I don't understand why it works this way, the = clause works though it doesn't have those double quotes.
If I do
WHERE username LIKE '%userName1%';
now also returns the row, because %% takes quotes into consideration:
1, "userName1"
But when I do
WHERE username LIKE '%username1%';
it returns 0 rows, so unlike the usual MySQL LIKE it's somehow case sensitive.
What am I doing wrong and how to filter the json payload the case insensitive way?
EDIT=========================================
The guess is that COLLATE should be used here, but so far I don't understand how to make it work.
Default collation of MySQL is latin1_swedish_ci before 8.0 and utf8mb4_0900_ai_ci since 8.0. So non-binary string comparisons are case-insensitive by default in ordinary columns.
However, as mentioned in MySQL manual for JSON type
MySQL handles strings used in JSON context using the utf8mb4 character set and utf8mb4_bin collation.".
Therefore, your JSON value is in utf8mb4_bin collation and you need to apply a case insensitive collation to either operand to make the comparison case insensitive.
E.g.
WHERE username COLLATE XXX LIKE '...'
where XXX should be a utf8mb4 collation (such as the utf8mb4_general_ci you've mentioned.).
Or
WHERE username LIKE '...' COLLATE YYY
where YYY should be a collation that match the character set of you connection.
For equality comparison, you should unquote the JSON value with JSON_UNQUOTE() or the unquoting extraction operator ->>
E.g.
JSON_UNQUOTE(JSON_EXTRACT(payload, '$.username'))
Or simply
payload->>'$.username'
The JSON type and functions work way different from ordinary data types. It appears that you are new to it. So I would suggest you to read the manual carefully before putting it into a production environment.
Okay, I was able to solve the case insensitivity by adding COLLATE utf8mb4_general_ci after the LIKE clause.
So the point here is to find a working collation, which in its turn can be found by researching the db you work with.
I have a user in my DB with this value:
booking_id -> 25DgW
This field is marked as unique in my model
booking_id = models.CharField(null=False, unique=True, max_length=5, default=set_booking_id)
But now when I query for the user like this:
>>> User.objects.get(booking_id='25dgw') # I think this should throw a DoesNotExist exacption
<User: John Doe>
Even if I do:
>>> Partner.objects.get(booking_id__exact='25dgw')
<User: John Doe>
I need a query that return the user only if the code is written exactly in the same way than is saved in database: 25DgW instead of 25dgw.
How should I query?
By default, MySQL does not consider the case of the strings. There will be a character set and collation associated with the database/table or column.
The default collation for character set latin1, which is latin1_swedish_ci, which is case-insensitive. You will have to change the collation type to latin1_general_cs which supports case sensitive comparison.
You can to change character-set/collation settings at the database/table/column levels.
The following steps worked for me.
Go to MySQL shell
mysql -u user -p databasename
Change character set and collation type of the table/column
ALTER TABLE tablename CONVERT TO CHARACTER SET latin1 COLLATE latin1_general_cs;
Now try the query
Partner.objects.get(booking_id__exact='25dgw')
This throws an error, table matching query does not exist.
Reference - Character Sets and Collations in MySQL
User.objects.filter(booking_id='25dgw', booking_id__contains='25dgw').first()
seems to work for me (result = None). The booking_id parameter is to assert the correct letters and no more, the booking_id__contains to assert the case sensitiveness.
This seems to work:
qs.filter(booking_id__contains='25DgW').first()
I have a field in a Django Model for storing a unique (hash) value. Turns out that the database (MySQL/inno) doesn't do a case sensitive search on this type (VARCHAR), not even if I explicitly tell Django to do a case sensitive search Document.objects.get(hash__exact="abcd123"). So "abcd123" and "ABcd123" are both returned, which I don't want.
class document(models.Model):
filename = models.CharField(max_length=120)
hash = models.CharField(max_length=33 )
I can change the 'hash field' to a BinaryField , so in the DB it becomes a LONGBLOB , and it does do a case-sensitive search (and works). However, this doesn't seem very efficient to me.
Is there a better way (in Django) to do this, like adding 'utf8 COLLATE'? or what would be the correct Fieldtype in this situation?
(yes, I know I could use PostgreSQL instead..)
The default collation for character set for MySQL is latin1_swedish_ci, which is case insensitive. Not sure why that is. But you should create your database like so:
CREATE DATABASE database_name CHARACTER SET utf8;
As #dan-klasson mentioned, the default non-binary string comparison is case insensetive by default; notice the _ci at the end of latin1_swedish_ci, it stands for case-insensetive.
You can, as Dan mentioned, create the database with a case sensitive collation and character set.
You may be also interested to know that you can always create a single table or even set only a single column to use a different collation (for the same result). And you may also change these collations post creation, for instance per table:
ALTER TABLE documents__document CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
Additionally, if you rather not change the database/table charset/collation, Django allows to run a custom query using the raw method. So you may be able to work around the change by using something like the following, though I have not tested this myself:
Document.objects.raw("SELECT * FROM documents__document LIKE '%s' COLLATE latin1_bin", ['abcd123'])
I need to interface with a database for which I cannot change the collation and charset.
However, I would like to pick some binary data from it, interpret it as if it were UTF8 and then do an UPPER on it (since just doing UPPER() on binary returns the raw value).
I would assume that this works:
SELECT UPPER(Filename.Name) COLLATE utf8_general_ci FROM Filename;
but it doesn't and complains that
COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'binary'
which is fair enough, I need some incantation to cast the binary field as being utf-8. How do I do a select which gives me a computed column with the right character set assigned to it?
Ok figured it out. For modern MySQL versions you can use CAST, and for older ones CONVERT (which is actually standard SQL).
SELECT UPPER(CONVERT(BINARY(Filename.Name) USING utf8)) FROM Filename;
I think you're looking for:
SELECT UPPER(Filename.Name COLLATE utf8_general_ci) FROM Filename;
But I'm not sure because I don't have a broken database to test with.
In my database which is utf8_general_ci, 99.99% of searches should be done case insensitive. Now there's a specific situation where I need to find some data in a case sensitive manner. The field is a varchar field, which I usually search case insensitive.
My question is: Can I perform a case sensitive search on a field that is usually case insensitive?
select * from page
where convert(pageTitle using latin1) collate utf8_ = 'Something'
i found this answer descriptive enough :)
You have to change the collation of operands which are used in the search query using COLLATE operator
OR
If you want a column always to be
treated in case-sensitive fashion,
declare it with a case sensitive or
binary collation.
See here http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html
I would suggest changing the collation type so that it becomes case sensitive and then changing your code in the 99.99% of other cases. For information on changing the case of the field check: http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html
You can use the LOWER() and UPPER() functions in SQL to standardise the case for insensitive searches once the field type is set to be case sensitive.
You can make a query case sensitive,
for example if you want to check for userid and password:
Select * from TableName Where WHERE LoginId=#userid COLLATE SQL_Latin1_General_CP1_CS_AS AND Password=#password COLLATE SQL_Latin1_General_CP1_CS_AS
it checks for userid and password and returns the values if they are true (case sensitive).