Using the SQL LIKE operator with the UPPER() syntax - mysql

I am trying to use the LIKE operator in my query as follows:
mysql> select cat_title from category where cat_title like '%Indian%Restaurant%';
+--------------------+
| cat_title |
+--------------------+
| Indian_Restaurants |
+--------------------+
1 row in set (2.59 sec)
However, since I want to do a case insensitive search, I am trying:
mysql> select cat_title from category where UPPER(cat_title) like UPPER('%Indian%Restaurant%');
Empty set (2.83 sec)
Why is the second query not working?

Most likely the collation on the cat_title column is case insensitive. Use
... cat_title LIKE '%Indian%Restaurant%' COLLATE utf8_bin
See also
How can I make SQL case sensitive string comparison on MySQL?
Case Sensitive collation in MySQL

Not sure what's your question. The UPPER function won't work before a 'like' operator. If you want a case-insensitive search, use the operator 'like' only, since it's case-insensitive.
On the other hand, if you want a case-sensitive search, you may try using
cat_title COLLATE Latin1_General_BIN LIKE '%Indian%Restaurant%'

Related

MySQL / MariaDB Case Insensitive Collation Still Case Sensitive?

Using MariaDB 10.0.36, I have a user table with the collation of utf8_turkish_ci with a user_login column that stores a user's username that is also using the collation of utf8_turkish_ci with a unique index.
My understanding is that a select statement should be case insensitive, but it doesn't appear to be that way with certain usernames.
For example, I have a user with the login of GoDoIt
This statement returns no records:
SELECT * FROM user WHERE user_login = 'godoit'
However, this works:
SELECT * FROM user WHERE user_login = 'GoDoIt'
I find this strange because the username of Eric works both ways.
SELECT * FROM user WHERE user_login = 'eric'
SELECT * FROM user WHERE user_login = 'Eric'
Return the the same result. So why would capitals in the middle of the string not work? I'm lowering the input username in PHP using tolower on the string before sending it to the database, and I guess this approach won't work with certain usernames.
Turkish dotless I and dotted i are two separate characters; those are not considered equal in the utf8_turkish_ci collation.
See the collation chart here: http://collation-charts.org/mysql60/mysql604.utf8_turkish_ci.html
Note the separate entries for the dotless I and dotted i.
Additional background here: https://en.wikipedia.org/wiki/Dotted_and_dotless_I
(Too long for a Comment. Spencer's answer is good.)
This lists the letters and states which are equal or not, and which order they are in. Here is the excerpt show ing that the dotless I's are equal to each other but considered less than the dotted I's:
utf8_turkish_ci I=ı Ħ=ħ i=Ì=Í=Î=Ï=ì=í=î=ï=Ĩ=ĩ=Ī=ī=Ĭ=ĭ=Į=į=İ ij=IJ=ij iz J=j=j́=Ĵ=ĵ jz
Some other things that are unusual about utf8_turkish_ci: Ö=ö -- treated as a "letter" that comes between O and P. Similarly for Ç=ç and Ğ=ğ and Ş=ş
Note: utf8mb4 and utf8 handle Turkish identically.
MySQL 6.0 died on the vine years ago; it looks like that link for the collation is out of date with respect to Ş:
mysql> SELECT 'Ş' = 'S' COLLATE utf8_turkish_ci;
+------------------------------------+
| 'Ş' = 'S' COLLATE utf8_turkish_ci |
+------------------------------------+
| 0 |
+------------------------------------+

MySQL REPLACE affects 0 rows but WHERE ... LIKE returns 90

For some reason while using PhpMyAdmin returns 90 rows when running:
SELECT COUNT(*)
FROM le_wp_posts
WHERE post_content LIKE '%Â%'
But the following updates 3 rows only:
UPDATE le_wp_posts
SET post_content = REPLACE(post_content, 'Â', '')
WHERE post_content LIKE '%Â%'
I have also tried it omitting the WHERE clause in the UPDATE statement. Is there any obvious reason I'm overlooking that's causing this issue? Or what steps can I take further to investigate the cause? My SQL is not the best.
I did the following test...
1) Create a table with some data:
create table test(col varchar(10));
insert into test values ('abc'), ('dbe');
2) Select number of rows using your same filter (but different character):
select count(*)
from test
where col like '%B%' -- note the uppercase
;
Got the following result:
+----------+
| count(*) |
+----------+
| 2 |
+----------+
1 row in set
3) Tried your update:
update test
set col = replace(col, 'B', '') -- note the uppercase
where col like '%B%' -- note the uppercase
;
And got this result:
Query OK, 0 rows affected (0.01 sec)
Rows matched: 2 Changed: 0 Warnings: 0
In my case, a default character set and collation where used on table creation. The default character set was 'latin1' and collation 'latin1_swedish_ci'. Note the ci at the end of the collation.... it means case insensitive. So, the LIKE filter did a case insensitive search, found 2 rows, but the REPLACE function, as can be seen on documentation, is case sensitive. Probably, as in my case, the update found the same number of rows as in the select, but updated less data because of the case restriction on REPLACE.
If this is your problem, can't you just run two updates, one for the uppercase case and one for the lowercase? I'll try to develop a solution on one update...
The docs about the REPLACE(str, from_str, to_str) function:
Returns the string str with all occurrences of the string from_str replaced by the string to_str. REPLACE() performs a case-sensitive match when searching for from_str.
The docs about the LIKE operator:
The following two statements illustrate that string comparisons are not case sensitive unless one of the operands is a case sensitive (uses a case-sensitive collation or is a binary string):
The first example:
mysql> SELECT 'abc' LIKE 'ABC';
-> 1
The second example:
mysql> SELECT 'abc' LIKE _latin1 'ABC' COLLATE latin1_general_cs;
-> 0
Note the cs at the end of the collation. It means case sensitive.
If you take a utf8-encoded £ (C2A3, treated as utf8) and store it into a latin1 column, when you read it back, you get £ (C2A3, treated as latin1). Removing the  will work for about 32 characters, but will fail for many other characters. And it will make the table harder to repair!
Let's look at an example of what you tried to store, together with the HEX of that ended up in the table. Also, let's look at SHOW CREATE TABLE to confirm my suspicion that the target it latin1.
This discusses the HEX debugging technique. And it discusses "Best Practice", which includes declaring, during the connection, that you really have utf8, not latin1. And it talks about "Mojibake", with an example of where ñ turns into ñ, making REPLACE a messy prospect.
Your symptom with LIKE is consistent with character set mismatches.
The LIKE is Case-insensitive but Replace is Case-sensitive, to bypass that use the following query:
UPDATE le_wp_posts
SET post_content = REPLACE(LOWER(post_content), LOWER('Â'), '')
WHERE post_content LIKE '%Â%'
OR if you want the final result not to be lower case:
UPDATE le_wp_posts
SET post_content = REPLACE(REPLACE(post_content, LOWER('Â'), ''), 'Â', '')
WHERE post_content LIKE '%Â%'
--you just need to put N before string pattern too (if you want look for unicode char)*/
Update le_wp_posts
Set post_content=REPLACE(post_content,N'Â','')
where post_content like '%Â%'
Can you please try using JOIN as below:
UPDATE le_wp_posts l
INNER JOIN (SELECT t.post_content
FROM le_wp_posts t
WHERE t.post_content LIKE '%Â%') t ON l.post_content = t.post_content
SET l.post_content = REPLACE(l.post_content, 'Â', '')
If you have an "Id" you could try this way:
UPDATE le_wp_posts
SET post_content = REPLACE(post_content, 'Â', '')
WHERE Id IN ( SELECT *
FROM (
SELECT Id
FROM le_wp_posts
WHERE post_content LIKE '%Â%'
) as A
)
I guess the update didn't occur from within PhpMyAdmin but from a client?
If so it's just the differing locale settings.
--Query first selects original column as well as replacement string and then update original column
Update Tbl1
Set Tbl1.post_content=Tbl2.Replacement
From le_wp_posts as Tbl1
Inner Join
(
select post_content,REPLACE(post_content,'Â','') as Replacement
from le_wp_posts
where post_content like '%Â%'
) as Tbl2
On Tbl1.post_content=Tbl2.post_content

MySQL magic column "name" and synthesized columns

So I got a SQL statement. The idea is that I want to do a case-insensitive LIKE.
I do it like this:
SELECT
FilenameId AS id,
LOWER(CONVERT(BINARY(Filename.Name) USING utf8)) AS name
FROM Filename
WHERE name LIKE '%something%'
COLLATE utf8_general_ci
This works fine, however my query also returns the case-transformed name. What I want to do
is synthesize the insensitive name and do a LIKE query on it, but also return the non case-transformed name.
SELECT
FilenameId AS id,
Filename.Name AS name,
LOWER(CONVERT(BINARY(Filename.Name) USING utf8)) AS iname
FROM Filename
WHERE iname LIKE '%something%'
COLLATE utf8_general_ci
...but then MySQL happily refuses:
Unknown column 'iname' in 'where clause'
What am I doing wrong? I am on MySQL 5.5 FWIW.
I don't know why you came up with this, but usually others have trouble getting LIKE case sensitive, not the other way round.
Write your query simply like this:
SELECT
FilenameId AS id,
Filename.Name AS name
FROM Filename
WHERE name LIKE '%something%'
And in general, you can't access aliases in WHERE clause. Either put your query into a subquery like Dhinakaran suggested or use HAVING (if you are lazy).
The difference? WHERE is rowbased, HAVING works on the result after applying WHERE clause (and GROUP BY).
From the manual:
The following two statements illustrate that string comparisons are not case sensitive unless one of the operands is a binary string:
mysql> SELECT 'abc' LIKE 'ABC';
-> 1
mysql> SELECT 'abc' LIKE BINARY 'ABC';
-> 0
Create a sub query then put where on subquery
Select * from (SELECT
FilenameId AS id,
Filename.Name AS name,
LOWER(CONVERT(BINARY(Filename.Name) USING utf8)) AS iname
FROM Filename ) temp
WHERE temp.iname LIKE '%something%'
COLLATE utf8_general_ci
Or
SELECT
FilenameId AS id,
Filename.Name AS name,
AS iname
FROM Filename
WHERE LOWER(CONVERT(BINARY(Filename.Name) USING utf8)) LIKE '%something%'
COLLATE utf8_general_ci

MySQL query - force case-sensitive with a ORDER BY rand( )

Is it possible to force case-sensitive for a query?
Mine sounds like this:
"SELECT g_path FROM glyphs WHERE g_glyph = :g_glyph ORDER BY rand()"
if g_glyph = r, the result can be R or r and it's not what I expect.
I'm looking for a case-sensitive return.
I googled my issue and I found this solution:
/*Case-sensitive sort in descending order.
In this query, ProductName is sorted in
case-sensitive descending order.
*/
SELECT ProductID, ProductName, UnitsInStock
FROM products
ORDER BY BINARY ProductName DESC;
But the following line doesn't work at all:
"SELECT g_path FROM glyphs WHERE g_glyph = :g_glyph ORDER BY BINARY rand()"
Any Suggestion?
Thank you very much for your help.
The order and equality of characters is defined by the collation. In most cases, a case-insensitive collation is used.
If you need to use a strict, case-sensitive comparison for a specific datum, use the BINARY operator:
mysql> SELECT 'a' = 'A';
-> 1
mysql> SELECT BINARY 'a' = 'A';
-> 0
mysql> SELECT 'a' = 'a ';
-> 1
mysql> SELECT BINARY 'a' = 'a ';
-> 0
So in your case:
SELECT g_path FROM glyphs WHERE BINARY g_glyph = :g_glyph ORDER BY rand()
This is covered in the manual page Case Sensitivity in String Searches.
You need to specify a case sensitive or binary collation.
The default character set and collation are latin1 and latin1_swedish_ci, so nonbinary string comparisons are case insensitive by default. This means that if you search with col_name LIKE 'a%', you get all column values that start with A or a. To make this search case sensitive, make sure that one of the operands has a case sensitive or binary collation. For example, if you are comparing a column and a string that both have the latin1 character set, you can use the COLLATE operator to cause either operand to have the latin1_general_cs or latin1_bin collation:
col_name COLLATE latin1_general_cs LIKE 'a%'
col_name LIKE 'a%' COLLATE latin1_general_cs
col_name COLLATE latin1_bin LIKE 'a%'
col_name LIKE 'a%' COLLATE latin1_bin
If you want a column always to be treated in case-sensitive fashion, declare it with a case sensitive or binary collation. See Section 13.1.14, “CREATE TABLE Syntax”.
The _cs in the collation name stands for "case sensitive".

MySQL DB selects records with and without umlauts. e.g: '.. where something = FÖÖ'

My Table collation is "utf8_general_ci". If i run a query like:
SELECT * FROM mytable WHERE myfield = "FÖÖ"
i get results where:
... myfield = "FÖÖ"
... myfield = "FOO"
is this the default for "utf8_general_ci"?
What collation should i use to only get records where myfield = "FÖÖ"?
SELECT * FROM table WHERE some_field LIKE ('%ö%' COLLATE utf8_bin)
A list of the collations offered by MySQL for Unicode character sets can be found here:
http://dev.mysql.com/doc/refman/5.0/en/charset-unicode-sets.html
If you want to go all-out and require strings to be absolutely identical in order to test as equal, you can use utf8_bin (the binary collation). Otherwise, you may need to do some experimentation with the different collations on offer.
For scandinavian letters you can use utf8_swedish_ci fir example.
Here is the character grouping for utf8_swedish_ci. It shows which characters are interpreted as the same.
http://collation-charts.org/mysql60/mysql604.utf8_swedish_ci.html
Here's the directory listing for other collations. I'm no sure which is the used utf8_general_ci though. http://collation-charts.org/mysql60/