MySQL LIKE matching at the end of the string - mysql

I'm trying to figure out why these two like statements are evaluated equally. In the first, I'm doing a simple select which returns 22 rows. In the second, I'm expecting that my update / replace should also return 22 rows affected. Can anybody see what I'm doing wrong? These should match strings like "I got a knee mri".
SET #acro = 'mri';
SELECT title FROM mytable WHERE title LIKE concat('% ', #acro);
//returns n rows
UPDATE mytable
SET title = REPLACE(title, CONCAT(' ', #acro), CONCAT(' ', UPPER(#acro)))
WHERE title LIKE CONCAT('% ', #acro);
//returns 0 rows
CREATE TABLE `mytable` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`title` text,
`author` varchar(50) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=119232 DEFAULT CHARSET=utf8;

The "rows affected" count is the number of rows that were modified, not the number of rows that were matched.
One explanation is that the column title is using a case insensitive collation, that is, a characterset that has a name ending in _ci.
It's possible that 22 rows were "matched", but no rows needed to be modified.
If the column is defined with characterset/collation latin1_swedish_ci, you could try comparing the results from a query like this:
SET #acro = _latin1'mri';
SELECT title
FROM mytable
WHERE title COLLATE latin1_general_cs LIKE UPPER(CONCAT('% ', #acro));
^^^

Ah, here's the answer: MySQL Update query with LIKE in WHERE clause not affecting matching rows
Replace() is case sensitive but like is not.

Related

how to make default text column where comparision binary (case sensitive and trim)

Sorry if this is duplicated, but I don't know how to find about the question.
Hi, this my table:
CREATE TABLE `log_Valor` (
`idLog_Valor` int(11) NOT NULL AUTO_INCREMENT,
`Valor` text binary NOT NULL,
PRIMARY KEY (`idLog_Valor`)
)
ENGINE=InnoDB;
INSERT INTO `log_Valor` (Valor) VALUES ('teste');
INSERT INTO `log_Valor` (Valor) VALUES ('teste ');
I have 2 rows:
1 | 'teste'
2 | 'teste '
When I run:
SELECT * FROM log_Valor where valor = 'teste'
It returns the two rows.
How do I make default comparison case sensitive and to not trim without having to specify in the query BINARY?
Use LIKE instead of =.
SELECT * FROM log_Valor WHERE valor LIKE 'teste';
From the documentation
In particular, trailing spaces are significant, which is not true for CHAR or VARCHAR comparisons performed with the = operator
DEMO

MySQL, Search for data containing hex character

I've got a table my_table with a varchar column col1. utf8
If I was looking for all rows containing the letter a in col1 (balloon, aardvark, etc) then I'd do:
select col1
from my_table
where col1 like "%a%" -- But how search for special hex character?
But what should I put instead of "%a%" if I'm looking for a special hex character, in this case 0xFFFC?
(This is the character: http://www.fileformat.info/info/unicode/char/fffc/index.htm)
Note that I am looking for a way to specify this character in the WHERE clause. I've seen this https://dev.mysql.com/doc/refman/5.7/en/hexadecimal-literals.html as well as Stackoverflow questions/answers that also use hex characters in the Select part. I need it in the WHERE clause. I have seen this How to find certain Hex values and Char() Values in a MySQL SELECT but that uses char(128), but I haven't got an equivalent char number in my case.
use: 0x61 == 'a'
select col1
from my_table
where col1 LIKE concat('%',0x61,'%');
Her is a Sample
CREATE TABLE `tmptable` (
`image` varchar(250) DEFAULT NULL,
UNIQUE KEY `d` (`image`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `tmptable` (`image`)
VALUES
('äöüß');
> SELECT image,hex(image) FROM tmptable WHERE image LIKE concat ('%',0xC39F,'%');
+--------------+--------------------------+
| image | hex(image) |
+--------------+--------------------------+
| äöüß`´' | C3A4C3B6C3BCC39F60C2B427 |
+--------------+--------------------------+
1 row in set (0.00 sec)
You can write your 'select' as this:
select col1
from my_table
where col1 LIKE CONCAT('%',X'FFFC','%');
You can read how use hexadecimal, as you say, in documentation https://dev.mysql.com/doc/refman/5.7/en/hexadecimal-literals.html
and using concat you use the character resolved.

INSTR(str,substr) does not work when str contains 'é' or 'ë' and substr only 'e'

In another post on stackoverflow, I read that INSTR could be used to order results by relevance.
My understanding of col LIKE '%str%' andINSTR(col, 'str')` is that they both behave the same. There seems to be a difference in how collations are handled.
CREATE TABLE `users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(64) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO users (name)
VALUES ('Joël'), ('René');
SELECT * FROM users WHERE name LIKE '%joel%'; -- 1 record returned
SELECT * FROM users WHERE name LIKE '%rene%'; -- 1 record returned
SELECT * FROM users WHERE INSTR(name, 'joel') > 0; -- 0 records returned
SELECT * FROM users WHERE INSTR(name, 'rene') > 0; -- 0 records returned
SELECT * FROM users WHERE INSTR(name, 'joël') > 0; -- 1 record returned
SELECT * FROM users WHERE INSTR(name, 'rené') > 0; -- 1 record returned
Although INSTR does some conversion, it finds ë in é.
SELECT INSTR('é', 'ë'), INSTR('é', 'e'), INSTR('e', 'ë');
-- returns 1, 0, 0
Am I missing something?
http://sqlfiddle.com/#!2/9bf21/6 (using mysql-version: 5.5.22)
This is due to bug 70767 on LOCATE() and INSTR(), which has been verified.
Though the INSTR() documentation states that it can be used for multi-byte strings, it doesn't seem to work, as you note, with collations like utf8_general_ci, which should be case and accent insensitive
This function is multi-byte safe, and is case sensitive only if at least one argument is a binary string.
The bug report states that although MySQL does this correctly it only does so when the number of bytes is also identical:
However, you can easily observe that they do not (completely) respect collations when looking for one string inside another one. It seems that what's happening is that MySQL looks for a substring which is collation-equal to the target which has exactly the same length in bytes as the target. This is only rarely true.
To pervert the reports example, if you create the following table:
create table t ( needle varchar(10), haystack varchar(10)
) COLLATE=utf8_general_ci;
insert into t values ("A", "a"), ("A", "XaX");
insert into t values ("A", "á"), ("A", "XáX");
insert into t values ("Á", "a"), ("Á", "XaX");
insert into t values ("Å", "á"), ("Å", "XáX");
then run this query, you can see the same behaviour demonstrated:
select needle
, haystack
, needle=haystack as `=`
, haystack LIKE CONCAT('%',needle,'%') as `like`
, instr(needle, haystack) as `instr`
from t;
SQL Fiddle

String compare exact in query MySQL

I created table like that in MySQL:
DROP TABLE IF EXISTS `barcode`;
CREATE TABLE `barcode` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`code` varchar(40) COLLATE utf8_bin DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
INSERT INTO `barcode` VALUES ('1', 'abc');
INSERT INTO `barcode` VALUES ('2', 'abc ');
Then I query data from table barcode:
SELECT * FROM barcode WHERE `code` = 'abc ';
The result is:
+-----+-------+
| id | code |
+-----+-------+
| 1 | abc |
+-----+-------+
| 2 | abc |
+-----+-------+
But I want the result set is only 1 record. I workaround with:
SELECT * FROM barcode WHERE `code` = binary 'abc ';
The result is 1 record. But I'm using NHibernate with MySQL for generating query from mapping table. So that how to resolve this case?
There is no other fix for it. Either you specify a single comparison as being binary or you set the whole database connection to binary. (doing SET NAMES binary, which may have other side effects!)
Basically, that 'lazy' comparison is a feature of MySQL which is hard coded. To disable it (on demand!), you can use a binary compare, what you apparently already do. This is not a 'workaround' but the real fix.
from the MySQL Manual:
All MySQL collations are of type PADSPACE. This means that all CHAR and VARCHAR values in MySQL are compared without regard to any trailing spaces
Of course there are plenty of other possiblities to achieve the same result from a user's perspective, i.e.:
WHERE field = 'abc ' AND CHAR_LENGTH(field) = CHAR_LENGTH('abc ')
WHERE field REGEXP 'abc[[:space:]]'
The problem with these is that they effectively disable fast index lookups, so your query always results in a full table scan. With huge datasets that makes a big difference.
Again: PADSPACE is default for MySQLs [VAR]CHAR comparison. You can (and should) disable it by using BINARY. This is the indended way of doing this.
You can try with a regular expression matching :
SELECT * FROM barcode WHERE `code` REGEXP 'abc[[:space:]]'
i was just working on case just like that when using LIKE with wildcard (%) resulting in an unexpected result. While searching i also found STRCMP(text1, text2) under string comparison feature of mysql which compares two string. however using BINARY with LIKE solved the problem for me.
SELECT * FROM barcode WHERE `code` LIKE BINARY 'abc ';
You could do this:
SELECT * FROM barcode WHERE `code` = 'abc '
AND CHAR_LENGTH(`code`)=CHAR_LENGTH('abc ');
I am assuming you only want one result, you could use LIMIT
SELECT * FROM barcode WHERE `code` = 'abc ' LIMIT 1;
To do exact string matching you could use Collation
SELECT *
FROM barcode
WHERE code COLLATE utf8_bin = 'abc';
The sentence right after the one quoted by Kaii basically says "use LIKE" :
“Comparison” in this context does not include the LIKE pattern-matching operator, for which trailing spaces are significant
and the example below shows that 'Monty' = 'Monty ' is true, but not 'Monty' LIKE 'Monty '.
However, if you use LIKE, beware of literal strings containing the '%', '_' or '\' characters : '%' and '_' are wildcard characters, '\' is used to escape sequences.

mysql group_concat one table to another

i would like to have a query that will solve my problem in native sql.
i have a table named "synonym" which holds words and the words' synonyms.
id, word, synonym
1, abandon, forsaken
2, abandon, desolate
...
As you can see words are repeated in this table lots of times and this makes the table unnecessarily big. i would like to have a table named "words" which doesn't have duplicate words like:
id, word, synonyms
1, abandon, 234|90
...
note: "234" and "90" here are the id's of forsaken and desolate in newly created words table.
so i already created a new "words" table with unique words from word field at synonym table. what i need is an sql query that will look at the synonym table for each word's synonyms then find their id's from words table and update the "synonyms" field with vertical line seperated ids. then i will just drop the synonym table.
just like:
UPDATE words SET synonyms= ( vertical line seperated id's (id's from words table) of the words at the synonyms at synonym table )
i know i must use group_concat but i couldn't achieved this.
hope this is clear enough. thanks for the help!
Your proposed schema is plain horrible.
Why not use a many-to-many relationship ?
Table words
id word
1 abandon
234 forsaken
Table synonyms
wid sid
1 234
1 90
You can avoid using update and do it using the queries below:
TRUNCATE TABLE words;
INSERT INTO words
SELECT (#rowNum := #rowNum+1),
a.word,
SUBSTRING(REPLACE(a.syns, a.id + '|', ''), 2) syns
FROM (
SELECT a.*,group_concat(id SEPARATOR '|') syns
FROM synonyms a
GROUP BY word
) a,
(SELECT #rowNum := 0) b
Test Script:
CREATE TABLE `ts_synonyms` (
`id` INT(11) NULL DEFAULT NULL,
`word` VARCHAR(20) NULL DEFAULT NULL,
`synonym` VARCHAR(2000) NULL DEFAULT NULL
);
CREATE TABLE `ts_words` (
`id` INT(11) NULL DEFAULT NULL,
`word` VARCHAR(20) NULL DEFAULT NULL,
`synonym` VARCHAR(2000) NULL DEFAULT NULL
);
INSERT INTO ts_synonyms
VALUES ('1','abandon','forsaken'),
('2','abandon','desolate'),
('3','test','tester'),
('4','test','tester4'),
('5','ChadName','Chad'),
('6','Charles','Chuck'),
('8','abandon','something');
INSERT INTO ts_words
SELECT (#rowNum := #rowNum+1),
a.word,
SUBSTRING(REPLACE(a.syns, a.id + '|', ''), 2) syns
FROM (
SELECT a.*,
GROUP_CONCAT(id SEPARATOR '|') syns
FROM ts_synonyms a
GROUP BY word
) a,
(SELECT #rowNum := 0) b;
SELECT * FROM ts_synonyms;
SELECT * FROM ts_words;