MySQL LIKE Operator with Special Characters Confusion - mysql

First let me apologize I have not been successful in finding anything online with this specific scenario.
I have been using MySQL for quite some time, but I am hoping to get some clarification on a certain situation I have come across, which honestly bothers me quite a bit.
I'm trying to match a string in a MySQL column that contains both \ and % literal characters using the LIKE operator.
Inside the table I have two records:
id value
-----------------------
1 100\\%A
2 100\%A
They both contain literal special characters.
If I do a SELECT, in an attempt to only match the first record (id=1), I would expect to write the query as such:
SELECT * FROM table_name WHERE value LIKE '%0\\\\\%A'
(\\\\ to match two literal backslashes, plus a backslash before % to match a literal %)
However, It only matches the row (id=2), which makes no sense to me.
If I change the query a little to be:
SELECT * FROM table_name WHERE value LIKE '%0\\\\%A'
I would expect to match the id=1 row only, (\\\\ to match 2 literal backslashes, and the % is not literal and should represent a wildcard). But instead, it matches both rows?
row (id=2) only has a single backslash but still matches.
Is row id=2 matching because the first 2 backslashes are matching the \, the 3rd backslash is ignored for some reason, and the 4th backslash is allowing a literal match on the %?
If I do a:
SELECT * FROM table_name WHERE value LIKE '%0\\\\\\\%A'
I for some reason get row (id=1), when I would expect to get no matches whatsoever.
I'm trying to find a solution in which I can do partial matches on any series of characters accurately, including those with consecutive special characters such as the scenario above. However, I'm having an impossible time trying to plan for situations such as these.
Any input would be greatly appreciated.

Maybe this help you understand the usage of escape chars in mySQL
https://stackoverflow.com/a/27061961/634698

Related

Why isn't MySQL REGEXP filtering out these values?

So I'm trying to find what "special characters" have been used in my customer names. I'm going through updating this query to find them all one-by-one, but it's still showing all customers with a - despite me trying to exlude that in the query.
Here's the query I'm using:
SELECT * FROM customer WHERE name REGEXP "[^\da-zA-Z\ \.\&\-\(\)\,]+";
This customer (and many others with a dash) are still showing in the query results:
Test-able Software Ltd
What am I missing? Based on that regexp, shouldn't that one be excluded from the query results?
Testing it on https://regex101.com/r/AMOwaj/1 shows there is no match.
Edit - So I want to FIND any which have characters other than the ones in the regex character set. Not exclude any which do have these characters.
Your code checks if the string contains any character that does not belong to the character class, while you want to ensure that none does belong to it.
You can use ^ and $ to check the while string at once:
SELECT * FROM customer WHERE name REGEXP '^[^\da-zA-Z .&\-(),]+$';
This would probably be simpler expressed with NOT, and without negating the character class:
SELECT * FROM customer WHERE name NOT REGEXP '[\da-zA-Z .&\-(),]';
Note that you don't need to escape all the characters within the character class, except probably for -.
Use [0-9] or [[:digit:]] to match digits irrespective of MySQL version.
Use the hyphen where it can't make part of a range construction.
Fix the expression as
SELECT * FROM customer WHERE name REGEXP "[^0-9a-zA-Z .&(),-]+";
If the entire text should match this pattern, enclose with ^ / $:
SELECT * FROM customer WHERE name REGEXP "^[^0-9a-zA-Z .&(),-]+$";
- implies a range except if it is first. (Well, after the "not" (^).)
So use
"[^-0-9a-zA-Z .&(),]"
I removed the + at the end because you don't really care how many; this way it will stop after finding one.

Mysql query returns no data with escaped \

I'm attempting to query our MSSQL database but I'm getting no data when there clearly is data there.
First I query
SELECT id, instruction_link FROM work_instructions WHERE instruction_link LIKE "%\\\\cots-sbs%";
Which returns 100+ lines.
http://tinypic.com/r/ief8td/8
(sorry couldn't post as actual picture, don't have enough rep :(
However if I query
SELECT id, instruction_link FROM work_instructions WHERE instruction_link LIKE "%\\\\cots-sbs\\%";
http://tinypic.com/r/33ksw3q/8
I get no results with the 2nd query. I have no idea what I'm doing wrong here. Seems pretty simple but I can't make any sense of it..
Thanks in advance.
As documented under LIKE:
Note
Because MySQL uses C escape syntax in strings (for example, “\n” to represent a newline character), you must double any “\” that you use in LIKE strings. For example, to search for “\n”, specify it as “\\n”. To search for “\”, specify it as “\\\\”; this is because the backslashes are stripped once by the parser and again when the pattern match is made, leaving a single backslash to be matched against.
\\% is parsed as a string containing a literal backslash followed by a percentage character, which is then interpreted as a pattern containing only a literal percentage sign.

MySQL \ regexp \ Search only the records which contains specific characters

I would like to find all records from 'mytable' which in the field 'name' used only characters listed below:
Ø-*³`!/-;$€"“- „”\ø,Ø:’.#*+_/? !œ³¥Φ?+#=–()<>ąĄćĆęĘłŁńŃóÓśŚż ŻźŹàáâåéÉêéíıñçãėÊèÈçßœŒæğîïİşúūýōòÒô
regular letters from a to z (and A to Z)
number 0,1,3,4,5,6,7,8,9
spaces and some 'tab' signs
This query does not work:
SELECT name
FROM mytable
WHERE name not regexp '[^a-zA-Z0-9Ø-*³`!/-;$€"“- „”\ø,Ø:’.#*+_/? !œ³¥Φ?+#=–()<>ąĄćĆęĘłŁńŃóÓśŚż ŻźŹàáâåéÉêéíıñçãėÊèÈçßœŒæğîïİşúūýōòÒô]'
I know that this solution is far from good :) but I've tried different methods - this one returns the result closest to the required. Can you please give me some hint?
You can take advantage of character classes.
For example, instead of [ąĄóÓōòÒô...] use [[=A-Za-z=]].
This will match any letters from a through z igoring case and ignoring if the letter has accent.
Check the documentation for additional characters classes that will match your missing characters.

Using REGEX to alter field data in a mysql query

I have two databases, both containing phone numbers. I need to find all instances of duplicate phone numbers, but the formats of database 1 vary wildly from the format of database 2.
I'd like to strip out all non-digit characters and just compare the two 10-digit strings to determine if it's a duplicate, something like:
SELECT b.phone as barPhone, sp.phone as SPPhone FROM bars b JOIN single_platform_bars sp ON sp.phone.REGEX = b.phone.REGEX
Is such a thing even possible in a mysql query? If so, how do I go about accomplishing this?
EDIT: Looks like it is, in fact, a thing you can do! Hooray! The following query returned exactly what I needed:
SELECT b.phone, b.id, sp.phone, sp.id
FROM bars b JOIN single_platform_bars sp ON REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(b.phone,' ',''),'-',''),'(',''),')',''),'.','') = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(sp.phone,' ',''),'-',''),'(',''),')',''),'.','')
MySQL doesn't support returning the "match" of a regular expression. The MySQL REGEXP function returns a 1 or 0, depending on whether an expression matched a regular expression test or not.
You can use the REPLACE function to replace a specific character, and you can nest those. But it would be unwieldy for all "non-digit" characters. If you want to remove spaces, dashes, open and close parens e.g.
REPLACE(REPLACE(REPLACE(REPLACE(sp.phone,' ',''),'-',''),'(',''),')','')
One approach is to create user defined function to return just the digits from a string. But if you don't want to create a user defined function...
This can be done in native MySQL. This approach is a bit unwieldy, but it is workable for strings of "reasonable" length.
SELECT CONCAT(IF(SUBSTR(sp.phone,1,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,1,1),'')
,IF(SUBSTR(sp.phone,2,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,2,1),'')
,IF(SUBSTR(sp.phone,3,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,3,1),'')
,IF(SUBSTR(sp.phone,4,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,4,1),'')
,IF(SUBSTR(sp.phone,5,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,5,1),'')
) AS phone_digits
FROM sp
To unpack that a bit... we extract a single character from the first position in the string, check if it's a digit, if it is a digit, we return the character, otherwise we return an empty string. We repeat this for the second, third, etc. characters in the string. We concatenate all of the returned characters and empty strings back into a single string.
Obviously, the expression above is checking only the first five characters of the string, you would need to extend this, basically adding a line for each position you want to check...
And unwieldy expressions like this can be included in a predicate (in a WHERE clause). (I've just shown it in the SELECT list for convenience.)
MySQL doesn't support such string operations natively. You will either need to use a UDF like this, or else create a stored function that iterates over a string parameter concatenating to its return value every digit that it encounters.

MySQL query - select postcode matches

I need to make a selection based on the first 2 characters of a field, so for example
SELECT * from table WHERE postcode LIKE 'rh%'
But this would select any record that contains those 2 characters at any point in the "postcode" field right? I am in need of a query that just selects the first 2 characters. Any pointerS?
Thanks
Your query is correct. It searches for postcodes starting with "rh".
In contrast, if you wanted to search for postcodes containing the string "rh" anywhere in the field, you would write:
SELECT * from table WHERE postcode LIKE '%rh%'
Edit:
To answer your comment, you can use either or both % and _ for relatively simple searches. As you have noticed already, % matches any number of characters whereas _ matches a single character.
So, in order to match postcodes starting with "RHx " (where x is any character) your query would be:
SELECT * from table WHERE postcode LIKE 'RH_ %'
(mind the space after _). For more complex search patterns, you need to read about regular expressions.
Further reading:
http://dev.mysql.com/doc/refman/5.1/en/pattern-matching.html
http://dev.mysql.com/doc/refman/5.1/en/regexp.html
LIKE '%rh%' will return all rows with 'rh' anywhere
LIKE 'rh%' will return all rows with 'rh' at the beginning
LIKE '%rh' will return all rows with 'rh' at the end.
If you want to get only first two characters 'rh', use MySQL SUBSTR() function
http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_substr
Dave, your way seems correct to me (and works on my test data). Using a leading % as well will match anywhere in the string which obviously isn't desirable when dealing with postcodes.