Special chars in SQL regex - match word boundary with special chars - mysql

I've got a search function which creates query. My goal is to search for exact word, so if the phrase is 'hello' it should return only results with 'hello' (not with 'xhello', 'helloxx' etc). My code looks like:
SELECT (...) WHERE x RLIKE '[[:<:]]word[[:>:]]'
And it works for most of the cases, BUT
the problem starts when the phrase is f.e. '$hello', or 'helloĊ' etc - the special chars ruin the functionality.
Is there a way to handle it ?

Try
SELECT * FROM table WHERE x RLIKE '(^|[[:space:]])Hello([[:space:]]|$)'
or
SELECT * FROM table WHERE x RLIKE '(^| )Hello( |$)'
or
SELECT * FROM table WHERE x REGEXP '(^|[[:space:]])Hello([[:space:]]|$)'
or
SELECT * FROM test WHERE name REGEXP '(^| )Hello( |$)'

Related

MySQL: Limit the number of characters in LIKE clause?

I'm using this query in my autocomplete feature:
SELECT description FROM my_table WHERE LIKE "%keyword%"
But this query returns the entire content of the field which is sometimes too long.
Is it possible to limit the number of characters before and after "keyword" ?
I suggest using MySQL's REGEXP operator here. For example, to accept a maximum of 10 characters before and after keyword, you could use:
SELECT description
FROM my_table
WHERE col REGEXP '^.{0,10}keyword.{0,10}$';
Note that if you intend to match keyword as a standalone word, you may want to surround it by word boundaries in the regex pattern:
SELECT description
FROM my_table
WHERE col REGEXP '^.{0,10}\\bkeyword\\b.{0,10}$';
To show for example 5 characters before and after you word you can do it using RIGHT, LEFT and SUBSTRING_INDEX
select description, concat(RIGHT(SUBSTRING_INDEX(description, 'keyword', 1),5), 'keyword', LEFT(SUBSTRING_INDEX(description, 'keyword', -1),5) ) as snippet
from my_table
where description like "%keyword%";
Check it here : https://dbfiddle.uk/MZcVJgEL

MySQL regex for word boundary containing '#'

I'm trying to search for an example phrase: '#test123' using regex like:
SELECT (...) WHERE x RLIKE '[[:<:]]#test123[[:>:]]'
With no luck. Probably the word boundary selector '[[:<:]]' does not count '#' as a word.
How to achieve it? How to set in MySQL regex word boundary selector but with exceptions?
MySQL 5.7 Reference Manual / ... / Regular Expressions:
[[:<:]], [[:>:]]
These markers stand for word boundaries. They match the beginning and
end of words, respectively. A word is a sequence of word characters
that is not preceded by or followed by word characters. A word
character is an alphanumeric character in the alnum class or an
underscore (_).
So, # is a word boundary, not a word character. We need to expand "word characters" class to include # too. The simplest way is to enumerate custom word characters directly a-z0-9_#:
SELECT * FROM
(
SELECT '#test123' AS x UNION ALL
SELECT 'and #test123 too' UNION ALL
SELECT 'not#test123not' UNION ALL
SELECT 'not#test123' UNION ALL
SELECT '#test123not' UNION ALL
SELECT 'not # test123' UNION ALL
SELECT 'test123' UNION ALL
SELECT '#west123'
) t
WHERE x RLIKE '([^a-z0-9_#]|^)#test123([^a-z0-9_#]|$)';
Result:
x
----------------
#test123
and #test123 too
I think you can use below expression instead:
'[.#.][[:<:]]test123[[:>:]]'
Note: don't use non-word literals inside [[:<:]][[:>:]] and use [..] for characters.
Or (with thanks to #Y.B.)
'(^|.*[^a-zA-Z0-9_])[.#.][[:<:]]test123[[:>:]]'

Show/convert only alphanumeric data in sql query [duplicate]

I'm trying to select all rows that contain only alphanumeric characters in MySQL using:
SELECT * FROM table WHERE column REGEXP '[A-Za-z0-9]';
However, it's returning all rows, regardless of the fact that they contain non-alphanumeric characters.
Try this code:
SELECT * FROM table WHERE column REGEXP '^[A-Za-z0-9]+$'
This makes sure that all characters match.
Your statement matches any string that contains a letter or digit anywhere, even if it contains other non-alphanumeric characters. Try this:
SELECT * FROM table WHERE column REGEXP '^[A-Za-z0-9]+$';
^ and $ require the entire string to match rather than just any portion of it, and + looks for 1 or more alphanumberic characters.
You could also use a named character class if you prefer:
SELECT * FROM table WHERE column REGEXP '^[[:alnum:]]+$';
Try this:
REGEXP '^[a-z0-9]+$'
As regexp is not case sensitive except for binary fields.
There is also this:
select m from table where not regexp_like(m, '^[0-9]\d+$')
which selects the rows that contains characters from the column you want (which is m in the example but you can change).
Most of the combinations don't work properly in Oracle platforms but this does. Sharing for future reference.
Try this
select count(*) from table where cast(col as double) is null;
Change the REGEXP to Like
SELECT * FROM table_name WHERE column_name like '%[^a-zA-Z0-9]%'
this one works fine

SELECT to match special characters and case

Field X in table may contain special characters e.g hello!World and I would like to know if there is a way to match that with HelloWorld (Ignore case and special characters).
SELECT * FROM table WHERE X='Helloworld'
http://sqlfiddle.com/#!9/2afa1/1
if you need exaclty match of string:
SELECT *
FROM table1
WHERE x REGEXP '^hello[[:punct:],[:space:]]world$';
And if hello world could be a part of larger string:
SELECT *
FROM table1
WHERE x REGEXP 'hello[[:punct:],[:space:]]world';
What you can do is to replace all special characters like this:
SELECT * FROM table WHERE LOWER(REPLACE(X, '!', '')) = LOWER('HelloWorld');
Chain those replacements if you have to replace more:
SELECT * FROM table WHERE LOWER(REPLACE(REPLACE(X, '!', ''), '?', '')) = LOWER('HelloWorld');
If I understood your question right, you need to filter out non-ASCII characters? Please confirm whether this is true. In order to do that, have a look at REGEXP matching as in the comment link and this question.
Try something like
SELECT * FROM `table ` WHERE `X` REGEXP 'Helloworld';
REGEXP 'hello[^[:alpha:]]*world'
Notes:
This finds the string in the middle of other stuff; add ^ and $ to anchor to ends.
This assumes the non-alpha character(s) are between hello and world, not some other spot in the string.
This relies on the relevant collation to do (or not do) case folding.

MySQL regex query case insensitive

In my table I have firstname and last name. Few names are upper case ( ABRAHAM ), few names are lower case (abraham), few names are character starting with ucword (Abraham).
So when i am doing the where condition using REGEXP '^[abc]', I am not getting proper records. How to change the names to lower case and use SELECT QUERY.
SELECT * FROM `test_tbl` WHERE cus_name REGEXP '^[abc]';
This is my query, works fine if the records are lower case, but my records are intermediate ,my all cus name are not lower case , all the names are like ucword.
So for this above query am not getting proper records display.
I think you should query your database making sure that the names are lowered, suppose that name is the name you whish to find out, and in your application you've lowered it like 'abraham', now your query should be like this:
SELECT * FROM `test_tbl` WHERE LOWER(cus_name) = name
Since i dont know what language you use, I've just placed name, but make sure that this is lowered and you should retrieve Abraham, ABRAHAM or any variation of the name!
Hepe it helps!
Have you tried:
SELECT * FROM `test_tbl` WHERE LOWER(cus_name) REGEXP '^[abc]';
I don't know since when, but nowadays MySql REGEXP is case insensitive.
https://dev.mysql.com/doc/refman/5.7/en/pattern-matching.html
You don't need regexp to search for names starting with a specific string or character.
SELECT * FROM `test_tbl` WHERE cus_name LIKE 'abc%' ;
% is wildcard char. The search is case insensitive unless you set the binary attribute for column cus_name or you use the binary operator
SELECT * FROM `test_tbl` WHERE BINARY cus_name LIKE 'abc%' ;
A few valid options already presented, but here's one more with just regex:
SELECT * FROM `test_tbl` WHERE cus_name REGEXP '^[abcABC]';