Field X in table may contain special characters e.g hello!World and I would like to know if there is a way to match that with HelloWorld (Ignore case and special characters).
SELECT * FROM table WHERE X='Helloworld'
http://sqlfiddle.com/#!9/2afa1/1
if you need exaclty match of string:
SELECT *
FROM table1
WHERE x REGEXP '^hello[[:punct:],[:space:]]world$';
And if hello world could be a part of larger string:
SELECT *
FROM table1
WHERE x REGEXP 'hello[[:punct:],[:space:]]world';
What you can do is to replace all special characters like this:
SELECT * FROM table WHERE LOWER(REPLACE(X, '!', '')) = LOWER('HelloWorld');
Chain those replacements if you have to replace more:
SELECT * FROM table WHERE LOWER(REPLACE(REPLACE(X, '!', ''), '?', '')) = LOWER('HelloWorld');
If I understood your question right, you need to filter out non-ASCII characters? Please confirm whether this is true. In order to do that, have a look at REGEXP matching as in the comment link and this question.
Try something like
SELECT * FROM `table ` WHERE `X` REGEXP 'Helloworld';
REGEXP 'hello[^[:alpha:]]*world'
Notes:
This finds the string in the middle of other stuff; add ^ and $ to anchor to ends.
This assumes the non-alpha character(s) are between hello and world, not some other spot in the string.
This relies on the relevant collation to do (or not do) case folding.
Related
I have a search module with SQL query like this:
SELECT FROM trilers WHERE title '%something%'
And when I search for keyword for example like "spiderman" it returns not found, but when I search for "spider-man" it returns my content (original row in MySQL is "spider-man").
How can I ignore all symbols like -, #, !, : and return content with "spiderman" and "spider-man" keywords at the same time?
What you can do is replace the characters you don't care about before the search takes place.
First iteration would look like this:
SELECT * FROM trilers WHERE REPLACE(title, '-', '') LIKE '%spiderman%'
This would ignore any '-'.
Next you would rap that with another REPLACE to include '#' like this:
SELECT * FROM trilers WHERE REPLACE(REPLACE(title, '-', ''), '#', '') LIKE '%spiderman%'
For all 3 ('!','-','#') you would just increase the Replace with another Replace like this:
SELECT * FROM trilers WHERE REPLACE(REPLACE(REPLACE(title, '-', ''), '#', ''),'!','') LIKE '%spiderman%'
You could try something like
SELECT * FROM trilers WHERE replace(title, '-', '') LIKE '%spiderman%'
The other answers involving using REPLACE are great, but if you don't care what characters appear between "spider" and "man" or how many characters there are between the two strings, you can use an additional wildcard in your expression:
SELECT * FROM Superheroes WHERE HeroName LIKE '%spider%man%';
If you want to match only one character, but allow any character, you can use the _ wildcard, which matches only one character:
SELECT * FROM Superheroes WHERE HeroName LIKE '%spider_man%';
This will match "spideryman" and "spideryman in la la land" but not "spiderysupereliteuberheroman".
If you have a limited number of possible symbols, a way to do it without REPLACE is to use a disjunctive expression:
SELECT * FROM Superheroes WHERE
HeroName LIKE '%spiderman%'
OR
HeroName LIKE '%spider-man%'
OR
HeroName LIKE '%spider#man%'
OR
HeroName LIKE '%spider!man%';
WHERE trilers REGEXP '[[:<:]]spider[-#!:]?man[[:>:]]'
Some discussion:
[[:<:]] -- word boundary
[-#!:] -- character set, matches any of them. ('-' must be first)
[-#!:]? -- optional -- so that 'spiderman' will still match
This, unlike the rest of the answers, will avoid matching
spidermaniac
Also, consider using FULLTEXT.
You should be able to use your search with a small update. You should be able to do something like: SELECT FROM trilers WHERE title LIKE '%spider%'. This should search for anything where spider is before or after something else like the hyphen (-)
I've got a search function which creates query. My goal is to search for exact word, so if the phrase is 'hello' it should return only results with 'hello' (not with 'xhello', 'helloxx' etc). My code looks like:
SELECT (...) WHERE x RLIKE '[[:<:]]word[[:>:]]'
And it works for most of the cases, BUT
the problem starts when the phrase is f.e. '$hello', or 'helloĊ' etc - the special chars ruin the functionality.
Is there a way to handle it ?
Try
SELECT * FROM table WHERE x RLIKE '(^|[[:space:]])Hello([[:space:]]|$)'
or
SELECT * FROM table WHERE x RLIKE '(^| )Hello( |$)'
or
SELECT * FROM table WHERE x REGEXP '(^|[[:space:]])Hello([[:space:]]|$)'
or
SELECT * FROM test WHERE name REGEXP '(^| )Hello( |$)'
I'm trying to select all rows that contain only alphanumeric characters in MySQL using:
SELECT * FROM table WHERE column REGEXP '[A-Za-z0-9]';
However, it's returning all rows, regardless of the fact that they contain non-alphanumeric characters.
Try this code:
SELECT * FROM table WHERE column REGEXP '^[A-Za-z0-9]+$'
This makes sure that all characters match.
Your statement matches any string that contains a letter or digit anywhere, even if it contains other non-alphanumeric characters. Try this:
SELECT * FROM table WHERE column REGEXP '^[A-Za-z0-9]+$';
^ and $ require the entire string to match rather than just any portion of it, and + looks for 1 or more alphanumberic characters.
You could also use a named character class if you prefer:
SELECT * FROM table WHERE column REGEXP '^[[:alnum:]]+$';
Try this:
REGEXP '^[a-z0-9]+$'
As regexp is not case sensitive except for binary fields.
There is also this:
select m from table where not regexp_like(m, '^[0-9]\d+$')
which selects the rows that contains characters from the column you want (which is m in the example but you can change).
Most of the combinations don't work properly in Oracle platforms but this does. Sharing for future reference.
Try this
select count(*) from table where cast(col as double) is null;
Change the REGEXP to Like
SELECT * FROM table_name WHERE column_name like '%[^a-zA-Z0-9]%'
this one works fine
I've been to the regexp page on the MySQL website and am having trouble getting the query right. I have a list of links and I want to find invalid links that do not contain a period. Here's my code that doesn't work:
select * from `links` where (url REGEXP '[^\\.]')
It's returning all rows in the entire database. I just want it to show me the rows where 'url' doesn't contain a period. Thanks for your help!
SELECT c1 FROM t1 WHERE c1 NOT LIKE '%.%'
Your regexp matches anything that contains a character that isn't a period. So if it contains foo.bar, the regexp matches the f and succeeds. You can do:
WHERE url REGEXP '^[^.]*$'
The anchors and repetition operator make this check that every character is not a period. Or you can do:
WHERE LOCATE(url, '.') = 0
BTW, you don't need to escape . when it's inside [] in a regexp.
Using regexp seems like an overkill here. A simple like operator would do the trick:
SELECT * FROM `links` WHERE url NOT LIKE '%.%
EDIT:
Having said that, if you really want to negate regexp, just use not regexp:
SELECT * FROM `links` WHERE url NOT REGEXP '[\\.]';
I have the following strings in the following pattern in a table in my db:
this_is_my_string_tester1
this_is_my_string_mystring2
this_is_my_string_greatstring
I am trying to match all strings that start with a specific pattern split by underscores i.e.this_is_my_string_ and then a wildcard final section
Unfortunately there is an added complication where some strings like the following:
this_is_my_string_tester1_yet_more_text
this_is_my_string_mystring2_more_text
this_is_my_string_greatstring_more
Therefore taking the following as examples:
this_is_my_string_tester1
this_is_my_string_mystring2
this_is_my_string_greatstring
this_is_my_string_tester1_yet_more_text
this_is_my_string_mystring2_more_text
this_is_my_string_greatstring_more
I am trying to have returned:
this_is_my_string_tester1
this_is_my_string_mystring2
this_is_my_string_greatstring
I have no idea how to do this with a like statement. Is this possible if so how?
EDIT
There is one final complication:
this_is_my_string
needs to be supplied as a list i.e in
(this_is_my_string, this_is_my_amazing_string, this_is_another_amazing_string)
SELECT * FROM atable WHERE afield REGEXP 'this_is_my_string_[a-z]+'
It might be faster if you have an index on afield and do
SELECT * FROM atable WHERE afield REGEXP 'this_is_my_string_[a-z]+'
AND afield LIKE 'this_is_my_string_%'
After edit of question:
Either
SELECT * FROM atable
WHERE afield REGEXP '(this_is_my_string|this_is_my_amazing_string)_[a-z]+'
or maybe you want something like having a table with the prefixes:
SELECT *
FROM atable AS t,
prefixes AS p
WHERE afield REGEXP CONCAT(p.prefix, '_[a-z]+')
As by the reference documentation this should not be possible, as a pattern (string literal) is required. Give it a try nevertheless.
There the answer of #KayNelson with LIKE (?) and INSTR might do instead of REGEXP.
try this
SELECT * FROM db.table WHERE strings LIKE 'this_is_my_string_%' AND instr(Replace(strings,"this_is_my_string_",""),"_") = 0;
It checks if more _ occurs after replacing the standard "this_is_my_string_"