mysql select regexp like with "full-stop" - mysql

have that little sql select:
select * from import_daten WHERE lastname REGEXP 'dipl\.|dr\.';
And just want to filter the rows with ing. and dipl. but with that statement i also get the people wtih for e.g. "Abendroth" in Lastname. Because the "dr" in Name.
Same is with
select * from import_daten WHERE lastname REGEXP 'dipl.|dr.';
How is it possible to include the full-stop correct within the regexp?

REGEXP '(dipl|dr)[.]'
Be careful of start/end of word:
mysql> SELECT 'dr.' REGEXP 'dr[.][[:>:]]', 'dr.' REGEXP 'dr[.]';
+-----------------------------+----------------------+
| 'dr.' REGEXP 'dr[.][[:>:]]' | 'dr.' REGEXP 'dr[.]' |
+-----------------------------+----------------------+
| 0 | 1 |
+-----------------------------+----------------------+
Notice how it fails? That is because . is not a character that can exist in a 'word'.
Also, I used [.] instead of \. because of the problem of escaping the escape character -- in some situations you need \\.; in others you might need \\\\.. Too confusing.
If necessary you can use 'word start': REGEXP '[[:<:]](dipl|dr)[.]'

I think you want this
select * from import_daten WHERE lastname REGEXP '(dipl\.)|(ing\.)';

You probably want to make sure the pattern is at a "word boundary." MySQL's regular expression syntax has special character sequences for that:
select * from import_daten WHERE lastname REGEXP '[[:<:]](dipl\.|dr\.)[[:>:]]';
See http://dev.mysql.com/doc/refman/5.7/en/regexp.html. It's nearly the last item on the page before that page's user comments.

Related

Regex to match exact word without spaces

I have a query such as the below:
SELECT * from table_name where lastname regexp "[[:<:]]Smith[[:>:]]"
This returns
De Smith
Smith
I only need to retrieve Smith
I even tried the below
SELECT * from surnames where last_name regexp "[[:<:]][^\s]Smith[[:>:]]"
I may be mistaking your requirement, but if you want to exactly match a last name then you can just use the equals operator:
SELECT * from table_name where TRIM(lastname) = 'Smith'
I used TRIM() on the lastname field just in case there might be leading or trailing whitespace.

Select Statement in Talend using REGEXP (MYSQL) to find spaces in a first name field

writing a select statement to view first names with spaces in them i.e. JO ANN or TERRY LYNN,
my statement format would look like:
SELECT FirstName FROM `DB`.`TABLE` where FirstName REGEXP ' '
I know the names exist because I can see them in the preview i just need to write a select statement to only view the names with spaces
It's better to use POSIX class which matches all the space characters because your's won't match the names which has tabs.
SELECT FirstName FROM `DB`.`TABLE` where FirstName REGEXP '[[:space:]]'
This would match the names only if there exists a space between two alphabets.
SELECT FirstName FROM `DB`.`TABLE` where FirstName REGEXP '[[:alpha:]][[:blank:]]+[[:alpha:]]'

Query to check if a certain row has 2 words

How to return rows where a column has 2 words (that is, strings separated by a space) in it?
It must be purely using SQL.
SELECT * FROM table WHERE name (has 2 strings in it);
I dont know the names when querying. Its a big dataset. I only have to check if the name contains a spacebar basically (from a comment).
If you want to distinguish names that have two parts from one-part and three-plus-part names, you can use regular expression:
SELECT * FROM my_table WHERE name REGEXP '^[^ ]+[ ]+[^ ]+$'
This regular expression matches when the entire string consists of two non-empty parts containing no spaces, with one or more space separating them.
This perfectly works for me
You can use 'AND' condition and Like Operator with wildcards (%).
SELECT * FROM table_name WHERE name LIKE '%Word1%' AND name LIKE '%Word2%'
How about simply:
...
WHERE [name] LIKE '%Word1%'
AND [name] LIKE '%Word2%'
SELECT * FROM table WHERE concat(' ',name,' ') like '% str1 %'
AND concat(' ',name,' ') like '% str2 %'
The extra blanks are there to separate words.
You can use the following technique
mysql> select length('first name')-length(replace('first name',' ','')) as diff;
+------+
| diff |
+------+
| 1 |
+------+
i.e. get the difference of actual name and the name after replacing space, and if its 1 then you have the value as firstname lastname
So the query may look like
select * from table
where
length(col_name)-length(replace(col_name,' ','')) = 1
Use % between words.
Example:
SELECT * FROM Table WHERE Col LIKE '%word1%word2%'
SELECT * FROM table_name WHERE name LIKE '% %'

Case sensitive RLIKE

Consider a table datatbl like this:
+----------+
| strfield |
+----------+
| abcde |
| fgHIJ |
| KLmno |
+----------+
I want to write a query something like this:
select * from datatbl where strfield rlike '[a-z]*';
As in a non-SQL regex, I'd like to return the lowercase row with abcde, but not the rows with capitals. I cannot seem to find an easy way to do this. Am I missing something stupid?
The MySQL REGEXP/RLIKE sucks for this - you need to cast the data as BINARY for case sensitive searching:
SELECT *
FROM datatbl
WHERE CAST(strfield AS BINARY) rlike '[a-z]*';
You'll find this raised in the comments for the REGEXP/RLIKE documentation.
Edit: I've misread OP and this is solution for the opposite case where MySQL is in SENSITIVE collation and you need to compare string in INSENSITIVE way.
MySQL 5.x
You can workaround it using LOWER() function, too.
SELECT *
FROM datatbl
WHERE LOWER(strfield) RLIKE '[a-z]*';
MySQL 8+
If you are running MySQL 8+, you can also use case-insensitive switch in REGEXP_LIKE() function.
SELECT *
FROM datatbl
WHERE REGEXP_LIKE(strfield, '[a-z]*', 'i');
For case-sensitive regex you can use REGEXP_LIKE() with match type c like this:
SELECT * FROM `table` WHERE REGEXP_LIKE(`column`, 'value', 'c');

MySQL REGEXP: matching blank entries

I have this SQL condition that is supposed to retrieve all rows that satisfy the given regexp condition:
country REGEXP ('^(USA|Italy|France)$')
However, I need to add a pattern for retrieving all blank country values. Currently I am using this condition
country REGEXP ('^(USA|Italy|France)$') OR country = ""
How can achieve the same effect without having to include the OR clause?
Thanks,
Erwin
This should work:
country REGEXP ('^(USA|Italy|France|)$')
However from a performance point of view, you may want to use the IN syntax
country IN ('USA','Italy','France', '')
The later should be faster as REGEXP can be quite slow.
There's no reason you can't use the $ (match end of string) to fill in your "empty subexpression" issue...
It looks a little weird but country REGEXP ('^(USA|Italy|France|$)$') will actually work
You could try:
country REGEXP ('^(USA|Italy|France|)$')
I just added another | after France, which should would basically tell it to also match ^$ which is the same as country = ''.
Update: since this method doesn't work, I would recommend you use this regex:
country REGEXP ('^(USA|Italy|France)$|^$')
Note that you can't use the regex: ^(USA|Italy|France|.{0})$ because it will complain that there is an empty sub expression. Although ^(USA|Italy|France)$|^.{0}$ would work.
Here are some examples of the return value of this regex:
select '' regexp '^(USA|Italy|France)$|^$'
> 1
select 'abc' regexp '^(USA|Italy|France)$|^$'
> 0
select 'France' regexp '^(USA|Italy|France)$|^$'
> 1
select ' ' regexp '^(USA|Italy|France)$|^$'
> 0
As you can see, it returns exactly what you want.
If you want to treat blank values the same (e.g. 0 spaces and 5 spaces both count as blank), you should use the regex:
country REGEXP ('^(USA|Italy|France|\s*)$')
This will cause the last row in the previous example to behave differently, i.e.:
select ' ' regexp '^(USA|Italy|France|\s*)$'
> 1