How to match any letter in SQL? - mysql

I want to return rows where certain fields follow a particular pattern such as whether a particular character in a string is a letter or number. To test it out, I want to return fields where the first letter is any letter. I used this code.
SELECT * FROM sales WHERE `customerfirstname` like '[a-z]%';
It returns nothing. So I would think that the criteria is the first character is a letter and then any following characters do not matter.
The following code works, but limits rows where the first character is an a.
SELECT * FROM sales WHERE `customerfirstname` like 'a%';
Am I not understanding pattern matching? Isn't it [a-z] or [A-Z] or [0-9] for any letter or number?
Also if I wanted to run this test on the second character in a string, wouldn't I use
SELECT * FROM `sales` WHERE `customerfirstname` like '_[a-z]%'
This is for SQL and MySQL. I am doing this in phpmyadmin.

You want to use regular expressions:
SELECT s.*
FROM sales s
WHERE s.customerfirstname REGEXP '^[a-zA-Z]';

This can be achieved with a regular expression.
SELECT * FROM sales WHERE REGEXP_LIKE(customerfirstname, '^[[:alpha:]]');
^ denotes the start of the string, while the [:alpha] character class matches any alphabetic character.
Just in case, here are a few others character classes that you may find useful :
alnum : dlphanumeric characters
digit: digit characters
lower : lowercase alphabetic characters
upper: uppercase alphabetic characters
See the mysql regexp docs for many more...

Related

Using REGEXP within MySQL to find a certain number within a comma separated list

I have a list of numbers in some fields in a table, for example something like this:
2033,1869,1914,1913,19120,1911,1910,1909,1908,1907,1866,1921,1922,1923
Now, I'm trying to do a query to check if a number is found in the row, however, I can't use LIKE as then it may return false positives as if I did a search for 1912 in the above field I would get a result returned because of the number 19120, obviously we don't want that - we can't append or prepend a comma as the start/end numbers don't have them.
So, onto using REGEXP I go... I tried this, but it doesn't work (it returns a result):
SELECT * FROM cat_listing WHERE cats REGEXP '[^0-9]*1912[^0-9]*';
I imagine why it still finds something is because of the * quantifier; it found [^0-9] 0 times AFTER 1912 so it considers it a match.
I'm not sure how to modify it to do what I want.
In your case, it seems word boundaries are necessary:
SELECT * FROM cat_listing WHERE cats REGEXP '[[:<:]]1912[[:>:]]';
[[:<:]] is the beginning of a word and [[:>:]] is the end. See reference:
[[:<:]], [[:>:]]
These markers stand for word boundaries. They match the beginning and end of >words, respectively. A word is a sequence of word characters that is not >preceded by or followed by word characters. A word character is an alphanumeric >character in the alnum class or an underscore (_).
You have another option called find_in_set()
SELECT * FROM cat_listing WHERE find_in_set('1912', cats) <> 0;
Returns 0 if str is not in strlist or if strlist is the empty string. Returns NULL if either argument is NULL. This function does not work properly if the first argument contains a comma (“,”) character.
No need to use a regex just because the column value has no comma at either end:
SELECT
cats
FROM cat_listing
WHERE INSTR(CONCAT(',', cats, ','), ',1912,')
;
See it in action: SQL Fiddle.
Please comment if adjustment / further detail is required.

mysql REGEXP to match a string ending in 2 digits that are preceded only by non-digits

I have struggled quite a bit with this.
In MySQL I would like to query a column for strings that end with a double-digit number. I don't care what else comes before, as long as the last characters are two digits preceded by at least one non-digit
For instance, these strings should match:
"Nov. 12th, 60"
"34 Bar 56"
"Foo-BAR-01"
"33-44-55"
"-------88"
"99"
When I do this:
SELECT <string> REGEXP <pattern>
Now, the <pattern> bit is what I need help on.
Thanks.
SELECT * FROM mytable WHERE mycolumn REGEXP "^.*[^0-9][0-9]{2}$";
Regex Explanation:
^.*[^0-9][0-9]{2}$
Assert position at the beginning of the string «^»
Match any single character that is NOT a line break character «.*»
Between zero and unlimited times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives «*»
Match any single character that is NOT present in the list below and that is NOT a line break character «[^0-9]»
A character in the range between “0” and “9” «0-9»
Match a single character in the range between “0” and “9” «[0-9]{2}»
Exactly 2 times «{2}»
Assert position at the very end of the string «$»
There is no need to describe all the string, the end should suffice:
SELECT 'Nov. 12th, 60' REGEXP '(^|[^[:digit:]])[[:digit:]]{2}$';
SELECT '99' REGEXP '(^|[^[:digit:]])[[:digit:]]{2}$'
So in short the two digits at the end [[:digit:]]{2}$ are preceded by a non digit character [^[:digit:]], OR | the begining of the string ^.
Note: you can write it in a less posix style, it doesn't change anything (only a little shorter):
SELECT 'Nov. 12th, 60' REGEXP '(^|[^0-9])[0-9]{2}$';

Escape percent sign in SQL query regex?

This query works:
SELECT * FROM table WHERE column REGEXP "[[:<:]]100[[:>:]]"
But this doesn't. It returns nothing, but I have a value "100%" on my table.
SELECT * FROM table WHERE column REGEXP "[[:<:]]100%[[:>:]]"
How can I make the query with the percent signal?
It depends. You've got to change the character class [[::>::]] to something else that matches your need, because the percent sign is no word character, see REGEXP documentation:
[[:<:]], [[:>:]]
These markers stand for word boundaries. They match the beginning and
end of words, respectively. A word is a sequence of word characters
that is not preceded by or followed by word characters. A word
character is an alphanumeric character in the alnum class or an
underscore (_).
If it could be followed by i.e whitespace or punctuation or the end of the string then you could use
SELECT * FROM example WHERE content REGEXP '[[:<:]]100%([[:blank:]]|[[:punct:]]|$)';
You can see in this demo that you don't have to escape the percent sign (even you could do it).
The percent sign makes the use of the word end boundary not applicable anymore. You have to find a combination of other markers that will do for you.
As far as I know % is not a special character in regular expressions.
You should be able to just fire off:
SELECT * FROM table WHERE column REGEXP '[[:<:]]100%[[:>:]]'

Regex : string must contain a-z and A-Z and 0-9

I'm using a stored procedure to validate the input parameter. The input parameter must contain a-z and A-Z and 0-9.
for Example:
aS78fhE0 -> Correct
76AfbRZt -> Correct
76afbrzt -> Incorrect(doesn't contain Upper Case A-Z)
asAfbRZt -> Incorrect(doesn't contain Numeric 0-9)
4QA53RZJ -> Incorrect(doesn't contain Lower Case a-z)
what Regular Expression that can validate the input parameter like above example,.?
Many Thanks,Praditha
UPDATEOthers character except Alphanumeric are not allowedI'm Using MySQL version 5
Further from Johns Post and subsequent comments:
The MySql you require would be
SELECT * FROM mytable WHERE mycolumn REGEXP BINARY '[a-z]'
AND mycolumn REGEXP BINARY '[A-Z]'
AND mycolumn REGEXP BINARY '[0-9]'
Add additional
AND mycolum REGEXP BINARY '^[a-zA-Z0-9]+$'
If you only want Alphanumerics in the string
With look-ahead assertion you could do like this:
/^(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9]).*$/
update: It seems mysql doesn't support look around assertions.
You could split it up into 3 separate regex to test for each case.
[a-z], [A-Z], and [0-9]
and the results of those matches together, and you can achieve the result you're looking for.
EDIT:
if you're only looking to match alphanumerics, you should do ^[a-zA-Z0-9]+$ as suggested by Ed Head in the comments
My solution is leads to a long expression becuase i will permutate over all 6 possibilities the found capital letter, small letter and the needed number can be arranged in the string:
^(.*[a-z].*[A-Z].*[0-9].*|
.*[a-z].*[0-9].*[A-Z].*|
.*[A-Z].*[a-z].*[0-9].*|
.*[A-Z].*[0-9].*[a-z].*|
.*[0-9].*[a-z].*[A-Z].*|
.*[0-9].*[A-Z].*[a-z].*)$
Edit: Forgot the .* at the end and at the beginning.
Unfortunately, MySQL does not support lookaround assertions, therefore you'll have to spell it out for the regex engine (assuming that only those characters are legal):
^(
[A-Za-z0-9]*[a-z][A-Za-z0-9]*[A-Z][A-Za-z0-9]*[0-9][A-Za-z0-9]*|
[A-Za-z0-9]*[a-z][A-Za-z0-9]*[0-9][A-Za-z0-9]*[A-Z][A-Za-z0-9]*|
[A-Za-z0-9]*[A-Z][A-Za-z0-9]*[a-z][A-Za-z0-9]*[0-9][A-Za-z0-9]*|
[A-Za-z0-9]*[A-Z][A-Za-z0-9]*[0-9][A-Za-z0-9]*[a-z][A-Za-z0-9]*|
[A-Za-z0-9]*[0-9][A-Za-z0-9]*[a-z][A-Za-z0-9]*[A-Z][A-Za-z0-9]*|
[A-Za-z0-9]*[0-9][A-Za-z0-9]*[A-Z][A-Za-z0-9]*[a-z][A-Za-z0-9]*
)$
or, in MySQL:
SELECT * FROM mytable WHERE mycolumn REGEXP BINARY "^([A-Za-z0-9]*[a-z][A-Za-z0-9]*[A-Z][A-Za-z0-9]*[0-9][A-Za-z0-9]*|[A-Za-z0-9]*[a-z][A-Za-z0-9]*[0-9][A-Za-z0-9]*[A-Z][A-Za-z0-9]*|[A-Za-z0-9]*[A-Z][A-Za-z0-9]*[a-z][A-Za-z0-9]*[0-9][A-Za-z0-9]*|[A-Za-z0-9]*[A-Z][A-Za-z0-9]*[0-9][A-Za-z0-9]*[a-z][A-Za-z0-9]*|[A-Za-z0-9]*[0-9][A-Za-z0-9]*[a-z][A-Za-z0-9]*[A-Z][A-Za-z0-9]*|[A-Za-z0-9]*[0-9][A-Za-z0-9]*[A-Z][A-Za-z0-9]*[a-z][A-Za-z0-9]*)$";
[a-zA-Z0-9]*[a-z]+[a-zA-Z0-9]*[A-Z]+[a-zA-Z0-9]*[0-9]+[a-zA-Z0-9]*|[a-zA-Z0-9]*[a-z]+[a-zA-Z0-9]*[0-9]+[a-zA-Z0-9]*[A-Z]+[a-zA-Z0-9]*|[a-zA-Z0-9]*[A-Z]+[a-zA-Z0-9]*[a-z]+[a-zA-Z0-9]*[0-9]+[a-zA-Z0-9]*|[a-zA-Z0-9]*[A-Z]+[a-zA-Z0-9]*[0-9]+[a-zA-Z0-9]*[a-z]+[a-zA-Z0-9]*|[a-zA-Z0-9]*[0-9]+[a-zA-Z0-9]*[A-Z]+[a-zA-Z0-9]*[a-z]+[a-zA-Z0-9]*|[a-zA-Z0-9]*[0-9]+[a-zA-Z0-9]*[a-z]+[a-zA-Z0-9]*[A-Z]+[a-zA-Z0-9]*

Match beginning of words in Mysql for UTF8 strings

I m trying to match beignning of words in a mysql column that stores strings as varchar. Unfortunately, REGEXP does not seem to work for UTF-8 strings as mentioned here
So,
select * from names where name REGEXP '[[:<:]]Aandre';
does not work if I have name like Foobar Aándreas
However,
select * from names where name like '%andre%'
matches the row I need but does not guarantee beginning of words matches.
Is it better to do the like and filter it out on the application side ? Any other solutions?
A citation from tha page you mentioned:
Warning
The REGEXP and RLIKE operators work in byte-wise fashion, so they are not multi-byte safe and may produce unexpected results with multi-byte character sets. In addition, these operators compare characters by their byte values and accented characters may not compare as equal even if a given collation treats them as equal.
select * from names where name like 'andre%'
select * from names where name like 'andre%' is not solution for eg:
name = 'richard andrew', because the string begining with richa... and not with andre...
for the moment, the temporaly solution, for search words (words != string) starting with a string
select * from names where name REGEXP '[[:<:]]andre';
But it no matching with accented words, eg: ándrew.
Any other solution, with regular expressions (mysql) to search in accented words?