RegEx SQL, combine known pattern with variable numbers - mysql

I want to check whether there is a known pattern with variable numbers.
This column 'shortcut' has values like this
|shortcuts|
-----------
|ab1
|ab2
|ab23
|abc123
The only thing I've got for my SQL-statement is the alphabetical pattern e.g. 'ab'
So I started with
SELECT * FROM mytable WHERE shortcut LIKE 'ab%'
I only need ab1, ab2 and ab23 and NOT abc12.
Is there a way to modify my statement? There is at least one number, numbers always follow the known pattern and the pattern is the only known value.

You can use regular expressions:
where shortcut regexp '^ab[0-9]+'
This says that shortcut starts with "ab" and is followed by at least one digit.

You can use
SELECT * FROM mytable WHERE shortcut REGEXP '^ab[0-9]+$'
The ^ab[0-9]+$ regex (see its online demo) matches:
^ - start of string
ab - an ab string (case insensitively, use BINARY after REGEXP to make it case sensitive)
[0-9]+ - one or more digits
$ - end of string.
See this regex graph:

Related

I don't know how to find .exe with numbers in MYSQL

For example, I would like to find an 'exe' with a fixed path: \Temp\3038.exe
The number before '.exe' is fixed to 4 digits, but the value is random(0001 ~ 9999).
Only numbers are changed to 4 random digits.
I'd really appreciate it if you could tell me what to do.
I tried 2ways:
SELECT * FROM myTable WHERE path REGEXP '\\Temp\\\d{4}.exe';
SELECT * FROM myTable WHERE path like '\\Temp\\\d{4}.exe';
my data:
C:\\Users\\AppData\\Local\\Temp\\1536.exe
C:\\Users\\AppData\\Local\\Temp\\6247.exe
C:\\Users\\AppData\\Local\\Temp\\2508.exe
.......(skip)
For these I need to get 1536, 6247, 2508
DBFIDDLE
A regular expression will bite, especially if it has a '` in it.
When you say the data is: C:\\Users\\AppData\\Local\\Temp\\2508.exe will it have one '` or two '' ? to be honest i got a but lost too 😉
The DBFIDDLE uses:
select *
from mytable
where path REGEXP '.*\\\\Temp\\\\\\d{4}.exe';
The REGEXP looks complicated, but let me try to explain, it matches
.* a dot is any character, and an asterix says that character(s) can be present 0 or more times.
\\ a backslash
\\ another backslash
'Temp' the four letters 'T','e','m' and 'p', in this order.
\\ another backslash
\\ another backslash
\d{4} the \d is representing a digit (which is a number between 0 and 9), and the '{4}' will match four of those.
. this is a random character
exe this is the three letters in 'exe'.
A bug in this is that this will also match: C:\\Users\\AppData\\Local\\Temp\\2508Xexe
But, I like to keep room for improvements 😉😁
P.S. When replacing the '.' by '.' it will match a dot literally, and not a random charachter.
EDIT: re-reading the question, I see that you only want the numbers.
From the results you can take LEFT(RIGHT(path,8),4)
This will first take the last 8 charachters (RIGHT(path,8)), and from that the first 4 characters. This should work because the matching regular expression result always end in 4 numbers with '.exe' after it.

Regex pattern equivalent of %word% in mysql

I need 2 regex case insensitive patterns. One of them are equivalent of SQL's %. So %word%. My attempt at this was '^[a-zA-Z]*word[a-zA-Z]*$'.
Question 1: This seems to work, but I am wondering if this is the equivalent of %word%.
Finally the last pattern being similar to %, but requires 3 or more characters either before and after the word. So for example if the target word was word:
words = matched because it doesn't have 3 or more characters either before or after it.
swordfish = not matched because it has 3 or more characters after word
theword = not matched because it has 3 or more characters before it
mywordly = matched because it doesn't contain 3 or more characters before or after word.
miswordeds = not matched because it has 3 characters before it. (it also has 3 words after it too, but it met the criteria already).
Question 2: For the second regex, I am not very sure how to start this. I will be using the regex in a MySQL query using the REGEXP function for example:
SELECT 1
WHERE 'SWORDFISH' REGEXP '^[a-zA-Z]*word[a-zA-Z]*$'
First Question:
According to https://dev.mysql.com/doc/refman/8.0/en/string-comparison-functions.html#operator_like
With LIKE you can use the following two wildcard characters in the pattern:
% matches any number of characters, even zero characters.
_ matches exactly one character.
It means the REGEX ^[a-zA-Z]*word[a-zA-Z]*$' is not equivalent to %word%
Second Question:
Change * to {0,2} to indicate you want to match at maximum 2 characters either before or after it:
SELECT 1
WHERE 'SWORDFISH' REGEXP '^[a-zA-Z]{0,2}word[a-zA-Z]{0,2}$'
And to make case insensitive:
SELECT 1 WHERE LOWER('SWORDFISH') REGEXP '^[a-z]{0,2}word[a-z]{0,2}$'
Assuming
The test string (or column) has only letters. (Hence, I can use . instead of [a-z]).
Case folding and accents are not an issue (presumably handled by a suitable COLLATION).
Either way:
WHERE x LIKE '%word%' -- found the word
AND x NOT LIKE '%___word%' -- but fail if too many leading chars
AND x NOT LIKE '%word___%' -- or trailing
WHERE x RLIKE '^.{0,2}word.{0,2}$'
I vote for RLIKE being slightly faster than LIKE -- only because there are fewer ANDs.
(MySQL 8.0 introduced incompatible regexp syntax; I think the syntax above works in all versions of MySQL/MariaDB. Note that I avoided word boundaries and character class shortcuts like \\w.)

MySQL query to find matching string using REGEXP not working

I am using MySQL 5.5.
I have a table named nutritions, having a column serving_data with text datatype.
Some of the values in serving_data column are like:
[{"label":"1 3\/4 cups","unit":"3\/4 cups"},{"label":"1 cups","unit":"3\/4 cups"},{"label":"1 container (7 cups ea.)","unit":"3\/4 cups"}]
Now, I want to find records containing serving_data like 1 3\/4 cups .
For that I've made a query,
SELECT id,`name`,`nutrition_data`,`serving_data`
FROM `nutritions` WHERE serving_data REGEXP '(\d\s\\\D\d\scup)+';
But is seems not working.
Also I've tried
SELECT id,`name`,`nutrition_data`,`serving_data`
FROM `nutritions` WHERE serving_data REGEXP '/(\d\s\\\D\d\scup)+/g';
If I use the same pattern in http://regexr.com/ then it seems matching.
Can anyone help me?
Note that in MySQL regex, you cannot use shorthand classes like \d, \D or \s, replace them with [0-9], [^0-9] and [[:space:]] respectively.
You may use
REGEXP '[0-9]+[[:space:]][0-9]+\\\\/[0-9]+[[:space:]]+cup'
See the regex demo (note that in general, regex101.com does not support MySQL regex flavor, but the PCRE option supports the POSIX character classes like [:digit:], [:space:], so it is only used for a demo here, not as a proof it works with MySQL REGEXP).
Pattern details:
[0-9]+ - 1 or more digits
[[:space:]] - a whitespace
[0-9]+- 1 or more digits
\\\\/ - a literal \/ char sequence
[0-9]+[[:space:]]+cup - 1 or more digits, 1 or more whitespaces, cup.
Note that you may precise the word cup with a word boundary, add a [[:>:]] pattern after it to match a cup as a whole word.

MySQL substring match using regular expression; substring contain 'man' not 'woman'

I have an issue while I fetch data from database using regular expression. While I search for 'man' in tags it returns tags contains 'woman' too; because its substring.
SELECT '#hellowomanclothing' REGEXP '^(.)*[^wo]man(.)*$'; # returns 0 correct, it contains 'woman'
SELECT '#helloowmanclothing' REGEXP '^(.)*[^wo]man(.)*$'; # returns 0 incorrect, it can contain anything other than 'woman'
SELECT '#stylemanclothing' REGEXP '^(.)*[^wo]man(.)*$'; # returns 1 correct
How can I update the regular expression, when I search for 'man' it should return only the tag contains 'man' not 'woman'?
You can use two expressions. I think like is sufficient:
SELECT ('#stylemanclothing' like '%man%' and '#stylemanclothing' not like '%woman%')
Although you can express this in a regular expression, this is probably the easier solution.
Use this:
SELECT '#helloowmanclothing' REGEXP '^(.)*([^o]|[^w]o)man(.)*$'
In your pattern [^wo] stands for "one character except for w and o", while you need to exclude two consecutive characters - w and then o.
Therefore above pattern allows for o before man only if o is preceeded by character other than w.
A variant of n-dru pattern since you don't need to describe all the string:
SELECT '#hellowomanclothing' REGEXP '(^#.|[^o]|[^w]o)man';
Note: if a tag contains 'man' and 'woman' this pattern will return 1. If you don't want that Gordon Linoff solution is what you are looking for.

Regex : string must contain a-z and A-Z and 0-9

I'm using a stored procedure to validate the input parameter. The input parameter must contain a-z and A-Z and 0-9.
for Example:
aS78fhE0 -> Correct
76AfbRZt -> Correct
76afbrzt -> Incorrect(doesn't contain Upper Case A-Z)
asAfbRZt -> Incorrect(doesn't contain Numeric 0-9)
4QA53RZJ -> Incorrect(doesn't contain Lower Case a-z)
what Regular Expression that can validate the input parameter like above example,.?
Many Thanks,Praditha
UPDATEOthers character except Alphanumeric are not allowedI'm Using MySQL version 5
Further from Johns Post and subsequent comments:
The MySql you require would be
SELECT * FROM mytable WHERE mycolumn REGEXP BINARY '[a-z]'
AND mycolumn REGEXP BINARY '[A-Z]'
AND mycolumn REGEXP BINARY '[0-9]'
Add additional
AND mycolum REGEXP BINARY '^[a-zA-Z0-9]+$'
If you only want Alphanumerics in the string
With look-ahead assertion you could do like this:
/^(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9]).*$/
update: It seems mysql doesn't support look around assertions.
You could split it up into 3 separate regex to test for each case.
[a-z], [A-Z], and [0-9]
and the results of those matches together, and you can achieve the result you're looking for.
EDIT:
if you're only looking to match alphanumerics, you should do ^[a-zA-Z0-9]+$ as suggested by Ed Head in the comments
My solution is leads to a long expression becuase i will permutate over all 6 possibilities the found capital letter, small letter and the needed number can be arranged in the string:
^(.*[a-z].*[A-Z].*[0-9].*|
.*[a-z].*[0-9].*[A-Z].*|
.*[A-Z].*[a-z].*[0-9].*|
.*[A-Z].*[0-9].*[a-z].*|
.*[0-9].*[a-z].*[A-Z].*|
.*[0-9].*[A-Z].*[a-z].*)$
Edit: Forgot the .* at the end and at the beginning.
Unfortunately, MySQL does not support lookaround assertions, therefore you'll have to spell it out for the regex engine (assuming that only those characters are legal):
^(
[A-Za-z0-9]*[a-z][A-Za-z0-9]*[A-Z][A-Za-z0-9]*[0-9][A-Za-z0-9]*|
[A-Za-z0-9]*[a-z][A-Za-z0-9]*[0-9][A-Za-z0-9]*[A-Z][A-Za-z0-9]*|
[A-Za-z0-9]*[A-Z][A-Za-z0-9]*[a-z][A-Za-z0-9]*[0-9][A-Za-z0-9]*|
[A-Za-z0-9]*[A-Z][A-Za-z0-9]*[0-9][A-Za-z0-9]*[a-z][A-Za-z0-9]*|
[A-Za-z0-9]*[0-9][A-Za-z0-9]*[a-z][A-Za-z0-9]*[A-Z][A-Za-z0-9]*|
[A-Za-z0-9]*[0-9][A-Za-z0-9]*[A-Z][A-Za-z0-9]*[a-z][A-Za-z0-9]*
)$
or, in MySQL:
SELECT * FROM mytable WHERE mycolumn REGEXP BINARY "^([A-Za-z0-9]*[a-z][A-Za-z0-9]*[A-Z][A-Za-z0-9]*[0-9][A-Za-z0-9]*|[A-Za-z0-9]*[a-z][A-Za-z0-9]*[0-9][A-Za-z0-9]*[A-Z][A-Za-z0-9]*|[A-Za-z0-9]*[A-Z][A-Za-z0-9]*[a-z][A-Za-z0-9]*[0-9][A-Za-z0-9]*|[A-Za-z0-9]*[A-Z][A-Za-z0-9]*[0-9][A-Za-z0-9]*[a-z][A-Za-z0-9]*|[A-Za-z0-9]*[0-9][A-Za-z0-9]*[a-z][A-Za-z0-9]*[A-Z][A-Za-z0-9]*|[A-Za-z0-9]*[0-9][A-Za-z0-9]*[A-Z][A-Za-z0-9]*[a-z][A-Za-z0-9]*)$";
[a-zA-Z0-9]*[a-z]+[a-zA-Z0-9]*[A-Z]+[a-zA-Z0-9]*[0-9]+[a-zA-Z0-9]*|[a-zA-Z0-9]*[a-z]+[a-zA-Z0-9]*[0-9]+[a-zA-Z0-9]*[A-Z]+[a-zA-Z0-9]*|[a-zA-Z0-9]*[A-Z]+[a-zA-Z0-9]*[a-z]+[a-zA-Z0-9]*[0-9]+[a-zA-Z0-9]*|[a-zA-Z0-9]*[A-Z]+[a-zA-Z0-9]*[0-9]+[a-zA-Z0-9]*[a-z]+[a-zA-Z0-9]*|[a-zA-Z0-9]*[0-9]+[a-zA-Z0-9]*[A-Z]+[a-zA-Z0-9]*[a-z]+[a-zA-Z0-9]*|[a-zA-Z0-9]*[0-9]+[a-zA-Z0-9]*[a-z]+[a-zA-Z0-9]*[A-Z]+[a-zA-Z0-9]*