MySQL RegEx to match two consecutive digits that are the same - mysql

I am using the following RegEx in MySQL to match two consecutive digits that are the same anywhere in a string:
^.*([[:digit:]])\1+.*$
It matches correctly the following strings:
8831
5011
9931
but it also matches
9318
and it doesn't match
3449
Is the problem around .* or is it something else?

There's no way to check to the same thing twice directly, instead you would need to check for all possibilities. Luckily since you are only looking at 10 digits, it's relatively easy:
(11|22|33|44|55|66|77|88|99|00)

I don't think MySQL regular expressions have back references. You can do the more verbose:
where col regexp '00|11|22|33|44|55|66|77|88|99'

Related

Combining two regexes

I'm trying to combine two regexes. One will ensure that input contains 14 digits: ^\\d{14}$ and I need another regex to check if all the input is not of the same digit.
Please suggest how I proceed with this. I want my regex to check for that the input is 14 digits and those digits are not all same numbers [0-9].
Is there a way I add the test for finding not all digits are the same with my regex that checks for if the input is exactly 14 digits? I would need one regex expression which combines them both. Thank you!
You can use negative lookahead with a back reference to the first digit:
(?!(\d)\1{13})\d{14}$
NB: This is pure regex syntax. I did not escape backslashes for use in a programming language.
There is no regex operation for "match here for all-of-these except a back-reference". You have a two-step test here, not a single one.

MySQL Regex for matching exactly 3 same chars but not 4 same chars within a larger string

I am trying to write a regex for mysql in PHP to find (at least one occurrence of) exactly 3 of the same characters in a row, but not 4 (or more) of the same.
Eg for "000" I want to find:
0//////0/00/ LS///////000
000////0/00/ LS//////////
0//////0/00/ LS////000///
0//////000// LS//////000/
0//////000// LS//00000000
but not:
0//////0000/ LS//////////
0//////0000/ LS//////////
0/////00000/ LS//////////
I have tried the code below which I thought would match 3 zeros preceded and followed by zero or more chars which are not 0, but this resulted in some rows with single 0's and some 000000's
REGEXP '[^0]*[0{3}][^0]*'
Many thanks.
If you plan to use a regex in MySQL, you cannot use lookarounds. Thus, you can use alternation with negated character class and anchors:
(^|[^0])0{3}([^0]|$)
See the regex demo
Explanation:
(^|[^0]) - a group matching either the start of string (^) or a character other than 0
0{3} - exactly 3 zeros
([^0]|$) - a group matching either a character other than 0 or the end of string ($).

MySQL Regular Expression [a-z]\.[a-z] but not a.m. or p.m

Evening,
I want to search some columns in a MySQL table for any instances of [a-z]\.[a-z], for example:
John.than, Ame.ica, Llan.antffraid etc.
but I don't want this to include the strings 'a.m.' OR 'p.m.'. I have tried using (?!a.m.|p.m.) but this does not work. It returns the error: "Got error 'repetition-operator operand invalid' from regexp".
I have the following regular expression:
REGEXP BINARY '[a-z]\\\.[a-z]'
N.B. If a colum includes a.m. OR p.m. but also contains a string like bro.ken, it needs to be returned.
Build your regex step by step:
You want everything, except its a "standalone" a.m or p.m:
[b-oq-z]{1}\.[a-ln-z]{1} matches everything of the format x.y that is not a.# or p.# or #.m
However you miss a.a, a.b, a.c ... also. so add that cases:
a\.[^m] (same for the p-cases: p\.[^m])
a.m is valid, when there are chars in front of the a: kra.m, tra.m. Same applies for p.m: erp.m
[a-z]{1}[ap]\.m covers this condtion.
Now, we are missing strings, where the second part is longer: a.mod, p.markt:
[ap]\.m[a-z]+ covers that one.
Finally just the ones ending with .m but having a different prefix are missing:
[b-oq-z]{1}\.m
This should now cover all possible use Cases. Simple combine the pattern with OR (|) and you are done:
([b-oq-z]{1}\.[a-ln-z]{1}|a\.[^m]|p\.[^m]|[a-z]{1}[ap]\.m|[ap]\.m[a-z]+|[b-oq-z]{1}\.m)
Edit live on Debuggex
Note: This will NOT give you the exakt match groups. But since you use it in a SQL-Query only the case that there is a match is required. (ark.m will be matched by k.m - but it fulfills your specification)
Keep in Mind: When creating a regular expression, there is no right solution: Just Working Ones, and not working ones. a\.[^m]|p\.[^m] is equal to [ap]\.[^m], which will reduce the pattern by one OR.
You have found the perfect Regex-Pattern, when 2 conditions are met:
It works!
You can understand it, when looking at it in 4 months!
If you can use assertions, this might work, but not sure about backtracking.
# (?=^.*(?:(?!a\.m|p\.m)[a-z]\.[a-z]|(?:a\.m|p\.m).*(?!a\.m|p\.m)[a-z]\.[a-z]))
(?=
^
.*
(?:
(?! a\.m | p\.m )
[a-z] \. [a-z]
|
(?: a\.m | p\.m )
.*
(?! a\.m | p\.m )
[a-z] \. [a-z]
)
)
I would do it like this:
SELECT 'Ame.ica wakes up at 8 a.m.' REGEXP
'[b-oq-z]\\.[a-ln-z]|[ap]\\.[^m]|[^ap]\\.m|[[:alpha:]][ap]\\.m|[ap]\\.m[[:alpha:]]' findme,
'America wakes up at 8 a.m.' REGEXP
'[b-oq-z]\\.[a-ln-z]|[ap]\\.[^m]|[^ap]\\.m|[[:alpha:]][ap]\\.m|[ap]\\.m[[:alpha:]]' dontfindme
It's a shorter and therefor slightly faster version of dognose's answer. Also it's tailored to MySQL which has the slightly odd [[:alpha:]] class.

Regex for start with three alpha and four digits

I have writen an sql statement to retrieve data from Mysql db and I wanted to select data where myId start with three alpha and 4 digits example : ABC1234K1D2
myId REGEXP '^[A-Z]{3}/d{4}'
but it gives me empty result(data is available in DB). Could someone point me to correct way.
In most regex variants the answer would be: /d matches a / followed by a d; I think you want \d which matches a digit.
However MySQL has a somewhat limited regex implementation (see documentation).
There is no shortcut to character sets like \d for any digit.
You need to either use a named character set ([[:digit:]]), or just use [0-9].
Try this out :
[A-Z]{3}[0-9]{4}
If you want characters to be case insensitive. Try this :
[a-zA-Z]{3}[0-9]{4}
First, in regular regular expressions, to match a digit, you have to use \d instead of /d (which makes you match / followed by d).
Then, I had never noticed, but I think \d (and the others like \w, etc.) don't seem to be available in MySQL. The doc lists the accepted spacial chars, and those generic classes don't appear. You could use [:digit:] instead, even if [0-9] is quite shorter ;)
You are doing fine, just replace /d with \d.Final regex: ^[A-Z]{3}\d{4}
You could use the following pattern :
^[a-zA-Z]{3}\d{4}

__* in Mysql regular expression

I am refering one open source code. There I can found an sql with this kind of a filter.
select sometext from table1,table2 where table1.sometext LIKE
CONCAT('% ',table2.test_keyword,' %') AND table2.test_keyword NOT
REGEXP '__*';
What is that __* in this sql?
__* matches one _ followed by zero or more _s.
__*
^^^
||\__ (zero or more) ^
|\___ underscore |
\____ underscore, then |
_+ would have done the same job.
_+
^^
|\__ (one or more) ^
\___ underscore |
It's simply one or more underscore characters.
The pattern is best read as:
'_', exactly one underscore,
'_*', followed by zero or more underscores.
Keep in mind that, without a start marker, that will match the pattern at any location in the string, so it basically means any string with an underscore in it (or, more accurately, since you're using NOT, a string without an underscore).
It's also needlessly complex, since you could achieve the same effect with AND table2.test_keyword NOT REGEXP '_'.
See here for the latest MySQL documentation on regexes (5.6 at the time of this answer).