Regex for email in Mysql in between any number of characters - mysql

I am trying to create a regex which identifies an email between a long string.
The below regex works fine for email :
^[A-Z0-9._%-]+#[A-Z0-9.-]+\.[A-Z]{2,4}$
But I need to create a regex such that this should return true :
SELECT 'hfdjj abc#enmail.com jkdfk' REGEXP '^[A-Z0-9._%-]+#[A-Z0-9.-]+\.[A-Z]{2,4}$';
I want to have any number of characters before and after the email.
Thanks,
Aman

If you're happy with the matching of the email, simply removing the ^ and $ characters from the start and end should suffice.
^ matches the beginning of the string.
$ matches the end of the string.
Between them, they are what's telling MySQL to not match where there are things either side of the email address.

Related

Issue with Regexp with mySQL query

I'm trying to build a search query which searches for a word in a string and finds matches based on the following criteria:
The word is surrounded by a combination of either a space, period or comma
The word is at the start of the string and ends with a space, period or comma
The word is at the end of the string and is followed by a space, period or comma
It's a full match, i.e. the entire string is just the word
For example, if the word is 'php' the following strings would be matches:
php
mysql, php, javascript
php.mysql
javascript php
But for instance it wouldn't match:
php5
I've tried the following query:
SELECT * FROM candidate WHERE skillset REGEXP '^|[., ]php[., ]|$'
However that doesn't work, it returns every record as a match which is wrong.
Without the ^| and |$ in there, i.e.
SELECT * FROM candidate WHERE skillset REGEXP '[., ]php[., ]'
It successfully finds matches where 'php' is somewhere in the string except the start and end of the string. So the problem must be with the ^| and |$ part of the regexp.
How can I add those conditions in to make it work as required?
Try '\bphp\b', \b is a word boundary and might just be exactly what you need because it looks for the whole word php.
For MySQL, word boundaries are represented with [[:<:]] and [[:>:]] instead of \b, so use the query '[[:<:]]php[[:>:]]'. More info on word boundaries here.
Well, you can play around a bit with regex101.com
Something I found that works for you but doesn't exactly follow your rules is:
/(?=[" ".,]?php[" ".,]?)(?=php[\W])/
This uses the lookahead operator, ?=, to do AND
The first portion of the regex is
[" ".,]?php[" ".,]?
This will match anything that has a space, period, or comma before or after the php, but at most only one.
The section portion of the regex is
php[\W]
This will match anything that is php, followed by a non-character. In other words, it will NOT match php followed by a character, digit, or underscore.
It's not the perfect answer for your set of rules, but it does work with your sample data set. Play around on regex101.com and try to make a perfect one.

How to select records from mysql database by regex

I have a regexp to validate user email address.
/^(|(([A-Za-z0-9]+_+)|([A-Za-z0-9]+\-+)|([A-Za-z0-9]+\.+)|([A-Za-z0-9]+\++))*[A-Za-z0-9]+#((\w+\-+)|(\w+\.))*\w{1,63}\.[a-zA-Z]{2,})$/i"
With the help of active record, I want to fetch from a database all the users whose email address doesn't match this regexp. I tried the following scope to achieve the desired result, but all I get is ActiveRecord::Relation.
scope :not_match_email_regex, :conditions => ["NOT email REGEXP ?'", /^(|(([A-Za-z0-9]+_+)|([A-Za-z0-9]+\-+)|([A-Za-z0-9]+\.+)|([A-Za-z0-9]+\++))*[A-Za-z0-9]+#((\w+\-+)|(\w+\.))*\w{1,63}\.[a-zA-Z]{2,})$/"]
This gives me the following query:
SELECT `users`.* FROM `users` WHERE (email REGEXP '--- !ruby/regexp /^(|(([A-Za-z0-9]+_+)|([A-Za-z0-9]+\\-+)|([A-Za-z0-9]+\\.+)|([A-Za-z0-9]+\\++))*[A-Za-z0-9]+#((\\w+\\-+)|(\\w+\\.))*\\w{1,63}\\.[a-zA-Z]{2,})$/\n...\n')
I also tried to define this scope in the following way with the same result:
scope :not_match_email_regex, :conditions => ["email REGEXP '(|(([A-Za-z0-9]+_+)|([A-Za-z0-9]+\-+)|([A-Za-z0-9]+\.+)|([A-Za-z0-9]+\++))*[A-Za-z0-9]+#((\w+\-+)|(\w+\.))*\w{1,63}\.[a-zA-Z]{2,})'"]
The query it generates is:
SELECT `users`.* FROM `users` WHERE (email REGEXP '(|(([A-Za-z0-9]+_+)|([A-Za-z0-9]+-+)|([A-Za-z0-9]+.+)|([A-Za-z0-9]+++))*[A-Za-z0-9]+#((w+-+)|(w+.))*w{1,63}.[a-zA-Z]{2,})')
How can I fetch all records that match or don't match the given regex?
EDIT 12-11-30 small corrections partly according to the comment by #innocent_rifle
The suggested Regexp here is trying to make the same matches as in the original question
1. In my solution when I first wrote it I forgot that you must escape \ in strings because I was testing directly in MySQL. When discussing Regexps it's confusing to use Regexps in strings, so I will use this form instead e.g. /dot\./.source which (in Ruby) will give "dot\\.".
2. REGEXP in MySQL (manual for 5.6, tested in 5.0.67) are using "C escape syntax in strings", so WHERE email REGEXP '\.' is still the same as WHERE email REGEXP '.', to find the character "." you must use WHERE email REGEXP '\\.', to achieve that you must use the code .where([ 'email REGEXP ?', "\\\\."]). It's more readable to use .where([ 'email REGEXP ?', /\\./.source ]) (MySQL needs 2 escapes). However, I prefer to use .where([ 'email REGEXP ?', /[.]/.source ]), then I don't have to worry about how many escapes you need.
3. You don't need to escape "-" in a Regexp, not when using that in [] either as long as that character is the first or the last.
Some errors I found: it's the first regexp-or "|" in you expression, and it should be as a String in the query, or using Regexp#source which I prefer. There was also an extra quote at the end I think.
Except from that are you really sure the regexps works. If you try it in the console on a string?
Also be aware of that you won't catch emails with NULL in db, in that case you must add (<your existing expr in parentheses>) OR IS NULL
Regexp syntax in my MySQL verion.
I also tested what #Olaf Dietsche wrote in his suggestion, it seems that it's not needed, but it's strongly recommended to follow the standard syntax anyway (NOT (expr REGEXP pat) or expr NOT REGEXP pat).
I have done some checking, these things must be changed: use [A-Za-z0-9_] instead of \w, and \+ is not valid, you must use \\+ ("\\\\+" if string), easier with [+] (in both Regexp or string).
It leads to following REGEXP in MySQL
'^(([A-Za-z0-9]+_+)|([A-Za-z0-9]+-+)|([A-Za-z0-9]+[.]+)|([A-Za-z0-9]+[+]+))*[A-Za-z0-9]+#(([A-Za-z0-9]+-+)|([A-Za-z0-9]+[.]))*[A-Za-z0-9]{1,63}[.][a-zA-Z]{2,}$'
Small change suggestions
I don't understand your regexp exactly, so this is only changing your regexp without changing what it will find.
First: change the whole string as I described above
Then change
(([A-Za-z0-9]+_+)|([A-Za-z0-9]+-+)|([A-Za-z0-9]+[.]+)|([A-Za-z0-9]+[+]+))*
to
([A-Za-z0-9]+[-+_.]+)*
and
#(([A-Za-z0-9]+-+)|([A-Za-z0-9]+[.]))*
to
#([A-Za-z]+[-.]+)*
Final code (change to ..., :conditions => ...syntax if you prefer that). I tried to make this find the same strings as in the comment by #innocent_rifle, only adding "_" in expressions to the right of #
.where([ 'NOT (email REGEXP ?)', /^([A-Za-z0-9]+[-+_.]+)*[A-Za-z0-9]+#([A-Za-z0-9]+[-._]+)*[A-Za-z0-9_]{1,63}[.][A-Za-z]{2,}$/.source ])
For validating email addresses, you might want to consider How to Find or Validate an Email Address. At least, this regexp looks a bit simpler.
According to MySQL - Regular Expressions the proper syntax is
expr REGEXP pat
for a match, and
expr NOT REGEXP pat or NOT (expr REGEXP pat)
for the opposite. Don't forget the braces in the second version.

How to remove fake names using regular expression in mysql query?

I want to remove the names which may be registered with fake names.
As the developer forgot to put validation on form registration.
Now i want to remove the fake names.
And for checking if that name is fake or not, I am checking if the name content any numbers or not ?
This is my query which i have written but its not working...
SELECT registration.regi_id, student.first_name,
student.cont_no, student.email_id,
registration.college,
registration.event_name,
registration.accomodation
FROM student, registration
WHERE student.stud_id = registration.stud_id
AND student.first_name NOT RLIKE '%[0-9]%'
How to fix this problem ?
Sorry for my language issues,
P.S.
There are many names in "first_name" field like "asdfasdf12323", i don't want that kind of names to be shown on list.
Your column may contain Alphanumeric characters also.YOu need to filter Numbers and Alphanumeric characters both
For Alphanumeric characters Try REGEXP '^[A-Za-z0-9]+$'
For numbers Try REGEXP '[0-9]'
Well as far as the regex is involved, your expression is only looking for a single number. Also, your 'NOT RLIKE' isn't using regex but is doing a basic string search for the literal '[0-9]' I believe. MySql has support for regex, and your last clause would look like so: AND student.first_name NOT REGEXP '[0-9]*'

MySQL REGEXP not matching string

I have a table of messages. I am trying to find messages in the table that have an ID code which complies with a specific format. The regexp that I have below was written for matching these values in PHP, but I want to move it to a MySQL query.
It is looking for a specific format of an identifier code that looks like this:
[692370613-3CUWU]
The code has a consistent format:
starts and ends with hard brackets [ ]
two components inside,
first is an account number, min 9 digits, but could be higher
second component is a alphanumeric code, 5 characters, can include 1-9, and capital letters excluding "O"
the complete code can occur anywhere in the message
I have a query that reads:
SELECT * FROM messages
WHERE
msgBody REGEXP '\\[(\d){9,}-([A-NP-Z1-9]){5}\\]'
OR
msgSubject REGEXP '\\[(\d){9,}-([A-NP-Z1-9]){5}\\]'
I created a test row in the table which has only the sample value above in the msgBody field for testing - but it does not return any results.
I am guessing that I am missing something in the conversion of PHP style regex vs. MySQL.
Help is greatly appreciated.
Thank you!
Instead of \d try using [[:digit:]]
SELECT * FROM messages
WHERE
msgBody REGEXP '\\[([0-9]){9,}-([A-NP-Z1-9]){5}\\]'
OR
msgSubject REGEXP '\\[([0-9]){9,}-([A-NP-Z1-9]){5}\\]'

Mysql REGEXP for username

what is Mysql query REGEXP to call this?
#text
#user_name
#4ll_r1ght
#last2
#_last1
#and1more_
SELECT * FROM users WHERE username REGEXP '^\#[0-9a-zA-Z_]+$'
Will select users with usernames starting with # and consisting of only alphanumeric characters (at least one).
I hope you are looking for regular expression for username with specified characters.
Try below :
^[a-zA-Z0-9._-]+#
This part of the expression validates the ‘username’ section of the email address. The hat sign (^) at the beginning of the expression represents the start of the string.
If we didn’t include sign (^), then someone could key in anything they wanted before the email address and it would still validate.
Here, we are allowing the letters a-z, A-Z, the numbers 0-9, and the symbols underscore (_), period (.), and dash (-). You can add/remove them according to your needs.