Extracting phone numbers from a string using sql - mysql

Actually I want to catch the phone numbers in a string. On some websites its mentioned that use this Regexpression (ref)
^(\+\d{1,2}\s)?\(?\d{3}\)?[\s.-]\d{3}[\s.-]\d{4}$
But it's not working on the normal sql query. Here's the syntax:
REGEXP_LIKE (mystring, '^(\+\d{1,2}\s)?\(?\d{3}\)?[\s.-]\d{3}[\s.-]\d{4}$')
I am doing something wrong here ?

\d and \s don't exist in MySQL. Use [[:digit:]] or [0-9] and [[:space:]] or [[:blank:]] or simply .
Some other things are fixed here:
str REGEXP '^([+][0-9]{1,2})?([(][0-9]{3}[)]|[0-9]{3})[-. ][0-9]{4}$'

Related

SQL - match last two characters in a string

I have a small mysql database with a column which has format of a field as following:
x_1_1,
x_1_2,
x_1_2,
x_2_1,
x_2_12,
x_3_1,
x_3_2,
x_3_11,
I want to extra the data where it matches last '_1'. So if I run a query on above sample dataset, it would return
x_1_1,
x_2_1,
x_3_1,
This should not return x_2_12 or x_3_11.
I tried like '%_1' but it returns x_2_12 and x_3_11 as well.
Thank you!
A simple method is the right() function:
select t.*
from t
where right(field, 2) = '_1';
You can use like but you need to escape the _:
where field like '%$_1' escape '$'
Or use regular expressions:
where field regexp '_1$'
The underscore character has special significance in a LIKE clause. It acts as a wildcard and represent one single character. So you would have to escape it with a backslash:
LIKE '%\_1'
RIGHT does the job too, but it requires that you provide the proper length for the string being sought and is thus less flexible.
Duh, I found the answer.
Use RIGHT (col_name, 2) = '_1'
Thank you!

Types of Wildcards in MySql

My query:
Select * From tableName Where columnName Like "[PST]%"
is not giving the expected result.
Why does this wildcard not work in MySql?
If you want to filter on strings that contain any 'P', 'S', or 'T', then you can use a regex:
where col rlike '[PST]'
If you want strings that contain substring 'PST', then no need for square brackets - and like is enough:
where col like '%PST%'
If you want the matching character(s) at the start of the string, then the regex solution looks like:
where col rlike '^PST'
And the like option would be:
where col like 'PST%'
MySQL's LIKE syntax is documented here: https://dev.mysql.com/doc/refman/8.0/en/pattern-matching.html
Standard SQL from decades ago defined only two wildcards: % and _. These are the only wildcards an SQL product needs to support if they want to say they are SQL compliant and support the LIKE predicate.
% matches zero or more of any characters. It's analogous to .* in regular expressions.
_ matches exactly one of any character. It's analogous to . in regular expressions.
Also if you want to match a literal '%' or '_', you need to escape it, i.e. put a backslash before it:
WHERE title LIKE 'The 7\% Solution'
Microsoft SQL Server's LIKE syntax is documented here: https://learn.microsoft.com/en-us/sql/t-sql/language-elements/like-transact-sql?view=sql-server-ver15
They support % and _ wildcards, and the \ escape character, but they extend standard SQL with two other forms:
[a-z] matches one character, but only characters in the range inside the brackets. This is similar in regular expressions. The - is a range operator, unless it appears at the start or end of the string inside the brackets.
[^a-z] matches one character, which must not be one of the characters in the range inside the brackets. Also the same in regular expressions.
These are not standard forms of wildcards for the LIKE predicate, and other brands of SQL database don't support them.
Later versions of the SQL standard introduced a new predicate SIMILAR TO which supports much richer patterns and wildcards, since the right-side operand is a string which contains a regular expression. But since this predicate was introduced in a later edition of the SQL standard, some implementations had already developed their own solution that was almost the same.
MySQL called the operator REGEXP and RLIKE is a synonym (https://dev.mysql.com/doc/refman/8.0/en/regexp.html).
It was requested in https://bugs.mysql.com/bug.php?id=746 to support SIMILAR TO syntax to help MySQL comply with the SQL standard, but the request was turned down, because it had subtly different behavior to the existing REGEXP/RLIKE operator.
Microsoft SQL Server has partial support of regular expression wildcards in the LIKE operator, and also a dbo.RegexMatch() function.
SQLite has a GLOB operator, and so on.
Thanks everyone!
For specific this question, we need to use regexp
Select * From tableName Where ColumnName Regexp "^[PST]";
For more detail over Regular Expression i.e Regexp :
https://www.youtube.com/watch?v=KoltE-JUY0c

SQL Regex last character search not working

I'm using regex to find specific search but the last separator getting ignore.
Must search for |49213[A-Z]| but searches for |49213[A-Z]
SELECT * FROM table WHERE (data REGEXP '/\|49213[A-Z]+\|/')
Why are you using | in the pattern? Why the +?
SELECT * FROM table WHERE (data REGEXP '\|49213[A-Z]\|')
If you want multiple:
SELECT * FROM table WHERE (data REGEXP '\|49213[A-Z]+\|')
or:
SELECT * FROM table WHERE (data REGEXP '[|]49213[A-Z][|]')
Aha. That is rather subtle.
\ escapes certain characters that have special meaning.
But it does not seem to do so for | ("or") or . ("any byte"), etc.
So, \| is the same as |.
But the regexp parser does not like having either side of "or" being empty. (I suspect this is a "bug"). Hence the error message.
https://dev.mysql.com/doc/refman/5.7/en/regexp.html says
To use a literal instance of a special character in a regular expression, precede it by two backslash () characters. The MySQL parser interprets one of the backslashes, and the regular expression library interprets the other. For example, to match the string 1+2 that contains the special + character, only the last of the following regular expressions is the correct one:
The best fix seems to be [|] or \\| instead of \| when you want the pipe character.
Someday, the REGEXP parser in MySQL will be upgraded to PCRE as in MariaDB. Then a lot more features will come, and this 'bug' may go away.

MySQL regex matching at least 2 dots

Consider the following regex
#(.*\..*){2,}
Expected behaviour:
a#b doesnt match
a#b.c doesnt match
a#b.c.d matches
a#b.c.d.e matches
and so on
Testing in regexpal it works as expected.
Using it it in a mysql select doesn't work as expected. Query:
SELECT * FROM `users` where mail regexp '#(.*\..*){2,}'
is returning lines like
foo#example.com
that should not match the given regex. Why?
I think the answer to your question is here.
Because MySQL uses the C escape syntax in strings (for example, ā€œ\nā€
to represent the newline character), you must double any ā€œ\ā€ that you
use in your REGEXP strings.
MYSQL Reference
Because your middle dot wasn't properly escaped it was treated as just another wildcard and in the end your expression was effectively collapsed to #.{2,} or #..+
#anubhava's answer is probably a better substitute for what you tried to do though I would note #dasblinkenlight's comment about using the character class [.] which will make it easy to drop in a regex you've already tested in at RegexPal.
You can use:
SELECT * FROM `users` where mail REGEXP '([^.]*\\.){2}'
to enforce at least 2 dots in mail column.
I would match two dots in MySQL using like:
where col like '%#.%.%'
The problem with your code is that .* (match-everything dot) matches dot '.' character. Replacing it with [^.]* fixes the problem:
SELECT *
FROM `users`
where mail regexp '#([^.]*[.]){2,}'
Note the use of [.] in place of the equivalent \.. This syntax makes it easier to embed the regex into programming languages that use backslash as escape character in their string literals.
Demo.

Does a RegEx Pattern need to be modified to be used with SQL in MySql?

I'am trying to write a SELECT-Statement to retrieve a list of Usernames from my Database. My Pattern is: /placeholder\d+/ig and I already tested it and can confirm it is working properly. I'am trying to retrieve every Placeholder in the Table.
I also tried to escape the \ after placeholder.
My SQL-Statement is: SELECT * FROM table WHERE (name REGEX '/placeholder\d+/ig') ... I tried different variations with backticks, etc or LIKE instead of REGEXbut LIKEonly has % and _ as a Wildcard.
Does my RegEx pattern needs to be modified in order to work with MySQL?
Unlike most scripting languages, MySQL is not using the PREG library for regular expression matching.
So yes, you need to modify your regex to make it work properly in MySQL:
SELECT * FROM table WHERE name REGEXP 'placeholder[0-9]+'
OR
SELECT * FROM table WHERE name REGEXP 'placeholder[[:digit:]]+'
There are no short-hand character classes like \d in MySQL. Also, you do not use the regex-delimeter ("/../si" is just ".." in MySQL)
Read the documentation on regular expressions in MySQL for more information.