MySQL Regular expression with alternation group not working - mysql

I'm trying to match this string "محمد مصلح حسن القطان" from a column in MySQL table using regular expression which have different alternations of the letter "ا". I have tried this
SELECT caseTitle FROM cases where caseTitle REGEXP 'قط([ا|أ|آ|إ])ن';
For some reason it doesn't work, when I try this
SELECT caseTitle FROM cases where caseTitle REGEXP 'قط([ا|أ|آ|إ])';
It works and matches the string, I'm using Google Cloud SQL with version 5.7 and unfortunately, I can't define custom collation for Arabic letters which should have solved my problem so I had to use regular expressions.

Related

Search for “whole word match” in MySQL [duplicate]

This question already has answers here:
Whole word matching with dot characters in MySQL
(5 answers)
Closed 2 years ago.
I want search the exact word using select query in mysql
eg: My table column content
"This is a sample mail to test Auto Decline Invitation."
Qry:
SELECT * FROM `test` where text REGEXP '[[:<:]]Invitation.[[:>:]]'
In above example i need to select all records match with 'Invitation.'
Instead of using REGEXP, you could also use the LIKE pattern matching operator.
A sample query could be:
SELECT * FROM `test` WHERE `text` LIKE '%Invitation.%';
Edit
Otherwise, if LIKE doesn't match your requirements, you can of course use REGEXP.
For a REGEXP (MySQL 5.7) expression, you'll want to use (mentioned by Wiktor):
SELECT * FROM `test` WHERE `text` REGEXP '[[:<:]]Invitation[.]';
For a REGEXP (MySQL 8.0) expression, you'll want to use:
SELECT * FROM `test` WHERE `text` REGEXP '\\bInvitation\\.';
The [[:<:]] & [[:>:]], and \b operators offer similar functionality for their boundaries. MySQL 5.7 is a little bit more explicit, as you can see per the documentation here at the bottom of the page. With MySQL 8.0, it supports the International Components for Unicode (ICU), as opposed to 5.7 that uses Henry Spencer's implementation for regular expressions.
From the MySQL 8.0 docs:
MySQL implements regular expression support using International Components for Unicode (ICU), which provides full Unicode support and is multibyte safe. (Prior to MySQL 8.0.4, MySQL used Henry Spencer's implementation of regular expressions, which operates in byte-wise fashion and is not multibyte safe.
If you do a search on this documentation page for \b, you'll see some clarification between the difference of ICU vs. Spencer regular expression handling:
The Spencer library supports word-beginning and word-end boundary markers ([[:<:]] and [[:>:]] notation). ICU does not. For ICU, you can use \b to match word boundaries; double the backslash because MySQL interprets it as the escape character within strings.
Bit of a learning experience for me too, thanks Wiktor!

Types of Wildcards in MySql

My query:
Select * From tableName Where columnName Like "[PST]%"
is not giving the expected result.
Why does this wildcard not work in MySql?
If you want to filter on strings that contain any 'P', 'S', or 'T', then you can use a regex:
where col rlike '[PST]'
If you want strings that contain substring 'PST', then no need for square brackets - and like is enough:
where col like '%PST%'
If you want the matching character(s) at the start of the string, then the regex solution looks like:
where col rlike '^PST'
And the like option would be:
where col like 'PST%'
MySQL's LIKE syntax is documented here: https://dev.mysql.com/doc/refman/8.0/en/pattern-matching.html
Standard SQL from decades ago defined only two wildcards: % and _. These are the only wildcards an SQL product needs to support if they want to say they are SQL compliant and support the LIKE predicate.
% matches zero or more of any characters. It's analogous to .* in regular expressions.
_ matches exactly one of any character. It's analogous to . in regular expressions.
Also if you want to match a literal '%' or '_', you need to escape it, i.e. put a backslash before it:
WHERE title LIKE 'The 7\% Solution'
Microsoft SQL Server's LIKE syntax is documented here: https://learn.microsoft.com/en-us/sql/t-sql/language-elements/like-transact-sql?view=sql-server-ver15
They support % and _ wildcards, and the \ escape character, but they extend standard SQL with two other forms:
[a-z] matches one character, but only characters in the range inside the brackets. This is similar in regular expressions. The - is a range operator, unless it appears at the start or end of the string inside the brackets.
[^a-z] matches one character, which must not be one of the characters in the range inside the brackets. Also the same in regular expressions.
These are not standard forms of wildcards for the LIKE predicate, and other brands of SQL database don't support them.
Later versions of the SQL standard introduced a new predicate SIMILAR TO which supports much richer patterns and wildcards, since the right-side operand is a string which contains a regular expression. But since this predicate was introduced in a later edition of the SQL standard, some implementations had already developed their own solution that was almost the same.
MySQL called the operator REGEXP and RLIKE is a synonym (https://dev.mysql.com/doc/refman/8.0/en/regexp.html).
It was requested in https://bugs.mysql.com/bug.php?id=746 to support SIMILAR TO syntax to help MySQL comply with the SQL standard, but the request was turned down, because it had subtly different behavior to the existing REGEXP/RLIKE operator.
Microsoft SQL Server has partial support of regular expression wildcards in the LIKE operator, and also a dbo.RegexMatch() function.
SQLite has a GLOB operator, and so on.
Thanks everyone!
For specific this question, we need to use regexp
Select * From tableName Where ColumnName Regexp "^[PST]";
For more detail over Regular Expression i.e Regexp :
https://www.youtube.com/watch?v=KoltE-JUY0c

How to make this REGEX below work for MySql?

I have written regex and tested it online, works fine. When I test in terminal, MySQL console, it doesn't match and I get an empty set. I believe MySQL regexp syntax is somehow different but I cannot find the right way.
This is data I use:
edu.ba;
medu.ba;
edu.ba;
med.edu.ba;
edu.com;
edu.ba
I should get only edu.ba matches including; if there is some. Works fine except in actual query.
(\;+|^)\bedu.ba\b(\;+|$|\n)
Is there anything I could change to get the same results?
You want to match edu.ba in between semi-colons or start/end of string. The word boundaries are redundant here (although if you want to experiment, the MySQL regex before MySQL v8 used [[:<:]] / [[:>:]] word boundaries, and in MySQL v8+, you need to use double backslashes with \b - '\\b').
Use
(;|^)edu[.]ba(;|$)
Details
(;|^) - ; or start of string
edu[.]ba - edu.ba literal string (dot inside brackets always matches a literal dot)
(;|$) - ; or end of string.

Extracting phone numbers from a string using sql

Actually I want to catch the phone numbers in a string. On some websites its mentioned that use this Regexpression (ref)
^(\+\d{1,2}\s)?\(?\d{3}\)?[\s.-]\d{3}[\s.-]\d{4}$
But it's not working on the normal sql query. Here's the syntax:
REGEXP_LIKE (mystring, '^(\+\d{1,2}\s)?\(?\d{3}\)?[\s.-]\d{3}[\s.-]\d{4}$')
I am doing something wrong here ?
\d and \s don't exist in MySQL. Use [[:digit:]] or [0-9] and [[:space:]] or [[:blank:]] or simply .
Some other things are fixed here:
str REGEXP '^([+][0-9]{1,2})?([(][0-9]{3}[)]|[0-9]{3})[-. ][0-9]{4}$'

Does a RegEx Pattern need to be modified to be used with SQL in MySql?

I'am trying to write a SELECT-Statement to retrieve a list of Usernames from my Database. My Pattern is: /placeholder\d+/ig and I already tested it and can confirm it is working properly. I'am trying to retrieve every Placeholder in the Table.
I also tried to escape the \ after placeholder.
My SQL-Statement is: SELECT * FROM table WHERE (name REGEX '/placeholder\d+/ig') ... I tried different variations with backticks, etc or LIKE instead of REGEXbut LIKEonly has % and _ as a Wildcard.
Does my RegEx pattern needs to be modified in order to work with MySQL?
Unlike most scripting languages, MySQL is not using the PREG library for regular expression matching.
So yes, you need to modify your regex to make it work properly in MySQL:
SELECT * FROM table WHERE name REGEXP 'placeholder[0-9]+'
OR
SELECT * FROM table WHERE name REGEXP 'placeholder[[:digit:]]+'
There are no short-hand character classes like \d in MySQL. Also, you do not use the regex-delimeter ("/../si" is just ".." in MySQL)
Read the documentation on regular expressions in MySQL for more information.