Match optional end of line - mysql

Hey I want to use a regular expression in MySQL to match rows.
It needs to match rows where a the pattern ends with anything that's not a digit or the end of the line.
This pattern works in Ruby /download:223(?:[\D]|$)/
In MySQL it doesn't match. I'm guessing it doesn't allow for optional matching of eol.
SELECT id FROM stories WHERE body REGEXP 'download:223(?:[\D]|$)'
I need to match the following (quotes just for clarity):
"download:223"
"download:223*"
"download:223 something"
"download:223 more text"
But NOT the following (again quotes just for clarity):
"download:2234"
"download:2234 more text"
"download:2234*"
"download:2234* even more"
Thanks!

This regex should work for you:
"download:223([^0-9]|$)"
MySQL regex engine doesn't support \D, \d etc.

Non-capturing groups are not supported in MySQL regexes. The rest should be fine. It definitely supports $ matching the end of string. Also, \D is not supported, but you can use [^0-9]
Try this:
SELECT id FROM stories WHERE body REGEXP 'download:223([^0-9]|$)'
MySQL groups don't capture, so supporting non-capturing groups is unnecessary.
Reference source:
Using Non-Capturing Groups in MySQL REGEXP

Related

Types of Wildcards in MySql

My query:
Select * From tableName Where columnName Like "[PST]%"
is not giving the expected result.
Why does this wildcard not work in MySql?
If you want to filter on strings that contain any 'P', 'S', or 'T', then you can use a regex:
where col rlike '[PST]'
If you want strings that contain substring 'PST', then no need for square brackets - and like is enough:
where col like '%PST%'
If you want the matching character(s) at the start of the string, then the regex solution looks like:
where col rlike '^PST'
And the like option would be:
where col like 'PST%'
MySQL's LIKE syntax is documented here: https://dev.mysql.com/doc/refman/8.0/en/pattern-matching.html
Standard SQL from decades ago defined only two wildcards: % and _. These are the only wildcards an SQL product needs to support if they want to say they are SQL compliant and support the LIKE predicate.
% matches zero or more of any characters. It's analogous to .* in regular expressions.
_ matches exactly one of any character. It's analogous to . in regular expressions.
Also if you want to match a literal '%' or '_', you need to escape it, i.e. put a backslash before it:
WHERE title LIKE 'The 7\% Solution'
Microsoft SQL Server's LIKE syntax is documented here: https://learn.microsoft.com/en-us/sql/t-sql/language-elements/like-transact-sql?view=sql-server-ver15
They support % and _ wildcards, and the \ escape character, but they extend standard SQL with two other forms:
[a-z] matches one character, but only characters in the range inside the brackets. This is similar in regular expressions. The - is a range operator, unless it appears at the start or end of the string inside the brackets.
[^a-z] matches one character, which must not be one of the characters in the range inside the brackets. Also the same in regular expressions.
These are not standard forms of wildcards for the LIKE predicate, and other brands of SQL database don't support them.
Later versions of the SQL standard introduced a new predicate SIMILAR TO which supports much richer patterns and wildcards, since the right-side operand is a string which contains a regular expression. But since this predicate was introduced in a later edition of the SQL standard, some implementations had already developed their own solution that was almost the same.
MySQL called the operator REGEXP and RLIKE is a synonym (https://dev.mysql.com/doc/refman/8.0/en/regexp.html).
It was requested in https://bugs.mysql.com/bug.php?id=746 to support SIMILAR TO syntax to help MySQL comply with the SQL standard, but the request was turned down, because it had subtly different behavior to the existing REGEXP/RLIKE operator.
Microsoft SQL Server has partial support of regular expression wildcards in the LIKE operator, and also a dbo.RegexMatch() function.
SQLite has a GLOB operator, and so on.
Thanks everyone!
For specific this question, we need to use regexp
Select * From tableName Where ColumnName Regexp "^[PST]";
For more detail over Regular Expression i.e Regexp :
https://www.youtube.com/watch?v=KoltE-JUY0c

How to use regex flags in Mariadb's regexp_replace?

I have a table with records. A record has a field content that contains some html like <p><img src=\"/pictures/image.jpg\" vspace=\"6\" hspace=\"6\" align=\"left\" alt=\"Alt text\" title=\"Title Text\" width=\"260\"> Some text content...
I need to remove <a></a> tags that are now placed around <img>. There can be multiple <a><img></a> occurrences in the string. I kinda made a corresponding regexp and learnt about REGEXP_REPLACE function. Ideally I expect something like
UPDATE table_name SET content = REGEXP_REPLACE(content, '/<a\shref=\\?"\/pictures\/.+">(<img.+">)<\/a>/gmU', '\\1') WHERE id=1
to work out, but it doesn't. I don't understand where to put flags gmU. Also in the articles/docs I found on the internet I don't see flags like g (global) and U (ungreedy). Is it global and ungreedy by default? How to make it all work?
10.3.15-MariaDB.
In MariaDB you pass flags to REGEXP_REPLACE by in-lining them in the regex using (?x) notation, where x is the flag. REGEXP_REPLACE by default replaces all occurrences of pattern in the string, so you don't need the g flag; nor in your case do you need the multi-line flag m as you are not attempting to use beginning/end of line anchors. You can use U though in place of the ? modifier to make + non-greedy.
There's a couple of issues with your regex:
MariaDB does not require regexes to be contained with /
\s represents a literal s and needs to be \\s
To match a literal \ you need to use \\\\, not \\
This regex should give you the results you want:
(?U)<a\\s.*href=\\\\?"/pictures.+(<img.+>)</a>
In a query:
SELECT REGEXP_REPLACE(content, '(?U)<a\\s.*href=\\\\?"/pictures.+(<img.+>)</a>', '\\1')
FROM test
Demo on dbfiddle

regexp mysql group

I try get name of city's from string '{"travelzoo_hotel_name":"Graduate Minneapolis","travelzoo_hotel_id":"223","city":"Minneapolis","country":"USA","sales_manager":"Stephen Conti"}'
I try this regexp:
SELECT REGEXP_SUBSTR('{\"travelzoo_hotel_name\":\"Graduate Minneapolis\",\"travelzoo_hotel_id\":\"223\",\"city\":\"Minneapolis\",\"country\":\"USA\",\"sales_manager\":\"Stephen Conti\"}'
,'(?:.city...)([[:alnum:]]+)');
I have: '"city":"Minneapolis'
Me need only name of city:Minneapolis.
How to use groups in queries?
My example in regex101
Help me Please
I assume you are using MySQL 8.x that uses ICU regex expressions.
It looks like the string you want to process is JSON. You may use JSON_EXTRACT with JSON_UNQUOTE and a '$.city' as JSON path then:
JSON_UNQUOTE(JSON_EXTRACT('{"travelzoo_hotel_name":"Graduate Minneapolis","travelzoo_hotel_id":"223","city":"Minneapolis","country":"USA","sales_manager":"Stephen Conti"}', '$.city'))
will return Minneapolis.
In your regex, the non-capturing group pattern is still matched and appended to the match value. "Non-capturing" only means no separate memory buffer is alotted to the text captured with a grouping construct. So, you may fix it with '(?<="city":")[^"]+' pattern where (?<="city":") is a positive lookbehind that matches "city":" but does not put it into the match value. The only text you will have in the output is the one matched with [^"]+, 1+ chars other than ".

MySQL regex matching at least 2 dots

Consider the following regex
#(.*\..*){2,}
Expected behaviour:
a#b doesnt match
a#b.c doesnt match
a#b.c.d matches
a#b.c.d.e matches
and so on
Testing in regexpal it works as expected.
Using it it in a mysql select doesn't work as expected. Query:
SELECT * FROM `users` where mail regexp '#(.*\..*){2,}'
is returning lines like
foo#example.com
that should not match the given regex. Why?
I think the answer to your question is here.
Because MySQL uses the C escape syntax in strings (for example, “\n”
to represent the newline character), you must double any “\” that you
use in your REGEXP strings.
MYSQL Reference
Because your middle dot wasn't properly escaped it was treated as just another wildcard and in the end your expression was effectively collapsed to #.{2,} or #..+
#anubhava's answer is probably a better substitute for what you tried to do though I would note #dasblinkenlight's comment about using the character class [.] which will make it easy to drop in a regex you've already tested in at RegexPal.
You can use:
SELECT * FROM `users` where mail REGEXP '([^.]*\\.){2}'
to enforce at least 2 dots in mail column.
I would match two dots in MySQL using like:
where col like '%#.%.%'
The problem with your code is that .* (match-everything dot) matches dot '.' character. Replacing it with [^.]* fixes the problem:
SELECT *
FROM `users`
where mail regexp '#([^.]*[.]){2,}'
Note the use of [.] in place of the equivalent \.. This syntax makes it easier to embed the regex into programming languages that use backslash as escape character in their string literals.
Demo.

Regex Search in phpMyAdmin

Attempting to change the "files" folder location in a Drupal site from /files to /sites/default/files.
In order to avoid changing anything else such as
http://www.google.com/profiles/
I'm trying to use a basic regular expression with a word boundary.
\bfiles/
A quick check in regexpal is working as expected, but when I enter the above in the phpMyAdmin search , checking the "as regular expression" checkbox, I don't get the expected result.
Two questions:
How should I write my expression with a word boundary so that it works in phpMyAdmin?
I'm really a newbie at SQL statements! Would it be possible to write a SQL query that would simply look for every occurrence of "files/" & replace it with "sites/default/files/"?
According to the MySql docs, the regex flavour used is POSIX 1003.2. For this flavour of regex, word boundaries are as follows:
[[:<:]] (beginning) [[:>:]] (end)
so your regex would be:
[[:<:]]files/
If you want to use sql to search and replace all instances of [[:<:]]files/ from a specific field in a table, you could use a UDF such as the one found here
Also, you should be aware of the following while using regex with MySql:
Because MySQL uses the C escape syntax in strings (for example, “\n”
to represent the newline character), you must double any “\” that you
use in your REGEXP strings.