If I try to use Regex inside locate it fails
Select Locate(FieldA regexp '[a-z][A-Z][a-z]',Binary FieldA) from PatternTester
as per http://sqlfiddle.com/#!9/403c36/2.
If I search for the explicit letter pattern it locates it correctly:
Select Locate('lC',Binary FieldA) from PatternTester
as per http://sqlfiddle.com/#!9/403c36/6
Is there something I need to do to make locate 'obey' Regex or will it simply not?
As mysql document says, LOCATE() will returns the position of the first occurrence of substring substr in string st, so it will not take any regex as input argument.
Also from checking your fiddle reference, you don't have REGEX_INSTR in this version!
Related
I have a table with records. A record has a field content that contains some html like <p><img src=\"/pictures/image.jpg\" vspace=\"6\" hspace=\"6\" align=\"left\" alt=\"Alt text\" title=\"Title Text\" width=\"260\"> Some text content...
I need to remove <a></a> tags that are now placed around <img>. There can be multiple <a><img></a> occurrences in the string. I kinda made a corresponding regexp and learnt about REGEXP_REPLACE function. Ideally I expect something like
UPDATE table_name SET content = REGEXP_REPLACE(content, '/<a\shref=\\?"\/pictures\/.+">(<img.+">)<\/a>/gmU', '\\1') WHERE id=1
to work out, but it doesn't. I don't understand where to put flags gmU. Also in the articles/docs I found on the internet I don't see flags like g (global) and U (ungreedy). Is it global and ungreedy by default? How to make it all work?
10.3.15-MariaDB.
In MariaDB you pass flags to REGEXP_REPLACE by in-lining them in the regex using (?x) notation, where x is the flag. REGEXP_REPLACE by default replaces all occurrences of pattern in the string, so you don't need the g flag; nor in your case do you need the multi-line flag m as you are not attempting to use beginning/end of line anchors. You can use U though in place of the ? modifier to make + non-greedy.
There's a couple of issues with your regex:
MariaDB does not require regexes to be contained with /
\s represents a literal s and needs to be \\s
To match a literal \ you need to use \\\\, not \\
This regex should give you the results you want:
(?U)<a\\s.*href=\\\\?"/pictures.+(<img.+>)</a>
In a query:
SELECT REGEXP_REPLACE(content, '(?U)<a\\s.*href=\\\\?"/pictures.+(<img.+>)</a>', '\\1')
FROM test
Demo on dbfiddle
The given string is a comma separated value, for example the string below
29,30,31,32,33,34,35,36,192,225,228,233,239,240,144,145
I want to find the exact whole match of any number in the string, and I did write a regex to find it. When I checked it on regex101, it worked well. But when I wrote the same regex on MySql Query, it did not work
(SELECT * FROM table_name WHERE (value REGEXP '.*\b22\b,{0,1}.*')
I did find another query that works, it used concat() function.
SELECT * FROM table_name WHERE CONCAT(',',value,',') like '%,22,%'
Edit: The answer from #anubhava also works (which can be found on the very first comment on this question). Since I am able to pick only one answer as Solved, I just wanted to let others know.
Thanks again
On your version of MySQL, most likely word boundaries are denoted by [[:<:]] and [[:>:]]:
SELECT *
FROM table_name
WHERE value REGEXP '[[:<:]]22[[:>:]]';
I believe that starting with MySQL 8+, it is possible to use the more standard \b to represent a word boundary. Check the demo link below to see it working on MySQL 5.7.
Demo
There is a function just for this task:
WHERE FIND_IN_SET('22', '29,30,31,32,33,34,35,36,192,225,228,233,239,240,144,145')
I have written regex and tested it online, works fine. When I test in terminal, MySQL console, it doesn't match and I get an empty set. I believe MySQL regexp syntax is somehow different but I cannot find the right way.
This is data I use:
edu.ba;
medu.ba;
edu.ba;
med.edu.ba;
edu.com;
edu.ba
I should get only edu.ba matches including; if there is some. Works fine except in actual query.
(\;+|^)\bedu.ba\b(\;+|$|\n)
Is there anything I could change to get the same results?
You want to match edu.ba in between semi-colons or start/end of string. The word boundaries are redundant here (although if you want to experiment, the MySQL regex before MySQL v8 used [[:<:]] / [[:>:]] word boundaries, and in MySQL v8+, you need to use double backslashes with \b - '\\b').
Use
(;|^)edu[.]ba(;|$)
Details
(;|^) - ; or start of string
edu[.]ba - edu.ba literal string (dot inside brackets always matches a literal dot)
(;|$) - ; or end of string.
I try get name of city's from string '{"travelzoo_hotel_name":"Graduate Minneapolis","travelzoo_hotel_id":"223","city":"Minneapolis","country":"USA","sales_manager":"Stephen Conti"}'
I try this regexp:
SELECT REGEXP_SUBSTR('{\"travelzoo_hotel_name\":\"Graduate Minneapolis\",\"travelzoo_hotel_id\":\"223\",\"city\":\"Minneapolis\",\"country\":\"USA\",\"sales_manager\":\"Stephen Conti\"}'
,'(?:.city...)([[:alnum:]]+)');
I have: '"city":"Minneapolis'
Me need only name of city:Minneapolis.
How to use groups in queries?
My example in regex101
Help me Please
I assume you are using MySQL 8.x that uses ICU regex expressions.
It looks like the string you want to process is JSON. You may use JSON_EXTRACT with JSON_UNQUOTE and a '$.city' as JSON path then:
JSON_UNQUOTE(JSON_EXTRACT('{"travelzoo_hotel_name":"Graduate Minneapolis","travelzoo_hotel_id":"223","city":"Minneapolis","country":"USA","sales_manager":"Stephen Conti"}', '$.city'))
will return Minneapolis.
In your regex, the non-capturing group pattern is still matched and appended to the match value. "Non-capturing" only means no separate memory buffer is alotted to the text captured with a grouping construct. So, you may fix it with '(?<="city":")[^"]+' pattern where (?<="city":") is a positive lookbehind that matches "city":" but does not put it into the match value. The only text you will have in the output is the one matched with [^"]+, 1+ chars other than ".
I'm trying to build a search query which searches for a word in a string and finds matches based on the following criteria:
The word is surrounded by a combination of either a space, period or comma
The word is at the start of the string and ends with a space, period or comma
The word is at the end of the string and is followed by a space, period or comma
It's a full match, i.e. the entire string is just the word
For example, if the word is 'php' the following strings would be matches:
php
mysql, php, javascript
php.mysql
javascript php
But for instance it wouldn't match:
php5
I've tried the following query:
SELECT * FROM candidate WHERE skillset REGEXP '^|[., ]php[., ]|$'
However that doesn't work, it returns every record as a match which is wrong.
Without the ^| and |$ in there, i.e.
SELECT * FROM candidate WHERE skillset REGEXP '[., ]php[., ]'
It successfully finds matches where 'php' is somewhere in the string except the start and end of the string. So the problem must be with the ^| and |$ part of the regexp.
How can I add those conditions in to make it work as required?
Try '\bphp\b', \b is a word boundary and might just be exactly what you need because it looks for the whole word php.
For MySQL, word boundaries are represented with [[:<:]] and [[:>:]] instead of \b, so use the query '[[:<:]]php[[:>:]]'. More info on word boundaries here.
Well, you can play around a bit with regex101.com
Something I found that works for you but doesn't exactly follow your rules is:
/(?=[" ".,]?php[" ".,]?)(?=php[\W])/
This uses the lookahead operator, ?=, to do AND
The first portion of the regex is
[" ".,]?php[" ".,]?
This will match anything that has a space, period, or comma before or after the php, but at most only one.
The section portion of the regex is
php[\W]
This will match anything that is php, followed by a non-character. In other words, it will NOT match php followed by a character, digit, or underscore.
It's not the perfect answer for your set of rules, but it does work with your sample data set. Play around on regex101.com and try to make a perfect one.