RegExp in mysql for field - mysql

I have the following query:
SELECT item from table
Which gives me:
<title>Titanic</title>
How would I extract the name "Titanic" from this? Something like:
SELECT re.find('\>(.+)\>, item) FROM table
What would be the correct syntax for this?

By default, MySQL does not provide functionality for extracting text using regular expressions. You can use REGEXP to find rows that match something like >.+<, but there is no straightforward way of extracting the captured group without some additional effort, such as:
using a library like lib_mysqludf_preg
writing your own MySQL function to extract matched text
performing regular string manipulation
using the regex functionality of whatever environment you're using MySQL from (e.g. PHP's preg_match)
reconsidering your need for regular expressions entirely. If you know that all your rows contain a <title> tag, for instance, it may be a better idea to simply use "normal" string functions such as SUBSTRING

As pointed out in the informative answer by George Bahij MySQL lacks this functionality so the options would be to either extend the functionality using udfs etc, or use the available string functions, in which case you could do:
SELECT
SUBSTR(
SUBSTRING_INDEX(
SUBSTRING_INDEX(item,'<title>',2)
,'</title>',1)
FROM 8
)
from table
Or if the string you need to extract from always is on the format <title>item</title> then you could simple use replace: replace(replace(item, '<title>', ''), '</title>','')

This regex: <\w+>.+</\w+> will match content in tags.
Your query should be something like:
SELECT * FROM `table` WHERE `field` REGEXP '<\w+>.+</\w+>';
Then if you're using PHP or something similar you could use a function like strip_tags to extract the content between the tags.

XML shouldn't be parsed with regexes, and at any rate MySQL only supports matching, not replacement.
But MySQL supports XPath 1.0. You should be able to simply do this:
SELECT ExtractValue(item,'/title') AS item_title FROM table;
https://dev.mysql.com/doc/refman/5.6/en/xml-functions.html

Related

Does a RegEx Pattern need to be modified to be used with SQL in MySql?

I'am trying to write a SELECT-Statement to retrieve a list of Usernames from my Database. My Pattern is: /placeholder\d+/ig and I already tested it and can confirm it is working properly. I'am trying to retrieve every Placeholder in the Table.
I also tried to escape the \ after placeholder.
My SQL-Statement is: SELECT * FROM table WHERE (name REGEX '/placeholder\d+/ig') ... I tried different variations with backticks, etc or LIKE instead of REGEXbut LIKEonly has % and _ as a Wildcard.
Does my RegEx pattern needs to be modified in order to work with MySQL?
Unlike most scripting languages, MySQL is not using the PREG library for regular expression matching.
So yes, you need to modify your regex to make it work properly in MySQL:
SELECT * FROM table WHERE name REGEXP 'placeholder[0-9]+'
OR
SELECT * FROM table WHERE name REGEXP 'placeholder[[:digit:]]+'
There are no short-hand character classes like \d in MySQL. Also, you do not use the regex-delimeter ("/../si" is just ".." in MySQL)
Read the documentation on regular expressions in MySQL for more information.

MySQL find/replace with a unique string inside

not sure how far I'm going to get with this, but I'm going through a database removing certain bits and pieces in preparation for a conversion to different software.
I'm struggling with the image tags as on the site they currently look like
[img:<string>]<image url>[/img:<string>]
those strings are in another field called bbcode_uid
The query I'm running to make the changes so far is
UPDATE phpbb_posts SET post_text = REPLACE(post_text, '[img:]', '');
So my actual question, is there any way of pulling in each string from bbcode_uid inside of that SQL query so that I don't have to run the same command 10,000+ times, changing the unique string every time.
Alternatively could I include something inside [img:] to also include the next 8 characters, whatever they may be, as that is the length of the string that is used.
Hoping to save time with this, otherwise I might have to think of another way of doing it.
As requested.
The text I wish to replace would be
[img:1nynnywx]http://i.imgur.com/Tgfrd3x.jpg[/img:1nynnywx]
I want to end up with just
http://i.imgur.com/Tgfrd3x.jpg
Just removing the code around the URL, however each post_text has a different string which is contained inside bbcode_uid.
Method 1
LIB_MYSQLUDF_PREG
If you want more regular expression power in your database, you can consider using LIB_MYSQLUDF_PREG. This is an open source library of MySQL user functions that imports the PCRE library. LIB_MYSQLUDF_PREG is delivered in source code form only. To use it, you'll need to be able to compile it and install it into your MySQL server. Installing this library does not change MySQL's built-in regex support in any way. It merely makes the following additional functions available:
PREG_CAPTURE extracts a regex match from a string. PREG_POSITION returns the position at which a regular expression matches a string. PREG_REPLACE performs a search-and-replace on a string. PREG_RLIKE tests whether a regex matches a string.
All these functions take a regular expression as their first parameter. This regular expression must be formatted like a Perl regular expression operator. E.g. to test if regex matches the subject case insensitively, you'd use the MySQL code PREG_RLIKE('/regex/i', subject). This is similar to PHP's preg functions, which also require the extra // delimiters for regular expressions inside the PHP string
you can refer this link :github.com/hholzgra/mysql-udf-regexp
Method 2
Use php program, fetch records one by one , use php preg_replace
refer : www.php.net/preg_replace
reference:http://www.online-ebooks.info/article/MySql_Regular_Expression_Replace.html
You might be able to do this with substring_index().
The following will work on your example:
select substring_index(substring_index(post_text, '[/img:', 1), ']', -1)

Regexp to validate URL in MySQL

I have tried several regex patterns (designed for use with PHP because I couldn't find any for MySQL) for URL validation, but none of them are working. Probably MySQL has a slightly different syntax.
I've also tried to come up with one, but no success.
So does anyone know a fairly good regex to use with MySQL for URL validation?
According to article 11.5.2. Regular Expressions in MySQL's documentation, you can perform selections with a regular expression with the following syntax
SELECT field FROM table WHERE field REGEX pattern
In order to match simple URLS, you may use
SELECT field FROM table
WHERE field REGEXP "^(https?://|www\\.)[\.A-Za-z0-9\-]+\\.[a-zA-Z]{2,4}"
This will match most urls like
www.google.il
http://google.com/
http://ww.google.net/
www.google.com/index.php?test=data
https://yahoo.dk/as
http://goo.gle.com/
http://wt.a.x24-s.org/ye/
www.website.info
But not
htp://google.com
ww.google.com/
www-google.com
http://google.c
http://goo#.com
httpf://google.com
Although the answer KBA posted works, there are some inconstancies with the escaping.
The proper syntax should be, this syntax works in MySQL as well as in PHP for example.
SELECT field FROM table
WHERE field REGEXP "^(https?:\/\/|www\.)[\.A-Za-z0-9\-]+\.[a-zA-Z]{2,4}"
The above code will only match if the content of 'field' starts with a URL. If you would like to match all rows where the field contains a url (so for example surrounded by other text / content) just simply use:
SELECT field FROM table
WHERE field REGEXP "(https?:\/\/|www\.)[\.A-Za-z0-9\-]+\.[a-zA-Z]{2,4}"

MySQL: Find and Replace Between Certain Characters

In field post_content I have a string like this in nearly 800 rows:
http://somesite.com/">This is some site</a>
I need to remove everything from "> onwards so that it leaves just the URL. I can't do a straight find and replace because the text is unique.
Any clues? This is really my first foray into MySQL database modifications but I did do an extensive search before posting here.
Thanks,
~Kyle~
From this site: http://www.regular-expressions.info/mysql.html
LIB_MYSQLUDF_PREG
If you want more regular expression power in your database, you can consider using LIB_MYSQLUDF_PREG. This is an open source library of MySQL user functions that imports the PCRE library. LIB_MYSQLUDF_PREG is delivered in source code form only. To use it, you'll need to be able to compile it and install it into your MySQL server. Installing this library does not change MySQL's built-in regex support in any way. It merely makes the following additional functions available:
Here it comes...
PREG_CAPTURE extracts a regex match from a string. PREG_POSITION returns the position at which a regular expression matches a string. PREG_REPLACE performs a search-and-replace on a string. PREG_RLIKE tests whether a regex matches a string.
Sounds exactly what you're looking for.
All these functions take a regular expression as their first parameter. This regular expression must be formatted like a Perl regular expression operator. E.g. to test if regex matches the subject case insensitively, you'd use the MySQL code PREG_RLIKE('/regex/i', subject). This is similar to PHP's preg functions, which also require the extra // delimiters for regular expressions inside the PHP string.
See this post: How to do a regular expression replace in MySQL?
Either that or you could just write a script in any lanugage which goes through each record, does a regex replacement and then updates the field. For more info on regex, see here: http://www.regular-expressions.info/reference.html
There's a number of options. One might be to use SUBSTRING_INDEX():
UPDATE
table
SET field = SUBSTRING_INDEX( field, '">', 1 )
It's possible - there is a syntax for User Defined Functions which would let you pass in a regular expression pattern that matches the link and strips everything else.
However, this is quite complicated for somebody new to MySQL, and from your question, this sounds like a one-off. In which case - why not just use Excel and then reimport the data?
Great stuff!
All seems doable with a little bit of time and self education.
In the end, I exported that table as a CSV in Sequel Pro and did some nifty find and replace work in Coda. Not as sophisticated as your suggestions, but it worked.
Thanks again,
~Kyle~

extract value using regular expression in mysql

I am storing values in database like this
www/content/lessons/40/Digital Library/document1.doc
I need to extract the file document.doc.
How to retrieve this value from mysql using regular expression.
you not need to use regexp (low performence)
use substring_index instead
SELECT SUBSTRING_INDEX('bbb/bbbbbb/bbbbbbbbb/bbbb', '/', -1);
link :
http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_substring-index
Mysql does not provide regex based extraction from strings, but you can extend mysql to do this:
https://github.com/mysqludf/lib_mysqludf_preg
It's not a good solution for a lot of people though. eg you couldn't use it in a shared hosting environment.