Mysql split string on punctuation - mysql

I have a database where office users have created a "poo man's categorization" by prefixing the administrative title field with a category. For instance, you have records like
Applications - When to Apply
Applications- Fees
Admission: GPA requirements
Admissions: Bursar
We are adding a category column, and I want to get (as close as possible) all the unique user-created categories in the title field. From the examples above, Applications, Admission, and Admissions are good enough.
How can I write a query to return the first part of a field, split on the first non-alphahnumeric character?

AFAIK, this isn't possible with any of the built-in MySQL functions. There's no function for searching a string for a character outside a set, e.g. the first non-alphanumeric character.
You can write a stored function that does it, by looping over the string and calling SUBSTR(). But you're probably better off searching the net for a user-defined function that can split a string using a regular expression.

Related

MySQL find/replace with a unique string inside

not sure how far I'm going to get with this, but I'm going through a database removing certain bits and pieces in preparation for a conversion to different software.
I'm struggling with the image tags as on the site they currently look like
[img:<string>]<image url>[/img:<string>]
those strings are in another field called bbcode_uid
The query I'm running to make the changes so far is
UPDATE phpbb_posts SET post_text = REPLACE(post_text, '[img:]', '');
So my actual question, is there any way of pulling in each string from bbcode_uid inside of that SQL query so that I don't have to run the same command 10,000+ times, changing the unique string every time.
Alternatively could I include something inside [img:] to also include the next 8 characters, whatever they may be, as that is the length of the string that is used.
Hoping to save time with this, otherwise I might have to think of another way of doing it.
As requested.
The text I wish to replace would be
[img:1nynnywx]http://i.imgur.com/Tgfrd3x.jpg[/img:1nynnywx]
I want to end up with just
http://i.imgur.com/Tgfrd3x.jpg
Just removing the code around the URL, however each post_text has a different string which is contained inside bbcode_uid.
Method 1
LIB_MYSQLUDF_PREG
If you want more regular expression power in your database, you can consider using LIB_MYSQLUDF_PREG. This is an open source library of MySQL user functions that imports the PCRE library. LIB_MYSQLUDF_PREG is delivered in source code form only. To use it, you'll need to be able to compile it and install it into your MySQL server. Installing this library does not change MySQL's built-in regex support in any way. It merely makes the following additional functions available:
PREG_CAPTURE extracts a regex match from a string. PREG_POSITION returns the position at which a regular expression matches a string. PREG_REPLACE performs a search-and-replace on a string. PREG_RLIKE tests whether a regex matches a string.
All these functions take a regular expression as their first parameter. This regular expression must be formatted like a Perl regular expression operator. E.g. to test if regex matches the subject case insensitively, you'd use the MySQL code PREG_RLIKE('/regex/i', subject). This is similar to PHP's preg functions, which also require the extra // delimiters for regular expressions inside the PHP string
you can refer this link :github.com/hholzgra/mysql-udf-regexp
Method 2
Use php program, fetch records one by one , use php preg_replace
refer : www.php.net/preg_replace
reference:http://www.online-ebooks.info/article/MySql_Regular_Expression_Replace.html
You might be able to do this with substring_index().
The following will work on your example:
select substring_index(substring_index(post_text, '[/img:', 1), ']', -1)

How do I assign a variable to each letter of a string in MySQL?

I am trying to figure out a way of doing an "anagram" function as a stored procedure on MySQL. Lets say I have a database containing all the words in the dictionary - I want to enter a parameter of some letters as a VARCHAR and get back a list of words which make up an anagram of those letters.
I guess what I'm sort of saying is, how do I run an SQL command to say "Select all words which are the same length as the parameter AND contain each of the letters in the parameter".
I have explored the string functions available (http://www.hscripts.com/tutorials/mysql/string-function.php). I'm sure these can be used in conjunction in some way but can't quite get the syntax right when it gets complicated.
I am new to SQL, and it just seems like the String functions available are very limited. Any help would be greatly appreciated :)
You don't; it's not a sensible thing to ask a relational database to do.
However, if someone was forcing me at gunpoint to implement anagram finding using a relational database, I would denormalize it like this:
word | sorted
-----|-------
bar | abr
bra | abr
keel | eekl
leek | eekl
Where "sorted" consists of all of the letters in "word", sorted using any rule you like as long as it's a total order. You would use something other than SQL to compute that part.
Then you could find anagrams with something like this:
SELECT w2.word AS anagram
FROM words w1
JOIN words w2 ON w1.sorted=w2.sorted
WHERE w1.word = 'leek'
AND w2.word <> w1.word
SQL is probably not the right place to do this, you should do it on the front end.
First of all consider the properties of an anagram, it will be the same length as the words in your dictionary. You can start by retrieving those words.
Instead of creating a variable per letter consider using an array
Each letter maps to an index (a=0, b=3, etc...). Each time you run into that letter increase the value for that bucket so for the word "dad" you'll end up with a structure that looks like this:
arr[0]=1, arr[1]=0, arr[2]=0, arr[3]=2, arr[4]=0 and so on...
Now you can just see if your words match each item in the array.
While not impossible in SQL, you can represent that kind of logic in the database, for example another table that will have a reference to the dictionary word and each tuple would be the array, then you can just retrieve all the items with the same values.

Using REGEX to alter field data in a mysql query

I have two databases, both containing phone numbers. I need to find all instances of duplicate phone numbers, but the formats of database 1 vary wildly from the format of database 2.
I'd like to strip out all non-digit characters and just compare the two 10-digit strings to determine if it's a duplicate, something like:
SELECT b.phone as barPhone, sp.phone as SPPhone FROM bars b JOIN single_platform_bars sp ON sp.phone.REGEX = b.phone.REGEX
Is such a thing even possible in a mysql query? If so, how do I go about accomplishing this?
EDIT: Looks like it is, in fact, a thing you can do! Hooray! The following query returned exactly what I needed:
SELECT b.phone, b.id, sp.phone, sp.id
FROM bars b JOIN single_platform_bars sp ON REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(b.phone,' ',''),'-',''),'(',''),')',''),'.','') = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(sp.phone,' ',''),'-',''),'(',''),')',''),'.','')
MySQL doesn't support returning the "match" of a regular expression. The MySQL REGEXP function returns a 1 or 0, depending on whether an expression matched a regular expression test or not.
You can use the REPLACE function to replace a specific character, and you can nest those. But it would be unwieldy for all "non-digit" characters. If you want to remove spaces, dashes, open and close parens e.g.
REPLACE(REPLACE(REPLACE(REPLACE(sp.phone,' ',''),'-',''),'(',''),')','')
One approach is to create user defined function to return just the digits from a string. But if you don't want to create a user defined function...
This can be done in native MySQL. This approach is a bit unwieldy, but it is workable for strings of "reasonable" length.
SELECT CONCAT(IF(SUBSTR(sp.phone,1,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,1,1),'')
,IF(SUBSTR(sp.phone,2,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,2,1),'')
,IF(SUBSTR(sp.phone,3,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,3,1),'')
,IF(SUBSTR(sp.phone,4,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,4,1),'')
,IF(SUBSTR(sp.phone,5,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,5,1),'')
) AS phone_digits
FROM sp
To unpack that a bit... we extract a single character from the first position in the string, check if it's a digit, if it is a digit, we return the character, otherwise we return an empty string. We repeat this for the second, third, etc. characters in the string. We concatenate all of the returned characters and empty strings back into a single string.
Obviously, the expression above is checking only the first five characters of the string, you would need to extend this, basically adding a line for each position you want to check...
And unwieldy expressions like this can be included in a predicate (in a WHERE clause). (I've just shown it in the SELECT list for convenience.)
MySQL doesn't support such string operations natively. You will either need to use a UDF like this, or else create a stored function that iterates over a string parameter concatenating to its return value every digit that it encounters.

extracting strings from mysql field

total slow moment day, i need to extract different areas based on what language is selected from a field in a mysql database
ex:
<!--:en-->Overview<!--:--><!--:es-->Overview<!--:--><!--:fr-->Présentation<!--:--><!--:ar-->نظرة عامة<!--:-->
so if my language is french for example, i want the part between <!--:fr--> and <!--:-->
any ideas?
Strings processing is not the strongest part of MySQL. But here is one idea:
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(column_name, '<!--:fr-->', -1), '<!--:-->', 1) FROM table_name
The easier way would be using a substring. You can find the index for the language on the string first. After that, find the index of the end marker () and extract what's in the middle, which is the value you want.
A more elaborated way would be using regular expressions. The implementation depends on the language you are coding on.

MySQL select statement similar to entered value

I've been searching all over and trying out different approaches but I'm just not getting what I need.
Is there a possibility in MySQL to select a db entry similar to a string value entered in a form?
for example: I have a db with vendor names in it and a customer can enter a vendor name into a search field. Let's say he's looking for adobe but accidentally types 'adope' in the search field. I would now like to select all entries, that are similar to 'adope'. How can I do that?
I've tried ... LIKE '%$vendor%' and all kinds of regexp but it seems I'm on the wrong way...
Thanks fpr your help in advance :-)
Cheers
Fred
You can do with SOUNDEX, check out this tutorial:
Mysql function to soundex match a word in a multi word string
Also check out the official docs
Returns a soundex string from str. Two
strings that sound almost the same
should have identical soundex strings.
A standard soundex string is four
characters long, but the SOUNDEX()
function returns an arbitrarily long
string. You can use SUBSTRING() on the
result to get a standard soundex
string.