extracting strings from mysql field

extracting strings from mysql field - mysql

total slow moment day, i need to extract different areas based on what language is selected from a field in a mysql database
ex:
<!--:en-->Overview<!--:--><!--:es-->Overview<!--:--><!--:fr-->Présentation<!--:--><!--:ar-->نظرة عامة<!--:-->
so if my language is french for example, i want the part between <!--:fr--> and <!--:-->
any ideas?

Strings processing is not the strongest part of MySQL. But here is one idea:
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(column_name, '<!--:fr-->', -1), '<!--:-->', 1) FROM table_name

The easier way would be using a substring. You can find the index for the language on the string first. After that, find the index of the end marker () and extract what's in the middle, which is the value you want.
A more elaborated way would be using regular expressions. The implementation depends on the language you are coding on.

Related

MYSQL REGEXP with JSON array

I have an JSON string stored in the database and I need to SQL COUNT based on the WHERE condition that is in the JSON string. I need it to work on the MYSQL 5.5.
The only solution that I found and could work is to use the REGEXP function in the SQL query.
Here is my JSON string stored in the custom_data column:
{"language_display":["1","2","3"],"quantity":1500,"meta_display:":["1","2","3"]}
https://regex101.com/r/G8gfzj/1
I now need to create a SQL sentence:
SELECT COUNT(..) WHERE custom_data REGEXP '[HELP_HERE]'
The condition that I look for is that the language_display has to be either 1, 2 or 3... or whatever value I will define when I create the SQL sentence.
So far I came here with the REGEX expression, but it does not work:
(?:\"language_display\":\[(?:"1")\])
Where 1 is replaced with the value that I look for. I could in general look also for "1" (with quotes), but it will also be found in the meta_display array, that will have different values.
I am not good with REGEX! Any suggestions?

I used the following regex to get matches on your test string
\"language_display\":\[(:?\"[0-9]\"\,)*?\"3\"(:?\,\"[0-9]\")*?\]
https://regex101.com/ is a free online regex tester, it seems to work great. Start small and work big.
Sorry it doesn't work for you. It must be failing on the non greedy '*?' perhaps try without the '?'
Have a look at how to serialize this data, with an eye to serializing the language display fields.
How to store a list in a column of a database table
Even if you were to get your idea working it will be slow as fvck. Better off to process through each row once and generate something more easily searched via sql. Even a field containing the comma separated list would be better.

Mysql split string on punctuation

I have a database where office users have created a "poo man's categorization" by prefixing the administrative title field with a category. For instance, you have records like
Applications - When to Apply
Applications- Fees
Admission: GPA requirements
Admissions: Bursar
We are adding a category column, and I want to get (as close as possible) all the unique user-created categories in the title field. From the examples above, Applications, Admission, and Admissions are good enough.
How can I write a query to return the first part of a field, split on the first non-alphahnumeric character?

AFAIK, this isn't possible with any of the built-in MySQL functions. There's no function for searching a string for a character outside a set, e.g. the first non-alphanumeric character.
You can write a stored function that does it, by looping over the string and calling SUBSTR(). But you're probably better off searching the net for a user-defined function that can split a string using a regular expression.

Select area code from phone number entries

I want to select only the area code from a list of column entries populated by phone numbers. This is what I have:
SELECT LEFT(phone, 3) AS areacode, COUNT(phone) AS count
FROM registration
GROUP BY areacode;
The problem is, the entries aren't consistent. So some phone numbers start as +123-456-7899, and others with (123)-456-7899, and others with no symbol at the beginning.
So my question is: is there a way that I can ensure the SELECT LEFT starts at the first integer?
Thanks!

There are somethings that SQL is just not meant for. This is one. I would select the phone number into a string, and do some pattern matching in your programming language of choice to find the area code.
-OR-
Change your table such that area code is a different column.

Two options (neither of which being SQL):
Select all phone numbers and use a programming language of your choice to programatically strip out the unnecessary characters.
Clean the input to strip out all unnecessary characters prior to inserting them into the database
SQL is not the best way to do this, rather, SQL + programming

There actually is a way to do this in SQL that was intentionally designed for this exact purpose.
SELECT SUBSTRING(office_phone_number, 1, 3) FROM contact;
Of course, this depends on how the number is stored in the table. If parenthesis are present, your starting position would be off.
Here is more information:
MySQL substring function

How to order text that contains double colons (::)

To order by name I'm using 'order by name'
But the names contain double colons : '::'
How can I order by the text that occurs subsequent to the double colons ?
So :
aaaa::bbbb
aaaa::aaaa
aaaa::1234
aaaa::a1234
Will be ordered :
aaaa::1234
aaaa::aaaa
aaaa::a1234
aaaa::bbbb

Order by the substring ans use locate to find where it starts:
order by substring(name, locate('::', name) + 3, 30)
It'll decrease performance since no index will be used.

You would have to create a new field in MySQL then insert the second part of your text into it. Sort by uses various indexes and algorithms (such as divide and conquer).
As such it would not work on sorting on a specific portion of a specific string, and if you did manage to 'fake' a way of doing it, the performance would be terrible due to lack of indexes.
Sorry, I realise this probably isn't the answer your looking for, but I'm afraid the best way is the slightly longer way, but at least you can then do it at lighting fast speeds if you add an index to it :)

You must split the text into two columns and order by the latter one. You can either split and join the columns in application code or use views and stored procedures to make it look like one column to a database client.

about your sorting , according to ascii values numbers come first before alphabets,
so aaaa:1234 should come first
You can retrieve the values and sort in PHP
Navsort
<?php
$arr = array("aaaa::bbbb","aaaa::aaaa","aaaa::1234","aaaa::a1234");
$sec=$arr;
natsort($sec);
print_r ($sec);
?>

You may try the following approach
Get all records where All data is Alphabet after ::
UNION
Get all records where All data is Numeric after ::

MySQL: Find and Replace Between Certain Characters

In field post_content I have a string like this in nearly 800 rows:
http://somesite.com/">This is some site</a>
I need to remove everything from "> onwards so that it leaves just the URL. I can't do a straight find and replace because the text is unique.
Any clues? This is really my first foray into MySQL database modifications but I did do an extensive search before posting here.
Thanks,
~Kyle~

From this site: http://www.regular-expressions.info/mysql.html
LIB_MYSQLUDF_PREG
If you want more regular expression power in your database, you can consider using LIB_MYSQLUDF_PREG. This is an open source library of MySQL user functions that imports the PCRE library. LIB_MYSQLUDF_PREG is delivered in source code form only. To use it, you'll need to be able to compile it and install it into your MySQL server. Installing this library does not change MySQL's built-in regex support in any way. It merely makes the following additional functions available:
Here it comes...
PREG_CAPTURE extracts a regex match from a string. PREG_POSITION returns the position at which a regular expression matches a string. PREG_REPLACE performs a search-and-replace on a string. PREG_RLIKE tests whether a regex matches a string.
Sounds exactly what you're looking for.
All these functions take a regular expression as their first parameter. This regular expression must be formatted like a Perl regular expression operator. E.g. to test if regex matches the subject case insensitively, you'd use the MySQL code PREG_RLIKE('/regex/i', subject). This is similar to PHP's preg functions, which also require the extra // delimiters for regular expressions inside the PHP string.

See this post: How to do a regular expression replace in MySQL?
Either that or you could just write a script in any lanugage which goes through each record, does a regex replacement and then updates the field. For more info on regex, see here: http://www.regular-expressions.info/reference.html

There's a number of options. One might be to use SUBSTRING_INDEX():
UPDATE
table
SET field = SUBSTRING_INDEX( field, '">', 1 )

It's possible - there is a syntax for User Defined Functions which would let you pass in a regular expression pattern that matches the link and strips everything else.
However, this is quite complicated for somebody new to MySQL, and from your question, this sounds like a one-off. In which case - why not just use Excel and then reimport the data?

Great stuff!
All seems doable with a little bit of time and self education.
In the end, I exported that table as a CSV in Sequel Pro and did some nifty find and replace work in Coda. Not as sophisticated as your suggestions, but it worked.
Thanks again,
~Kyle~

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

extracting strings from mysql field - mysql

Strings processing is not the strongest part of MySQL. But here is one idea: SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(column_name, '', -1), '', 1) FROM table_name

Related

MYSQL REGEXP with JSON array

Mysql split string on punctuation

Select area code from phone number entries

How to order text that contains double colons (::)

MySQL: Find and Replace Between Certain Characters

Categories

Resources