Getting the center entry of substring_index - mysql

I have a SQL query where I'm calling SUBSTRING_INDEX on a comma delimited string. In all cases the string has two commas. Is there any way to get at the center element only? That is to say, besides doing something like this:
SUBSTRING_INDEX(SUBSTRING_INDEX(foo, ',', 2), ',', -1)

You can't do a regular expression substitution with MySQL but you can take a look at those third party libs for MySQL:
https://launchpad.net/mysql-udf-regexp

Related

Migrate comma separated string to json array

I have an old database, where some columns have comma separated strings stored like this technician,director,website designer
I want to convert them to a JSON array, so I can use the MySQL JSON array type and the methods associated with it.
So basically I am looking for a method to convert technician,director,website designer to ["technician","director","website designer"] in SQL.
The length of the list is arbitrary.
The biggest struggle I am having is how to apply a SQL function on each element in the comma separated string, (So I for example can run the JSON_QUOTE() on each element) as adding the brackets are just a simple CONCAT.
The solution should be for MySQL 5.7.
You can use REPLACE to get the expected string:
SELECT CONCAT('["', REPLACE('technician,director,website designer', ',', '","'), '"]')
-- ["technician","director","website designer"]
Using JSON_VALID you can check if the result of the conversion is a valid JSON value:
SELECT JSON_VALID(CONCAT('["', REPLACE('technician,director,website designer', ',', '","'), '"]'))
-- 1
demo on dbfiddle.uk

How to pass multiple delimeters in substring_index

I want to query the string between https:// or http:// and the first delimeter characters that comes after it. For example, if the field contains:
https://google.com/en/
https://www.yahoo.com?en/
I want to get:
google.com
www.yahoo.com
My initial query that will capture the / only contains two substring_index as follows:
SELECT substring_index(substring_index(mycol,'/',3),'://',-1)
FROM mytable;
Now I found that the URLs may contain multiple delimeters. I want my statament to capture multiple delimeters possibilities which are (each one is a separate character):
:/?#[]#!$&'()*+,;=
How to do this in my statement? I tried this solution but the end result the command could not be executed due to syntax error while I am sure I followed the solution. Can anyone help me correctly construct the query to capture all the delimeter characters I listed above?
I use MySQL workbecnh 6.3 on Ubuntu 18.04.
EDIT:
Some corrections made in the first example of URLs.
First, note that https://www.yahoo.com?en/ seems like an unlikely URL, because it has a path separator contained inside the query string. In any case, if you are using MySQL 8+, then consider using its regex functionality. The REGEXP_REPLACE function can be helpful here, using the following pattern:
https?://([A-Za-z_0-9.-]+).*
Sample query:
WITH yourTable AS (
SELECT 'https://www.yahoo.com?en/' AS url UNION ALL
SELECT 'no match'
)
SELECT
REGEXP_REPLACE(url, 'https?://([A-Za-z_0-9.-]+).*', '$1') AS url
FROM yourTable
WHERE url REGEXP 'https?://[^/]+';
Demo
The term $1 refers to the first capture group in the regex pattern. An explicit capture group is denoted by a quantity in parentheses. In this case, here is the capture group (highlighted below):
https?://([A-Za-z_0-9.-]+).*
^^^^^^^^^^^^^^^
That is, the capture group is the first portion of the URL path, including domain, subdomain, etc.
In MySQL 8+, this should work:
SELECT regexp_replace(regexp_substr(mycol, '://[a-zA-Z0-9_.]+[/:?]'), '[^a-zA-Z0-9_.]', '')
FROM (SELECT 'https://google.com/en' as mycol union all
SELECT 'https://www.yahoo.com?en'
) x
In older versions, this is much more challenging because there is no way to search for a string class.
One brute force method is:
select (case when substring_index(mycol, '://', -1) like '%/%'
then substring_index(substring_index(mycol, '://', -1), '/', 1)
when substring_index(mycol, '://', -1) like '%?%'
then substring_index(substring_index(mycol, '://', -1), '?', 1)
. . . -- and so on for each character
else substring_index(mycol, '://', -1)
end) as what_you_want
The [a-zA-Z0-9_.] is intended to be something like the valid character class for your domain names.

MySQL substring between two strings

I need a hand to solve a problem with my column field.
I need to extract the string in between these two different "patterns" of strings for example:
[...string] contract= 1234567890123350566076070666 issued= [string
...]
I want to extract the string in between 'contract=' and 'issued='
At the present moment I'm using
SELECT substring(substring_index(licence_key,'contract=',-1),1,40) FROM table
The problem is that this string in between doesn't have always 40 characters so it's not fixed length and so the data that comes before and after that. It's a volatile data.
Do you known how I can handle that?
Just use substring_index() twice:
SELECT substring_index(substring_index(licence_key, 'contract=', -1),
'issued=', 1)
FROM table;
If this string does not match then give the total result.
If you want to replace then you can use like this.
UPDATE questions set question= REPLACE(question, '<xml></xml>', '') WHERE question like '%<xml>%';
UPDATE questions set question= REPLACE(question, substring_index(substring_index(question, '<xml>', -1), '</xml>', 1), '') WHERE question like '%<xml>%';

Can not use substr function in MySQL to get left part of the string using last occurrence of delimiter

I have a question to worm up your minds :)
I need to extract a string A from string B where B is stored in MySQL table.
This is string B:
#/schema#/CT[Items]#/sequence[0]#/element[item]#/CT[]#/sequence[0]#/element[productImage]
And this is A:
#/schema#/CT[Items]#/sequence[0]#/element[item]#/CT[]#/sequence[0]
The delimiter in my case is '#' and I need to remove it along with the following characters '/element[productImage]'.
I tried different functions like SUBSTR(str, pos, len), POSITION(substr IN str), and REVERSE(str) but cannot solve the problem.
Note that index of the last occurrence of '#' is unknown. And I can not find a way to locate the last index of the '#' character (like lastIndexOf() function in JAVA).
I believe that there is a way to do it by reversing the whole string first, cutting the unnecessary part then reversing again to get the desired result.
Can any one help please.
Try this:
LEFT(str, CHAR_LENGTH(str) - LOCATE('#', REVERSE(str)))
You could count the number of occurrences with:
CHAR_LENGTH(stringB)-CHAR_LENGTH(REPLACE(stringB, '#', ''))
and then you can just use SUBSTRING_INDEX:
SELECT
SUBSTRING_INDEX(
stringB,
'#',
CHAR_LENGTH(stringB)-CHAR_LENGTH(REPLACE(stringB, '#', '')))
This looks like it works
SELECT TRIM(TRAILING CONCAT("#", SUBSTRING_INDEX(str, '#', -1)) FROM str)
This part :
SUBSTRING_INDEX(str, '#', -1)
grabs everything after the last token occurrence without the token
this bit :
TRIM(TRAILING 'x' FROM str)
returns the string with the last occurrence of 'x' (or whatever string provided) from the target string.
But since SUBSTRING_INDEX returns the substring without the delimiter token, you need to trim off the trailing token as well, for which we use the CONCAT to add it back to the front of the "bad" sub-string.

Replacing using wildcards or regex in MySQL?

I have values like this in my column: /1/0/101.00_1234.jpg
Now I want to replace the /1/0 with something else. Problem is, it can differ from row to row. It can be /h/a as well. How could I do that without any additional tools?
Thanks
I'd try the SUBSTRING_INDEX key word:
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_substring-index
for example:
SUBSTRING_INDEX(myfield, '/', -3)
Most likely you just need to replace first 4 chars, so use SUBSTRING and CONCAT functions:
CONCAT('/a/b', SUBSTRING(colunm_name, 4))
Write a stored function which searches for the rightmost "/" in the field content and deletes all characters before that position. Then use this stored function to update the field.
Of is this already an "additional tool" -? In this case use an inline function like (not tested)
RIGHT(fieldname,LENGTH(fieldname) - LOCATE('/', reverse(fieldname))