Split values in SQL after a specific character [duplicate] - mysql

This question already has answers here:
SQL split values to multiple rows
(12 answers)
Closed 2 years ago.
I have a table with one column :
Val A
Val B
Val C,Val B,Val D
Val A,Val F,Val A
My question how can i split the values after a specific character in this case "," so that i can have only one per row like this :
Val A
Val B
Val C
Val B
Val D
Val A
Val F
Val A
I don't if it's important but i'm using MySql Workbench.
Thanks in advance.

You can use substring_index(). One method is:
select substring_index(col, ';', 1)
from t
union all
select substring_index(substring_index(col, ';', 2), ';', -1)
from t
where col like '%;%'
union all
select substring_index(substring_index(col, ';', 3), ';', -1)
from t
where col like '%;%;%';
You need to add a separate subquery up to the maximum number of elements in any row.
EDIT:
I don't really like the answers in the duplicate. I would recommend a recursive CTE:
with recursive cte as (
select col as part, concat(col, ',') as rest, 0 as lev
from t
union all
select substring_index(rest, ',', 1),
substr(rest, instr(rest, ',') + 1),
lev + 1
from cte
where rest <> '' and lev < 5
)
select part
from cte
where lev > 0;
Here is a db<>fiddle.

Related

How to select from database using explode

I want export data from my SQL database.
Simply use :
SELECT `id`,`tags` FROM `posts`
This query give me those results :
(1, 'handshake,ssl,windows'),
(2, 'office,word,windows'),
(3, 'site')
I want results in this form:
(1, 'handshake'),
(1, 'ssl'),
(1, 'windows'),
(2, 'office'),
(2, 'word'),
(2, 'windows'),
(3, 'site')
How can write a query that give me this results?
Thank you and sorry for my poor English.
If you are using SQL Server
You can apply the fuction
STRING_SPLIT
SELECT id, value
FROM posts
CROSS APPLY STRING_SPLIT(tags, ',')
Check this out:
SQL Fiddle example
After many search and try finally i find the solution:
SELECT
DISTINCT postid , SUBSTRING_INDEX(SUBSTRING_INDEX(tags, ',', n.digit+1), ',', -1) val
FROM
posts
INNER JOIN
(SELECT 0 digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6) n
ON LENGTH(REPLACE(tags, ',' , '')) <= LENGTH(tags)-n.digit;
For a max of three words, the code below can be used. If you want more words then you just add more lines. The method may not be fully automated, but it works.
SELECT id, SUBSTRING_INDEX(SUBSTRING_INDEX(tags, ',', 1), ',', -1) FROM tabela
UNION
SELECT id, SUBSTRING_INDEX(SUBSTRING_INDEX(tags, ',', 2), ',', -1) FROM tabela
UNION
SELECT id, SUBSTRING_INDEX(SUBSTRING_INDEX(tags, ',', 3), ',', -1) FROM tabela
ORDER BY id;

How to fix ORDER BY with item ids?

I have a table containing item id's
some examples are:
1
1:3
2:1
2:2
3
3:1
12:2
21:2
I want them to be sorted in the way listed ^
The MYSQL sorts them in following order:
1
1:3
12:2
2:1
2:2
21:2
3
3:1
Anyone has any idea how to fix that problem?
Using SUBSTRING_INDEX() it is possible:
SELECT *
FROM TestTable
ORDER BY CAST(SUBSTRING_INDEX(ColumnVal, ':', 1) AS UNSIGNED),
CAST(SUBSTRING_INDEX(ColumnVal, ':', 2) AS UNSIGNED)
Demo on db<>fiddle
In another way using POSITION()
SELECT *
FROM TestTable
ORDER BY CAST(SUBSTRING_INDEX(ColumnVal, ':', 2) AS UNSIGNED),
POSITION(":" IN ColumnVal),
SUBSTRING(ColumnVal, POSITION(":" IN ColumnVal) + 1, LENGTH(ColumnVal))
Demo on db<>fiddle
SELECT _table.*
# , RPAD(SUBSTRING_INDEX(_table._col, ':', 1), 3, 0)
FROM
(
SELECT
CAST('1' AS CHAR) AS _col
UNION
SELECT
'1:3'
UNION
SELECT
'2:1'
UNION
SELECT
'2:2'
UNION
SELECT
'3'
UNION
SELECT
'3:1'
UNION
SELECT
'12:2'
UNION
SELECT
'21:2') _table
ORDER BY RPAD(SUBSTRING_INDEX(_table._col, ':', 1), 3, 0),
RPAD(SUBSTRING_INDEX(_table._col, ':', 2), 5, 0)
;
You may use ABS() or CAST() if that satisfy you as the following:
SELECT * FROM table ORDER BY ABS(column);
SELECT * FROM table ORDER BY CAST(column as DECIMAL);

SQL query with subquery

My data is like this:
data1_qqq_no_abc_ccc
data1_qqq_abc_ccc
data2_qqq_no_abc_ccc
data2_qqq_abc_ccc
data3_qqq_no_abc_ccc
data4_qqq_no_abc_ccc
data4_qqq_abc_ccc
...
Now I want to get the fields where data has substring _no_abc_ccc, but doesn't have _abc_ccc. In the above example, its data3
I am trying to create a query for it.
rough one is
select SUBSTRING_INDEX(name, 'abc', 1)
from table1
where SUBSTRING_INDEX(name, 'abc', 1) not LIKE "%no"
and NOT IN (select SUBSTRING_INDEX(name, '_no_abc', 1)
from table
where name LIKE "%no_abc");
Something like this (?)
create table t (
col text
);
insert into t
values
('data1_qqq_no_abc_ccc'),
('data1_qqq_abc_ccc'),
('data2_qqq_no_abc_ccc'),
('data2_qqq_abc_ccc'),
('data3_qqq_no_abc_ccc'),
('data4_qqq_no_abc_ccc'),
('data4_qqq_abc_ccc');
select f from (
select SUBSTRING_INDEX(col, '_', 1) as f, SUBSTRING_INDEX(col, '_', -3) as s from t
) tt
group by f
having
count(case when s = 'no_abc_ccc' then 1 end) > 0
and
count(case when s like '%qqq_abc%' then 1 end) = 0
demo

Extracting matches of REGEXP in SQL [duplicate]

I would like to have a mysql query like this:
select <second word in text> word, count(*) from table group by word;
All the regex examples in mysql are used to query if the text matches the expression, but not to extract text out of an expression. Is there such a syntax?
The following is a proposed solution for the OP's specific problem (extracting the 2nd word of a string), but it should be noted that, as mc0e's answer states, actually extracting regex matches is not supported out-of-the-box in MySQL. If you really need this, then your choices are basically to 1) do it in post-processing on the client, or 2) install a MySQL extension to support it.
BenWells has it very almost correct. Working from his code, here's a slightly adjusted version:
SUBSTRING(
sentence,
LOCATE(' ', sentence) + CHAR_LENGTH(' '),
LOCATE(' ', sentence,
( LOCATE(' ', sentence) + 1 ) - ( LOCATE(' ', sentence) + CHAR_LENGTH(' ') )
)
As a working example, I used:
SELECT SUBSTRING(
sentence,
LOCATE(' ', sentence) + CHAR_LENGTH(' '),
LOCATE(' ', sentence,
( LOCATE(' ', sentence) + 1 ) - ( LOCATE(' ', sentence) + CHAR_LENGTH(' ') )
) as string
FROM (SELECT 'THIS IS A TEST' AS sentence) temp
This successfully extracts the word IS
Shorter option to extract the second word in a sentence:
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('THIS IS A TEST', ' ', 2), ' ', -1) as FoundText
MySQL docs for SUBSTRING_INDEX
According to http://dev.mysql.com/ the SUBSTRING function uses start position then the length so surely the function for the second word would be:
SUBSTRING(sentence,LOCATE(' ',sentence),(LOCATE(' ',LOCATE(' ',sentence))-LOCATE(' ',sentence)))
No, there isn't a syntax for extracting text using regular expressions. You have to use the ordinary string manipulation functions.
Alternatively select the entire value from the database (or the first n characters if you are worried about too much data transfer) and then use a regular expression on the client.
As others have said, mysql does not provide regex tools for extracting sub-strings. That's not to say you can't have them though if you're prepared to extend mysql with user-defined functions:
https://github.com/mysqludf/lib_mysqludf_preg
That may not be much help if you want to distribute your software, being an impediment to installing your software, but for an in-house solution it may be appropriate.
I used Brendan Bullen's answer as a starting point for a similar issue I had which was to retrive the value of a specific field in a JSON string. However, like I commented on his answer, it is not entirely accurate. If your left boundary isn't just a space like in the original question, then the discrepancy increases.
Corrected solution:
SUBSTRING(
sentence,
LOCATE(' ', sentence) + 1,
LOCATE(' ', sentence, (LOCATE(' ', sentence) + 1)) - LOCATE(' ', sentence) - 1
)
The two differences are the +1 in the SUBSTRING index parameter and the -1 in the length parameter.
For a more general solution to "find the first occurence of a string between two provided boundaries":
SUBSTRING(
haystack,
LOCATE('<leftBoundary>', haystack) + CHAR_LENGTH('<leftBoundary>'),
LOCATE(
'<rightBoundary>',
haystack,
LOCATE('<leftBoundary>', haystack) + CHAR_LENGTH('<leftBoundary>')
)
- (LOCATE('<leftBoundary>', haystack) + CHAR_LENGTH('<leftBoundary>'))
)
I don't think such a thing is possible. You can use SUBSTRING function to extract the part you want.
My home-grown regular expression replace function can be used for this.
Demo
See this DB-Fiddle demo, which returns the second word ("I") from a famous sonnet and the number of occurrences of it (1).
SQL
Assuming MySQL 8 or later is being used (to allow use of a Common Table Expression), the following will return the second word and the number of occurrences of it:
WITH cte AS (
SELECT digits.idx,
SUBSTRING_INDEX(SUBSTRING_INDEX(words, '~', digits.idx + 1), '~', -1) word
FROM
(SELECT reg_replace(UPPER(txt),
'[^''’a-zA-Z-]+',
'~',
TRUE,
1,
0) AS words
FROM tbl) delimited
INNER JOIN
(SELECT #row := #row + 1 as idx FROM
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t1,
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t2,
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t3,
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t4,
(SELECT #row := -1) t5) digits
ON LENGTH(REPLACE(words, '~' , '')) <= LENGTH(words) - digits.idx)
SELECT c.word,
subq.occurrences
FROM cte c
LEFT JOIN (
SELECT word,
COUNT(*) AS occurrences
FROM cte
GROUP BY word
) subq
ON c.word = subq.word
WHERE idx = 1; /* idx is zero-based so 1 here gets the second word */
Explanation
A few tricks are used in the SQL above and some accreditation is needed. Firstly the regular expression replacer is used to replace all continuous blocks of non-word characters - each being replaced by a single tilda (~) character. Note: A different character could be chosen instead if there is any possibility of a tilda appearing in the text.
The technique from this answer is then used for transforming a string with delimited values into separate row values. It's combined with the clever technique from this answer for generating a table consisting of a sequence of incrementing numbers: 0 - 10,000 in this case.
The field's value is:
"- DE-HEB 20% - DTopTen 1.2%"
SELECT ....
SUBSTRING_INDEX(SUBSTRING_INDEX(DesctosAplicados, 'DE-HEB ', -1), '-', 1) DE-HEB ,
SUBSTRING_INDEX(SUBSTRING_INDEX(DesctosAplicados, 'DTopTen ', -1), '-', 1) DTopTen ,
FROM TABLA
Result is:
DE-HEB DTopTEn
20% 1.2%

mysql select substrings and group them by column

I am trying to divide data in one onf the tables on my MySQL database.
Column contains data like this:
de:"Sweatjacke*";en:"jacket*";pl:"bluza*";
de:"*";en:"*";pl:"bluza*";
fr:"*";de:"*";en:"*";pl:"dres junior*";cz:"*";
pl:"bluza";
And I am trying to divide all of the translations into separate columns. Already came with solution to do this by using:
SELECT
SUBSTRING_INDEX(SUBSTRING_INDEX(name, ';', 1), ';', -1) as tr1,
SUBSTRING_INDEX(SUBSTRING_INDEX(name, ';', 2), ';', -1) as tr2,
SUBSTRING_INDEX(SUBSTRING_INDEX(name, ';', 3), ';', -1) as tr3,
SUBSTRING_INDEX(SUBSTRING_INDEX(name, ';', 4), ';', -1) as tr4,
SUBSTRING_INDEX(SUBSTRING_INDEX(name, ';', 5), ';', -1) as tr5
FROM product;
statement, but that results in:
tr1 tr2 tr3 tr4 tr5
fr:"*" de:"*" en:"*" pl:"bluza*" cz:"*"
fr:"*" de:"Sweatjacke*" en:"jacket*" pl:"bluza*" cz:"*"
de:"Sweatjacke*" en:"jacket*" pl:"bluza*"
And I want to have the results gruped by translation type (pl/de/en) so in each collumn one type of translatoin is present. For example in column1 = pl:, column2 = en: etc.
Any one came across similar problem and knows a way to solve it?
You need to unpivot the data, then select the first and second part of each value and then re-aggregate it.
However, a better form for the data is really to have language/translation. The following produces this:
select substring_index(tr, ':', 1) as l, substring_index(tr, ':', 2) as t, name
from (select SUBSTRING_INDEX(SUBSTRING_INDEX(name, ';', n.n), ';', -1) as tr, n, name
from product p cross join
(select 1 as n union all select 2 union all select 3 union all select 4 union all
select 5
) n
) n
You would probably want an "id" column or "word" column to identify each row, rather than the name column.
You can now pivot this result to get what you want:
select max(case when l = 'en' then name end) as en,
max(case when l = 'fr' then name end) as fr,
max(case when l = 'de' then name end) as de,
max(case when l = 'pl' then name end) as pl,
max(case when l = 'cz' then name end) as cz
from (select substring_index(tr, ':', 1) as l, substring_index(tr, ':', 2) as t, name
from (select SUBSTRING_INDEX(SUBSTRING_INDEX(name, ';', n.n), ';', -1) as tr, n, name
from product p cross join
(select 1 as n union all select 2 union all select 3 union all select 4 union all
select 5
) n
) n
) lt
group by name;
Managed to solve it by using some of the string related functions funcitons:
SELECT
SUBSTRING_INDEX( SUBSTRING( name, LOCATE( "pl:", name ) , 150 ) , ';', 1 ) AS pl,
SUBSTRING_INDEX( SUBSTRING( name, LOCATE( "en:", name ) , 150 ) , ';', 1 ) AS en,
SUBSTRING_INDEX( SUBSTRING( name, LOCATE( "de:", name ) , 150 ) , ';', 1 ) AS de,
SUBSTRING_INDEX( SUBSTRING( name, LOCATE( "fr:", name ) , 150 ) , ';', 1 ) AS fr
FROM product
Thanks to everyone for help.
As far as I understand you want to UNPIVOT your data. There is no such function in MySQL, so you might want to export your data into MSSQL (you can use free MSSQL Express) and use UNPIVOT function: http://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx