Split values to multiple row on MySQL - mysql

Here is my data in mysql table:
+---------+-------------------+------------+
| ID | Name | Class |
+---------+-------------------+------------+
| 1, 2, 3 | Alex, Brow, Chris | Aa, Bb, Cc |
+---------+-------------------+------------+
I want split values to multiple rows to get data as the below format.
1 Alex Aa
2 Brow Bb
3 Chris Cc
How can I do that?

One trick is to join to a Tally table with numbers.
Then use SUBSTRING_INDEX to get the parts.
If you don't already have a numbers table, here's one way.
drop table if exists Digits;
create table Digits (n int primary key not null);
insert into Digits (n) values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
drop table if exists Nums;
create table Nums (n int primary key not null);
insert into Nums (n)
select (n3.n*100+n2.n*10+n1.n) as n
from Digits n1
cross join Digits n2
cross join Digits n3;
Then it can be used to unfold those columns
Sample data:
drop table if exists YourTable;
create table YourTable (
ID varchar(30) not null,
Name varchar(30) not null,
Class varchar(30) not null
);
insert into YourTable
(ID, Name, Class) values
('1, 2, 3', 'Alex, Brow, Chris', 'Aa, Bb, Cc')
, ('4, 5, 6', 'Drake, Evy, Fiona', 'Dd, Ee, Ff')
;
The query:
SELECT
LTRIM(SUBSTRING_INDEX( SUBSTRING_INDEX( t.ID, ',', Nums.n), ',', -1)) AS Id,
LTRIM(SUBSTRING_INDEX( SUBSTRING_INDEX( t.Name, ',', Nums.n), ',', -1)) AS Name,
LTRIM(SUBSTRING_INDEX( SUBSTRING_INDEX( t.Class, ',', Nums.n), ',', -1)) AS Class
FROM YourTable t
LEFT JOIN Nums ON n BETWEEN 1 AND (LENGTH(ID)-LENGTH(REPLACE(ID, ',', ''))+1);
Result:
Id Name Class
1 Alex Aa
2 Brow Bb
3 Chris Cc
4 Drake Dd
5 Evy Ee
6 Fiona Ff

Multiple values are not recommended for a one field but still if you want a solution you can split string by comma and insert to the table.
Please see this blog post which shows how to split https://nisalfdo.blogspot.com/2019/02/mysql-how-to-insert-values-from-comma.html#more

Related

How to get related items by id inside of json integer array column in MySQL

My goal is how to get a relation between a column that have references in a json array to other one. In a simplified way, I have two tables:
table_a
| id | references |
|-----|------------|
| 1 | "[1,3]" |
| 2 | "[2,3]" |
Whose references is a json array of integers and table b
table_b
| id | name |
|-----|----------|
| 1 | "item 1" |
| 2 | "item 2" |
| 3 | "item 3" |
So, I would like to get all items of table B related to an item of table A with id, for example, 1 that have their ids in the column references integer array (as json).
Something like this:
|-----|----------|
| 1 | "item 1" |
| 3 | "item 3" |
I have been trying to achieve this with json_contains, json_extract, json_search, etc. from docs and I think that problem is in the way to match values inside of json integers array.
For example:
SELECT JSON_SEARCH((select references from table_a where id=1), 'one', '3');
must return something but always return NULL and I dont understand way. Also I tried with 3 without quotes.
¿Any idea?
My current version of MySQL is 5.7.25
Thanks in advance.
Minimal code to reproduce:
select version();
CREATE TABLE `table_a` (
`id` int(11) NOT NULL,
`references` json NULL
);
CREATE TABLE `table_b` (
`id` int(11) NOT NULL,
`name` text NULL
);
INSERT INTO `table_a` (`id`, `references`) VALUES
(1, '\"[1,3]\"'),
(2, '\"[2,3]\"');
INSERT INTO `table_b` (`id`, `name`) VALUES
(1, 'item_1'),
(2, 'item_2'),
(3, 'item_3');
SELECT * from table_a;
SELECT * from table_b;
select `references` from table_a where id=1;
SELECT JSON_SEARCH((select `references` from table_a where id=1), 'one', '3');
Sandbox to test: https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=ac557666852fa94e77fdf87158c9abe0
Incorrect (but checked successfully by JSON_VALID function) JSON.
The solution is monstrous:
SELECT table_b.*
FROM table_a, table_b
WHERE table_a.id = 1
AND JSON_SEARCH(REPLACE(REPLACE(REPLACE(REPLACE(table_a.references, '"', ''), '[', '["'), ']', '"]'), ',', '","'), 'one', table_b.id) IS NOT NULL
fiddle with some additional queries which explains the problem.
You can use next query as solution:
select
table_a.*,
table_b.*
from table_a
join table_b on JSON_CONTAINS(
CAST(TRIM('"' FROM `references`) as JSON),
CAST(table_b.id as JSON)
)
where table_a.id=2;
Because your references field is not valid JSON type you need to convert it into JSON and after this, JSON_CONTAINS functions can be used.
Try it here
You can get rid of quotes wrapping up the array first, and then concatenate the resulting integers with item_ substrings along with using auxiliary subqueries one of which generates rows in order to consecutively get each members of the array such as
SELECT b.*
FROM
(
SELECT #i := #i + 1 AS rn,
CONCAT('item_',JSON_EXTRACT(JSON_UNQUOTE(`references`),
CONCAT('$[',#i-1,']'))) AS name
FROM information_schema.tables
CROSS JOIN `table_a` AS a
CROSS JOIN (SELECT #i := 0) r
WHERE #i < JSON_LENGTH(JSON_UNQUOTE(`references`)) ) AS a
JOIN `table_b` AS b
ON b.`name` = a.name
Demo

Inner joining on to temp table and insert to table in MySql

I have a list of strings. Each of them have categories which are separated out by a '/'.
For example:
animals/domestic/dog
animals/domestic/cat
What I want to do with these categories is to insert into a MySql categories table.
The table has 4 columns:
id (int auto increment), category_name (nvarchar), parent_id (int), is_active (bit)
The logic around inserting these should be as follows:
The main categories (animals) should have a parent_id of 0.
The child categories will have the id of their parent as parent_id.
There cannot be two active categories with the same category name.
I have tried to implement the following logic:
Get a distinct list of strings.
From these, get a distinct list of main categories.
Insert the distinct main categories to the categories table with a parent ID of 0.
Organise each of the categories in pairs and get distinct pairs:
(animals, domestic)
(domestic, dog)
(domestic, cat)
Get the matching id for each of the parent categories and insert in to the child's parent_id
SQL:
/*INSERT ALL THE FIRST PARENT CATEGORIES WITH A PARENT ID OF 0*/
INSERT INTO categories (category_name, parent_id, is_active)
VALUES ('animals', 0, 1);
/*INSERT ALL THE CATEGORIES IN PAIRS TO TEMP TABLE*/
CREATE TEMPORARY TABLE tempcat(parent nvarchar(256), child nvarchar(256));
INSERT INTO tempcat
VALUES ('animals', 'domestic'),('domestic', 'dog'),('domestic','cat');
/*INSERT INTO THE CATEGORIES TABLE*/
INSERT INTO categories(category_name, parent_id, is_active)
SELECT tempcat.child, categories.id, 1
FROM categories
INNER JOIN tempcat
ON categories.category_name = tempcat.parent;
WHERE categories.is_active = 1;
/*DISPOSE THE TEMPORARY TABLE*/
DROP TEMPORARY TABLE tempcat;
Issue:
After the query is run I expect 4 entries in the categories table.
But I only get 2.
I can see that the temp table has correct entries before doing the last inner join.
I can't seem to figure out why the categories table wouldn't have the other two rows.
Any guidance in the right direction is highly appreciated.
Update #1
Suppose the specifications said 'There cannot be two active categories with the same category name that had the same parent IDs'.
For example, if there were two strings as (animals/domestic/cat), (animals/outdoor/cat) there should be two entries for cat with IDs of domestic and outdoor as parent_id's.
CREATE TABLE categories (id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
category_name VARCHAR(64),
parent_id INT UNSIGNED NOT NULL DEFAULT 0,
is_active CHAR(1) NULL,
UNIQUE INDEX idx_name_active (category_name));
CREATE TABLE source_data (path TEXT);
INSERT INTO source_data VALUES ('animals/domestic/dog'), ('animals/domestic/cat');
CREATE PROCEDURE update_categories_table()
BEGIN
DECLARE cnt INT DEFAULT 0;
INSERT IGNORE INTO categories (category_name, parent_id, is_active)
SELECT SUBSTRING_INDEX(path, '/', 1), 0, '1'
FROM source_data;
iteration: LOOP
SELECT COUNT(*) INTO cnt
FROM source_data
WHERE LOCATE('/', path);
IF NOT cnt THEN
LEAVE iteration;
END IF;
INSERT IGNORE INTO categories (category_name, parent_id, is_active)
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(source_data.path, '/', 2), '/', -1),
categories.id,
'1'
FROM source_data, categories
WHERE SUBSTRING_INDEX(source_data.path, '/', 1) = categories.category_name;
UPDATE source_data
SET path = SUBSTRING(path FROM 1 + LOCATE('/', path));
END LOOP iteration;
TRUNCATE source_data;
END
call update_categories_table;
SELECT * FROM categories;
id | category_name | parent_id | is_active
-: | :------------ | --------: | :--------
1 | animals | 0 | 1
4 | domestic | 1 | 1
7 | dog | 4 | 1
8 | cat | 4 | 1
db<>fiddle here
In MySQL 8, you can do this with a single query:
with splits as (
select 1 as n, substring_index(cats, '/', 1) as cat, cats
from strings union all
select 2 as n, substring_index(substring_index(cats, '/', 2), '/', -1) as cat, cats
from strings
where cats like '%/%' union all
select 3 as n, substring_index(substring_index(cats, '/', 3), '/', -1) as cat, cats
from strings
where cats like '%/%/%'
),
splits_n as (
select s.*, dense_rank() over (order by n, cat) as new_id
from splits s
),
splits_np as (
select s.*, sp.new_id as parent_id
from splits_n s left join
splits_n sp
on sp.cats = s.cats and sp.n = s.n - 1
)
select distinct new_id as id, cat, parent_id, 1 as is_active
from splits_np s;
Here is a db<>fiddle.
Unfortunately, this is much more painful in earlier versions.

Sql Query to find duplicates in 2 columns where the values in first column are same

I have a table where the first column contains States and second column contains Zip Code. I want to find duplicate Zip Codes in the same State. So, the first column can have same values but i need to find the duplicates in the second column that have the same values in the first column.
Table :
+---+----+------+
| Z | A | B |
+---+----+------+
| 1 | GA | 1234 |
| 2 | GA | 321 |
| 3 | GA | 234 |
| 4 | GA | 9890 |
| 5 | GA | 1234 |
+---+----+------+
The query should return the value of the zip code that has a duplicate i.e 1234. I have around 10000+ records.
Thank You.
Try using a GROUP BY query and retain zip codes appearing in duplicate.
SELECT A, B
FROM yourTable
GROUP BY A, B
HAVING COUNT(*) > 1
Note that we can group by state and zip code assuming that a given zip code only appears once, for a given state.
Please try the following...
SELECT Z AS RecordNumber,
tblTable.A AS State,
tblTable.B AS ZipCode
FROM tblTable
JOIN ( SELECT A,
B
FROM tblTable
GROUP BY A,
B
HAVING COUNT( * ) > 1
) AS duplicatesFinder ON tblTable.A = duplicatesFinder.A
AND tblTable.B = duplicatesFinder.B
ORDER BY tblTable.A,
tblTable.B,
Z;
This statement starts with a subquery that selects every unique combination of State and Zip Code that occurs more than once in the source table (which I have called tblTable in the absence of the table's name).
The results of this subquery are then joined to the source table based on shared values of State and Zip Code. This JOIN effectively eliminates all records from the source table that have a unique State / Zip Code combination from our results dataset.
The list of duplicated States / Zip Codes is then returned along with the values of Z associated with each pairing.
If you have any questions or comments, then please feel free to post a Comment accordingly.
Appendix
My code was tetsted against a database created using the following script...
CREATE TABLE tblTable
(
Z INT,
A CHAR( 2 ),
B INT
);
INSERT INTO tblTable ( Z,
A,
B )
VALUES ( 1, 'GA', 1234 ),
( 2, 'GA', 321 ),
( 3, 'GA', 234 ),
( 4, 'GA', 9890 ),
( 5, 'GA', 1234 );
try this:
select A,B, count(CONCAT_WS('',A,B)) as cnt from
(select * from yourtable) as a group by A,B having count(CONCAT_WS('',A,B))>1
result for all duplicate records or more than one records:
GA 1234 2
It sounds like you want both rows returned where duplicates are found. This should work:
with cte1 as (
select
A
,B
,count(1) over (partition by A, B) as counter
from table_name
)
select
A
,B
from cte1
where 1=1
and counter > 1
order by A, B
;
If you want to know how many duplicate rows there are in total, you can select the "counter" field in the final select:
with cte1 as (
select
A
,B
,count(1) over (partition by A, B) as counter
from table_name
)
select
A
,B
,counter
from cte1
where 1=1
and counter > 1
order by A, B
;
You can use below query.
SELECT A, B, COUNT(*)
FROM TABLE_NAME
GROUP BY A, B
HAVING COUNT(*) > 1;

Retrieve Distinct concat values from MySQL table

I have an SQL table advert
id name cat
11 abc ab
12 acb ab, bc
13 abb bcd
14 abcd ad
15 acbd de
16 abbd ad
On using DISTINCT function I am getting an output like this
Query:
SELECT DISTINCT cat FROM advert;
Output:
ab
ab, bc
bcd
ad
de
WHAT changes do I need to make in my query for output like this
ab
bc
bcd
ad
de
select distinct trim(substring_index(substring_index(cat,',',n),',',-1)) as cat
from t join (select 1 as n union all select 2 union all select 3) r
on cat like concat('%',repeat(',%',n-1))
I think you should change your table structure and make it like this.
tblName
id | name
11 abc
12 acb
13 abb
14 abcd
15 acbd
16 abbd
tblCat
id | name_id | cat
some ids* 11 ab
12 ab
12 bc
13 bcd
14 ad
15 de
16 ad
In this way you can easily query and manage your data in your tables.
You should fix your data structure so you are not storing comma-delimited lists in columns. That is the wrong way to store data in a relational database . . . as you can see by the problems for answering this simple question. What you want is a junction table.
Sometimes, we are stuck with other peoples bad designs. You say that there are only two or values, then you can do:
select cat
from ((select substring_index(cat, ', ', 1) as cat
from advert
) union all
(select substring_index(substring_index(cat, ', ', 2), ', ', -1) as cat
from advert
where cat like '%, %'
) union all
(select substring_index(substring_index(cat, ', ', 3), ', ', -1) as cat
from advert
where cat like '%, %, %'
)
) c
group by cat;
First... I would create a statement that would turn all the rows into one big massive comma delimited list.
DECLARE #tmp VarChar(max)
SET #tmp = ''
SELECT #tmp = #tmp + ColumnA + ',' FROM TableA
Then use the table valued udf split described by this SO article to turn that massive string back into a table with a distinct clause to ensure that it's unique.
https://stackoverflow.com/a/2837662/261997
SELECT DISTINCT * FROM dbo.Split(',', #tmp)
Full code example:
if object_id('dbo.Split') is not null
drop function dbo.Split
go
CREATE FUNCTION dbo.Split (#sep char(1), #s varchar(512))
RETURNS table
AS
RETURN (
WITH Pieces(pn, start, stop) AS (
SELECT 1, 1, CHARINDEX(#sep, #s)
UNION ALL
SELECT pn + 1, stop + 1, CHARINDEX(#sep, #s, stop + 1)
FROM Pieces
WHERE stop > 0
)
SELECT pn,
SUBSTRING(#s, start, CASE WHEN stop > 0 THEN stop-start ELSE 512 END) AS s
FROM Pieces
)
go
declare #t table (colA varchar(max))
insert #t select '111, 223'
union all select '333'
union all select '444'
union all select '777,999';
select ltrim(rtrim(s.s)) as colC
from #t t
cross apply
dbo.split(',', t.colA) s

Create an inline SQL table on the fly (for an excluding left join)

Let's assume the following:
Table A
id | value
----------
1 | red
2 | orange
5 | yellow
10 | green
11 | blue
12 | indigo
20 | violet
I have a list of id's (10, 11, 12, 13, 14) that can be used to look up id's in this table. This list of id's is generated in my frontend.
Using purely SQL, I need to select the id's from this list (10, 11, 12, 13, 14) that do not have entries in Table A (joining on the 'id' column). The result should be the resultset of id's 13 and 14.
How can I accomplish this using only SQL? (Also, I'd like to avoid using a stored procedure if possible)
The only approach I can think of is something that would create an inline SQL table on the fly to temporarily hold my list of id's. However, I have no idea how to do this. Is this possible? Is there a better way?
Thanks! :)
You can do this from SQL Server 2008 onwards using a table value constructor.
SELECT * FROM (
VALUES(1, 'red'),
(2, 'orange'),
(5, 'yellow'),
(10, 'green'),
(11, 'blue'),
(12, 'indigo'),
(20, 'violet'))
AS Colors(Id, Value)
More information here:
Table Value Constructor
You can create an "inline table" with a UNION subquery:
(
SELECT 10 AS id
UNION ALL SELECT 11 UNION ALL SELECT 12 UNION ALL SELECT 13 UNION ALL SELECT 14
-- etc.
) AS inline_table
CREATE TEMPORARY TABLE ids (id INT NOT NULL PRIMARY KEY);
INSERT
INTO ids
VALUES
(10),
(11),
(12),
(13),
(14);
SELECT *
FROM ids
WHERE id NOT IN
(
SELECT id
FROM a
);
Something like this will work too
SELECT * FROM (
SELECT 'ds' AS source
UNION ALL
SELECT 'cache' AS source
) as dataSource
----------
| source |
----------
| ds |
----------
| cache |
----------
create table B (id int)
insert into B values (10),(11),(12),(13),(14)
select *
from B
left join A
on A.id=B.id
where A.id is null
drop table B
http://sqlfiddle.com/#!6/6666c1/30