Delete duplicated tags MySQL - mysql

I have duplicated tags on my MySQL DB such as below:
| id | tags |
+- ---+-------------------------------------+
| 3 | x,yz,z,x,x |
| 5 | a,b,c d,a,b,c d, d |
+-----+-------------------------------------+
How can I execute a query that can remove the duplicated tags?
The result should be:
| id | tags |
+- ---+-------------------------------------+
| 3 | x,yz,z |
| 5 | a,b,c d, d |
+-----+-------------------------------------+

setup
create table overly_complex_tags
(
id integer primary key not null,
tags varchar(100) not null
);
insert into overly_complex_tags
( id, tags )
values
( 3 , 'x,yz,z,x,x' ),
( 5 , 'a,b,c d,a,b,c d,d' )
;
create view digits_v
as
SELECT 0 AS N
UNION ALL
SELECT 1
UNION ALL
SELECT 2
UNION ALL
SELECT 3
UNION ALL
SELECT 4
UNION ALL
SELECT 5
UNION ALL
SELECT 6
UNION ALL
SELECT 7
UNION ALL
SELECT 8
UNION ALL
SELECT 9
;
query delete duplicate tags
update overly_complex_tags t
inner join
(
select id, group_concat(tag) as new_tags
from
(
select distinct t.id, substring_index(substring_index(t.tags, ',', n.n), ',', -1) tag
from overly_complex_tags t
cross join
(
select a.N + b.N * 10 + 1 n
from digits_v a
cross join digits_v b
order by n
) n
where n.n <= 1 + (length(t.tags) - length(replace(t.tags, ',', '')))
) cleaned_tags
group by id
) updated_tags
on t.id = updated_tags.id
set t.tags = updated_tags.new_tags
;
output
+----+-----------+
| id | tags |
+----+-----------+
| 3 | yz,z,x |
| 5 | c d,a,d,b |
+----+-----------+
sqlfiddle
note
the complexity of above solution comes from not having a properly
normalised structure.. note that the solution uses an intermediate
normalised structure

In Oracle we accomplish this as
Update table set col=(select distinct regexp_substr(col, '[^,]+', 1, level) col
from table)
But for MySQL it not possible and only way is through PHP array as stated here. However Maria db have its solution as well. So use intermediate way.

Related

How to query duplicates results MySQL

While every question I found is talking about removing duplicates I need these duplicates.
Let's say my database is
+-------+-----------+
| ID | letter |
+-------+-----------+
| 1 | A |
| 2 | B |
| 4 | Z |
+-------+-----------+
I need to query a person name so let say the name is "ABA" when I query like this
select * from letters where letter = 'A' or letter = 'B' or letter = 'A'
My result will be
+-------+-----------+
| ID | letter |
+-------+-----------+
| 1 | A |
| 2 | B |
+-------+-----------+
I want the output will include the 3rd letter as a separate row.
+-------+-----------+
| ID | letter |
+-------+-----------+
| 1 | A |
| 2 | B |
| 3 | A |
+-------+-----------+
Maybe I don't know the right term but I didn't find even one answer that give me half a solution.
there is one entry but can I can the entry again? if I query for "nina" get the full name and not just "nia"
Original answer (and recommended approach)
Use a recursive query to convert the name into rows with one letter each. Then join with your letters table.
with recursive word_letters (word, pos, letter) as
(
select #name, 1, substr(#name, 1, 1)
union all
select word, pos + 1, substr(word, pos + 1, 1)
from word_letters
where pos < length(word)
)
select letters.*
from word_letters
join letters on letters.letter = word_letters.letter
order by word_letters.pos;
Demo: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=6547c21d2dd9223270615047d46d9783
UPDATE: Workaround for old MySQL versions
Build a table of positions (numbers) large enough to cover the longest word. Then join the word in order to get the position for each letter in it. Then join your other table.
select letters.*
from
(
select hundreds.digit * 100 + tens.digit * 10 + units.digit + 1 as pos
from (select 0 as digit union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) units
cross join (select 0 as digit union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) tens
cross join (select 0 as digit union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) hundreds
) positions
join (select #name as word) w on length(w.word) >= positions.pos
join letters on letters.letter = substr(w.word, positions.pos, 1)
order by positions.pos;
Demo: https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=db3762d5705ce5eb77e628c4d4058485
You can do it by using any of the below queries. For your reference visit the link below to test the output.
http://sqlfiddle.com/#!9/c5e9f3/1
The table created and the populated data is as below
create table tbl(ID int,letter varchar(2));
insert into tbl(ID,letter) values(1,'A');
insert into tbl(ID,letter) values(2,'B');
insert into tbl(ID,letter) values(3,'A');
insert into tbl(ID,letter) values(4,'Z');
insert into tbl(ID,letter) values(5,'C');
and now the query
select * from tbl where letter in ('A','B');
or
select * from tbl where letter ='A' or letter='B';

How to emulate JSON_OVERLAPS function on MySQL 5.7?

I have two columns from different tables that hold JSON-formatted data. The data stored in both columns are arrays. Example:
users
+----+------------------+
| id | options |
+----+------------------+
| 1 | ["AB","CD","XY"] |
| 2 | ["CD","GH"] |
+----+------------------+
items
+----+-------------+
| id | options |
+----+-------------+
| 10 | ["CD","EF"] |
| 11 | ["GH","XY"] |
| 12 | ["GH"] |
+----+-------------+
I wanted to write a query that returns all the rows from users which matches a given row from items, using options columns to perform the match. The rule is if any value in the array is present in both rows, they are a match. Example: user 1 would match items 10 (because of CD option) and 11 (because of XY option); user 2 would match items 10, 11 and 12 because all of them have CD or GH.
Looking at MySQL docs I found that JSON_OVERLAPS does exactly that. However, I'm running MySQL 5.7 and the function is only available starting at 8.0.17. There is also no much talking around this function on the web.
How could I emulate JSON_OVERLAPS behavior on MySQL 5.7 in a query?
Edit: Unfortunately, upgrading to MySQL 8 is not an option since we run MariaDB on production, which also doesn't have that function.
How to emulate JSON_OVERLAPS function on MySQL 5.7?
Edit: Unfortunately, upgrading to MySQL 8 is not an option since we run MariaDB on production, which also doesn't have that function.
Be warned like Strawberry, suggested already upgrading is more easy
Now that is out off the way. You still asked for it, lets have some fun.
I posted some answers in the past to simulate MySQL's 8 JSON_TABLE(), why did i mention this? Because i use this method to emulate MySQL's 8 JSON_OVERLAPS to simply JOIN both resultsets which emulate JSON_TABLE() to a final resultset
Which makes the query below (forgive the formatting)
Query
SELECT
*
FROM (
SELECT
items.id
, JSON_UNQUOTE(
JSON_EXTRACT(items.options, CONCAT('$[', number_generator.number , ']'))
) AS json_options
FROM (
SELECT
#items_row := #items_row + 1 AS number
FROM (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) row1
CROSS JOIN (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) row2
CROSS JOIN (
SELECT #items_row := -1
) init_user_params
) AS number_generator
CROSS JOIN (
SELECT
items.id
, items.options
, JSON_LENGTH(items.options) AS json_array_length
FROM
items
) AS items
WHERE
number BETWEEN 0 AND json_array_length - 1
) AS items
INNER JOIN (
SELECT
users.id
, JSON_UNQUOTE(
JSON_EXTRACT(users.options, CONCAT('$[', number_generator.number , ']'))
) AS json_options
FROM (
SELECT
#users_row := #users_row + 1 AS number
FROM (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) row1
CROSS JOIN (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) row2
CROSS JOIN (
SELECT #users_row := -1
) init_user_params
) AS number_generator
CROSS JOIN (
SELECT
users.id
, users.options
, JSON_LENGTH(users.options) AS json_array_length
FROM
users
) AS users
WHERE
number BETWEEN 0 AND json_array_length - 1
) AS users
USING(json_options)
Result
| json_options | id | id |
| ------------ | --- | --- |
| CD | 10 | 2 |
| CD | 10 | 1 |
| GH | 11 | 2 |
| GH | 12 | 2 |
| XY | 11 | 1 |
see demo

Create new columns for duplicate row values based on column ID duplicate in sql

I have a table with 2 columns, the first column is called ID and the second is called TRACKING. The ID column has duplicates, I want to to take all of those duplicates and consolidate them into one row where each value from TRACKING from the duplicate row is placed into a new column within the same row and I no longer have duplicates.
I have tried a few suggested things where all of the values would be concatenated into one column but I want these TRACKING values for the duplicate IDs to be in separate columns. The code below did not do what I intended it to.
SELECT ID, TRACKING =
STUFF((SELECT DISTINCT ', ' + TRACKING
FROM #t b
WHERE b.ID = a.ID
FOR XML PATH('')), 1, 2, '')
FROM #t a
GROUP BY ID
I am looking to take this:
| ID | TRACKING |
-----------------
| 5 | 13t3in3i |
| 5 | g13g13gg |
| 3 | egqegqgq |
| 2 | 14y2y24y |
| 2 | 42yy44yy |
| 5 | 8i535i35 |
And turn it into this:
| ID | TRACKING | TRACKING1 | TRACKING2 |
-----------------
| 5 | 13t3in3i | g13g13gg | 8i535i35 |
| 3 | egqegqgq | | |
| 2 | 14y2y24y | 42yy44yy | |
On (relatively) painful way to do this in MySQL is to use correlated subqueries:
select i.id,
(select t.tracking
from t
where t.id = i.id
order by t.tracking
limit 1, 0
) as tracking_1,
(select t.tracking
from t
where t.id = i.id
order by t.tracking
limit 1, 1
) as tracking_2,
(select t.tracking
from t
where t.id = i.id
order by t.tracking
limit 1, 2
) as tracking_3
from (select distinct id from t
) i;
As bad as this looks, it will probably have surprisingly decent performance with an index on (id, tracking).
By the way, your original code with stuff() would put everything into one column:
select id, group_concat(tracking)
from t
group by id;
with test_tbl as
(
select 5 id, 'goog' tracking,'goog' tracking1
union all
select 5 id, 'goog1','goo'
union all
select 2 , 'yahoo','yah'
union all
select 2, 'yahoo1','ya'
union all
select 3,'azure','azu'
), modified_tbl as
(
select id,array_agg(concat(tracking)) Tracking,array_agg(concat(tracking1)) Tracking1 from test_tbl group by 1
)
select id, tracking[safe_offset(0)] Tracking_1,tracking1[safe_offset(0)] Tracking_2, tracking[safe_offset(1)] Tracking_3,tracking1[safe_offset(1)] Tracking_4 from modified_tbl where array_length(Tracking) > 1

Add row number after splitting a string field

I have a table that contains 2 fields:
ID: text
Suggestions: string (comma separated values)
I would like to make a select query that would return a new numbered rows representing each suggestion with its own number as shown in the original string
Example:
Note: this ranking must be guaranteed to be the same everytime I run the query..
Thanks
If Version of your DB is 8.0+, then with recursive cte as clause might be used as in the following select statement ( after needed DML's provided such as create table and insert statements ):
mysql> create table tab( ID int, suggestions varchar(25));
mysql> insert into tab values(1,'A,B,C');
mysql> insert into tab values(2,'D,E,F,G,H');
mysql> select q2.*,
row_number()
over
(partition by q2.id order by q2.suggestion) as number
from
(
select distinct
id,
substring_index(
substring_index(suggestions, ',', q1.nr),
',',
-1
) as suggestion
from tab
cross join
(with recursive cte as
(
select 1 as nr
union all
select 1+nr from cte where nr<10
)
select * from cte) q1
) q2;
+------+------------+--------+
| id | suggestion | number |
+------+------------+--------+
| 1 | A | 1 |
| 1 | B | 2 |
| 1 | C | 3 |
| 2 | D | 1 |
| 2 | E | 2 |
| 2 | F | 3 |
| 2 | G | 4 |
| 2 | H | 5 |
+------+------------+--------+
Find here same problem is solved.
https://gist.github.com/avoidwork/3749973
I would suggest a series of subqueries:
select id, substring_index(suggestions, ',', 1) as suggestion, 1
from example
where suggestions is not null
union all
select id, substring_index(substring_index(suggestions, ',', 2), ',', -1) as suggestion, 2
from example
where suggestions like '%,%'
union all
select id, substring_index(substring_index(suggestions, ',', 3), ',', -1) as suggestion, 3
from example
where suggestions like '%,%,%'
union all
select id, substring_index(substring_index(suggestions, ',', 4), ',', -1) as suggestion, 4
from example
where suggestions like '%,%,%,%'
union all
select id, substring_index(substring_index(suggestions, ',', 5), ',', -1) as suggestion, 5
from example
where suggestions like '%,%,%,%,%';
This can easily be extended if you have more than 5 options per id.

Get value even if it doesn't exist in table from SQL SELECT statement

I have a MySQL table that looks something like this:
|---ID---|---COUNTER---|
| 1 | 2 |
| 2 | 6 |
| 3 | 1 |
| 5 | 9 |
| 6 | 10 |
I'm looking for a SELECT statement that returns ID's and their COUNTER. The table only have ID's such as: 1,2,3,5,6. Is there a statement where you say: I want ID's 1 to 10 even if they doesn't exist in the table, and if the ID doesn't exist, return the ID anyway with the COUNTER value 0. For example:
|---ID---|---COUNTER---|
| 1 | 2 |
| 2 | 6 |
| 3 | 1 |
| 4 | 0 |
| 5 | 9 |
| 6 | 10 |
| 7 | 0 |
| 8 | 0 |
| 9 | 0 |
| 10 | 0 |
Do I have to create a SELECT statement that contains NOT EXIST parameters?
Thanks in advance, Steve-O
Without creating a temp table:
select t.num as id, coalesce(yt.counter, 0)
from your_table yt
right join (
select 1 as num union select 2 union select 3 union select 4 union select 5 union
select 6 union select 7 union select 8 union select 9 union select 10
) t on yt.id = t.num
order by t.num
and bit more general:
select t.num as id, coalesce(yt.counter, 0)
from your_table yt
right join (
select t1.num + t2.num * 10 + t3.num * 100 as num
from (
select 1 as num union select 2 union select 3 union select 4 union select 5 union
select 6 union select 7 union select 8 union select 9 union select 0
) t1
cross join (
select 1 as num union select 2 union select 3 union select 4 union select 5 union
select 6 union select 7 union select 8 union select 9 union select 0
) t2
cross join (
select 1 as num union select 2 union select 3 union select 4 union select 5 union
select 6 union select 7 union select 8 union select 9 union select 0
) t3
) t on yt.id = t.num
where t.num between (select min(id) from your_table) and (select max(id) from your_table)
You can define limit by yourself here I've used min and max of id value from your_table.
It's not very robust, but if you created a temporary table with the ID's you wanted in it, you could then left join to your table containing ID and Counter which would include all the values:
Declare #tempidtable as table ( imaginaryid int )
insert into #tempidtable ( imaginaryid ) values ( 1 )
insert into #tempidtable ( imaginaryid ) values ( 2 )
insert into #tempidtable ( imaginaryid ) values ( 3 )
select
#temptable.imaginaryid,
ISNULL(yourothertable.counter, 0)
from #tempidtable
left join yourothertable
on #tempidtable.imaginaryid = yourothertable.id
As Tomek says you could loop over the inserts to make it easier to maintain, or possible store the ids you want as a base in another table, using this as the basis for the join rather than a temp table.
Create a table with all possible ID's:
create table Numbers (nr int primary key);
declare i int default 1;
while i < 100000 do
insert into Numbers (nr) values (i);
set i = i + 1;
end while;
Then you can use left join to return all numbers:
select n.NR
, c.Counter
from Numbers n
left join
Counters c
on c.ID = n.NR
You can use left join to solve your issue. Read more about left join here
I think you will have to create (generate in loop) temporary table with the complete sequence of numbers from 1 to N (where N is the MAX(Id) of counted table). Then do left join to that table and apply GROUP BY clause.
You need the range of integers to do an outer join with your table based on ID. Generating a range of integers is dependent on the SQL vendor if you do not want to use a temporary table. See SQL SELECT to get the first N positive integers for hints on how to do this based on your SQL vendor.