Update duplicate rows - mysql

I have a table:
id name
1 a
2 a
3 a
4 b
5 b
6 c
I am looking for an update statement that will update name column to:
id name
1 a
2 a-2
3 a-3
4 b
5 b-2
6 c
In SQL Server I would use:
;with cte as(select *, row_number() over(partition by name order by id) rn from table)
update cte set name = name + '-' + cast(rn as varchar(10))
where rn <> 1
I am not strong in MySQL nonstandard queries.
Can I do something like this in MySQL?

You can do this:
UPDATE YourTable p
JOIN(SELECT t.id,t.name,count(*) as rnk
FROM YourTable t
INNER JOIN YourTable s on(t.name = s.name and t.id <= s.id)
GROUP BY t.id,t.name) f
ON(p.id = f.id)
SET p.name = concat(p.name,'-',f.rnk)
WHERE rnk > 1
This will basically use join and count to get the same as ROW_NUMBER() , and update only those who have more then 1 result(meaning the second,third ETC excluding the first)

In MySQL you can use variables in order to simulate ROW_NUMBER window function:
SELECT id, CONCAT(name, IF(rn = 1, '', CONCAT('-', rn))) AS name
FROM (
SELECT id, name,
#rn := IF(name = #n, #rn + 1,
IF(#n := name, 1, 1)) AS rn
FROM mytable
CROSS JOIN (SELECT #rn := 0, #n := '') AS vars
ORDER BY name, id) AS t
To UPDATE you can use:
UPDATE mytable AS t1
SET name = (
SELECT CONCAT(name, IF(rn = 1, '', CONCAT('-', rn))) AS name
FROM (
SELECT id, name,
#rn := IF(name = #n, #rn + 1,
IF(#n := name, 1, 1)) AS rn
FROM mytable
CROSS JOIN (SELECT #rn := 0, #n := '') AS vars
ORDER BY name, id) AS t2
WHERE t1.id = t2.id)
Demo here
You can also use UPDATE with JOIN syntax:
UPDATE mytable AS t1
JOIN (
SELECT id, rn, CONCAT(name, IF(rn = 1, '', CONCAT('-', rn))) AS name
FROM (
SELECT id, name,
#rn := IF(name = #n, #rn + 1,
IF(#n := name, 1, 1)) AS rn
FROM mytable
CROSS JOIN (SELECT #rn := 0, #n := '') AS vars
ORDER BY name, id) AS x
) AS t2 ON t2.rn <> 1 AND t1.id = t2.id
SET t1.name = t2.name;
The latter is probably faster than the former because it performs less UPDATE operations.

The next query will do it with less effort for the database:
UPDATE
tab AS tu
INNER JOIN
-- result set containing only duplicate rows that must to be updated
(
SELECT
t.id,
COUNT(*) AS cnt
FROM
tab AS t
-- join the same table by smaller id and equal value. That way you will exclude rows that are not duplicated
INNER JOIN
tab AS tp
ON
tp.name = t.name
AND
tp.id < t.id
GROUP BY
t.id
) AS tc
ON
tu.id = tc.id
SET
tu.name = CONCAT(tu.name, '-', tc.cnt + 1)

Related

How to find median value with group by (MySQL)

Need to find median value of time difference between sent date and click date (in seconds) for each type of emails. I found solution just for all data:
SET #rowindex := -1;
SELECT g.type, g.time_diff
FROM
(SELECT #rowindex:=#rowindex + 1 AS rowindex,
TIMESTAMPDIFF(SECOND, emails_sent.date_sent, emails_clicks.date_click) AS time_diff,
emails_sent.id_type AS type
FROM emails_sent inner join emails_clicks on emails_sent.id = emails_clicks.id_email
ORDER BY time_diff) AS g
WHERE g.rowindex IN (FLOOR(#rowindex / 2) , CEIL(#rowindex / 2));
Is it possible to add group by id_type statement?
Thanks!
First, you need to enumerate the rows for each type. Using variables, this code looks like:
select sc.*,
(#rn := if(#t = id_type, #rn + 1,
if(#t := id_type, 1, 1)
)
) as seqnum
from (select timestampdiff(second, s.date_sent, c.date_click) as time_diff,
s.id_type,
from emails_sent s inner join
emails_clicks c
on s.id = c.id_email
order by time_diff
) sc cross join
(select #t := -1, #rn := 0) as params;
Then, you need to bring in the total number for each type and do the calculation for the median:
select sc.id_type, avg(time_diff)
from (select sc.*,
(#rn := if(#t = id_type, #rn + 1,
if(#t := id_type, 1, 1)
)
) as seqnum
from (select timestampdiff(second, s.date_sent, c.date_click) as time_diff,
s.id_type,
from emails_sent s inner join
emails_clicks c
on s.id = c.id_email
order by time_diff
) sc cross join
(select #t := -1, #rn := 0) as params
) sc join
(select id_type, count(*) as cnt
from emails_sent s inner join
emails_clicks c
on s.id = c.id_email
group by id_type
) n
where 2 * seqnum in (n.cnt, n.cnt, n.cnt + 1, n.cnt + 2)
group by sc.id_type;

MySql join two tables sequentially

Table one
===================
id name
-------------------
1 m
2 m
3 a
4 u
5 g
Table two
===================
id name
-------------------
8 m
9 m
10 u
11 a
12 x
15 m
Expected result
===================
1 m 8
2 m 9
3 a 11
4 u 10
I need to find id from table 2 associated with table 1 by name. But ids from table 2 must be different.
If i make join i receive wrong intersections:
select t1.id as i1, t1.name, t2.id as i2 from t1
join t2 on t1.name = t2.name
i1 name i2
--------------------
'1','m','8'
'2','m','8'
'1','m','9'
'2','m','9'
'4','u','10'
'3','a','11'
'1','m','15'
'2','m','15'
I need this for tables synchronization from different systems.
You can use the following query:
SELECT t1.id, t1.name, t2.id
FROM (
SELECT id, name,
#rn1 := IF(#n = name, #rn1 + 1,
IF(#n := name, 1, 1)) AS rn1
FROM Table1
CROSS JOIN (SELECT #rn1 := 0, #n := '') AS vars
ORDER BY name, id) AS t1
INNER JOIN (
SELECT id, name,
#rn2 := IF(#n = name, #rn2 + 1,
IF(#n := name, 1, 1)) AS rn2
FROM Table2
CROSS JOIN (SELECT #rn2 := 0, #n := '') AS vars
ORDER BY name, id
) AS t2 ON t1.name = t2.name AND t1.rn1 = t2.rn2
ORDER BY t1.id
The query uses variables in order to simulate ROW_NUMBER() window function, currently not available in MySQL. Variables #rn1, #rn2 enumerate records that belong to the same name partition with an order determined by id field.
Demo here

how to port this postgresql lag statement to mysql?

Imagine a table just filled with ID's and created timestamps, how would I convert this over to MySQL?:
SELECT created AS col_a , LAG (created) OVER ( ORDER by created ) AS col_b
FROM tester
You can use a correlated subquery:
SELECT t1.created AS col_a,
(SELECT created
FROM tester AS t2
WHERE t2.created < t1.created
ORDER BY created DESC LIMIT 1) AS col_b
FROM tester AS t1
or, use variables:
SELECT t1.created AS col_a, t2.created AS col_b
FROM (
SELECT created, #rn1 := #rn1 + 1 AS rn
FROM tester
CROSS JOIN (SELECT #rn1 := 0) AS var
ORDER BY created) AS t1
LEFT JOIN (
SELECT created, #rn2 := #rn2 + 1 AS rn
FROM tester
CROSS JOIN (SELECT #rn2 := 0) AS var
ORDER BY created
) AS t2 ON t1.rn = t2.rn + 1

Getting the rank of a record using two where clauses

I am trying to get the rank of specified record, and I have some success with this code:
SELECT `rank`
FROM
(
select #rownum:=#rownum+1 `rank`, p.*
from TableName p, (SELECT #rownum:=0) r
order by point DESC
) s
WHERE names = 'house'
(See the schema here.)
This SQL query works, but if I want to get the result according to city_id and name, I must use two where clauses, and then the code doesn't work.
I want to get rank of house only for its city. How can I do this?
I want to get rank of "house" only for its city
You can do this by introducing another variable to keep track of the city:
SELECT `rank`
FROM (select p.*,
(#rn := if(#c = city, #rn + 1,
if(#c := city, 1, 1)
)
) as rank
from TableName p CROSS JOIN
(SELECT #rn := 0, #c := -1) params
order by city_id, point DESC
) s
WHERE names = 'house' ;
You can also use your original query, with a tweak, if you know that "house" only appears once in the data:
SELECT `rank`
FROM (select p.*, (#rn := #rn + 1) as rank
from TableName p CROSS JOIN
(SELECT #rn := 0) params
where city_id = (select city_id from TableName t2 where t2.names = 'house')
order by point DESC
) s
WHERE names = 'house' ;

MySQL - Parallel merge two unrelated queries with same # of rows

I have two tables:
exam_outline_items:
jml_quiz_pool:
Of all the things I've tried, this got me the closest:
select t1.sequence, t1.title, t2.q_cat, t2.q_count
from student_pl.exam_outline_items t1
cross join pe_joomla.jml_quiz_pool t2
where t1.exam_outline_id = 5 and t1.chapter_num > 0
and t2.q_id = 1109 and t2.q_count > 0
group by title
Which produces this result:
I just need those q_cat values to be different, like they are in the 2nd query.
Thanks in advance for your help.
You have to have something to connect them with. If you don't have such a column, you can simulate one by creating a rownumber with variables.
select sequence, title, q_cat, q_count from (
select t1.sequence, t1.title, #r1 := #r1 + 1 as rownumber
from student_pl.exam_outline_items t1
, (select #r1 := 0) var_init
where t1.exam_outline_id = 5 and t1.chapter_num > 0
order by t1.sequence
) a
inner join
(
select t2.q_cat, t2.q_count, #r2 := #r2 + 1 as rownumber
from pe_joomla.jml_quiz_pool t2
, (select #r2 := 0) var_init
where t2.q_id = 1109 and t2.q_count > 0
order by t2.q_cat
) b on a.rownumber = b.rownumber;
Also note, that I used order by in those queries. In a database you have no sort order unless you explicitly set it with order by.