Get Duplicated Count Without Removing - mysql

I am trying to get duplicate counts but without actually removing duplicates.
I tried using GROUP BY id and then COUNT(id) but it removes all duplicate entries.
Is there any way to not remove duplicates?
The table looks like this:
ID1 ID2 Value
1 2 someval
1 3 someval
1 4 someval
2 3 someval
2 1 someval
3 1 someval
4 1 someval
I am trying to get this:
ID1 ID2 Value COUNT
1 2 someval 3
1 3 someval 3
1 4 someval 3
2 3 someval 2
2 1 someval 2
3 1 someval 1
4 1 someval 1
I used this:
SELECT ID1, ID2, Value, COUNT(ID1) FROM table GROUP BY ID1;

One of way doing this is to have a separate query for the count and join on it:
SELECT t.id1, t.id2, t.value, cnt
FROM my_table t
JOIN (SELECT id1, count(*) AS cnt
FROM my_table
GROUP BY id1) c ON t.id1 = c.id1

You can do this with a correlated subquery in MySQL;
select id1, id2, value,
(select count(*) from table t2 where t2.id1 = t.id1) as count
from table t;

If performance is an issue then an uncorrelated subquery will likely be orders of magnitude faster than a correlated one...
SELECT x.*
, cnt
FROM my_table x
JOIN
( SELECT id1,COUNT(*) cnt FROM my_table GROUP BY id1) y
ON y.id1 = x.id1;

try something like this :
SELECT YourColumn, COUNT(*) TotalCount
FROM YourTable
GROUP BY YourColumn
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC

Related

SQL : Summing the values in a column till first non zero values appears in other column?

Suppose, I have a table t1 looking like
id
value1
value2
wk_id
1
2
0
1
1
1
1
2
1
3
0
3
2
2
1
2
2
2
0
3
3
1
0
2
3
2
0
4
3
3
0
5
And I want to sum up the value1 till non-zero value appears on the value2 for first time.
End product must look like this:
id
value1
1
2
2
0
3
6
How to perform this in SQL?
If your MySQL version support window function you can try to use SUM window function with condition aggregate function be a flag to represent your logic (till non-zero value appears on the value2 for first time)
Then do condition aggregate function again.
Query #1
SELECT id,
SUM(CASE WHEN flag = 0 THEN value1 ELSE 0 END) value1
FROM (
SELECT *,
SUM(CASE WHEN value2 = 1 THEN -1 ELSE 0 END) OVER(PARTITION BY ID ORDER BY wk_id) flag
FROM T
) t1
GROUP BY id;
id
value1
1
2
2
0
3
6
View on DB Fiddle
WITH cte AS (
SELECT *,
CASE WHEN SUM(value2) OVER (partition by id ORDER BY wk_id) = 0
THEN SUM(value1) OVER (partition by id ORDER BY wk_id)
ELSE 0
END sum_value1
FROM test
ORDER BY id, wk_id
)
SELECT id, MAX(sum_value1) sum_value1
FROM cte
GROUP BY id;
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=0fcdca008e4a821f952de4d608434bcf
One option is a NOT EXISTS clause, looking for the stop row (the first row with value2 = 1).
select id, sum(value1)
from mytable
where not exists
(
select null
from mytable stoprow
where stoprow.id = mytable.id
and stoprow.wk_id <= mytable.wk_id
and stoprow.value2 = 1
)
group by id
order by id;
Aggregating on a calculated rolling total of value2 can also be done via a self-join.
select id
, sum(if(roll_tot_value2=0,value1,0)) as total
from
(
select t1a.id, t1a.wk_id, t1a.value1
, sum(t1b.value2) as roll_tot_value2
from t1 as t1a
join t1 as t1b
on t1b.id = t1a.id
and t1b.wk_id <= t1a.wk_id
group by t1a.id, t1a.wk_id, t1a.value1
) q
group by id;
id
total
1
2
2
0
3
6
Test on db<>fiddle here
You can use a subquery:
select t.id, coalesce(
(select sum(t2.value1) from t1 t2 where t2.value2 = 0 and t2.id = t.id
and (not exists (select 1 from t1 t3 where t3.value2 = 1 and t3.id = t.id)
or t2.wk_id < (select min(t4.wk_id) from t1 t4 where t4.id = t.id and t4.value2 = 1))), 0)
from t1 t group by t.id

mysql select distinct n row that has a certain value

Considering here is my table query:
id name number Code
1 red 1 A
2 red 3 B
3 blue 3 C
4 blue 5 A
5 purple 2 D
6 yellow 3 D
7 yellow 4 C
Now I need to query to get 2 random row such that there is 1 name is red and 1 number is 3, kinda like this:
SELECT * FROM table WHERE name = "red" LIMIT 1 and number = 3 LIMIT 1
So like row 1+3,1+6 or 2 + any other row.
Here is my query:
SELECT * FROM table
group by name,number
having count(name="red") = 1
and count(number=3) = 1
ORDER BY RAND()
LIMIT 2;
However, it seems like it just query the row randomly and not satisfying my requirement. Can anyone show me what is wrong ?
Thank you.
I think that this will do what you want:
select t1.*
from tablename t1
inner join (
select t1.id id1, t2.id id2
from tablename t1 inner join tablename t2
on t2.id > t1.id
and ('red' in (t1.name, t2.name)) + ('3' in (t1.number, t2.number)) = 2
order by rand() limit 1
) t2 on t1.id in (t2.id1, t2.id2)
Note that the row with the highest probability to be returned is id = 2, because it can be combined with any other row of the table.
See the demo.
If you can live with odd formatting...
select x.id x_id
, x.name x_name
, x.number x_number
, x.code x_code
, y.id y_id
, y.name y_name
, y.number y_number
, y.code y_code
from my_table x
join my_table y
on y.id <> x.id
where x.name = 'red'
and y.number = 3
order by rand()
limit 1;
https://www.db-fiddle.com/f/6JmLKq1RwaPrSwS3zx4Qmt/0
Previously, I posted this solution, but it too has some flaws, I think. But TB liked it, so I'll keep it here...
select *
from my_table where name = 'red'
union distinct
select *
from my_table where number = 3
order by rand()
limit 2

Incrementing count ONLY for duplicates in MySQL

Here is my MySQL table. I updated the question by adding an 'id' column to it (as instructed in the comments by others).
id data_id
1 2355
2 2031
3 1232
4 9867
5 2355
6 4562
7 1232
8 2355
I want to add a new column called row_num to assign an incrementing number ONLY for duplicates, as shown below. Order of the results does not matter.
id data_id row_num
3 1232 1
7 1232 2
2 2031 null
1 2355 1
5 2355 2
8 2355 3
6 4562 null
4 9867 null
I followed this answer and came up with the code below. But following code adds a count of '1' to non-duplicate values too, how can I modify below code to add a count only for duplicates?
select data_id,row_num
from (
select data_id,
#row:=if(#prev=data_id,#row,0) + 1 as row_num,
#prev:=data_id
from my_table
)t
If you are running MySQL 8.0, you can do this more efficiently with window functions only:
select
data_id,
case when count(*) over(partition by data_id) > 1
then row_number() over(partition by data_id order by data_id) row_num
end
from mytable
When the window count returns more than 1, you know that the current data_id has duplicates, in which case you can use row_number() to assign the incrementing number.
Note that, in absence of an ordering columns to uniquely identify each record within groups sharing the same data_id, it is undefined which record will actually get each number.
I am assuming that id is the column that defines the order on the rows.
In MySQL 8 you can use row_number() to get the number of each data_id and a CASE with EXISTS to exclude the rows which have no duplicate.
SELECT t1.data_id,
CASE
WHEN EXISTS (SELECT *
FROM my_table t2
WHERE t2.data_id = t1.data_id
AND t2.id <> t1.id) THEN
row_number() OVER (PARTITION BY t1.data_id
ORDER BY t1.id)
END row_num
FROM my_table t1;
In older versions you can use a subquery counting the rows with the same data_id but smaller id. With an EXISTS in a HAVING clause you can exclude the rows that have no duplicate.
SELECT t1.data_id,
(SELECT count(*)
FROM my_table t2
WHERE t2.data_id = t1.data_id
AND t2.id < t1.id
HAVING EXISTS (SELECT *
FROM my_table t2
WHERE t2.data_id = t1.data_id
AND t2.id <> t1.id)) + 1 row_num
FROM my_table t1;
db<>fiddle
Join with a query that returns the number of duplicates.
select t1.data_id, IF(t2.dups > 1, row_num, '') AS row_num
from (
select data_id,
#row:=if(#prev=data_id,#row,0) + 1 as row_num,
#prev:=data_id
from my_table
order by data_id
) AS t1
join (
select data_id, COUNT(*) AS dups
FROM my_table
GROUP BY data_id
) AS t2 ON t1.data_id = t2.data_id
If you want to have the old "order" of the old table, you need much more code
SELECT
data_id, IF (row_num = 1 AND cntid = 1, NULL,row_num)
FROM
(SELECT
#row:=IF(#prev = t1.data_id, #row, 0) + 1 AS row_num,
cntid,
#prev:=t1.data_id data_id
FROM
(SELECT
*
FROM
my_table
ORDER BY data_id) t1
INNER JOIN (SELECT Count(*) cntid,data_id FROM my_table GROUP BY data_id)t2
ON t1.data_id = t2.data_id) t2
data_id | IF (row_num = 1 AND cntid = 1, NULL,row_num)
------: | -------------------------------------------:
1232 | 1
1232 | 2
2031 | null
2355 | 1
2355 | 2
2355 | 3
4562 | null
9867 | null
db<>fiddle here

select rows with condition in other rows

I want select rows from my table with last status_Id if there is a row with status_Id = 2 for that rows
ticketStatus_Id ticket_Id status_Id
======================================
1 1 1
2 1 2 -
3 1 3 *
4 2 1
5 3 1
6 3 2 - *
7 4 1
8 4 2 -
9 4 3
10 4 4 *
I want select just rows 3, 6, 10. there are another rows with status_Id = 2 (rows 2, 6, 8) for that ticket_Id,
In other word How to select rows 3,6,10 with ticket_Id =1,3,4 that there are another row with these ticket_Ids and status_Id=2 (rows 2,6,8)
If you want the complete row, then I would view this as exists:
select t.*
from t
where exists (select 1
from t t2
where t2.ticket_id = t.ticket_id and t2.status_id = 2
) and
t.status_Id = (select max(t2.status_id)
from t t2
where t2.ticket_id = t.ticket_id
);
If you just want the ticket_id and status_id (and not the whole row), I would recommend aggregation:
select ticket_id, max(status_id)
from t
group by ticket_id
having sum(status_id = 2) > 0;
In your case, ticketStatus_Id seems to increase with status_id, so you can use:
select max(ticketStatus_Id) as ticketStatus_Id, ticket_id, max(status_id) as Status_Id
from t
group by ticket_id
having sum(status_id = 2) > 0;
First, for each ticket we get the row with the highest status. We can do this with a self-join. Each row is joined with the row with the next highest status. We select the rows which have no higher status, those will be the highest. Here's a more detailed explanation.
select ts1.*
from ticket_statuses ts1
left outer join ticket_statuses ts2
on ts1.ticket_Id = ts2.ticket_Id
and ts1.status_Id < ts2.status_Id
where ts2.ticketStatus_Id is null
3 1 3
4 2 1
6 3 2
10 4 4
11 5 3
Note that I've added a curve-ball of 11, 5, 3 to ensure we only select tickets with a status of 2, not greater than 2.
Then we can use that as a CTE (or subquery if you're not using MySQL 8) and select only those tickets who have a status of 2.
with max_statuses as (
select ts1.*
from ticket_statuses ts1
left outer join ticket_statuses ts2
on ts1.ticket_Id = ts2.ticket_Id
and ts1.status_Id < ts2.status_Id
where ts2.ticketStatus_Id is null
)
select ms.*
from max_statuses ms
join ticket_statuses ts
on ms.ticket_id = ts.ticket_id
and ts.status_id = 2;
3 1 3
6 3 2
10 4 4
This approach ensures we select the complete rows with the highest statuses and any extra data they may contain.
dbfiddle
This is basicaly a "last row per group" problem. You will find some solutions here. My prefered solution would be:
select t.*
from (
select max(ticketStatus_Id) as ticketStatus_Id
from mytable
group by ticket_Id
) tmax
join mytable t using(ticketStatus_Id)
The difference in your question is that you have a condition requiring a specific value within the group. This can be solved with a JOIN within the subquery:
select t.*
from (
select max(t1.ticketStatus_Id) as ticketStatus_Id
from mytable t2
join mytable t1 using(ticket_Id)
where t2.status_Id = 2
group by t2.ticket_Id
) tmax
join mytable t using(ticketStatus_Id)
Result:
| ticketStatus_Id | ticket_Id | status_Id |
| --------------- | --------- | --------- |
| 3 | 1 | 3 |
| 6 | 3 | 2 |
| 10 | 4 | 4 |
View on DB Fiddle
A solution using window functions could be:
select ticketStatus_Id, ticket_Id, status_Id
from (
select *
, row_number() over (partition by ticket_Id order by ticketStatus_Id desc) as rn
, bit_or(status_Id = 2) over (partition by ticket_Id) > 0 as has_status2
from mytable
) x
where has_status2 and rn = 1
A quite expressive way is to use EXISTS and NOT EXISTS subquery conditions:
select t.*
from mytable t
where exists (
select *
from mytable t1
where t1.ticket_Id = t.ticket_Id
and t1.status_Id = 2
)
and not exists (
select *
from mytable t1
where t1.ticket_Id = t.ticket_Id
and t1.ticketStatus_Id > t.ticketStatus_Id
)
SELECT a.*
FROM t a
JOIN
(
SELECT ticket_id, MAX(status_id) max_status_id
FROM t
WHERE status_id >= 2
GROUP BY ticket_id
) b
ON a.ticket_id = b.ticket_id
AND a.status_id = b.max_status_id;
SELECT
MAX(m1.ticketstatus_Id) as ticket_status,
m1.ticket_Id as ticket,
MAX(m1.status_Id) as status
FROM mytable m1
WHERE
m1.ticket_Id in (select m2.ticket_Id from mytable m2 where m2.ticket_Id=m1.ticket_Id and m2.status_Id=2)
GROUP BY m1.ticket_Id

how to exclude first and last row if group it on particular id

I have sample table with data like this
id uniqueid values
1 6 0
2 6 1
3 6 2
4 6 0
5 6 1
I want result like this
id uniqueid values
2 6 1
3 6 2
4 6 0
I tried like this
select id,uniqueid,values
FROM t1
WHERE
id not in(SELECT concat(MAX(message_id_pk),',',min(message_id_pk)) FROM t1
where uniqueid=6)
and `uniqueid`=6
GROUP BY uniqueid
but its not working
You can achieve the desired results by doing self join, Inner query will get the the max and min ids for per group and outer query will filter out the results by using minid and maxid
select a.*
from demo a
join (
select `uniqueid`,min(id) minid, max(id) maxid
from demo
where uniqueid=6
group by `uniqueid`
) b using(`uniqueid`)
where a.id > b.minid and a.id < b.maxid /* a.id <> b.minid and a.id <> b.maxid */
Demo
Also you can do it by using 2 sub-queries with EXISTS to exclude the min and max id of each uniqueid.
Query
select `id`, `uniqueid`, `values`
from `your_table_name` t1
where exists (
select 1 from `your_table_name` t2
where t2.`uniqueid` = t1.`uniqueid`
and t2.`id` > t1.`id`
)
and exists(
select 1 from `your_table_name` t2
where t2.`uniqueid` = t1.`uniqueid`
and t2.`id` < t1.`id`
);
Here is a sql fiddle demo
Try this -
SELECT id, uniqueid, values
FROM YOUR_TABLE
WHERE id NOT IN (MIN(id), MAX(id));