How to track previous row status count - mysql

I want to calculate count of order status changes within different states.
My Orderstatus table:
| id |ordr_id| status |
|----|-------|------------|
| 1 | 1 | pending |
| 2 | 1 | processing |
| 3 | 1 | complete |
| 4 | 2 | pending |
| 5 | 2 | cancelled |
| 6 | 3 | processing |
| 7 | 3 | complete |
| 8 | 4 | pending |
| 9 | 4 | processing |
Output I want:
| state | count |
|----------------------|-------|
| pending->processing | 2 |
| processing->complete | 2 |
| pending->cancelled | 1 |
Currently I'm fetching the results by SELECT order_id,GROUP_CONCAT(status) as track FROM table group by order_id and then process the data in php to get the output. But is that possible in query itself ?

Use lag():
select prev_status, status, count(*)
from (select t.*,
lag(status) over (partition by order_id order by status) as prev_status
from t
) t
group by prev_status, status;
LAG() is available in MySQL starting with version 8.
Note that you can filter out the first status for each order by putting where prev_status is not null in the outer query.
Your version is not quite correct, because it does not enforce the ordering. It should be:
SELECT order_id,
GROUP_CONCAT(status ORDER BY id) as track
EDIT:
In earlier versions of MySQL, you can use a correlated subquery:
select prev_status, status, count(*)
from (select t.*,
(select t2.status
from t t2
where t2.order_id = t.order_id and t2.id < t.id
order by t2.id desc
limit 1
) as prev_status
from t
) t
group by prev_status, status;

If id column ensure the sequence of records, you can use self join to achieve your requirement as below-
SELECT A.Status +'>'+ B.Status, COUNT(*)
FROM OrderStatus A
INNER JOIN OrderStatus B
ON A.id = B.id -1
WHERE B.Status IS NOT NULL
GROUP BY A.Status +'>'+ B.Status

With a join of the 3 status change types to the grouping of the table that you already did:
select c.changetype, count(*) counter
from (
select 'pending->processing' changetype union all
select 'processing->complete' union all
select 'pending->cancelled'
) c inner join (
select
group_concat(status order by id separator '->') changestatus
from tablename
group by ordr_id
) t on concat('->', t.changestatus, '->') like concat('%->', changetype, '->%')
group by c.changetype
See the demo.
Results:
> changetype | counter
> :------------------- | ------:
> pending->cancelled | 1
> pending->processing | 2
> processing->complete | 2

...or just a simple join...
SELECT CONCAT(a.status,'->',b.status) action
, COUNT(*) total
FROM my_table a
JOIN my_table b
ON b.ordr_id = a.ordr_id
AND b.id = a.id + 1
GROUP
BY action;
+----------------------+-------+
| action | total |
+----------------------+-------+
| pending->cancelled | 1 |
| pending->processing | 2 |
| processing->complete | 2 |
+----------------------+-------+
Note that this relies on the fact that ids are contiguous.

Related

Duplicate and get the last item from the mysql table

table 1 t1
+----+----------+
| id | name |
+----+----------+
| 1 | free |
| 2 | basic |
| 3 | advanced |
+----+----------+
table 2 t2
+----+-------+------+
| id | t1_fk | cost |
+----+-------+------+
| 1 | 2 | 1650 |
| 3 | 3 | 2000 |
| 4 | 2 | 550 |
+----+-------+------+
I want to get the output of t2 table but without duplicates. I was able to get this using GROUP BY function. Also i need the last item on the duplicate (i got stuck here).
Here's what i tried and it didn't work.
SELECT id cost FROM t2 GROUP BY t1_fk ORDER BY MAX(id) DESC
any help
On MySQL 8+, we can use ROW_NUMBER here:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY t1_fk ORDER BY id DESC) rn
FROM t2
)
SELECT id, t1_fk, cost
FROM cte
WHERE rn = 1;
On earlier versions of MySQL, one canonical way to handle this would be to use a join to a subquery which finds the max id value for each t1_fk:
SELECT a.id, a.t1_fk, a.cost
FROM t2 a
INNER JOIN
(
SELECT t1_fk, MAX(id) AS max_id
FROM t2
GROUP BY t1_fk
) b
ON a.t1_fk = b.t1_fk AND a.id = b.max_id;

Select top most non-duplicated entry after ordering by other columns [duplicate]

This question already has answers here:
SQL select only rows with max value on a column [duplicate]
(27 answers)
Closed 5 years ago.
I would like to select the "top most" entry for each row with a duplicated column value.
Performing the following query -
SELECT *
FROM shop
ORDER BY shop.start_date DESC, shop.created_date DESC;
I get the result set -
+--------+---------+------------+--------------+
| row_id | shop_id | start_date | created_date |
+--------+---------+------------+--------------+
| 1 | 1 | 2017-02-01 | 2017-01-01 |
| 2 | 1 | 2017-01-01 | 2017-02-01 |
| 3 | 2 | 2017-01-01 | 2017-07-01 |
| 4 | 2 | 2017-01-01 | 2017-01-01 |
+--------+---------+------------+--------------+
Can I modify the SELECT so that I only get back the "top rows" for each unique shop_id -- in this case, row_ids 1 and 3. There can be 1..n number of rows with the same shop_id.
Similarly, if my query above returned the following order, I'd want to only SELECT row_ids 1 and 4 since those would be the "top most" entries each shop_id.
+--------+---------+------------+--------------+
| row_id | shop_id | start_date | created_date |
+--------+---------+------------+--------------+
| 1 | 1 | 2017-02-01 | 2017-01-01 |
| 2 | 1 | 2017-01-01 | 2017-02-01 |
| 4 | 2 | 2017-01-01 | 2017-07-01 |
| 3 | 2 | 2017-01-01 | 2017-01-01 |
+--------+---------+------------+--------------+
You can do this by using a subquery:
select s.*
from shop s
where s.row_id = (
select row_id
from shop
where shop_id = s.shop_id
order by start_date desc, created_date desc
limit 1
)
Mind the assumption of row_id being uniq for each shop_id in this query example.
Demonstration
Or like this:
select t.*
from shop t
join (
select t2.shop_id, t2.start_date, max(t2.created_date) as created_date
from shop t2
join (
select max(start_date) as start_date, shop_id
from shop
group by shop_id
) t3 on t3.shop_id = t2.shop_id and t3.start_date = t2.start_date
group by t2.shop_id, t2.start_date
) t1 on t1.shop_id = t.shop_id and t.start_date = t1.start_date and t.created_date = t1.created_date
Mind that in case there can be records with the same start_date and created_date for the same shop_id you would need to use another group by s.shop_id, s.start_date, s.created_date in the outer query (adding min(row_id) with other columns listed in the group by in select)
Demonstration
Try joining to a subquery which finds the "top" rows for each shop_id:
SELECT t1.*
FROM shop t1
INNER JOIN
(
SELECT shop_id, MIN(row_id) AS min_id
FROM shop
GROUP BY shop_id
) t2
ON t1.shop_id = t2.shop_id AND
t1.row_id = t2.min_id
ORDER BY
t1.start_date DESC,
t1.created_date DESC;
Demo

Calculate sum on the records while joining two tables with a third table

I have three tables:
mysql> select * from a;
+----+---------+
| ID | Name |
+----+---------+
| 1 | John |
| 2 | Alice |
+----+---------+
mysql> select * from b;
+------+------------+----------+
| UID | date | received |
+------+------------+----------+
| 1 | 2017-10-02 | 5 |
| 1 | 2017-09-30 | 1 |
| 1 | 2017-09-29 | 4 |
+------+------------+----------+
mysql> select * from c;
+------+------------+------+
| UID | date | sent |
+------+------------+------+
| 1 | 2017-09-25 | 7 |
| 1 | 2017-09-30 | 2 |
| 1 | 2017-09-29 | 3 |
+------+------------+------+
If I try to calculate the total number of sent for John, it would be 12. And for received, it would be 10.
But if I try to join all three tables, the result is weird. Here is my query to join three tables:
mysql> select sum(sent), sum(received) from a
-> join c on c.UID = a.ID
-> join b on b.UID = a.ID
-> where a.ID = 1;
+-----------+---------------+
| sum(sent) | sum(received) |
+-----------+---------------+
| 36 | 30 |
+-----------+---------------+
But I need correct numbers (12 and 10, respectively). How can I have correct numbers?
You should join the aggregated result and not the raw tables
select a.uid, t1.received, t2.sent
from a
inner join (
select uid, sum(received) received
from b
group by uid
) t1 on t1.uid = a.id
inner join (
select uid, sum(sent) sent
from c
group by uid
) t2 on t2.uid = a.id
where a.id = 1
You could try below
select bx.id, recieved, sum(c.sent) sent from
(
SELECT a.id, sum(b.received) recieved
from a
INNER JOIN b
ON a.id=b.uid
group by a.id
) bx
INNER JOIN c
ON c.uid=bx.id
group by bx.id, bx.recieved;
>>>Demo<<<
This gets rid of the subquery, but introduces something else you might not want:
( SELECT uid, 'Received' AS direction, SUM(received) AS HowMany
WHERE uid = 1
GROUP BY uid )
UNION ALL
( SELECT uid, 'Sent' AS direction, SUM(sent) AS HowMany
WHERE uid = 1
GROUP BY uid )

Select two items with maximum number of common values

I have the following table:
+----+-----------+-----------+
| id | teacherId | studentId |
+----+-----------+-----------+
| 1 | 1 | 4 |
| 2 | 1 | 2 |
| 3 | 1 | 1 |
| 4 | 1 | 3 |
| 5 | 2 | 2 |
| 6 | 2 | 1 |
| 7 | 2 | 3 |
| 8 | 3 | 9 |
| 9 | 3 | 6 |
| 10 | 1 | 6 |
+----+-----------+-----------+
I need a query to find two teacherId's with maximum number of common studentId's.
In this case teachers with teacherIds 1,2 have common students with studentIds 2, 1, 3, which is greater than 1,3 having common students 6.
Thanks in Advance!
[Edit]: After several hours I've had the following solution:
SELECT * FROM (
SELECT r1tid, r2tid, COUNT(r2tid) AS cnt
FROM (
SELECT r1.teacherId AS r1tid, r2.teacherId AS r2tid
FROM table r1
INNER JOIN table r2 ON r1.studentId=r2.studentId AND r1.teacherId!=r2.teacherId
ORDER BY r1tid
) t
GROUP BY r1tid, r2tid
ORDER BY cnt DESC
) t GROUP BY cnt ORDER BY cnt DESC LIMIT 1;
I was sure that there must exist more short and elegant solution, but I could not find it.
You would do this with a self-join. Assuming no duplicates in the table:
select t.teacherid, t2.teacherid, count(*) as NumStudentsInCommon
from table t join
table t2
on t.studentid = t2.studentid and
t.teacherid < t2.teacherid
group by t.teacherid, t2.teacherid
order by NumStudentsInCommon desc
limit 1;
If you had duplicates, you would just replace count(*) with count(distinct studentid), but count(distinct) requires a bit more work.
select t.teacherId, t2.teacherId, sum(t.studentId) as NumStudentsInCommon
from table1 t join
table1 t2
on t.studentId = t2.studentId and
t.teacherId < t2.teacherId
group by t.teacherId, t2.teacherId
order by NumStudentsInCommon desc

How to select only the latest rows for each user?

My table looks like this:
id | user_id | period_id | completed_on
----------------------------------------
1 | 1 | 1 | 2010-01-01
2 | 2 | 1 | 2010-01-10
3 | 3 | 1 | 2010-01-13
4 | 1 | 2 | 2011-01-01
5 | 2 | 2 | 2011-01-03
6 | 2 | 3 | 2012-01-13
... | ... | ... | ...
I want to select only the latest users periods entries, bearing in mind that users will not all have the same period entries.
Essentially (assuming all I have is the above table) I want to get this:
id | user_id | period_id | completed_on
----------------------------------------
3 | 3 | 1 | 2010-01-13
4 | 1 | 2 | 2011-01-01
6 | 2 | 3 | 2012-01-13
Both of the below queries always resulted with the first user_id occurance being selected, not the latest (because the ordering happens after the rows are selected from what I understand):
SELECT
DISTINCT user_id,
period_id,
completed_on
FROM my_table
ORDER BY
user_id ASC,
period_id DESC
SELECT *
FROM my_table
GROUP BY user_id
ORDER BY
user_id ASC,
period_id DESC
Seems like this should work using MAX and a subquery:
SELECT t.Id, t.User_Id, t.Period_Id, t.Completed_On
FROM my_table t
JOIN (SELECT Max(completed_on) Max_Completed_On, t.User_Id
FROM my_table
GROUP BY t.User_ID
) t2 ON
t.User_Id = t2.User_Id AND t.Completed_On = t2.Max_Completed_On
However, if you potentially have multiple records where the completed_on date is the same per user, then this could return multiple records. Depending on your needs, potentially adding a MAX(Id) in your subquery and joining on that would work.
try this:
SELECT t.Id, t.User_Id, t.Period_Id, t.Completed_On
FROM table1 t
JOIN (SELECT Max(completed_on) Max_Completed_On, t.User_Id
FROM table1 t
GROUP BY t.User_ID) t2 ON t.User_Id = t2.User_Id AND t.Completed_On = t2.Max_Completed_On
DEMO HERE