How to select only the latest rows for each user? - mysql

My table looks like this:
id | user_id | period_id | completed_on
----------------------------------------
1 | 1 | 1 | 2010-01-01
2 | 2 | 1 | 2010-01-10
3 | 3 | 1 | 2010-01-13
4 | 1 | 2 | 2011-01-01
5 | 2 | 2 | 2011-01-03
6 | 2 | 3 | 2012-01-13
... | ... | ... | ...
I want to select only the latest users periods entries, bearing in mind that users will not all have the same period entries.
Essentially (assuming all I have is the above table) I want to get this:
id | user_id | period_id | completed_on
----------------------------------------
3 | 3 | 1 | 2010-01-13
4 | 1 | 2 | 2011-01-01
6 | 2 | 3 | 2012-01-13
Both of the below queries always resulted with the first user_id occurance being selected, not the latest (because the ordering happens after the rows are selected from what I understand):
SELECT
DISTINCT user_id,
period_id,
completed_on
FROM my_table
ORDER BY
user_id ASC,
period_id DESC
SELECT *
FROM my_table
GROUP BY user_id
ORDER BY
user_id ASC,
period_id DESC

Seems like this should work using MAX and a subquery:
SELECT t.Id, t.User_Id, t.Period_Id, t.Completed_On
FROM my_table t
JOIN (SELECT Max(completed_on) Max_Completed_On, t.User_Id
FROM my_table
GROUP BY t.User_ID
) t2 ON
t.User_Id = t2.User_Id AND t.Completed_On = t2.Max_Completed_On
However, if you potentially have multiple records where the completed_on date is the same per user, then this could return multiple records. Depending on your needs, potentially adding a MAX(Id) in your subquery and joining on that would work.

try this:
SELECT t.Id, t.User_Id, t.Period_Id, t.Completed_On
FROM table1 t
JOIN (SELECT Max(completed_on) Max_Completed_On, t.User_Id
FROM table1 t
GROUP BY t.User_ID) t2 ON t.User_Id = t2.User_Id AND t.Completed_On = t2.Max_Completed_On
DEMO HERE

Related

How to track previous row status count

I want to calculate count of order status changes within different states.
My Orderstatus table:
| id |ordr_id| status |
|----|-------|------------|
| 1 | 1 | pending |
| 2 | 1 | processing |
| 3 | 1 | complete |
| 4 | 2 | pending |
| 5 | 2 | cancelled |
| 6 | 3 | processing |
| 7 | 3 | complete |
| 8 | 4 | pending |
| 9 | 4 | processing |
Output I want:
| state | count |
|----------------------|-------|
| pending->processing | 2 |
| processing->complete | 2 |
| pending->cancelled | 1 |
Currently I'm fetching the results by SELECT order_id,GROUP_CONCAT(status) as track FROM table group by order_id and then process the data in php to get the output. But is that possible in query itself ?
Use lag():
select prev_status, status, count(*)
from (select t.*,
lag(status) over (partition by order_id order by status) as prev_status
from t
) t
group by prev_status, status;
LAG() is available in MySQL starting with version 8.
Note that you can filter out the first status for each order by putting where prev_status is not null in the outer query.
Your version is not quite correct, because it does not enforce the ordering. It should be:
SELECT order_id,
GROUP_CONCAT(status ORDER BY id) as track
EDIT:
In earlier versions of MySQL, you can use a correlated subquery:
select prev_status, status, count(*)
from (select t.*,
(select t2.status
from t t2
where t2.order_id = t.order_id and t2.id < t.id
order by t2.id desc
limit 1
) as prev_status
from t
) t
group by prev_status, status;
If id column ensure the sequence of records, you can use self join to achieve your requirement as below-
SELECT A.Status +'>'+ B.Status, COUNT(*)
FROM OrderStatus A
INNER JOIN OrderStatus B
ON A.id = B.id -1
WHERE B.Status IS NOT NULL
GROUP BY A.Status +'>'+ B.Status
With a join of the 3 status change types to the grouping of the table that you already did:
select c.changetype, count(*) counter
from (
select 'pending->processing' changetype union all
select 'processing->complete' union all
select 'pending->cancelled'
) c inner join (
select
group_concat(status order by id separator '->') changestatus
from tablename
group by ordr_id
) t on concat('->', t.changestatus, '->') like concat('%->', changetype, '->%')
group by c.changetype
See the demo.
Results:
> changetype | counter
> :------------------- | ------:
> pending->cancelled | 1
> pending->processing | 2
> processing->complete | 2
...or just a simple join...
SELECT CONCAT(a.status,'->',b.status) action
, COUNT(*) total
FROM my_table a
JOIN my_table b
ON b.ordr_id = a.ordr_id
AND b.id = a.id + 1
GROUP
BY action;
+----------------------+-------+
| action | total |
+----------------------+-------+
| pending->cancelled | 1 |
| pending->processing | 2 |
| processing->complete | 2 |
+----------------------+-------+
Note that this relies on the fact that ids are contiguous.

Select top most non-duplicated entry after ordering by other columns [duplicate]

This question already has answers here:
SQL select only rows with max value on a column [duplicate]
(27 answers)
Closed 5 years ago.
I would like to select the "top most" entry for each row with a duplicated column value.
Performing the following query -
SELECT *
FROM shop
ORDER BY shop.start_date DESC, shop.created_date DESC;
I get the result set -
+--------+---------+------------+--------------+
| row_id | shop_id | start_date | created_date |
+--------+---------+------------+--------------+
| 1 | 1 | 2017-02-01 | 2017-01-01 |
| 2 | 1 | 2017-01-01 | 2017-02-01 |
| 3 | 2 | 2017-01-01 | 2017-07-01 |
| 4 | 2 | 2017-01-01 | 2017-01-01 |
+--------+---------+------------+--------------+
Can I modify the SELECT so that I only get back the "top rows" for each unique shop_id -- in this case, row_ids 1 and 3. There can be 1..n number of rows with the same shop_id.
Similarly, if my query above returned the following order, I'd want to only SELECT row_ids 1 and 4 since those would be the "top most" entries each shop_id.
+--------+---------+------------+--------------+
| row_id | shop_id | start_date | created_date |
+--------+---------+------------+--------------+
| 1 | 1 | 2017-02-01 | 2017-01-01 |
| 2 | 1 | 2017-01-01 | 2017-02-01 |
| 4 | 2 | 2017-01-01 | 2017-07-01 |
| 3 | 2 | 2017-01-01 | 2017-01-01 |
+--------+---------+------------+--------------+
You can do this by using a subquery:
select s.*
from shop s
where s.row_id = (
select row_id
from shop
where shop_id = s.shop_id
order by start_date desc, created_date desc
limit 1
)
Mind the assumption of row_id being uniq for each shop_id in this query example.
Demonstration
Or like this:
select t.*
from shop t
join (
select t2.shop_id, t2.start_date, max(t2.created_date) as created_date
from shop t2
join (
select max(start_date) as start_date, shop_id
from shop
group by shop_id
) t3 on t3.shop_id = t2.shop_id and t3.start_date = t2.start_date
group by t2.shop_id, t2.start_date
) t1 on t1.shop_id = t.shop_id and t.start_date = t1.start_date and t.created_date = t1.created_date
Mind that in case there can be records with the same start_date and created_date for the same shop_id you would need to use another group by s.shop_id, s.start_date, s.created_date in the outer query (adding min(row_id) with other columns listed in the group by in select)
Demonstration
Try joining to a subquery which finds the "top" rows for each shop_id:
SELECT t1.*
FROM shop t1
INNER JOIN
(
SELECT shop_id, MIN(row_id) AS min_id
FROM shop
GROUP BY shop_id
) t2
ON t1.shop_id = t2.shop_id AND
t1.row_id = t2.min_id
ORDER BY
t1.start_date DESC,
t1.created_date DESC;
Demo

Laravel / MySQL query raw

Not sure on how to query this, but let's say I've got two tables as such
Table 1
| id | userid | points |
|:-----------|------------:|:------------:|
| 1 | 1 | 30
| 2 | 3 | 40
| 3 | 1 | 30
| 4 | 3 | 40
| 5 | 1 | 30
| 6 | 3 | 40
Table 2
| id | userid | productid |
|:-----------|------------:|:------------:|
| 1 | 1 | 4
| 2 | 3 | 4
| 3 | 1 | 3
| 4 | 3 | 3
| 5 | 1 | 3
| 6 | 3 | 3
I need to get all rows with s from table 1 where points are above 30 and where table2 has a productid of 4
At the moment I have a raw query like this:
SELECT userid, SUM(points) as points FROM table1 GROUP BY userid HAVING SUM(points) >= 30 ORDER BY SUM(points) DESC, userid
Through DB::select
How can I make sure that all of the results only have a product id of 4 via table2 connected via the userid? Is this where join is applicable and then I see leftjoin and others so I'm not too sure how to go about this, any suggestions appreciated.
EDIT:
I just got this working:
SELECT userid, SUM(points) as points FROM table1 LEFTJOIN table2 on table1.userid = table2.userid WHERE table2.productid = '4' GROUP BY userid HAVING SUM(points) >= 30 ORDER BY SUM(points) DESC, userid
It is giving me back to correct results, but not 100%sure on join/leftjoin, any feedback if that is OK?
If you use inner join you get only the related row that match between productid =4 and sum only this
SELECT userid, SUM(points) as points
FROM table1
inner join table2 on table1.id = table2.userid and productid=4
GROUP BY userid
HAVING SUM(points) >= 30
RDER BY SUM(points) DESC, userid
or if you are looking for the user that have on of the product = 4 then you can use
SELECT userid, SUM(points) as points
FROM table1
inner join (
select distinct userid
from table2 where productid =4
) t on table1.id = t.userid
GROUP BY userid
HAVING SUM(points) >= 30
RDER BY SUM(points) DESC, userid

Sorting connected records in mysql table

Have Users table, where users can have multiple accounts.
Table can look like this:
u_id | u_parent_d | date_added
1 | 1 | 2017-01-01
2 | 2 | 2017-01-04
3 | 1 | 2017-01-05
4 | 4 | 2017-01-06
5 | 2 | 2017-01-07
How can I order these records by date added but grouped connected accounts together
u_id | u_parent_d | date_added
5 | 2 | 2017-01-07
2 | 2 | 2017-01-04
4 | 4 | 2017-01-06
3 | 1 | 2017-01-05
1 | 1 | 2017-01-01
You can build your query in two steps. First of all get the maximum date for each u_parent_d
select u_parent_d, max(date_added) as max_date
from Users
group by u_parent_d
Then you can join this with the initial table, and use max_date for sorting
select t1.*
from Users t1
join (
select u_parent_d, max(date_added) as max_date
from Users
group by u_parent_d
) t2
on t1.u_parent_d = t2.u_parent_d
order by t2.max_date desc, t1.date_added desc
Order both by date and parent id:
SELECT * FROM users
ORDER BY u_parent_id, date_added DESC

Select two items with maximum number of common values

I have the following table:
+----+-----------+-----------+
| id | teacherId | studentId |
+----+-----------+-----------+
| 1 | 1 | 4 |
| 2 | 1 | 2 |
| 3 | 1 | 1 |
| 4 | 1 | 3 |
| 5 | 2 | 2 |
| 6 | 2 | 1 |
| 7 | 2 | 3 |
| 8 | 3 | 9 |
| 9 | 3 | 6 |
| 10 | 1 | 6 |
+----+-----------+-----------+
I need a query to find two teacherId's with maximum number of common studentId's.
In this case teachers with teacherIds 1,2 have common students with studentIds 2, 1, 3, which is greater than 1,3 having common students 6.
Thanks in Advance!
[Edit]: After several hours I've had the following solution:
SELECT * FROM (
SELECT r1tid, r2tid, COUNT(r2tid) AS cnt
FROM (
SELECT r1.teacherId AS r1tid, r2.teacherId AS r2tid
FROM table r1
INNER JOIN table r2 ON r1.studentId=r2.studentId AND r1.teacherId!=r2.teacherId
ORDER BY r1tid
) t
GROUP BY r1tid, r2tid
ORDER BY cnt DESC
) t GROUP BY cnt ORDER BY cnt DESC LIMIT 1;
I was sure that there must exist more short and elegant solution, but I could not find it.
You would do this with a self-join. Assuming no duplicates in the table:
select t.teacherid, t2.teacherid, count(*) as NumStudentsInCommon
from table t join
table t2
on t.studentid = t2.studentid and
t.teacherid < t2.teacherid
group by t.teacherid, t2.teacherid
order by NumStudentsInCommon desc
limit 1;
If you had duplicates, you would just replace count(*) with count(distinct studentid), but count(distinct) requires a bit more work.
select t.teacherId, t2.teacherId, sum(t.studentId) as NumStudentsInCommon
from table1 t join
table1 t2
on t.studentId = t2.studentId and
t.teacherId < t2.teacherId
group by t.teacherId, t2.teacherId
order by NumStudentsInCommon desc