I'm trying to select the most recent rows for every unique userid where pid = 50 and active = 1. I haven't been able to figure it out.
Here is a sample table
+-----+----------+-------+-----------------------+---------+
| id | userid | pid | start_date | active |
+-----+----------+-------+-----------------------+---------+
| 1 | 4 | 50 | 2015-05-15 12:00:00 | 1 |
| 2 | 4 | 50 | 2015-05-16 12:00:00 | 1 |
| 3 | 4 | 50 | 2015-05-17 12:00:00 | 0 |
| 4 | 4 | 51 | 2015-06-29 12:00:00 | 1 |
| 5 | 4 | 51 | 2015-06-30 12:00:00 | 1 |
| 6 | 5 | 50 | 2015-07-05 12:00:00 | 1 |
| 7 | 5 | 50 | 2015-07-06 12:00:00 | 1 |
| 8 | 5 | 51 | 2015-07-08 12:00:00 | 1 |
+-----+----------+-------+-----------------------+---------+
Desired Result
+-----+----------+-------+-----------------------+---------+
| id | userid | pid | start_date | active |
+-----+----------+-------+-----------------------+---------+
| 2 | 4 | 50 | 2015-05-16 12:00:00 | 1 |
| 7 | 5 | 50 | 2015-07-06 12:00:00 | 1 |
+-----+----------+-------+-----------------------+---------+
I've tried a bunch of things and this is the closest I got but unfortunately it is not quit there.
SELECT *
FROM mytable t1
WHERE
(
SELECT COUNT(*)
FROM mytable t2
WHERE
t1.userid = t2.userid
AND t1.start_date < t2.start_date
) < 1
AND pid = 50
AND active = 1
ORDER BY start_date DESC
plan
get last record grouping by userid where pid is 50 and is active
inner join to mytable to get the record info associated with last
query
select
my.*
from
(
select userid, pid, active, max(start_date) as lst
from mytable
where pid = 50
and active = 1
group by userid, pid, active
) maxd
inner join mytable my
on maxd.userid = my.userid
and maxd.pid = my.pid
and maxd.active = my.active
and maxd.lst = my.start_date
;
output
+----+--------+-----+------------------------+--------+
| id | userid | pid | start_date | active |
+----+--------+-----+------------------------+--------+
| 2 | 4 | 50 | May, 16 2015 12:00:00 | 1 |
| 7 | 5 | 50 | July, 06 2015 12:00:00 | 1 |
+----+--------+-----+------------------------+--------+
sqlfiddle
notes
as suggested by #Strawberry, updated to join also on pid and active. this will avoid the possibility of a record which is not active or not pid 50 but has exact same date also being rendered.
Related
This question already has answers here:
Group by minimum value in one field while selecting distinct rows
(10 answers)
Closed 2 years ago.
I have a table that stores facial login data of employees based upon employee id. I need to get the earliest login for each employee on a day and all other logins to be ignored. I know how to get latest or earliest record for each employee but I am unable to figure out how to get earliest entry in each day by each employee.
+----+-----------+--------------------------------------+-------------+-----------------------+
| id | camera_id | image_name | employee_id | created_at |
+----+-----------+--------------------------------------+-------------+-----------------------+
| 10 | 2 | pjcc7vf142pec6li7k8kqxuqvnmhm0tyo8ib | 16 | 2020-07-11 10:40:20 |
| 11 | 2 | 9iizfdtk3m81a745ut7tzqzqh8kf9ipz2u02 | 2 | 2020-07-11 10:40:22 |
| 14 | 2 | 3p74yrq35nfaazwdo8auguvn2h5hpugtfvvw | 2 | 2020-07-11 12:07:24 |
| 15 | 2 | hpa2am40ufke7o7q2y733hh83h7ykxxdgkof | 16 | 2020-07-11 12:09:35 |
| 16 | 2 | g7adgyzloab2t4z7xx2id0a9cjqx8ojfni99 | 2 | 2020-07-11 12:09:41 |
| 17 | 2 | tapufkiuj5toxfdoikjicbe3k7tl32yj5khp | 16 | 2020-07-12 12:09:47 |
| 18 | 2 | pjcc7vf142pec6li7k8kqxuqvnmhm0tyo8ib | 16 | 2020-07-12 14:40:20 |
| 19 | 2 | 9iizfdtk3m81a745ut7tzqzqh8kf9ipz2u02 | 2 | 2020-07-12 15:40:22 |
| 20 | 2 | 3p74yrq35nfaazwdo8auguvn2h5hpugtfvvw | 2 | 2020-07-12 16:07:24 |
| 21 | 2 | hpa2am40ufke7o7q2y733hh83h7ykxxdgkof | 16 | 2020-07-12 17:09:35 |
| 22 | 2 | g7adgyzloab2t4z7xx2id0a9cjqx8ojfni99 | 2 | 2020-07-13 12:09:41 |
+----+-----------+--------------------------------------+-------------+-----------------------+
The result will look like below...
+----+-----------+--------------------------------------+-------------+-----------------------+
| id | camera_id | image_name | employee_id | created_at |
+----+-----------+--------------------------------------+-------------+-----------------------+
| 10 | 2 | pjcc7vf142pec6li7k8kqxuqvnmhm0tyo8ib | 16 | 2020-07-11 10:40:20 |
| 11 | 2 | 9iizfdtk3m81a745ut7tzqzqh8kf9ipz2u02 | 2 | 2020-07-11 10:40:22 |
| 17 | 2 | tapufkiuj5toxfdoikjicbe3k7tl32yj5khp | 16 | 2020-07-12 12:09:47 |
| 19 | 2 | 9iizfdtk3m81a745ut7tzqzqh8kf9ipz2u02 | 2 | 2020-07-12 15:40:22 |
| 22 | 2 | g7adgyzloab2t4z7xx2id0a9cjqx8ojfni99 | 2 | 2020-07-13 12:09:41 |
+----+-----------+--------------------------------------+-------------+-----------------------+
You can do:
select *
from t
where (employee_id, created_at) in (
select employee_id, min(created_at)
from t
group by employee_id, date(created_at)
)
how to get earliest entry in each day by each employee
You can filter with a correlated subquery:
select t.*
from mytable t
where t.created_at = (
select min(t1.created_at)
from mytable t1
where
t1.employee_id = t.employee_id
and t1.created_at >= date(t.created_at)
and t1.created_at < date(t.created_at) + interval 1 day
)
This query would take advantage of an index on (employee_id, created_at).
Or, if you are running MySQL 8.0, you can use window functions:
select *
from (
select
t.*,
row_number() over(
partition by employee_id, date(created_at)
order by created_at
) rn
from mytable t
) t
where rn = 1
I have the following table bellow.
The timeStamp is the moment that the status began.
There are some rows that don't add new information if status changed (like the second row) and they could be ignored.
I would to calculate (using mysql 5.7) the total amount of time for each status.
| timeStamp | status |
|------------------------------|
| 2019-12-10 14:00:00 | 1 |
| 2019-12-10 14:10:00 | 1 | // this row could be ignored
| 2019-12-10 14:00:00 | 2 | // more 24 hours in status 1
| 2019-12-11 14:10:00 | 2 |
| 2019-12-12 14:00:00 | 1 | // more 24 hours in status 2
| 2019-12-14 14:00:00 | 2 | // more 48 hours in status 1
| 2019-12-16 14:10:00 | 2 |
| 2019-12-17 14:20:00 | 2 |
| 2019-12-18 14:00:00 | 3 | // more 96 hours in status 2
| 2019-12-19 14:00:00 | 1 | // more 24 hours in status 3
I would like to see as result a table like bellow.
| status | amount_of_time |
|-------------------------|
| 1 | 72 hours |
| 2 | 120 hours |
| 3 | 24 hours |
What complicates this is that the status don't stay in order: is not 1, 2,3.
In the example above it is: 1, 2, 1, 2, 3, 1, so I can't use the MIN information.
Get the timestamp of the following row in a subquery and calculate the difference to the timestamp of the current row:
select t1.status, timestampdiff(second,
t1.timeStamp,
(
select min(t2.timeStamp)
from mytable t2
where t2.timeStamp > t1.timeStamp
)
) as diff
from mytable t1;
This will return:
| status | diff |
| ------ | ------ |
| 1 | 600 |
| 1 | 86400 |
| 2 | 600 |
| 2 | 85800 |
| 1 | 172800 |
| 2 | 173400 |
| 2 | 87000 |
| 2 | 85200 |
| 3 | 86400 |
| 1 | NULL |
View on DB Fiddle
From here it's just a matter of GROUP BY and SUM:
select status, sum(diff) as duratation_in_seconds
from (
select t1.status, timestampdiff(second,
t1.timeStamp,
(
select min(t2.timeStamp)
from mytable t2
where t2.timeStamp > t1.timeStamp
)
) as diff
from mytable t1
) x
group by status;
Result:
| status | duratation_in_seconds |
| ------ | --------------------- |
| 1 | 259800 |
| 2 | 432000 |
| 3 | 86400 |
View on DB Fiddle
If you want the time in hours, change the first line to
select status, round(sum(diff)/3600) as duratation_in_hours
and you will get:
| status | duratation_in_hours |
| ------ | ------------------- |
| 1 | 72 |
| 2 | 120 |
| 3 | 24 |
View on DB Fiddle
You might though want to use floor() instead of round(). That's not clear from your question.
In MySQL 8 you could use the LEAD() window function to get the timestamp of the next row:
select status, sum(diff) as duratation_in_seconds
from (
select
status,
timestampdiff(second, timeStamp, lead(timeStamp) over (order by timeStamp)) as diff
from mytable
) x
group by status;
View on DB Fiddle
I have the following table:
+------------+--------+-----+
| reg_dat | status | id |
+------------+--------+-----+
| 2016-01-31 | 10 | 1 |
| 2017-06-31 | 12 | 1 |
| 2015-01-31 | 12 | 4 |
| 2017-01-25 | 5 | 4 |
| 2017-01-11 | 3 | 2 |
+------------+--------+-----+
I would like to do a mysql query to group the rows by id and keeping only the more recent date... so the output should be the following:
+------------+--------+-----+
| reg_dat | status | id |
+------------+--------+-----+
| 2017-06-31 | 12 | 1 |
| 2017-01-25 | 5 | 4 |
| 2017-01-11 | 3 | 2 |
+------------+--------+-----+
Unfortunately my code doesn't work...
select *
from table
group by id
order by id, reg_dat DESC
Have you some suggestions?
You can do that using a JOIN and a subquery
SELECT t.reg_dat, t.status, t.id
FROM table t
JOIN (SELECT max(reg_dat) max_date, id FROM table GROUP BY id) t1
ON t.reg_dat = t1.max_date AND t.id = t1.id
Here's my current database:
TABLE profileData
+---------------------------+
| profileID | name |
+----+----------------------+
| 1 | Stackoverflow |
| 2 | Stackexchange |
| 3 | Askubuntu |
+-----------+---------------+
TABLE stats
+----+-----------+------------------------------+
| id | profileID | sCount | ts |
+----+-----------+--------+---------------------+
| 1 | 1 | 1 | 2013-10-04 00:00:01 |
| 2 | 2 | 5 | 2013-10-04 00:00:01 |
| 3 | 3 | 8 | 2013-10-04 00:00:01 |
| 4 | 1 | 10 | 2013-10-05 00:00:01 |
| 5 | 2 | 50 | 2013-10-05 00:00:01 |
| 6 | 1 | 100 | 2013-10-06 00:00:01 |
| 7 | 2 | 500 | 2013-10-06 00:00:01 |
| 8 | 1 | 101 | 2013-10-06 13:00:01 |
| 9 | 2 | 501 | 2013-10-06 19:00:01 |
| 10 | 3 | 17 | 2013-10-06 05:00:01 |
| 11 | 1 | 100 | 2013-10-09 00:00:01 |
| 12 | 2 | 500 | 2013-10-09 00:00:01 |
+----+-----------+------- +---------------------+
TABLE users
+--------+-----------+
| userID | profileID |
+--------------------+
| 1337 | 1 |
| 1337 | 2 |
+--------+-----------+
What i need is the following:
Select all profiles from the table "users" and get the names of them + the last entry of every day from the table "stats" for these profiles for the last 7 days. So, the expected result is
+---------------+--------+---------------------+
| name | sCount | ts |
+---------------+--------+---------------------+
| Stackoverflow | 1 | 2013-10-04 00:00:01 |
| Stackexchange | 5 | 2013-10-04 00:00:01 |
| Stackoverflow | 10 | 2013-10-05 00:00:01 |
| Stackexchange | 50 | 2013-10-05 00:00:01 |
| Stackoverflow | 101 | 2013-10-06 13:00:01 |
| Stackexchange | 501 | 2013-10-06 19:00:01 |
| Stackoverflow | 100 | 2013-10-09 00:00:01 |
| Stackexchange | 500 | 2013-10-09 00:00:01 |
+---------------+--------+---------------------+
I ended up with this statement:
SELECT
profileData.name,
stats.scount,
stats.ts
FROM
users
INNER JOIN profiles ON
users.profileID = profiles.profileID
INNER JOIN
(
SELECT t1.profileID, t1.sCount, t1.ts
FROM stats t1
INNER JOIN (
SELECT MAX(ts) maxi
FROM stats
GROUP BY DATE(ts)
) a2 ON t1.ts = a2.maxi) stats ON
users.profileID = stats.profileID
WHERE
users.userID = 1337 AND DATE(stats.ts) >= DATE(DATE_SUB(NOW(), INTERVAL 7 DAY))
ORDER BY users.userID, stats.ts
This worked partially. However, this statement seems to be an overkill and it's not working anymore.
I've also tried to select MAX(ts). That worked, but the result didn't contain the correspondending sCount value.
So, i'm looking for THE solution of my problem and i hope, anyone can help me with this.
Oh and it has to be a pure SQL solution, if possible.
you can do something like that, which is not that far from what you did.
select p.name,
s.scount,
s.ts
from profileData p
inner join users u on u.profileID = p.profileID
inner join stats s on s.profileID = p.profileID
inner join (select max(ts) as maxTs, profileID
from stats
where DATE(stats.ts) >= DATE(DATE_SUB(NOW(), INTERVAL 7 DAY))
group by profileID, DATE(ts)) as mx
on s.profileID = mx.profileID and mx.maxTs = s.ts
where u.userID = 1337
see sqlfiddle
What about this (EXPLAIN is better) because less FULL tables scan are needed..
But you really need indexes
SELECT
profileData.name
, stats.sCount
, MAX(ts) ts
FROM
stats
INNER JOIN
users
ON
stats.profileID = users.profileID
INNER JOIN
profileData
ON
users.profileID = profileData.profileID
WHERE
users.userID = 1337
AND
DATE(stats.ts) >= DATE(DATE_SUB(NOW(), INTERVAL 7 DAY))
GROUP BY
profileData.name
, stats.profileID ASC
, stats.sCount ASC
ORDER BY
ts ASC
;
see demo http://sqlfiddle.com/#!2/db8d3/51
In table I need to filter out nearest duplicated rows which have same status_id (but not completely all) when user_id is the same. GROUP BY or DISTINCT did not help in this situation. Here is an example:
---------------------------------------------------
| id | user_id | status_id | date |
---------------------------------------------------
| 1 | 10 | 1 | 2010-10-10 10:00:10|
| 2 | 10 | 1 | 2010-10-11 10:00:10|
| 3 | 10 | 1 | 2010-10-12 10:00:10|
| 4 | 10 | 2 | 2010-10-13 10:00:10|
| 5 | 10 | 4 | 2010-10-14 10:00:10|
| 6 | 10 | 4 | 2010-10-15 10:00:10|
| 7 | 10 | 2 | 2010-10-16 10:00:10|
| 8 | 10 | 2 | 2010-10-17 10:00:10|
| 9 | 10 | 1 | 2010-10-18 10:00:10|
| 10 | 10 | 1 | 2010-10-19 10:00:10|
Have to look like:
---------------------------------------------------
| id | user_id | status_id | date |
---------------------------------------------------
| 1 | 10 | 1 | 2010-10-10 10:00:10|
| 4 | 10 | 2 | 2010-10-13 10:00:10|
| 5 | 10 | 4 | 2010-10-14 10:00:10|
| 7 | 10 | 2 | 2010-10-16 10:00:10|
| 9 | 10 | 1 | 2010-10-18 10:00:10|
Oldest entries (by date) should remain in the table
You want to keep each row where the previous status is different, based on the id or date column.
If your ids are really sequential (as they are in the question), you can do this with a convenient join:
select t.*
from t left outer join
t tprev
on t.id = tprev.id+1
where tprev.id is null or tprev.status <> t.status;
If the ids are not sequential, you can get the previous one using a correlated subquery:
select t.*
from (select t.*,
(select t2.status
from t t2
where t2.user_id = t.user_id and
t2.id < t.id
order by t2.id desc
limit 1
) as prevstatus
from t
) t
where prevstatus is null or prevstatus <> t.status;