i have table
userID | date | time
===================
1 | 2015-02-08 | 06:32
1 | 2015-02-08 | 05:36
1 | 2015-02-08 | 17:43
1 | 2015-02-08 | 18:00
1 | 2015-02-09 | 06:36
1 | 2015-02-09 | 15:43
1 | 2015-02-09 | 19:00
1 | 2015-02-10 | 05:36
1 | 2015-02-10 | 17:43
1 | 2015-02-10 | 18:00
2 | 2015-02-08 | 06:32
2 | 2015-02-08 | 05:36
2 | 2015-02-08 | 17:43
2 | 2015-02-08 | 18:00
2 | 2015-02-09 | 06:36
2 | 2015-02-09 | 15:43
2 | 2015-02-09 | 19:00
2 | 2015-02-10 | 05:36
2 | 2015-02-10 | 17:43
2 | 2015-02-10 | 18:00
But i want the number of records returned to be exactly the same as the number of days of the current month and get min time for in and max time for the out. if the current month has 28 days and only had two records it should bring:
userID | date | in | out
========================
1 | 2015-02-01 | |
1 | 2015-02-02 | |
1 | 2015-02-03 | |
1 | 2015-02-04 | |
1 | 2015-02-05 | |
1 | 2015-02-06 | |
1 | 2015-02-07 | |
1 | 2015-02-08 | 06:32 | 18:00
1 | 2015-02-09 | 06:36 | 19:00
1 | 2015-02-10 | 05:36 | 18:00
1 | 2015-02-11 | |
1 | 2015-02-12 | |
1 | 2015-02-13 | |
1 | 2015-02-14 | |
1 | 2015-02-15 | |
1 | 2015-02-16 | |
1 | 2015-02-17 | |
1 | 2015-02-18 | |
1 | 2015-02-19 | |
1 | 2015-02-20 | |
1 | 2015-02-21 | |
1 | 2015-02-22 | |
1 | 2015-02-23 | |
1 | 2015-02-24 | |
1 | 2015-02-25 | |
1 | 2015-02-26 | |
1 | 2015-02-27 | |
1 | 2015-02-28 | |
How can i modify my query to achieve the above result?
this is my query:
$sql = "SELECT
colUserID,
colDate,
if(min(colJam) < '12:00:00',min(colJam), '') as in,
if(max(colJam) > '12:00:00',max(colJam), '') as out
FROM tb_kehadiran
WHERE colDate > DATE_ADD(MAKEDATE($tahun, 31),
INTERVAL($bulan-2) MONTH)
AND
colDate < DATE_ADD(MAKEDATE($tahun, 1),
INTERVAL($bulan) MONTH)
AND
colUserID = $user_id
GROUP BY colUserID,colDate";
I had to think about this one. But probably the simpliest answer so far:
WITH AllMonthDays as (
SELECT n = 1
UNION ALL
SELECT n + 1 FROM AllMonthDays WHERE n + 1 <= DAY(EOMONTH(GETDATE()))
)
SELECT
DISTINCT datefromparts(YEAR(GETDATE()), MONTH(GETDATE()), n) As dates
, MIN(d.time) as 'In'
, MAX(d.time) as 'Out'
FROM AllMonthDays as A
LEFT OUTER JOIN
table as d on
DAY(d.date) = A.n
GROUP BY n,(d.date);
--- Test and tried in this environment: ---
use Example;
CREATE TABLE demo (
ID int identity(1,1)
,date date
,time time
);
INSERT INTO demo (date, time) VALUES
('2015-12-08', '06:32'),
('2015-12-08', '05:36'),
('2015-12-08', '17:43'),
('2015-12-08', '18:00'),
('2015-12-09', '06:36'),
('2015-12-09', '15:43'),
('2015-12-09', '19:00'),
('2015-12-10', '05:36'),
('2015-12-10', '17:43'),
('2015-12-10', '18:00')
;
WITH AllMonthDays as (
SELECT n = 1
UNION ALL
SELECT n + 1 FROM AllMonthDays WHERE n + 1 <= DAY(EOMONTH(GETDATE()))
)
SELECT
DISTINCT datefromparts(YEAR(GETDATE()), MONTH(GETDATE()), n) As dates
, MIN(d.time) as 'In'
, MAX(d.time) as 'Out'
FROM AllMonthDays as A
LEFT OUTER JOIN
demo as d on
DAY(d.date) = A.n
GROUP BY n,(d.date);
DROP table demo;
The way I've approached this problem in the past is to have a date table that is pre-populated for some years in the future.
You could create such a table, possibly defining columns for year, month and date, with indexes on year and month.
You can then use this table with a JOIN on your data to ensure that all dates are present in your results.
You need three things:
A list of dates.
A left join
Aggregation
So:
select d.dte, min(t.time), max(t.time)
from (select date('2015-02-01') as dte union all
select date('2015-02-02') union all
. .
select date('2015-02-28')
) d left join
t
on d.dte = t.date
group by d.dte
order by d.dte;
Try this
set #is_first_date = 0;
set #temp_start_date = date('2015-02-01');
set #temp_end_date = date('2015-02-28');
select my_dates.date,your_table_name.user_id, MIN(your_table_name.time), MAX(your_table_name.time) from
( select if(#is_first_date , #temp_start_date := DATE_ADD(#temp_start_date, interval 1 day), #temp_start_date) as date,#is_first_date:=#is_first_date+1 as start_date from information_schema.COLUMNS
where #temp_start_date < #temp_end_date limit 0, 31
) my_dates left join your_table_name on
my_dates.date = your_table_name.date
group by my_dates.date
Try This query
SELECT `date`, MIN(`time`) as `IN`, MAX('time') AS `OUT`
FROM `table_name` WHERE month(current_date) = month(`date`)
GROUP BY `date`;
Related
I have a table in an old version of MySQL 5.x like this:
+---------+------------+------------+
| Task_ID | Start_Date | End_Date |
+---------+------------+------------+
| 1 | 2015-10-15 | 2015-10-16 |
| 2 | 2015-10-17 | 2015-10-18 |
| 3 | 2015-10-19 | 2015-10-20 |
| 4 | 2015-10-21 | 2015-10-22 |
| 5 | 2015-11-01 | 2015-11-02 |
| 6 | 2015-11-17 | 2015-11-18 |
| 7 | 2015-10-11 | 2015-10-12 |
| 8 | 2015-10-12 | 2015-10-13 |
| 9 | 2015-11-11 | 2015-11-12 |
| 10 | 2015-11-12 | 2015-11-13 |
| 11 | 2015-10-01 | 2015-10-02 |
| 12 | 2015-10-02 | 2015-10-03 |
| 13 | 2015-10-03 | 2015-10-04 |
| 14 | 2015-10-04 | 2015-10-05 |
| 15 | 2015-11-04 | 2015-11-05 |
| 16 | 2015-11-05 | 2015-11-06 |
| 17 | 2015-11-06 | 2015-11-07 |
| 18 | 2015-11-07 | 2015-11-08 |
| 19 | 2015-10-25 | 2015-10-26 |
| 20 | 2015-10-26 | 2015-10-27 |
| 21 | 2015-10-27 | 2015-10-28 |
| 22 | 2015-10-28 | 2015-10-29 |
| 23 | 2015-10-29 | 2015-10-30 |
| 24 | 2015-10-30 | 2015-10-31 |
+---------+------------+------------+
If the End_Date of the tasks are consecutive,
then they are part of the same project.
I am interested in finding the total number of different projects completed.
If there is more than one project that have the same number of completion days,
then order by the Start_Date of the project.
For this few sample records the expected output would be:
2015-10-15 2015-10-16
2015-10-17 2015-10-18
2015-10-19 2015-10-20
2015-10-21 2015-10-22
2015-11-01 2015-11-02
2015-11-17 2015-11-18
2015-10-11 2015-10-13
2015-11-11 2015-11-13
2015-10-01 2015-10-05
2015-11-04 2015-11-08
2015-10-25 2015-10-31
I am a bit jammed with this.
I would really appreciate any help. Thanks.
Following query should work:
select tmp.projectid, date_sub(max(tmp.ed2), interval max(tmp.projectdays) day) start_date,
max(tmp.ed2) end_date,
max(tmp.projectdays) No_Of_ProjectDays
from
(
select t1.task_id tid1, t1.start_date sd1, t1.end_date ed1,
t2.task_id tid2, t2.start_date sd2, t2.end_date ed2,
case when datediff(t2.start_date, ifnull(t1.start_date,'1000-01-01')) != 1
then (#pid := #pid + 1)
else (#pid := #pid)
end as ProjectId,
case when datediff(t2.start_date, ifnull(t1.start_date,'1000-01-01')) != 1
then (#pdays := 1)
else (#pdays := #pdays + 1)
end as ProjectDays
from tasks t1 right join tasks t2
on t2.task_id = t1.task_id + 1
cross join (select #pid :=1, #pdays := 1) vars
) tmp
group by tmp.projectid
order by max(tmp.projectdays), start_date
Please find the Demo here.
EDIT : I have made changes in the query and link according to new data sample. Please have a look.
This answers -- and answers correctly -- the original version of this question.
Hmmmm . . . I think you can use variables. The simplest way is to generate a sequential number and then subtract this value to get a constant for adjacent rows from the date:
select min(start_date), max(end_date)
from (select t.*, (#rn := #rn + 1) as rn
from (select t.* from tasks t order by end_date) t cross join
(select #rn := 0) params
) t
group by (end_date - interval rn day);
Here is a db<>fiddle.
It's a little tricky problem, but the query below works fine.
It builds two tables, one with Start_Date and other with End_Date
that NOT IN End_Date and Start_Date respectively from Projects table,
and query these tables fetching Start_Date WHERE Start_Date < End_Date grouping by Start_Date
using aggregate function MIN with End_Date to get a complete Project.
DATEDIFF(MIN(End_Date), Start_Date) to calculate project_duration and able to order by project_duration.
SELECT Start_Date, MIN(End_Date) AS End_Date, DATEDIFF(MIN(End_Date), Start_Date) AS project_duration
FROM
(SELECT Start_Date FROM Projects WHERE Start_Date NOT IN (SELECT End_Date FROM Projects)) a,
(SELECT End_Date FROM Projects WHERE End_Date NOT IN (SELECT Start_Date FROM Projects)) b
WHERE Start_Date < End_Date
GROUP BY Start_Date
ORDER BY project_duration ASC, Start_Date ASC;
expected output
+------------+------------+---------------+
| Start_Date | End_Date | project_duration |
+------------+------------+---------------+
| 2015-10-15 | 2015-10-16 | 1 |
| 2015-10-17 | 2015-10-18 | 1 |
| 2015-10-19 | 2015-10-20 | 1 |
| 2015-10-21 | 2015-10-22 | 1 |
| 2015-11-01 | 2015-11-02 | 1 |
| 2015-11-17 | 2015-11-18 | 1 |
| 2015-10-11 | 2015-10-13 | 2 |
| 2015-11-11 | 2015-11-13 | 2 |
| 2015-10-01 | 2015-10-05 | 4 |
| 2015-11-04 | 2015-11-08 | 4 |
| 2015-10-25 | 2015-10-31 | 6 |
+------------+------------+---------------+
I have next data:
mysql> select no,crt_date,tobilling_date,sent_to_client,dop_prov from assistfin limit 20;
+--------+---------------------+---------------------+----------------+------------+
| no | crt_date | tobilling_date | sent_to_client | dop_prov |
+--------+---------------------+---------------------+----------------+------------+
| 50.01 | 2014-02-05 10:28:10 | 2014-02-05 14:42:35 | 2014-04-16 | 2014-09-23 |
| 123.01 | 2014-02-05 19:17:36 | 2014-03-17 18:58:05 | 2014-04-10 | 2014-06-30 |
| 51.01 | 2014-02-06 00:09:32 | 2014-03-20 16:53:46 | 2014-04-10 | 2014-06-30 |
| 124.01 | 2014-02-06 15:29:08 | 2014-03-20 17:04:42 | 2014-04-10 | 2014-06-30 |
| 230.01 | 2014-02-07 22:01:11 | 2014-03-20 16:41:03 | 2014-04-10 | 2014-06-30 |
| 252.01 | 2014-02-08 02:52:33 | 2014-03-20 16:43:03 | 2014-04-10 | 2014-06-30 |
| 123.02 | 2014-02-08 03:00:52 | 2014-03-17 18:58:10 | 2014-04-10 | 2014-06-30 |
| 213.01 | 2014-02-08 04:01:35 | 2014-03-26 19:03:01 | 2014-04-10 | 2014-09-19 |
| 55.01 | 2014-02-08 21:04:45 | 2014-03-07 18:40:46 | NULL | 2014-06-26 |
| 126.01 | 2014-02-08 21:46:58 | 2014-09-02 18:39:36 | 2014-09-09 | 2014-09-26 |
| 284.01 | 2014-02-09 01:52:54 | 2014-06-11 19:11:06 | 2014-07-02 | 2014-07-21 |
| 261.01 | 2014-02-09 02:20:34 | 2014-03-17 20:57:39 | 2014-04-10 | 2014-06-30 |
| 318.01 | 2014-02-09 03:09:28 | 2014-03-17 20:44:25 | 2014-04-10 | 2014-06-30 |
| 225.01 | 2015-02-10 03:21:08 | 2014-03-20 16:57:56 | 2014-04-10 | 2014-06-30 |
| 248.01 | 2014-02-09 03:30:58 | 2014-03-18 18:02:21 | 2014-04-10 | 2014-06-30 |
| 178.01 | 2014-04-05 03:35:25 | 2014-03-21 17:10:12 | 2014-04-10 | 2014-06-30 |
| 184.01 | 2014-04-08 04:01:13 | 2015-03-20 16:38:02 | 2015-04-10 | 2015-06-30 |
| 320.01 | 2014-04-08 05:57:23 | 2015-03-17 20:49:19 | 2015-04-10 | 2015-06-30 |
| 230.02 | 2015-05-08 06:18:15 | 2016-03-20 16:41:08 | 2016-04-10 | 2016-06-06 |
| 325.01 | 2014-05-09 06:23:50 | 2015-03-17 20:42:04 | 2015-04-10 | 2015-06-30 |
+--------+---------------------+---------------------+----------------+------------+
Need to get next data:
+---------+---------+--------+-----------+---------+
| year | Created | Passed | To client | To prov |
+---------+---------+--------+-----------+---------+
| 2016-01 | 1901 | 1879 | 1873 | 1743 |
| 2016-02 | 2192 | 2169 | 2114 | 1912 |
| 2016-03 | 2693 | 2639 | 2539 | 2309 |
| 2016-04 | 2634 | 2574 | 2273 | 1976 |
| 2016-05 | 2593 | 2497 | 1109 | 949 |
| 2016-06 | 471 | 449 | 2 | 78 |
+---------+---------+--------+-----------+---------+
Where year like DATE_FORMAT(curdate(), '%Y-%m'), next column Count(assistfin.crt_date) as Created.
The problem is that crt_date can be like 2015%, but sent_to_client or dop_prov can be like 2016%.
How to make correct query?
Ok sorry this is so long and messy and also I couldnt do it using unions as I so arrogantly posted in the comments, also have to reference MySQL: Is it possible to 'fill' a SELECT with values without a table? that gave me the list of months. You could rewrite it so you left join all the tables to crt_date, but then it wont show a month when nothing was created, hence the generated months table. The original query had a limit 120 in the months, but I have replaced it with a datetime > '2014' for you to change with your earliest date.
Try this and see how quickly it runs for you.
select Months.yearmonth, created, passed, to_client, to_prov
from
(SELECT date_format(datetime,'%Y-%m') as yearmonth
FROM (
select (curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) MONTH) as datetime
from (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as c
) AS t
where datetime > '2014' -- enter your earliest year here
ORDER BY datetime ASC) Months left join
(select date_format(crt_date,'%Y-%m') as yearmonth, count(no) as "created" from assistfin group by yearmonth) created on Months.yearmonth=created.yearmonth
left join
(select date_format(tobilling_date,'%Y-%m') as yearmonth, count(no) as "passed" from assistfin group by yearmonth) passed on Months.yearmonth=passed.yearmonth
left join
(select date_format(sent_to_client,'%Y-%m') as yearmonth, count(no) as "to_client" from assistfin group by yearmonth) to_client on Months.yearmonth=to_client.yearmonth
left join
(select date_format(dop_prov,'%Y-%m') as yearmonth, count(no) as "to_prov" from assistfin group by yearmonth) to_prov on Months.yearmonth=to_prov.yearmonth
where
group by yearmonth;
Use group by and date_forma in where
select date_format(crt_date, '%Y-%m') as year, count(sent_to_client ), count(dop_pprov)
from assistfin
where date_format(crt_date, '%Y-%m') = date_format(now(), '%Y-%m')
group by year
for the year you can
select date_format(crt_date, '%Y-%m') as year, count(sent_to_client ), count(dop_pprov)
from assistfin
where date_format(crt_date, '%Y') = date_format(now(), '%Y')
group by year
OR for A Range OF yearS you can
select date_format(crt_date, '%Y-%m') as year, count(sent_to_client ), count(dop_pprov)
from assistfin
where date_format(crt_date, '%Y')
BETWEEN(date_format(now(),'%Y')-2) and date_format(now(), '%Y')
group by year
I am trying to show invoices for every single day, so for that purpose I used group by on created date and sum on subtotal. This is how I done it :
SELECT
`main_table`.*,
SUM(subtotal) AS `total_sales`
FROM
`sales_invoice` AS `main_table`
GROUP BY
DATE_FORMAT(created_at, "%m-%y")
Its working, but I also want to get the Invoice # from and Invoice # to for every date. Is it possible to do it with single query ?
EDIT :
Table Structure :
------------------------------------------------
| id | inoice_no | created_at | subtotal
| 1 | 34 | 2015-03-17 05:55:27 | 5
| 2 | 35 | 2015-03-17 12:35:00 | 7
| 3 | 36 | 2015-03-20 01:40:00 | 3
| 4 | 37 | 2015-03-20 07:05:13 | 6
| 5 | 38 | 2015-03-20 10:25:23 | 1
| 6 | 39 | 2015-03-24 12:00:00 | 6
------------------------------------------------
Output
---------------------------------------------------------------
| id | inoice_no | created_at | subtotal | total_sales
| 2 | 35 | 2015-03-17 12:35:00 | 7 | 12
| 5 | 38 | 2015-03-20 10:25:23 | 1 | 10
| 6 | 39 | 2015-03-24 12:00:00 | 6 | 6
-----------------------------------------------------------------
What I Expect
---------------------------------------------------------------
| id | inoice_no | created_at | subtotal | total_sales | in_from | in_to
| 2 | 35 | 2015-03-17 12:35:00 | 7 | 12 | 34 | 35
| 5 | 38 | 2015-03-20 10:25:23 | 1 | 10 | 36 | 38
| 6 | 39 | 2015-03-24 12:00:00 | 6 | 6 | 39 | 39
-----------------------------------------------------------------
If your invoice number is INTEGER then below query will give you the result what you want:
SELECT DATE_FORMAT(A.created_at, "%m-%y") AS InvoiceDate,
MIN(A.invoiveNo) AS FromInvoiceNo,
MAX(A.invoiveNo) AS ToInvoiceNo,
SUM(A.subtotal) AS total_sales
FROM sales_invoice AS A
GROUP BY InvoiceDate;
I guess salesid is primaryid in sales_invoice table.
select * from(
SELECT
`main_table`.*,
SUM(subtotal) AS `total_sales`
FROM
`sales_invoice` AS `main_table`
GROUP BY
DATE_FORMAT(created_at, "%m-%y")
order by main_table.salesid limit 1
union all
SELECT
`main_table`.*,
SUM(subtotal) AS `total_sales`
FROM
`sales_invoice` AS `main_table`
GROUP BY
DATE_FORMAT(created_at, "%m-%y")
order by main_table.salesid desc limit 1
)a
I'm trying to select the most recent rows for every unique userid where pid = 50 and active = 1. I haven't been able to figure it out.
Here is a sample table
+-----+----------+-------+-----------------------+---------+
| id | userid | pid | start_date | active |
+-----+----------+-------+-----------------------+---------+
| 1 | 4 | 50 | 2015-05-15 12:00:00 | 1 |
| 2 | 4 | 50 | 2015-05-16 12:00:00 | 1 |
| 3 | 4 | 50 | 2015-05-17 12:00:00 | 0 |
| 4 | 4 | 51 | 2015-06-29 12:00:00 | 1 |
| 5 | 4 | 51 | 2015-06-30 12:00:00 | 1 |
| 6 | 5 | 50 | 2015-07-05 12:00:00 | 1 |
| 7 | 5 | 50 | 2015-07-06 12:00:00 | 1 |
| 8 | 5 | 51 | 2015-07-08 12:00:00 | 1 |
+-----+----------+-------+-----------------------+---------+
Desired Result
+-----+----------+-------+-----------------------+---------+
| id | userid | pid | start_date | active |
+-----+----------+-------+-----------------------+---------+
| 2 | 4 | 50 | 2015-05-16 12:00:00 | 1 |
| 7 | 5 | 50 | 2015-07-06 12:00:00 | 1 |
+-----+----------+-------+-----------------------+---------+
I've tried a bunch of things and this is the closest I got but unfortunately it is not quit there.
SELECT *
FROM mytable t1
WHERE
(
SELECT COUNT(*)
FROM mytable t2
WHERE
t1.userid = t2.userid
AND t1.start_date < t2.start_date
) < 1
AND pid = 50
AND active = 1
ORDER BY start_date DESC
plan
get last record grouping by userid where pid is 50 and is active
inner join to mytable to get the record info associated with last
query
select
my.*
from
(
select userid, pid, active, max(start_date) as lst
from mytable
where pid = 50
and active = 1
group by userid, pid, active
) maxd
inner join mytable my
on maxd.userid = my.userid
and maxd.pid = my.pid
and maxd.active = my.active
and maxd.lst = my.start_date
;
output
+----+--------+-----+------------------------+--------+
| id | userid | pid | start_date | active |
+----+--------+-----+------------------------+--------+
| 2 | 4 | 50 | May, 16 2015 12:00:00 | 1 |
| 7 | 5 | 50 | July, 06 2015 12:00:00 | 1 |
+----+--------+-----+------------------------+--------+
sqlfiddle
notes
as suggested by #Strawberry, updated to join also on pid and active. this will avoid the possibility of a record which is not active or not pid 50 but has exact same date also being rendered.
Is there anyway to count a given run of timestamps that are close to each other, but not necessarily in a fixed time frame?
Ie, not grouped by hour or minute, but rather grouped by how close the current row's timestamp is to the next row's timestamp. If the next row is within "x" seconds/minutes then add that row to the group, otherwise start a new grouping.
Given this data:
+----+---------+---------------------+
| id | item_id | event_date |
+----+---------+---------------------+
| 1 | 1 | 2013-05-17 11:59:59 |
| 2 | 1 | 2013-05-17 12:00:00 |
| 3 | 1 | 2013-05-17 12:00:02 |
| 4 | 1 | 2013-05-17 12:00:03 |
| 5 | 3 | 2013-05-17 14:05:00 |
| 6 | 3 | 2013-05-17 14:05:01 |
| 7 | 3 | 2013-05-17 15:30:00 |
| 8 | 3 | 2013-05-17 15:30:01 |
| 9 | 3 | 2013-05-17 15:30:02 |
| 10 | 1 | 2013-05-18 09:12:00 |
| 11 | 1 | 2013-05-18 09:13:30 |
| 12 | 1 | 2013-05-18 09:13:45 |
| 13 | 1 | 2013-05-18 09:14:00 |
| 14 | 2 | 2013-05-20 15:45:00 |
| 15 | 2 | 2013-05-20 15:45:03 |
| 16 | 2 | 2013-05-20 15:45:10 |
| 17 | 2 | 2013-05-23 07:36:00 |
| 18 | 2 | 2013-05-23 07:36:10 |
| 19 | 2 | 2013-05-23 07:36:12 |
| 20 | 2 | 2013-05-23 07:36:15 |
| 21 | 1 | 2013-05-24 11:55:00 |
| 22 | 1 | 2013-05-24 11:55:02 |
+----+---------+---------------------+
Desired Results:
+---------+-------+---------------------+
| item_id | total | last_date_in_group |
+---------+-------+---------------------+
| 1 | 4 | 2013-05-17 12:00:03 |
| 3 | 2 | 2013-05-17 14:05:01 |
| 3 | 3 | 2013-05-17 15:30:02 |
| 1 | 4 | 2013-05-18 09:14:00 |
| 2 | 3 | 2013-05-20 15:45:10 |
| 2 | 4 | 2013-05-23 07:36:15 |
| 1 | 2 | 2013-05-24 11:55:02 |
+---------+-------+---------------------+
This is a little complicated. To start, you need is time of the next event for each record. The following subquery adds in such a time (nexted), if it is within bounds:
select t.*,
(select event_date
from t t2
where t2.item_id = t.item_id and
t2.event_date > t.event_date and
<date comparison here>
order by event_date limit 1
) as nexted
from t
This uses a correlated subquery. The <date comparison here> is for whatever date comparison you want. When there is no record, the value will be NULL.
Now, with this information (nexted) there is a trick to get the grouping. For any record, it is the first event time afterwards where nexted is NULL. This will be the last event in the series. Unfortunately, this requires two levels of nested correlated subqueries (or joins with aggregations). The result looks a bit unwieldy:
select item_id, GROUPING, MIN(event_date) as start_date, MAX(event_date) as end_date,
COUNT(*) as num_dates
from (select t.*,
(select min(t2.event_date)
from (select t1.*,
(select event_date
from t t2
where t2.item_id = t1.item_id and
t2.event_date > t1.event_date and
<date comparison here>
order by event_date limit 1
) as nexted
from t1
) t2
where t2.nexted is null
) as grouping
from t
) s
group by item_id, grouping;
What about approaching it from finding each individual record's local associations, and then grouping on the max event date from each record's discoveries. This is based on a static differential time interval (5 minutes in my example)
SELECT item_id, MAX(total), MAX(last_date_in_group) AS last_date_in_group FROM (
SELECT t1.item_id, COUNT(*) AS total, COALESCE(GREATEST(t1.event_date, MAX(t2.event_date)), t1.event_date) AS last_date_in_group
FROM table_name t1
LEFT JOIN table_name t2 ON t2.event_date BETWEEN t1.event_date AND t1.event_date + INTERVAL 5 MINUTE
GROUP BY t1.id
) t
GROUP BY last_date_in_group