MySQL get count of periods where date in row - mysql

I have an MySQL table, similar to this example:
c_id date value
66 2015-07-01 1
66 2015-07-02 777
66 2015-08-01 33
66 2015-08-20 200
66 2015-08-21 11
66 2015-09-14 202
66 2015-09-15 204
66 2015-09-16 23
66 2015-09-17 0
66 2015-09-18 231
What I need to get is count of periods where dates are in row. I don't have fixed start or end date, there can be any.
For example: 2015-07-01 - 2015-07-02 is one priod, 2015-08-01 is second period, 2015-08-20 - 2015-08-21 is third period and 2015-09-14 - 2015-09-18 as fourth period. So in this example there is four periods.
SELECT
SUM(value) as value_sum,
... as period_count
FROM my_table
WHERE cid = 66
Cant figure this out all day long.. Thx.

I don't have enough reputation to comment to the above answer.
If all you need is the NUMBER of splits, then you can simply reword your question: "How many entries have a date D, such that the date D - 1 DAY does not have an entry?"
In which case, this is all you need:
SELECT
COUNT(*) as PeriodCount
FROM
`periods`
WHERE
DATE_ADD(`date`, INTERVAL - 1 DAY) NOT IN (SELECT `date` from `periods`);
In your PHP, just select the "PeriodCount" column from the first row.
You had me working on some crazy stored procedure approach until that clarification :P

I should get deservedly flamed for this, but anyway, consider the following...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(date DATE NOT NULL PRIMARY KEY
,value INT NOT NULL
);
INSERT INTO my_table VALUES
('2015-07-01',1),
('2015-07-02',777),
('2015-08-01',33),
('2015-08-20',200),
('2015-08-21',11),
('2015-09-14',202),
('2015-09-15',204),
('2015-09-16',23),
('2015-09-17',0),
('2015-09-18',231);
SELECT x.*
, SUM(y.value) total
FROM
( SELECT a.date start
, MIN(c.date) end
FROM my_table a
LEFT
JOIN my_table b
ON b.date = a.date - INTERVAL 1 DAY
LEFT
JOIN my_table c
ON c.date >= a.date
LEFT
JOIN my_table d
ON d.date = c.date + INTERVAL 1 DAY
WHERE b.date IS NULL
AND c.date IS NOT NULL
AND d.date IS NULL
GROUP
BY a.date
) x
JOIN my_table y
ON y.date BETWEEN x.start AND x.end
GROUP
BY x.start;
+------------+------------+-------+
| start | end | total |
+------------+------------+-------+
| 2015-07-01 | 2015-07-02 | 778 |
| 2015-08-01 | 2015-08-01 | 33 |
| 2015-08-20 | 2015-08-21 | 211 |
| 2015-09-14 | 2015-09-18 | 660 |
+------------+------------+-------+
4 rows in set (0.00 sec) -- <-- This is the number of periods

there is a simpler way of doing this, see here SQLfiddle:
SELECT min(date) start,max(date) end,sum(value) total FROM
(SELECT #i:=#i+1 i,
ROUND(Unix_timestamp(date)/(24*60*60))-#i diff,
date,value
FROM tbl, (SELECT #i:=0)n WHERE c_id=66 ORDER BY date) t
GROUP BY diff
This select groups over the same difference between sequential number and date value.
Edit
As Strawberry remarked quite rightly, there was a flaw in my apporach, when a period spans a month change or indeed a change into the next year. The unix_timestamp() function can cure this though: It returns the seconds since 1970-1-1, so by dividing this number by 24*60*60 you get the days since that particular date. The rest is simple ...
If you only need the count, as your last comment stated, you can do it even simpler:
SELECT count(distinct diff) period_count FROM
(SELECT #i:=#i+1 i,
ROUND(Unix_timestamp(date)/(24*60*60))-#i diff,
date,value
FROM tbl,(SELECT #i:=0)n WHERE c_id=66 ORDER BY date) t

Tnx. #cars10 solution worked in MySQL, but could not manage to get period count to echo in PHP. It returned 0. Got it working tnx to #jarkinstall. So my final select looks something like this:
SELECT
sum(coalesce(count_tmp,coalesce(count_reserved,0))) as sum
,(SELECT COUNT(*) FROM my_table WHERE cid='.$cid.' AND DATE_ADD(date, INTERVAL - 1 DAY) NOT IN (SELECT date from my_table WHERE cid='.$cid.' AND coalesce(count_tmp,coalesce(count_reserved,0))>0)) as periods
,count(*) as count
,(min(date)) as min_date
,(max(date)) as max_date
FROM my_table WHERE cid=66
AND coalesce(count_tmp,coalesce(count_reserved,0))>0
ORDER BY date;

Related

How to make time buckets with a start and end time column?

I have 3 columns, employee_id, start_time and end_time I want to make bucks of 1 hour to show me how many employees were working in each hour. For example, employee A worked from 12 pm to 3 pm and employee B worked from 2 pm to 4 pm so, at 12 pm (1 employee was working) 1 pm (1 employee) 2 pm (2 employees were working) 3 pm (2 employees) and 4 pm (1 employee), how can I make this in SQL? Let me show you a picture of the start and end time columns.
Sample input would be:
Expected outcome would be something like
I want to create a bucket in order to know how many people were working in each hour of the day.
SELECT
Employee_id,
TIME(shift_start_at,timezone) AS shift_start,
TIME(shift_end_at,timezone) AS shift_end,
FROM
`employee_shifts` AS shifts
WHERE
DATE(shifts.shift_start_at_local) >= "2022-05-01"
GROUP BY
1,
2,
3
Assuming you are on mysql version 8 or above generate all the buckets , left join to shifts to infill times in start-endtime ranges , filter out those that are not applicable then count eg:-
DROP TABLE IF EXISTS t;
create table t (id int, startts datetime, endts datetime);
insert into t values
(1,'2022-06-19 08:30:00','2022-06-19 10:00:00'),
(2,'2022-06-19 08:30:00','2022-06-19 08:45:00'),
(3,'2022-06-19 07:00:00','2022-06-19 07:59:00');
with cte as
(select 7 as bucket union select 8 union select 9 union select 10 union select 11),
cte1 as
(select bucket,t.*,
floor(hour(startts)) starthour, floor(hour(endts)) endhour
from cte
left join t on cte.bucket between floor(hour(startts)) and floor(hour(endts))
)
select bucket,count(id) nof from cte1 group by bucket
;
+--------+-----+
| bucket | nof |
+--------+-----+
| 7 | 1 |
| 8 | 2 |
| 9 | 1 |
| 10 | 1 |
| 11 | 0 |
+--------+-----+
5 rows in set (0.001 sec)
If you have a limited number of time bucket maybe you can use it this way
WITH CTE AS
(SELECT
COUNTRY,
MONTH,
TIMESTAMP_DIFF(time_b, time_a, MINUTE) dt,
METRIC_a,
METRIC_b
FROM
TABLE_NAME)
SELECT
CASE
WHEN dt BETWEEN 0 AND 10 THEN "0-10"
WHEN dt BETWEEN 10 AND 20 THEN "11-20"
WHEN dt BETWEEN 20 AND 30 THEN "21-30"
WHEN dt BETWEEN 30 AND 40 THEN "31-40"
WHEN dt > 40 THEN ">40"
END as time_bucket,
AVG(METRIC_a),
SUM(METRIC_b)
FROM CTE
Althought, I should emphasize that this solution works if you have a limited bucket. If you have a lot of buckets, you can create a base table with your buckets then LEFT JOIN it to get your results.
Just use a subquery for each column mentioning the required timestamp in between, also make sure your start_time and end_time columns are timestamp types. For more information, please share the table structure, sample data, and expected output
If I understood well, this would be
SELECT HOUR, (SELECT COUNT(*)
FROM employee
WHERE start_time <= HOUR
AND end_time >= HOUR) AS working
FROM schedule HOUR
Where schedule is a table with employee schedules.

How to sum up records from starting month to current per month

I've searched for this topic but all I got was questions about grouping results by month. I need to retrieve rows grouped by month with summed up cost from start date to this whole month
Here is an example table
Date | Val
----------- | -----
2017-01-20 | 10
----------- | -----
2017-02-15 | 5
----------- | -----
2017-02-24 | 15
----------- | -----
2017-03-14 | 20
I need to get following output (date format is not the case):
2017-01-20 | 10
2017-02-24 | 30
2017-03-14 | 50
When I run
SELECT SUM(`val`) as `sum`, DATE(`date`) as `date` FROM table
AND `date` BETWEEN :startDate
AND :endDate GROUP BY year(`date`), month(`date`)
I got sum per month of course.
Nothing comes to my mind how to put in nicely in one query to achieve my desired effect, probably W will need to do some nested queries but maybe You know some better solution.
Something like this should work (untestet). You could also solve this by using subqueries, but i guess that would be more costly. In case you want to sort the result by the total value the subquery variant might be faster.
SET #total:=0;
SELECT
(#total := #total + q.sum) AS total, q.date
FROM
(SELECT SUM(`val`) as `sum`, DATE(`date`) as `date` FROM table
AND `date` BETWEEN :startDate
AND :endDate GROUP BY year(`date`), month(`date`)) AS q
You can use DATE_FORMAT function to both, format your query and group by.
DATE_FORMAT(date,format)
Formats the date value according to the format string.
SELECT Date, #total := #total + val as total
FROM
(select #total := 0) x,
(select Sum(Val) as Val, DATE_FORMAT(Date, '%m-%Y') as Date
FROM st where Date >= '2017-01-01' and Date <= '2017-12-31'
GROUP BY DATE_FORMAT(Date, '%m-%Y')) y
;
+---------+-------+
| Date | total |
+---------+-------+
| 01-2017 | 10 |
+---------+-------+
| 02-2017 | 30 |
+---------+-------+
| 03-2017 | 50 |
+---------+-------+
Can check it here: http://rextester.com/FOQO81166
Try this.
I use yearmonth as an integer (the year of the date multiplied by 100 plus the month of the date) . If you want to re-format, your call, but integers are always a bit faster.
It's the complete scenario, including input data.
CREATE TABLE tab (
dt DATE
, qty INT
);
INSERT INTO tab(dt,qty) VALUES( '2017-01-20',10);
INSERT INTO tab(dt,qty) VALUES( '2017-02-15', 5);
INSERT INTO tab(dt,qty) VALUES( '2017-02-24',15);
INSERT INTO tab(dt,qty) VALUES( '2017-03-14',20);
SELECT
yearmonths.yearmonth
, SUM(by_month.month_qty) AS running_qty
FROM (
SELECT DISTINCT
YEAR(dt) * 100 + MONTH(dt) AS yearmonth
FROM tab
) yearmonths
INNER JOIN (
SELECT
YEAR(dt) * 100 + MONTH(dt) AS yearmonth
, SUM(qty) AS month_qty
FROM tab
GROUP BY YEAR(dt) * 100 + MONTH(dt)
) by_month
ON yearmonths.yearmonth >= by_month.yearmonth
GROUP BY yearmonths.yearmonth
ORDER BY 1;
;
yearmonth|running_qty
201,701| 10.0
201,702| 30.0
201,703| 50.0
select succeeded; 3 rows fetched
Need explanations?
My solution has the advantage over the others that it will be re-usable without change when you move it to a more modern database - and you can convert it to using analytic functions when you have time.
Marco the Sane

MySQL concurrent select from rows

I have a table that has start_date and end_date timestamps, the rows contain data from a radius accounting table.
Shortly once a user logs in, a row is inserted with the start_date timestamp, and once the user logs off, the end_date is populated with an UPDATE.
Here are couple of rows:
ID | start_date | end_date
22 2013-11-19 12:00:22 2013-11-20 14:20:22 (*)
23 2013-11-19 12:02:22 2013-11-20 15:20:22 (*)
23 2013-11-19 17:02:22 2013-11-20 20:20:22
24 2013-11-20 12:06:22 2013-11-20 15:20:22 *
25 2013-11-20 12:40:22 2013-11-20 15:23:22 *
26 2013-11-20 12:50:22 2013-11-20 17:23:22 *
27 2013-11-20 16:56:22 2013-11-20 17:29:22
28 2013-11-20 17:58:22 2013-11-20 20:24:22
So in this case, for 2013-11-19 the max number of concurrent user is 2 (marked with (*) )(their times between start and end overlaps), for 2013-11-20 it is 3 (marked with *)
I am trying to write an SQL query to get the number of most concurrent users in a day (based on the start and end date), so a short result would be that on 2013-08-12, the most online at the same time is xx number.
I could do this in PHP by analyzing row by row, but I would like to keep it as an SQL query.
Try
select d, count(*) as user_count
from
(
select start_date as d from your_table
union all
select end_date as d from your_table
) x
group by d
order by user_count desc
You could calculate count of concurrent users on each datapoint you have (start_date / end_date) and then calculate the max out of that:
select max(cnt)
from (
select q.t, count(*) as 'cnt'
from (
select start_date as 't'
from log
where start_date between YOUR_START_DATE and YOUR_END_DATE
union
select end_date
from log
where end_date between YOUR_START_DATE and YOUR_END_DATE
) as q
join log l on q.t between l.start_date and l.end_date
group by q.t
) a

Find big enough gaps in booking table

A rental system uses a booking table to store all bookings and reservations:
booking | item | startdate | enddate
1 | 42 | 2013-10-25 16:00 | 2013-10-27 12:00
2 | 42 | 2013-10-27 14:00 | 2013-10-28 18:00
3 | 42 | 2013-10-30 09:00 | 2013-11-01 09:00
…
Let’s say a user wants to rent item 42 from 2013-10-27 12:00 until 2013-10-28 12:00 which is a period of one day. The system will tell him, that the item is not available in the given time frame, since booking no. 2 collides.
Now I want to suggest the earliest rental date and time when the selected item is available again. Of course considering the user’s requested period (1 day) beginning with the user’s desired date and time.
So in the case above, I’m looking for an SQL query that returns 2013-10-28 18:00, since the earliest date since 2013-10-27 12:00 at which item 42 will be available for 1 day, is from 2013-10-28 18:00 until 2013-10-29 18:00.
So I need to to find a gap between bookings, that is big enough to hold the user’s reservation and that is as close a possible to the desired start date.
Or in other words: I need to find the first booking for a given item, after which there’s enough free time to place the user’s booking.
Is this possible in plain SQL without having to iterate over every booking and its successor?
If you can't redesign your database to use something more efficient, this will get the answer. You'll obviously want to parameterize it. It says find either the desired date, or the earliest end date where the hire interval doesn't overlap an existing booking:
Select
min(startdate)
From (
select
cast('2013-10-27 12:00' as datetime) startdate
from
dual
union all
select
enddate
from
booking
where
enddate > cast('2013-10-27 12:00' as datetime) and
item = 42
) b1
Where
not exists (
select
'x'
from
booking b2
where
item = 42 and
b1.startdate < b2.enddate and
b2.startdate < date_add(b1.startdate, interval 24 hour)
);
Example Fiddle
SELECT startfree,secondsfree FROM (
SELECT
#lastenddate AS startfree,
UNIX_TIMESTAMP(startdate)-UNIX_TIMESTAMP(#lastenddate) AS secondsfree,
#lastenddate:=enddate AS ignoreme
FROM
(SELECT startdate,enddate FROM bookings WHERE item=42) AS schedule,
(SELECT #lastenddate:=NOW()) AS init
ORDER BY startdate
) AS baseview
WHERE startfree>='2013-10-27 12:00:00'
AND secondsfree>=86400
ORDER BY startfree
LIMIT 1
;
Some explanation: The inner query uses a variable to move the iteration into SQL, the outer query finds the needed row.
That said, I would not do this in SQL, if the DB structure is like the given. You could reduce the iteration count by using some smort WHERE in the inner query to a sane timespan, but chances are, this won't perform well.
EDIT
A caveat: I did not check, but I assume, this won't work, if there are no prior reservations in the list - this should not be a problem, as in this case your first reservation attempt (original time) will work.
EDIT
SQLfiddle
Searching for overlapping date ranges generally yields poor performance in SQL. For that reason having a "Calendar" of available slots often makes things a lot more efficient.
For example, the booking 2013-10-25 16:00 => 2013-10-27 12:00 would actually be represented by 44 records, each one hour long.
The "gap" until the next booking at 2013-10-27 14:00 would then be represented by 2 records, each one hours long.
Then, each record could also have the duration (in time, or number of slots) until the next change.
slot_start_time | booking | item | remaining_duration
------------------+---------+------+--------------------
2013-10-27 10:00 | 1 | 42 | 2
2013-10-27 11:00 | 1 | 42 | 1
2013-10-27 12:00 | NULL | 42 | 2
2013-10-27 13:00 | NULL | 42 | 1
2013-10-27 14:00 | 2 | 42 | 28
2013-10-27 15:00 | 2 | 42 | 27
... | ... | ... | ...
2013-10-28 17:00 | 2 | 42 | 1
2013-10-28 18:00 | NULL | 42 | 39
2013-10-28 19:00 | NULL | 42 | 38
Then your query just becomes:
SELECT
*
FROM
slots
WHERE
slot_start_time >= '2013-10-27 12:00'
AND remaining_duration >= 24
AND booking IS NULL
ORDER BY
slot_start_time ASC
LIMIT
1
OK this isn't pretty in MySQL. That's because we have to fake rownum values in subqueries.
The basic approach is to join the appropriate subset of the booking table to itself offset by one.
Here's the basic list of reservations for item 42, ordered by reservation time. We can't order by booking_id, because those aren't guaranteed to be in order of reservation time. (You're trying to insert a new reservation between two existing ones, eh?) http://sqlfiddle.com/#!2/62383/9/0
SELECT #aserial := #aserial+1 AS rownum,
booking.*
FROM booking,
(SELECT #aserial:= 0) AS q
WHERE item = 42
ORDER BY startdate, enddate
Here is that subset joined to itself. The trick is the a.rownum+1 = b.rownum, which joins each row to the one that comes right after it in the booking table subset. http://sqlfiddle.com/#!2/62383/8/0
SELECT a.booking_id, a.startdate asta, a.enddate aend,
b.startdate bsta, b.enddate bend
FROM (
SELECT #aserial := #aserial+1 AS rownum,
booking.*
FROM booking,
(SELECT #aserial:= 0) AS q
WHERE item = 42
ORDER BY startdate, enddate
) AS a
JOIN (
SELECT #bserial := #bserial+1 AS rownum,
booking.*
FROM booking,
(SELECT #bserial:= 0) AS q
WHERE item = 42
ORDER BY startdate, enddate
) AS b ON a.rownum+1 = b.rownum
Here it is again, showing each reservation (except the last one) and the number of hours following it. http://sqlfiddle.com/#!2/62383/15/0
SELECT a.booking_id, a.startdate, a.enddate,
TIMESTAMPDIFF(HOUR, a.enddate, b.startdate) gaphours
FROM (
SELECT #aserial := #aserial+1 AS rownum,
booking.*
FROM booking,
(SELECT #aserial:= 0) AS q
WHERE item = 42
ORDER BY startdate, enddate
) AS a
JOIN (
SELECT #bserial := #bserial+1 AS rownum,
booking.*
FROM booking,
(SELECT #bserial:= 0) AS q
WHERE item = 42
ORDER BY startdate, enddate
) AS b ON a.rownum+1 = b.rownum
So, if you're looking for the starting time and ending time of the earliest twelve-hour slot you can use that result set to do this: http://sqlfiddle.com/#!2/62383/18/0
SELECT MIN(enddate) startdate, MIN(enddate) + INTERVAL 12 HOUR as enddate
FROM (
SELECT a.booking_id, a.startdate, a.enddate,
TIMESTAMPDIFF(HOUR, a.enddate, b.startdate) gaphours
FROM (
SELECT #aserial := #aserial+1 AS rownum,
booking.*
FROM booking,
(SELECT #aserial:= 0) AS q
WHERE item = 42
ORDER BY startdate, enddate
) AS a
JOIN (
SELECT #bserial := #bserial+1 AS rownum,
booking.*
FROM booking,
(SELECT #bserial:= 0) AS q
WHERE item = 42
ORDER BY startdate, enddate
) AS b ON a.rownum+1 = b.rownum
) AS gaps
WHERE gaphours >= 12
here is the query, it will return needed date, obvious condition - there should be some bookings in table, but as I see from question - you do this check:
SELECT min(enddate)
FROM
(
select a.enddate from table4 as a
where
a.item=42
and
DATE_ADD(a.enddate, INTERVAL 1 day) <= ifnull(
(select min(b.startdate)
from table4 as b where b.startdate>=a.enddate and a.item=b.item),
a.enddate)
and
a.enddate>=now()
union all
select greatest(ifnull(max(enddate), now()),now()) from table4
) as q
you change change INTERVAL 1 day to INTERVAL ### hour
If I have understood your requirements correctly, you could try self-JOINing book with itself, to get the "empty" spaces, and then fit. This is MySQL only (I believe it can be adapted to others - certainly PostgreSQL):
SELECT book.*, TIMESTAMPDIFF(MINUTE, book.enddate, book.best) AS width FROM
(
SELECT book.*, MIN(book1.startdate) AS best
FROM book
JOIN book AS book1 USING (item)
WHERE item = 42 AND book1.startdate >= book.enddate
GROUP BY book.booking
) AS book HAVING width > 110 ORDER BY startdate LIMIT 1;
In the above example, "110" is the looked-for minimum width in minutes.
Same thing, a bit less readable (for me), a SELECT removed (very fast SELECT, so little advantage):
SELECT book.*, MIN(book1.startdate) AS best
FROM book
JOIN book AS book1 ON (book.item = book1.item AND book.item = 42)
WHERE book1.startdate >= book.enddate
GROUP BY book.booking
HAVING TIMESTAMPDIFF(MINUTE, book.enddate, best) > 110
ORDER BY startdate LIMIT 1;
In your case, one day is 1440 minutes and
SELECT book.*, MIN(book1.startdate) AS best FROM book JOIN book AS book1 ON (book.item = book1.item AND book.item = 42) WHERE book1.startdate >= book.enddate GROUP BY book.booking HAVING TIMESTAMPDIFF(MINUTE, book.enddate, best) >= 1440 ORDER BY startdate LIMIT 1;
+---------+------+---------------------+---------------------+---------------------+
| booking | item | startdate | enddate | best |
+---------+------+---------------------+---------------------+---------------------+
| 2 | 42 | 2013-10-27 14:00:00 | 2013-10-28 18:00:00 | 2013-10-30 09:00:00 |
+---------+------+---------------------+---------------------+---------------------+
1 row in set (0.00 sec)
...the period returned is 2, i.e., at the end of booking 2, and until "best" which is booking 3, a period of at least 1440 minutes is available.
An issue could be that if no periods are available, the query returns nothing -- then you need another query to fetch the farthest enddate. You can do this with an UNION and LIMIT 1 of course, but I think it would be best to only run the 'recovery' query on demand, programmatically (i.e. if empty(query) then new_query...).
Also, in the inner WHERE you should add a check for NOW() to avoid dates in the past. If expired bookings are moved to inactive storage, this could be unnecessary.

Grouping into interval of 5 minutes within a time range

I have some difficulties with mySQL commands that I want to do.
SELECT a.timestamp, name, count(b.name)
FROM time a, id b
WHERE a.user = b.user
AND a.id = b.id
AND b.name = 'John'
AND a.timestamp BETWEEN '2010-11-16 10:30:00' AND '2010-11-16 11:00:00'
GROUP BY a.timestamp
This is my current output statement.
timestamp name count(b.name)
------------------- ---- -------------
2010-11-16 10:32:22 John 2
2010-11-16 10:35:12 John 7
2010-11-16 10:36:34 John 1
2010-11-16 10:37:45 John 2
2010-11-16 10:48:26 John 8
2010-11-16 10:55:00 John 9
2010-11-16 10:58:08 John 2
How do I group them into 5 minutes interval results?
I want my output to be like
timestamp name count(b.name)
------------------- ---- -------------
2010-11-16 10:30:00 John 2
2010-11-16 10:35:00 John 10
2010-11-16 10:40:00 John 0
2010-11-16 10:45:00 John 8
2010-11-16 10:50:00 John 0
2010-11-16 10:55:00 John 11
This works with every interval.
PostgreSQL
SELECT
TIMESTAMP WITH TIME ZONE 'epoch' +
INTERVAL '1 second' * round(extract('epoch' from timestamp) / 300) * 300 as timestamp,
name,
count(b.name)
FROM time a, id
WHERE …
GROUP BY
round(extract('epoch' from timestamp) / 300), name
MySQL
SELECT
timestamp, -- not sure about that
name,
count(b.name)
FROM time a, id
WHERE …
GROUP BY
UNIX_TIMESTAMP(timestamp) DIV 300, name
I came across the same issue.
I found that it is easy to group by any minute interval is
just dividing epoch by minutes in amount of seconds and then either rounding or using floor to get ride of the remainder. So if you want to get interval in 5 minutes you would use 300 seconds.
SELECT COUNT(*) cnt,
to_timestamp(floor((extract('epoch' from timestamp_column) / 300 )) * 300)
AT TIME ZONE 'UTC' as interval_alias
FROM TABLE_NAME GROUP BY interval_alias
interval_alias cnt
------------------- ----
2010-11-16 10:30:00 2
2010-11-16 10:35:00 10
2010-11-16 10:45:00 8
2010-11-16 10:55:00 11
This will return the data correctly group by the selected minutes interval; however, it will not return the intervals that don't contains any data. In order to get those empty intervals we can use the function generate_series.
SELECT generate_series(MIN(date_trunc('hour',timestamp_column)),
max(date_trunc('minute',timestamp_column)),'5m') as interval_alias FROM
TABLE_NAME
Result:
interval_alias
-------------------
2010-11-16 10:30:00
2010-11-16 10:35:00
2010-11-16 10:40:00
2010-11-16 10:45:00
2010-11-16 10:50:00
2010-11-16 10:55:00
Now to get the result with interval with zero occurrences we just outer join both result sets.
SELECT series.minute as interval, coalesce(cnt.amnt,0) as count from
(
SELECT count(*) amnt,
to_timestamp(floor((extract('epoch' from timestamp_column) / 300 )) * 300)
AT TIME ZONE 'UTC' as interval_alias
from TABLE_NAME group by interval_alias
) cnt
RIGHT JOIN
(
SELECT generate_series(min(date_trunc('hour',timestamp_column)),
max(date_trunc('minute',timestamp_column)),'5m') as minute from TABLE_NAME
) series
on series.minute = cnt.interval_alias
The end result will include the series with all 5 minute intervals even those that have no values.
interval count
------------------- ----
2010-11-16 10:30:00 2
2010-11-16 10:35:00 10
2010-11-16 10:40:00 0
2010-11-16 10:45:00 8
2010-11-16 10:50:00 0
2010-11-16 10:55:00 11
The interval can be easily changed by adjusting the last parameter of generate_series. In our case we use '5m' but it could be any interval we want.
You should rather use GROUP BY UNIX_TIMESTAMP(time_stamp) DIV 300 instead of round(../300) because of the rounding I found that some records are counted into two grouped result sets.
For postgres, I found it easier and more accurate to use the
date_trunc
function, like:
select name, sum(count), date_trunc('minute',timestamp) as timestamp
FROM table
WHERE xxx
GROUP BY name,date_trunc('minute',timestamp)
ORDER BY timestamp
You can provide various resolutions like 'minute','hour','day' etc... to date_trunc.
The query will be something like:
SELECT
DATE_FORMAT(
MIN(timestamp),
'%d/%m/%Y %H:%i:00'
) AS tmstamp,
name,
COUNT(id) AS cnt
FROM
table
GROUP BY ROUND(UNIX_TIMESTAMP(timestamp) / 300), name
Not sure if you still need it.
SELECT FROM_UNIXTIME(FLOOR((UNIX_TIMESTAMP(timestamp))/300)*300) AS t,timestamp,count(1) as c from users GROUP BY t ORDER BY t;
2016-10-29 19:35:00 | 2016-10-29 19:35:50 | 4 |
2016-10-29 19:40:00 | 2016-10-29 19:40:37 | 5 |
2016-10-29 19:45:00 | 2016-10-29 19:45:09 | 6 |
2016-10-29 19:50:00 | 2016-10-29 19:51:14 | 4 |
2016-10-29 19:55:00 | 2016-10-29 19:56:17 | 1 |
You're probably going to have to break up your timestamp into ymd:HM and use DIV 5 to split the minutes up into 5-minute bins -- something like
select year(a.timestamp),
month(a.timestamp),
hour(a.timestamp),
minute(a.timestamp) DIV 5,
name,
count(b.name)
FROM time a, id b
WHERE a.user = b.user AND a.id = b.id AND b.name = 'John'
AND a.timestamp BETWEEN '2010-11-16 10:30:00' AND '2010-11-16 11:00:00'
GROUP BY year(a.timestamp),
month(a.timestamp),
hour(a.timestamp),
minute(a.timestamp) DIV 12
...and then futz the output in client code to appear the way you like it. Or, you can build up the whole date string using the sql concat operatorinstead of getting separate columns, if you like.
select concat(year(a.timestamp), "-", month(a.timestamp), "-" ,day(a.timestamp),
" " , lpad(hour(a.timestamp),2,'0'), ":",
lpad((minute(a.timestamp) DIV 5) * 5, 2, '0'))
...and then group on that
How about this one:
select
from_unixtime(unix_timestamp(timestamp) - unix_timestamp(timestamp) mod 300) as ts,
sum(value)
from group_interval
group by ts
order by ts
;
I found out that with MySQL probably the correct query is the following:
SELECT SUBSTRING( FROM_UNIXTIME( CEILING( timestamp /300 ) *300,
'%Y-%m-%d %H:%i:%S' ) , 1, 19 ) AS ts_CEILING,
SUM(value)
FROM group_interval
GROUP BY SUBSTRING( FROM_UNIXTIME( CEILING( timestamp /300 ) *300,
'%Y-%m-%d %H:%i:%S' ) , 1, 19 )
ORDER BY SUBSTRING( FROM_UNIXTIME( CEILING( timestamp /300 ) *300,
'%Y-%m-%d %H:%i:%S' ) , 1, 19 ) DESC
Let me know what you think.
select
CONCAT(CAST(CREATEDATE AS DATE),' ',datepart(hour,createdate),':',ROUNd(CAST((CAST((CAST(DATEPART(MINUTE,CREATEDATE) AS DECIMAL (18,4)))/5 AS INT)) AS DECIMAL (18,4))/12*60,2)) AS '5MINDATE'
,count(something)
from TABLE
group by CONCAT(CAST(CREATEDATE AS DATE),' ',datepart(hour,createdate),':',ROUNd(CAST((CAST((CAST(DATEPART(MINUTE,CREATEDATE) AS DECIMAL (18,4)))/5 AS INT)) AS DECIMAL (18,4))/12*60,2))
This will do exactly what you want.
Replace
dt - your datetime
c - call field
astro_transit1 - your table
300 as seconds for each time gap increase
SELECT
FROM_UNIXTIME(300 * ROUND(UNIX_TIMESTAMP(r.dt) / 300)) AS 5datetime,
(SELECT
r.c
FROM
astro_transit1 ra
WHERE
ra.dt = r.dt
ORDER BY ra.dt DESC
LIMIT 1) AS first_val
FROM
astro_transit1 r
GROUP BY UNIX_TIMESTAMP(r.dt) DIV 300
LIMIT 0 , 30
Based on #boecko answer for MySQL, I used a CTE (Common Table Expression) to accelerate the query execution time :
so this :
SELECT
`timestamp`,
`name`,
count(b.`name`)
FROM `time` a, `id` b
WHERE …
GROUP BY
UNIX_TIMESTAMP(`timestamp`) DIV 300, name
becomes :
WITH cte AS (
SELECT
`timestamp`,
`name`,
count(b.`name`),
UNIX_TIMESTAMP(`timestamp`) DIV 300 AS `intervals`
FROM `time` a, `id` b
WHERE …
)
SELECT * FROM cte GROUP BY `intervals`
In a large amount of data, the speed is accelerated by more than 10!
As timestamp and time are reserved in MySQL, don't forget to use `...` on each table and column name !
Hope it will help some of you.