Data sample:
dtime
id
2021-01-01 06:00:00
1
2021-01-01 06:00:00
2
2021-01-01 06:00:00
3
...
...
2021-01-01 12:00:00
1
2021-01-01 12:00:00
2
2021-01-01 12:00:00
3
...
...
...
...
2021-01-12 20:00:00
1
2021-01-12 20:00:00
2
2021-01-12 20:00:00
3
In the real dataset, ids are between 1 and 9999, dtime are every 5 minutes, 24h/day, and I'd like to sample only at certain times (eg 06, 12, 16, 20h).
The expected output is the average of count(id) values, grouped by DATE(dtime), but:
Only certain TIME(dtime) should be sampled (eg 06, 12, 16, 20h);
count(id) should ignore id that are not between 10 and 500;
count(id) should be discarded (and not considered for the average) if <3.
Output sample:
DATE(dtime)
AVG(count(id))
2021-01-01
31
2021-01-02
29
So far I've got:
SELECT dtime,count(id)
FROM cron5min
WHERE (TIME(dtime) = '06:00:00' OR TIME(dtime) = '12:00:00' OR TIME(dtime) = '16:00:00' OR TIME(dtime) = '20:00:00') AND id BETWEEN 10 AND 500 AND estado = 1
GROUP BY dtime
and then I'm using PHP to do the average and discard data according to 3.
I'm now trying to do this with a MySQL statement only, no PHP.
You need 2 levels of aggregation:
SELECT DATE(dtime) date, AVG(counter) avg_count
FROM (
SELECT dtime, COUNT(id) counter
FROM cron5min
WHERE TIME(dtime) IN ('06:00:00', '12:00:00', '16:00:00', '20:00:00')
AND id BETWEEN 10 AND 500
AND estado = 1
GROUP BY dtime
HAVING counter >= 3
) t
GROUP BY date
Related
EDIT: I have added the primary key, following the comment by #Strawberry
The aim is to return the number of current members, and also the number of past memberships, on any particular date/time.
For example, suppose we have
msid id start cancelled
1 1 2020-01-01 09:00:00 null
2 2 2020-01-01 09:00:00 2020-12-31 09:00:00
3 2 2021-01-01 09:00:00 null
4 3 2020-01-01 09:00:00 2020-06-30 09:00:00
5 3 2020-02-01 09:00:00 2020-06-30 09:00:00
6 3 2020-07-01 09:00:00 null
and we want to calculate the number of members at various times, which should return as follows
Datetime Current Past <Notes - not to be returned by the query>
2020-01-01 12:00:00 3 0 -- all 3 IDs have joined earlier on this date
2020-02-01 12:00:00 3 0 -- new membership for existing member (ID 3) is not counted
2020-06-30 12:00:00 2 1 -- ID 3 has cancelled earlier on this day
2020-07-01 12:00:00 3 0 -- ID 3 has re-joined earlier on this day
2020-12-31 12:00:00 2 1 -- ID 2 has cancelled earlier on this day
2021-01-01 12:00:00 3 0 -- ID 2 has re-joined earlier on this day
An ID may either be current or past, but never both. That is, if a past member re-joins, as in the case of ID 2 and 3 above, they become current members, and are no longer past members.
Also, a member may have multiple current memberships, but they can only be counted as a current member once, as in the case of ID 3 above.
How can this be achieved in MySQL ?
Here is a db<>fiddle with the above data
Test this:
WITH
cte1 AS ( SELECT start `timestamp` FROM dt
UNION
SELECT cancelled FROM dt WHERE cancelled IS NOT NULL ),
cte2 AS ( SELECT DISTINCT id
FROM dt )
SELECT cte1.`timestamp`, COUNT(DISTINCT dt.id) current, SUM(dt.id IS NULL) past
FROM cte1
CROSS JOIN cte2
LEFT JOIN dt ON cte1.`timestamp` >= dt.start
AND (cte1.`timestamp` < dt.cancelled OR dt.cancelled IS NULL)
AND cte2.id = dt.id
GROUP BY cte1.`timestamp`
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=942e4c97951ed0929e178134ef67ce69
I have a table where records will be getting inserted every 4 hours on a daily basis. If the record was not inserted for continuous 4 hours, I need to insert a log into another table. Below is the table schema.
Id DocPathid CreatedAt
1 1 2021-04-02 00:00:00
2 1 2021-04-02 04:00:00
3 1 2021-04-02 09:00:00
4 1 2021-04-02 12:00:00
5 1 2021-04-02 16:00:00
6 1 2021-04-02 20:00:00
7 1 2021-04-02 24:00:00
In the above case, there was no records inserted within a interval of 4hours (i.e. between 2021-04-02 04:00:00 & 2021-04-02 09:00:00). The query should return no. of failure count (in this case it is failed for 1 time).
Is there a way to achieve this in MySQL?
You can do something like this.
select count(1)
from (
select id, CreatedAt , timestampdiff(hour, CreatedAt
, lead(CreatedAt,1) over (partition by DocPathid order by CreatedAt) ) as hour
from Table1
) t
where hour >4
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=9b0c631145422dbccd2ea23f0a7d2011
halo i want to count the data and group by month (month start date) then show in date time, any idea ?
table_a
timestamp
2020-11-28 04:00:00
2020-11-28 05:00:00
2020-12-29 01:00:00
2020-12-29 02:00:00
2020-12-29 03:00:00
expected result:
timestamp count
2020-11-01 00:00:00 2
2020-12-01 00:00:00 3
my query is:
SELECT STR_TO_DATE(DATE_FORMAT(timestamp, '%Y-%m-dd HH'), '%Y-%m-dd HH'), count(*) as count
from table_a
but my results was:
timestamp count
2020-11-00 2
2020-12-00 3
You can use:
select str_to_date(concat_ws('-', year(timestamp), month(timestamp), 1)) as yyyymm,
count(*)
from table_a
group by yyyymm;
i fixed the question with add group by month(timestamp) , i m ignored the minuted, the important things is group by month , here are my full query :
SELECT timestamp, count(*) as count
from table_a
group by month(timestamp)
then the output show :
timestamp count
2020-11-01 04:00:00 2
2020-12-01 01:00:00 3
Suppose I have 5 records for a sales table.
ID Name datetime_col
1 ABC 2016-09-15 02:07:56
2 HSJ 2016-09-31 11:45:45
3 JSD 2016-11-26 07:09:56
4 JUH 2016-12-31 12:00:00
5 IGY 2017-01-13 14:00:07
I want to find how many records are there in sales table for each hour between 2016-09-15 AND 2017-01-13
Then result should be like
Hour sales_at_this_hour
2016-09-15 01:00:00 0
2016-09-15 02:00:00 1
2016-09-15 03:00:00 0
...
...
2017-01-13 01:00:00 0
2017-01-13 02:00:00 0
2017-01-13 03:00:00 0
....
2017-01-13 14:00:00 1
Then find the average of sales_at_this_hour using MySQL
EDIT: sorry not fully understand the question at first.
Use DATE_FORMAT
select
DATE_FORMAT(datetime_col, '%Y-%m-%d %h:00:00') as date,
count(id) as count
from table_name
group by date;
Get result with hours that has sales_at_this_hour > 1 (not exactly what you ask for)
datetime_col count
2016-02-04 05:00:00 5
2016-02-04 07:00:00 1
2016-02-04 08:00:00 5
2016-02-04 10:00:00 10
2016-02-04 11:00:00 1
Provide start_date and end_date, and then use DATEDIFF to calculate total time interval for the average calculation.
set #start_date = '2016-01-01', #end_date = '2017-01-01';
select
DATE_FORMAT(group_by_date.datetime, '%h:00:00') as hour,
AVG(group_by_date.count) / DATEDIFF(#end_date, #start_date) as average
from (
select
DATE_FORMAT(created_dtm, '%Y-%m-%d %h:00:00') as datetime,
count(id) as count
from table_name
where created_dtm > #start_date
and created_dtm < #end_date
group by datetime
) group_by_date
group by hour;
For each hour,
average sale count per day = total sale count / total days
hour average
01:00:00 0.03841209
02:00:00 0.01653005
03:00:00 0.0306716
04:00:00 0.01147541
05:00:00 0.01179831
Options (my table)
id datetime energy
1 2014-10-28 04:00:00 14
1 2014-10-28 04:05:00 16
1 2014-10-28 04:10:00 23
1 2014-10-28 04:15:00 45
1 2014-10-29 04:00:00 34
1 2014-10-29 04:05:00 33
1 2014-10-29 04:10:00 12
1 2014-10-29 04:15:00 67
output
id datetime
1 2014-10-28 04:15:00 28
1 2014-10-29 04:15:00 37.33
my query:
SELECT date(`datetime`) dateDay,id,
15*floor(date_format(`datetime`,'%i')/15) dateHour,
avg(energy) FROM `meter`
WHERE `datetime` >= '2014-10-28 00:00:01' AND `datetime` <= '2014-10-29 23:59:59'
GROUP BY id,day(datetime),month(datetime),dateHour
If I understand, you want to achieve a grouping by 15 minutes intervals. If so, you would get along with something like this:
SELECT
id,
avg(energy) as average_value,
date_format(datetime, "%Y-m-d") as date_day,
date_format(datetime,'%H') as date_hour
IF(date_format(datetime,'%i') < 15, 0,
IF(date_format(datetime,'%i') <= 30, 15,
IF(date_format(datetime,'%i') <= 45, 30,45))) as fifteen_minutes_slot
from deliverydestination
GROUP BY id, date_day, date_hour, fifteen_minutes_slot
1) compute the fifteen_minutes slots with a set of ifs: if the value for minutes is less than 15, give it the value 0, else if it is less than 30, give it the value 15, etc...
2) group by day, hour and 15 minutes slot value. And by id, as in your example, if needed.
It would be nice to have a fiddle with your data, to show the final result...