group by consecutive period - mysql

Hi,I have a column as below
+--------+--------+
| day | amount|
+--------+---------
| 2 | 2 |
| 1 | 3 |
| 1 | 4 |
| 2 | 2 |
| 3 | 3 |
| 4 | 3 |
| 5 | 6 |
| 6 | 6 |
+--------+--------+
now I want something like this sum day 1- day3 as row one , sum day2-day4 as row 2, and so on.
+--------+--------+
| day | amount|
+--------+---------
| 1-3 | 14 |
| 2-4 | 10 |
| 3-5 | 12 |
| 4-6 | 15 |
+--------+--------+
Could you offer any one help ,thanks!

I would just use a correlated subquery:
select day, day + 2 as end_day,
(select sum(amount)
from t t2
where t2.day in (t.day, t.day + 1, t.day + 2)
) as amount
from (select distinct day from t) t;
This returns rows for all days, not limited to the last 4. If you really want that limit, then you can use:
select day, day + 2 as end_day,
(select sum(amount)
from t t2
where t2.day in (t.day, t.day + 1, t.day + 2)
) as amount
from (select distinct day
from t
order by day
offset 1 limit 99999999
) t
order by day;

Way 1:
Simply use UNION ALL:
SELECT '1 - 3' [Day], SUM(Amount)Amount FROM Your_Table WHERE Day BETWEEN 1 AND 3
UNION ALL
SELECT '2 - 4', SUM(Amount) FROM Your_Table WHERE Day BETWEEN 2 AND 4
UNION ALL
SELECT '3 - 5', SUM(Amount) FROM Your_Table WHERE Day BETWEEN 3 AND 5
UNION ALL
SELECT '4 - 6', SUM(Amount) FROM Your_Table WHERE Day BETWEEN 4 AND 6
Way 2:
You have to create a table with date range and JOIN the Table.
CREATE TABLE Tab1 (Day INT, Amount INT)
INSERT INTO Tab1 VALUES( 2 ,2 )
,(1, 3)
,(1, 4)
,(2, 2)
,(3, 3)
,(4, 3)
,(5, 6)
,(6, 6)
CREATE TABLE Tab2 (DateRange VARCHAR(10), StartDate INT, EndDate INT)
INSERT INTO Tab2 VALUES ('1 - 3',1,3)
,('2 - 4',2,4)
,('3 - 5',3,5)
,('4 - 6',4,6)
SELECT T2.DateRange,SUM(T1.Amount) Amount
FROM Tab1 T1
JOIN Tab2 T2 ON T1.Day BETWEEN T2.StartDate AND T2.EndDate
GROUP BY T2.DateRange
OutPut:
Day Amount
1 - 3 14
2 - 4 10
3 - 5 12
4 - 6 15

You can use integer division in order to calculate 'days buckets' and group by each bucket:
SELECT (day - 1) DIV 3 AS bucket, SUM(amount) AS total
FROM mytable
GROUP BY (day - 1) DIV 3;
Output:
bucket total
-------------
0 14
1 15
Demo here
To get the bucket string you can use:
SELECT concat(3 * ((day - 1) DIV 3 + 1) - 2, ' - ',
3 * ((day - 1) DIV 3 + 1)) AS bucket,
SUM(amount) AS total
FROM mytable
GROUP BY (day - 1) DIV 3
order by day;
Output:
bucket total
--------------
1 - 3 14
4 - 6 15
Note: The query works only for consecutive non-overlapping intervals.

Related

count by person by month between days in mysql

I have a table of absences with 3 columns id, begin_dt, end_dt. I need to give a count of how many id's has at least one day of absence in that month. So for example there is a row as follow:
id begin_dt end_dt
1 01/01/2020 02/02/2020
2 02/02/2020 02/02/2020
my result has to be
month count
01-2020 1
02-2020 2
I thought with a group by on DATE_FORMAT(SYSDATE(), '%Y-%m'), but I don't know how to manage the fact that we had to look for the whole period begin_dt till end_dt
you can find a working creation of table of this example here: https://www.db-fiddle.com/f/rYBsxQzTjjQ9nGBEmeAX6W/0
Schema (MySQL v5.7)
CREATE TABLE absence (
`id` VARCHAR(6),
`begin_dt` DATETIME,
`end_dt` DATETIME
);
INSERT INTO absence
(`id`, `begin_dt`, `end_dt`)
VALUES
('1', DATE('2019-01-01'), DATE('2019-02-02')),
('2', DATE('2019-02-02'), DATE('2019-02-02'));
Query #1
select * from absence;
| id | begin_dt | end_dt |
| --- | ------------------- | ------------------- |
| 1 | 2019-01-01 00:00:00 | 2019-02-02 00:00:00 |
| 2 | 2019-02-02 00:00:00 | 2019-02-02 00:00:00 |
View on DB Fiddle
SELECT DATE_FORMAT(startofmonth, '%Y-%m-01') year_and_month,
COUNT(*) absent_person_count
FROM absence
JOIN ( SELECT DATE_FORMAT(dt + INTERVAL n MONTH, '%Y-%m-01') startofmonth,
DATE_FORMAT(dt + INTERVAL n MONTH, '%Y-%m-01') + INTERVAL 1 MONTH - INTERVAL 1 DAY endofmonth
FROM ( SELECT MIN(begin_dt) dt
FROM absence ) startdate,
( SELECT 0 n UNION ALL
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5 ) numbers,
( SELECT DATE_FORMAT(MIN(begin_dt), '%Y-%m') mindate,
DATE_FORMAT(MAX(end_dt), '%Y-%m') maxdate
FROM absence ) datesrange
WHERE DATE_FORMAT(dt + INTERVAL n MONTH, '%Y-%m') BETWEEN mindate AND maxdate ) dateslist
ON begin_dt <= endofmonth
AND end_dt >= startofmonth
GROUP BY year_and_month;
fiddle

Get record in range of multiples of 5

i have a existing new week_table -
start_date end_date weekno ----------------------------------------------
1996-01-01 1996-01-05 1
1996-01-08 1996-01-12 2
1996-01-15 1996-01-19 3
1996-01-22 1996-01-26 4
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''till
1998-12-21 1998-12-26 156
i am trying to extract records with a count of 5 weeks in group. I am looking at results like
start_date end_date weekno_start weekno_end ----------------------------------------------
1996-01-01 1996-02-02 1 5
1996-02-05 1996-03-08 6 10
1996-03-11 1996-04-12 11 16
i do get the results but the weekno numbers keep running over the maximum week no in the database. for records over weekno 156 i get rows with null value.
How can i avoid the records with null and limit the view to the maximum week no
my current code is-
SELECT (t1.weekno * 5) - 4 AS start_id
,t3.start_date
,t4.end_date
,(t1.weekno * 5) AS end_id
FROM weekcon_table t1
LEFT JOIN weekcon_table t2 ON (t2.weekno = t1.weekno * 5)
LEFT JOIN weekcon_table t3 ON (t3.weekno = (t1.weekno * 5) - 4)
LEFT JOIN weekcon_table t4 ON (t4.weekno = (t1.weekno * 5))
Have you tried something like this:
select min(weekno) as `start_id`,
min(start_date) as `start_date`,
max(end_date) as `end_date`,
min(weekno) as `weekno_start`,
max(weekno) as `weekno_end`
from weekcon_table
group by ((weekno - 1) DIV 5)
order by ((weekno - 1) DIV 5) asc
Here is the output:
start_id start_date end_date weekno_start weekno_end
1 01/01/1996 26/01/1996 1 5
6 04/03/1996 24/02/1996 6 10
11 01/04/1996 30/03/1996 11 15
16 06/05/1996 27/04/1996 16 20
21 03/06/1996 25/05/1996 21 23
Record Count: 5; Execution Time: 1ms View Execution Plan link
I create two tables and asign a rank_id
the first one is for star_date ... will be each row weekno % 5 = 1
second table is for end_date ... will be each row weekno % 5 = 0 and also include the last date of all weeks.
Then join by rank_id
Sql Fiddle Demo In the demo you can change the select fields for * if want see what is happening
SELECT ini_range.start_date,
end_range.end_date,
ini_range.weekno,
end_range.weekno
FROM
(
SELECT r.* ,
(SELECT count(distinct r2.weekno)
FROM
(
SELECT *
FROM t_week
WHERE weekno % 5 = 1
) r2
WHERE r2.weekno <= r.weekno
) as rank
FROM
(
SELECT *
FROM t_week
WHERE weekno % 5 = 1
) r
) ini_range
JOIN
(
SELECT r.* ,
(SELECT count(distinct r2.weekno)
FROM
(
SELECT *
FROM t_week
WHERE weekno % 5 = 0
or weekno = (SELECT max(weekno) FROM t_week)
) r2
WHERE r2.weekno <= r.weekno
) as rank
FROM
(
SELECT *
FROM t_week
WHERE weekno % 5 = 0
or weekno = (SELECT max(weekno) FROM t_week)
) r
) end_range
ON ini_range.rank = end_range.rank
OUTPUT
| start_date | end_date | weekno | weekno |
|------------|------------|--------|--------|
| 01/01/1996 | 03/02/1996 | 1 | 5 |
| 05/02/1996 | 09/03/1996 | 6 | 10 |
| 11/03/1996 | 13/04/1996 | 11 | 15 |
| 15/04/1996 | 18/05/1996 | 16 | 20 |
| 20/05/1996 | 08/06/1996 | 21 | 23 | <- 23 is last week
and group only have
3 week instead of 5
I found another solution
SQL Fiddle Demo
SELECT *
FROM t_week w_ini
JOIN t_week w_end
ON w_ini.weekno = w_end.weekno + 4
OR w_ini.weekno + 5 > w_end.weekno
WHERE
w_ini.weekno % 5 = 1
and w_ini.weekno < w_end.weekno
and(
w_end.weekno % 5 = 0 or
w_end.weekno = (SELECT max(weekno) FROM t_week)
)

Check for maximum amount of records in a period

I've got a table with a "date" column (timestamp). What I'm trying to achieve is to check if after inserting a row there will be no more than 3 records contained in a single 24 hours period, for example:
I have records with the following dates:
1. 2015-05-31 23:14:00
2. 2015-06-01 02:07:00
3. 2015-06-01 15:16:00
So now I shouldn't be able to to insert a row with the date of (for example) 2015-06-01 16:01:00 or 2015-06-01 01:01:00 but I should be able to add records with the dates of (for example): 2015-06-01 23:50:00, 2015-05-31 01:05:00.
How can I achieve this?
There is a little trick that you can achieve this problem with purely SQL
SET #DATE = '2015-05-31 1:14:00';
INSERT tbldate(inputdate)
SELECT #DATE FROM
(
(
SELECT
COUNT(*) AS c
FROM tbldate AS t1 INNER JOIN tbldate AS t2
WHERE
t1.inputdate <= t2.inputdate AND
t2.inputdate <= t1.inputdate + INTERVAL 24 HOUR AND
t1.inputdate BETWEEN #DATE - INTERVAL 24 HOUR AND #DATE
GROUP BY
t1.inputdate
)
UNION ALL
(SELECT 0 AS c)
) AS r
HAVING MAX(r.c) < 2
where #DATE is date you want to insert.
So, essentially, you want to prevent the insertion of dates which fall within the following ranges, if there are already two dates within those ranges:
SELECT x.id, x.dt - INTERVAL 24 HOUR min_range, x.dt max_range FROM my_table x
UNION
SELECT x.id, x.dt, x.dt + INTERVAL 24 HOUR max_range FROM my_table x;
+----+---------------------+---------------------+
| id | min_range | max_range |
+----+---------------------+---------------------+
| 1 | 2015-05-30 23:14:00 | 2015-05-31 23:14:00 |
| 2 | 2015-05-31 02:07:00 | 2015-06-01 02:07:00 |
| 3 | 2015-05-31 15:16:00 | 2015-06-01 15:16:00 |
| 1 | 2015-05-31 23:14:00 | 2015-06-01 23:14:00 |
| 2 | 2015-06-01 02:07:00 | 2015-06-02 02:07:00 |
| 3 | 2015-06-01 15:16:00 | 2015-06-02 15:16:00 |
+----+---------------------+---------------------+
I'm not suggesting that this is the most efficient solution, but I think it works...
SET #dt = '2015-06-01 23:50:00'
INSERT INTO my_table (dt)
SELECT #dt
FROM (SELECT 1) m
LEFT
JOIN
( SELECT a.*
FROM
( SELECT x.id, x.dt - INTERVAL 24 HOUR min_range, x.dt max_range FROM my_table x
UNION
SELECT x.id, x.dt, x.dt + INTERVAL 24 HOUR FROM my_table x
) a
JOIN my_table b
ON b.dt BETWEEN a.min_range AND a.max_range
GROUP
BY a.id
, a.min_range
, a.max_range
HAVING COUNT(*) >= 3
) n
ON #dt BETWEEN n.min_range AND n.max_range
WHERE n.id IS NULL LIMIT 1;

Cumulative Sum of group count in mysql query

I have a table look like below....
ID HID Date UID
1 1 2012-01-01 1002
2 1 2012-01-24 2005
3 1 2012-02-15 5152
4 2 2012-01-01 6252
5 2 2012-01-19 10356
6 3 2013-01-06 10989
7 3 2013-03-25 25001
8 3 2014-01-14 35798
How can i group by HID, Year, Month and count(UID) and add a cumulative_sum (which is count of UID). So the final result look like this...
HID Year Month Count cumulative_sum
1 2012 01 2 2
1 2012 02 1 3
2 2012 01 2 2
3 2013 01 1 1
3 2013 03 1 2
3 2014 01 1 3
What's the best way to accomplish this using query?
I made assumptions about the original data set. You should be able to adapt this to the revised dataset - although note that the solution using variables (instead of my self-join) is faster...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(ID INT NOT NULL
,Date DATE NOT NULL
,UID INT NOT NULL PRIMARY KEY
);
INSERT INTO my_table VALUES
(1 ,'2012-01-01', 1002),
(1 ,'2012-01-24', 2005),
(1 ,'2012-02-15', 5152),
(2 ,'2012-01-01', 6252),
(2 ,'2012-01-19', 10356),
(3 ,'2013-01-06', 10989),
(3 ,'2013-03-25', 25001),
(3 ,'2014-01-14', 35798);
SELECT a.*
, SUM(b.count) cumulative
FROM
(
SELECT x.id,YEAR(date) year,MONTH(date) month, COUNT(0) count FROM my_table x GROUP BY id,year,month
) a
JOIN
(
SELECT x.id,YEAR(date) year,MONTH(date) month, COUNT(0) count FROM my_table x GROUP BY id,year,month
) b
ON b.id = a.id AND (b.year < a.year OR (b.year = a.year AND b.month <= a.month)
)
GROUP
BY a.id, a.year,a.month;
+----+------+-------+-------+------------+
| id | year | month | count | cumulative |
+----+------+-------+-------+------------+
| 1 | 2012 | 1 | 2 | 2 |
| 1 | 2012 | 2 | 1 | 3 |
| 2 | 2012 | 1 | 2 | 2 |
| 3 | 2013 | 1 | 1 | 1 |
| 3 | 2013 | 3 | 1 | 2 |
| 3 | 2014 | 1 | 1 | 3 |
+----+------+-------+-------+------------+
If you don't mind an extra column in the result, you can simplify (and accelerate) the above, as follows:
SELECT x.*
, #running:= IF(#previous=x.id,#running,0)+x.count cumulative
, #previous:=x.id
FROM
( SELECT x.id,YEAR(date) year,MONTH(date) month, COUNT(0) count FROM my_table x GROUP BY id,year,month ) x
,( SELECT #cumulative := 0,#running:=0) vals;
The code turns out kind of messy, and it reads as follows:
SELECT
HID,
strftime('%Y', `Date`) AS Year,
strftime('%m', `Date`) AS Month,
COUNT(UID) AS Count,
(SELECT
COUNT(UID)
FROM your_db A
WHERE
A.HID=B.HID
AND
(strftime('%Y', A.`Date`) < strftime('%Y', B.`Date`)
OR
(strftime('%Y', A.`Date`) = strftime('%Y', B.`Date`)
AND
strftime('%m', A.`Date`) <= strftime('%m', B.`Date`)))) AS cumulative_count
FROM your_db B
GROUP BY HID, YEAR, MONTH
Though by using views, it should become much clearer:
CREATE VIEW temp_data AS SELECT
HID,
strftime('%Y', `Date`) as Year,
strftime('%m', `Date`) as Month,
COUNT(UID) as Count
FROM your_db GROUP BY HID, YEAR, MONTH;
Then your statement will read as follows:
SELECT
HID,
Year,
Month,
`Count`,
(SELECT SUM(`Count`)
FROM temp_data A
WHERE
A.HID = B.HID
AND
(A.Year < B.Year
OR
(A.Year = B.Year
AND
A.Month <= B.Month))) AS cumulative_sum
FROM temp_data B;

count the number of rows between intervals

My table is like:
+---------+---------+------------+-----------------------+---------------------+
| visitId | userId | locationId | comments | time |
+---------+---------+------------+-----------------------+---------------------+
| 1 | 3 | 12 | It's a good day here! | 2012-12-12 11:50:12 |
+---------+---------+------------+-----------------------+---------------------+
| 2 | 3 | 23 | very beautiful | 2012-12-12 12:50:12 |
+---------+---------+------------+-----------------------+---------------------+
| 3 | 3 | 52 | nice | 2012-12-12 13:50:12 |
+---------+---------+------------+-----------------------+---------------------+
which records visitors' trajectory and some comments on the places visited.
I want to count the numbers of visitors that visit a specific place (say id=3227) from 0:00 to 23:59, over some interval (ie. 30mins)
I was trying to do this by :
SELECT COUNT(*) FROM visits
GROUP BY HOUR(time), SIGN( MINUTE(time) - 30 )// if they are in the same interval this will yield the same result
WHERE locationId=3227
The problem is that if there is no record that falls in some interval, this will NOT return that interval with count 0. For example, there are no visitors visiting the location from 02:00 to 03:00, this will not give me the intervals of 02:00-02:29 and 02:30-2:59.
I want a result with an exact size of 48 (one for every half hour), how can I do this?
You have to create a table with the 48 rows that you want and use left outer join:
select n.hr, n.hr, coalesce(v.cnt, 0) as cnt
from (select 0 as hr, -1 as sign union all
select 0, 1 union all
select 1, -1 union all
select 1, 1 union all
. . .
select 23, -1 union all
select 23, 1 union all
) left outer join
(SELECT HOUR(time) as hr, SIGN( MINUTE(time) - 30 ) as sign, COUNT(*) as cnt
FROM visits
WHERE locationId=3227
GROUP BY HOUR(time), SIGN( MINUTE(time) - 30 )
) v
on n.hr = v.hr and n.sign = v.sign
order by n.hr, n.hr