Show all data in a date range using MYSQL recursive function - mysql

I'm trying to get a list of sales for the past 6 months and get 0 values if I have no data for a specific month. So I'm using recursive_all_dates to generate a date range for the past 6 months which works great:
with recursive all_dates(dt) as (
-- anchor
select DATE_SUB(now(), INTERVAL 6 MONTH) dt
union all
-- recursion with stop condition
select dt + interval 1 month from all_dates where dt + interval 1 month <= DATE(now())
)
select DATE_FORMAT(dt, '%Y-%m') as ym from all_dates
This will return:
ym
------
2019-10
2019-11
2019-12
2020-01
2020-02
2020-03
2020-04
Now I want to left join this with my real data:
with recursive all_dates(dt) as (
-- anchor
select DATE_SUB(now(), INTERVAL 6 MONTH) dt
union all
-- recursion with stop condition
select dt + interval 1 month from all_dates where dt + interval 1 month <= now()
)
SELECT
DATE_FORMAT(ad.dt, '%Y-%m') as ym,
sum(profit) as profit
FROM
all_dates as ad
LEFT JOIN organisation_invoices as i
ON
DATE_FORMAT(ad.dt, '%Y-%m') = DATE_FORMAT(i.issue_date, '%Y-%m')
JOIN (
SELECT
invoice_id,
SUM(value) as profit
FROM organisation_invoice_services isrv
GROUP BY invoice_id
) isrv
ON i.id = isrv.invoice_id
WHERE
i.organisation_id = '4b166dbe-d99d-5091-abdd-95b83330ed3a' AND
i.issue_date >= DATE_SUB(NOW(), INTERVAL 6 MONTH)
GROUP BY `ym`
ORDER BY `ym` ASC
But I still only get the populated months:
ym profit
------------------
2019-12 8791
2020-02 302
2020-04 10452
The desired result:
ym profit
------------------
2019-10 0
2019-11 0
2019-12 8791
2020-01 0
2020-02 302
2020-03 0
2020-04 10452
What am I missing?
Edit: Sample data set and fiddle:
CREATE TABLE `organisation_invoices` (
`id` varchar(255) NOT NULL,
`organisation_id` varchar(255) NOT NULL,
`issue_date` date NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `organisation_invoice_services` (
`id` varchar(255) NOT NULL,
`organisation_id` varchar(255) NOT NULL,
`invoice_id` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`qty` float NOT NULL,
`value` float NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `organisation_invoices` (id, organisation_id, issue_date)
VALUES ('e11cec69-138f-4e20-88e5-5430b6c8d0a1', '4b166dbe-d99d-5091-abdd-95b83330ed3a', '2020-01-20');
INSERT INTO `organisation_invoice_services` (id, organisation_id, invoice_id, qty, `value`)
VALUES ('fe45dfd67-138f-4e20-88e5-5430b6c8d0a1', '4b166dbe-d99d-5091-abdd-95b83330ed3a', 'e11cec69-138f-4e20-88e5-5430b6c8d0a1', 1, 1000);
https://www.db-fiddle.com/f/dibyQi31CBtr2Cr8vjJA8i/0

You can use the following:
with recursive all_dates(dt) as (
-- anchor
select DATE_SUB(now(), INTERVAL 6 MONTH) dt
union all
-- recursion with stop condition
select dt + interval 1 month from all_dates where dt + interval 1 month <= now()
)
SELECT DATE_FORMAT(ad.dt, '%Y-%m') as ym, IFNULL(sum(profit),0) as profit
FROM all_dates as ad
LEFT JOIN organisation_invoices as i
ON DATE_FORMAT(ad.dt, '%Y-%m') = DATE_FORMAT(i.issue_date, '%Y-%m')
LEFT JOIN (
SELECT
invoice_id,
SUM(value) as profit
FROM organisation_invoice_services isrv
GROUP BY invoice_id
) isrv
ON i.id = isrv.invoice_id
WHERE
(i.organisation_id = '4b166dbe-d99d-5091-abdd-95b83330ed3a' AND
i.issue_date >= DATE_SUB(NOW(), INTERVAL 6 MONTH)) OR i.organisation_id IS NULL
GROUP BY `ym`
ORDER BY `ym` ASC
demo on dbfiddle.uk
Changes:
The conditions on the WHERE clause change the behaviour of your LEFT JOIN. Since you check for a specific organization_id, you only get matches between your month table and data (the LEFT JOIN behaves like a INNER JOIN). You need the following WHERE clause instead:
WHERE (i.organisation_id = '4b166dbe-d99d-5091-abdd-95b83330ed3a' AND
i.issue_date >= DATE_SUB(NOW(), INTERVAL 6 MONTH)) OR i.organisation_id IS NULL
You also have to change the second JOIN to a LEFT JOIN.

Related

Count Number of a Specific Day(s) Between Two Dates

I have a single line in MySQL table: volunteers
user_id | start_date | end_date
11122 | 2017-04-20 | 2018-02-17
How can I find how many times the 3rd day or 24th day of a month appears? (i.e. 2017-05-03, 2017-06-03, 2017-12-24, 2018-01-24) I'm trying to get to the following count:
Sample Output:
user_id | number_of_third_day | number_of_twenty_fourth_day
11122 | 10 | 10
I look at the documentation online to see if there is a way I can say (pseudo):
SELECT
day, COUNT(*)
FROM volunteers
WHERE day(between(start_date, end_date)) in (3,24)
I tried to create a calendar table to no avail, but I would try to get the days, GROUP BY day, and COUNT(*) times that day appears in the range
WITH calendar AS (
SELECT start_date AS date
FROM volunteers
UNION ALL
SELECT DATE_ADD(start_date, INTERVAL 1 DAY) as date
FROM volunteers
WHERE DATE_ADD(start_date, INTERVAL 1 DAY) <= end_date
)
SELECT date FROM calendar;
Thanks for any help!
This one is more optimized since I generate date range by months not days as other questions, so its faster
WITH RECURSIVE cte AS
(
SELECT user_id, DATE_FORMAT(start_date, '%Y-%m-03') as third_day,
DATE_FORMAT(start_date, '%Y-%m-24') as twenty_fourth_day,
start_date, end_date
FROM table1
UNION ALL
SELECT user_id,
DATE_FORMAT(third_day + INTERVAL 1 MONTH, '%Y-%m-03') as third_day,
DATE_FORMAT(twenty_fourth_day + INTERVAL 1 MONTH, '%Y-%m-24') as twenty_fourth_day,
start_date, end_date
FROM cte
WHERE third_day + INTERVAL 1 MONTH <= end_date
)
SELECT user_id,
SUM(CASE WHEN third_day BETWEEN start_date AND end_date THEN 1 ELSE 0 END) AS number_of_third_day,
SUM(CASE WHEN twenty_fourth_day BETWEEN start_date AND end_date THEN 1 ELSE 0 END) AS number_of_twenty_fourth_day
FROM cte
GROUP BY user_id;
Demo here
A dynamic approach is.
but creating the dateranges, takes a lot of time, so you should have a date table to get the dates
CREATE TABLE table1
(`user_id` int, `start_date` varchar(10), `end_date` varchar(10))
;
INSERT INTO table1
(`user_id`, `start_date`, `end_date`)
VALUES
(11122, '2017-04-20', '2018-02-17')
,(11123, '2019-04-20', '2020-02-17')
;
Records: 2 Duplicates: 0 Warnings: 0
WITH RECURSIVE cte AS (
SELECT
user_id,
`start_date` as date_run ,
`end_date`
FROM table1
UNION ALL
SELECT
user_id,
DATE_ADD(cte.date_run, INTERVAL 1 DAY),
end_date
FROM cte
WHERE DATE_ADD(date_run, INTERVAL 1 DAY) <= end_date
)SELECT user_id,
SUM(DAYOFMONTH(date_run) = 3) as day_3th,
SUM(DAYOFMONTH(date_run) = 24) as day_24th
FROM cte
GROUP BY user_id
user_id
day_3th
day_24th
11122
10
10
11123
10
10
fiddle
In last MySQL version you can use recursion:
-- get list of all dates in interval
with recursive dates(d) as (
select '2017-04-20'
union all
select date_add(d, interval 1 day) from dates where d < '2018-02-17'
) select
-- calculate
sum(day(d) = 10) days_10,
sum(day(d) = 24) days_24
from dates
-- filter 10 & 24 days
where day(d) = 10 or day(d) = 24;
https://sqlize.online/sql/mysql80/c00eb7de69d011a85502fa538d64d22c/
As long as you are looking for days that occur in every month (so not the 29th or beyond), this is just straightforward math. The number of whole calendar months between two dates (exclusive) is:
timestampdiff(month,start_date,end_date) - (day(start_date) <= day(end_date))
Then add one if the start month includes the target day and one if the end month includes it:
timestampdiff(month,start_date,end_date) - (day(start_date) <= day(end_date))
+ (day(start_date) <= 3) + (day(end_date) >= 3)

Return Monthly Data From Query Even When Month Does Not Exist In DataSet

I am a MS SQL Server guy, but am having to write some MySQL Queries. I am attempting to write a query that will show monthly sales data for a selected employee and if the employee has no sales data for that month show the month and a 0.
This is the query I have but it's returning NULL?
CREATE TABLE `saleamountbyemployee` (
`month_year` varchar(9) DEFAULT NULL,
`total_sales` int(11) NOT NULL,
`employee` char(17) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `saleamountbyemployee` (`month_year`,`total_sales`,`employee`) VALUES ('Feb 18','34512','James Jones');
INSERT INTO `saleamountbyemployee` (`month_year`,`total_sales`,`employee`) VALUES ('Feb 18','223','Sally Smith');
INSERT INTO `saleamountbyemployee` (`month_year`,`total_sales`,`employee`) VALUES ('Feb 18','22','James Jones');
WITH RECURSIVE
cte_months_to_pull AS (
SELECT DATE_FORMAT(#start_date, '%Y-%m-01') - INTERVAL #number_of_months MONTH AS month_to_pull
UNION ALL
SELECT month_to_pull + INTERVAL 1 MONTH
FROM cte_months_to_pull
WHERE month_to_pull < #start_date + INTERVAL #number_of_months - 2 MONTH
)
SELECT YRS.months_to_pull,T.employee,COALESCE(T.IA, 0) IA
FROM (SELECT DATE_Format(month_to_pull, '%b-%Y') months_to_pull
FROM cte_months_to_pull
ORDER BY months_to_pull
) AS YRS
LEFT JOIN (SELECT Date_format(month_year, '%b-%Y') AS `Month`
,employee,Sum(total_sales) AS IA
FROM saleamountbyemployee
WHERE employee = 'James Jones'
GROUP BY Date_format(month_year, '%b-%Y'), employee) T
ON YRS.months_to_pull = T.`Month`
order by month(STR_TO_DATE(CONCAT('01-',months_to_pull), '%d-%b-%Y')),YEAR(STR_TO_DATE(CONCAT('01-',months_to_pull), '%d-%b-%Y'))
EDIT
If I alter syntax to this:
SET #start_date = 'Jan 18';
SET #number_of_months = 12;
WITH RECURSIVE
cte_months_to_pull AS (
SELECT str_to_date(CONCAT(#start_date,' 01'), '%b %y %d') - INTERVAL #number_of_months MONTH AS month_to_pull
UNION ALL
SELECT month_to_pull + INTERVAL 1 MONTH
FROM cte_months_to_pull
WHERE month_to_pull < #start_date + INTERVAL #number_of_months - 2 MONTH
)
SELECT YRS.months_to_pull,T.employee,COALESCE(T.IA, 0) IA
FROM (SELECT DATE_Format(month_to_pull, '%b-%Y') months_to_pull
FROM cte_months_to_pull
ORDER BY months_to_pull
) AS YRS
LEFT JOIN (SELECT Date_format(month_year, '%b-%Y') AS `Month`
,employee,Sum(total_sales) AS IA
FROM saleamountbyemployee
WHERE employee = 'James Jones'
GROUP BY Date_format(month_year, '%b-%Y'), employee) T
ON YRS.months_to_pull = T.`Month`
order by month(STR_TO_DATE(CONCAT('01-',months_to_pull), '%d-%b-%Y')),YEAR(STR_TO_DATE(CONCAT('01-',months_to_pull), '%d-%b-%Y'))
I now get this error message:
Error Code: 1292. Incorrect datetime value: 'Jan 18'
This line in the CTE:
SELECT DATE_FORMAT(#start_date, '%Y-%m-01') - INTERVAL #number_of_months MONTH AS month_to_pull
The date format of '%Y-%m-01' doesn't match what's in the table. If you use 'Feb 18' as the value of the #start_date parameter, it will return NULL. The date_format you specified expects it to look like this:
SELECT DATE_FORMAT('2018-01-01', '%Y-%m-01') - INTERVAL 1 MONTH AS month_to_pull
Add ' 01' to the date, this will make it think it's the first of the month.
SELECT str_to_date(CONCAT(#start_date,' 01'), '%b %y %d') - INTERVAL #number_of_months MONTH AS month_to_pull
This will return the following if you use 'Feb 18' as the #start_date and 1 as the #number_of_months:
1/1/2018 12:00:00 AM
That VARCHAR data type and date format in the table is gonna bite you.
I am attempting to write a query that will show monthly sales data for a selected employee and if the employee has no sales data for that month show the month and a 0.
Assuming that you have data for some employee in each month, you can use conditional aggregation:
select month_year,
sum(case when employee = 'James Jones' then total_sales else 0 end) as monthly_sales
from saleamountbyemployee sae
group by month_year;

How to select data from 1 to 5 days with Group By

I've tried it here but I could not, I can only display the total.
I have a download table and a program table.
Every time I download a program I record the date and time, I need to do a grouping of downloaded programs and then 5 columns with the dates, here's an example.
PROGRAMA | HOJE | ONTEM| 2 DIAS | 3 DIAS | 4 DIAS
Programa 1 11 110 55 66 12
Programa 2 25 140 60 90 12
Programa 3 10 20 20 10 10
TOTAL 46 270 135 166 32
Below is my query
select `k`.`app_id` AS `app_id`,`b`.`aplicativo` AS `aplicativo`,count(0) AS `HOJE`,
(select count(0) AS `count(*)` from (`registration` `a` join `aplicativos` `b`) where `k`.`app_id`= `b`.`id` and created_at > (cast(now() as date) - interval 1 day) and (`a`.`created_at` < cast(now() as date)- interval 0 day) ) as ONTEM ,
(select count(0) AS `count(*)` from (`registration` `a` join `aplicativos` `b`) where `k`.`app_id` = `b`.`id`
and created_at > (cast(now() as date) - interval 2 day) and (`a`.`created_at` < cast(now() as date)- interval 1 day) ) as 2_DIAS_ANTES ,
(select count(0) AS `count(*)` from (`registration` `a` join `aplicativos` `b`) where `k`.`app_id` = `b`.`id`
and created_at > (cast(now() as date) - interval 3 day) and (`a`.`created_at` < cast(now() as date)- interval 2 day) ) as 3_DIAS_ANTES ,
(select count(0) AS `count(*)` from (`registration` `a` join `aplicativos` `b`) where `k`.`app_id` = `b`.`id`
and created_at > (cast(now() as date) - interval 4 day) and (`a`.`created_at` < cast(now() as date)- interval 3 day) ) as 4_DIAS_ANTES ,
(select count(0) AS `count(*)` from (`registration` `a` join `aplicativos` `b`) where `k`.`app_id` = `b`.`id`
and created_at > (cast(now() as date) - interval 5 day) and (`a`.`created_at` < cast(now() as date)- interval 4 day) ) as 5_DIAS_ANTES
from (`registration` `k` join `aplicativos` `b`) where ((`k`.`app_id` = `b`.`id`) and (`k`.`created_at` > (cast(now() as date) - interval 0 day)))
group by `b`.`aplicativo`
Table structure
Table aplicativos
CREATE TABLE IF NOT EXISTS `aplicativos` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`id_usuario` int(11) NOT NULL,
`aplicativo` varchar(200) NOT NULL,
`link` varchar(400) NOT NULL,
`quantidade_notificacoes` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=13 ;
Table registration
CREATE TABLE IF NOT EXISTS `registration` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`gcm_regid` varchar(300) NOT NULL,
`app_id` int(11) NOT NULL,
`email` varchar(200) NOT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=73876 ;
Here is one approach for MySQL:
SELECT a.aplicativo as PROGRAMA,
sum(Date(r.created_at) = CURDATE()) AS HOJE,
sum(date(r.created_at) = DATE_SUB(CURDATE(), INTERVAL 1 DAY), 1, 0)) AS ONTEM,
...
FROM registration r INNER JOIN
aplicativos a
on r.app_id = a.id
GROUP BY r.app_id ;
Does this give you the expected result?
SELECT
a.aplicativo as PROGRAMA,
COUNT(IF(DATE(r.created_at) = CURDATE(), 1, 0)) AS HOJE,
COUNT(IF(DATE(r.created_at) = DATE_SUB(CURDATE(), INTERVAL 1 DAY), 1, 0)) AS ONTEM,
COUNT(IF(DATE(r.created_at) = DATE_SUB(CURDATE(), INTERVAL 2 DAY), 1, 0)) AS 2DIAS,
COUNT(IF(DATE(r.created_at) = DATE_SUB(CURDATE(), INTERVAL 3 DAY), 1, 0)) AS 3DIAS,
COUNT(IF(DATE(r.created_at) = DATE_SUB(CURDATE(), INTERVAL 4 DAY), 1, 0)) AS 4DIAS,
COUNT(IF(DATE(r.created_at) = DATE_SUB(CURDATE(), INTERVAL 5 DAY), 1, 0)) AS 5DIAS
FROM registration r
INNER JOIN aplicativos a
ON r.app_id = a.id
GROUP BY r.app_id, DATE(r.created_at) with ROLLUP;

How I can count the number of times a value appears in a column grouped by day?

My table structure is:
CREATE TABLE `survey` (
`id` int(11) NOT NULL auto_increment,
`submitdate` datetime default NULL,
`answer` varchar(5) collate utf8_unicode_ci default NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=499 ;
Here answer contains values such as a1, a2, a3.
I want to calculate the last 10 days records depending upon answer. Is there is no record on a particular day, it should be zero.
The output I want is
date count answer
19-11-2012 10 a1
19-11-2012 8 a2
19-11-2012 0 a3
18-11-2012 30 a1
18-11-2012 30 a2
18-11-2012 30 a3
I used a query like
SELECT days.day, count(survey.id)
FROM
(select curdate() as day
union select curdate() - interval 1 day
union select curdate() - interval 2 day
union select curdate() - interval 3 day
union select curdate() - interval 4 day
union select curdate() - interval 5 day
union select curdate() - interval 6 day
union select curdate() - interval 7 day
union select curdate() - interval 8 day
union select curdate() - interval 9 day) days
left join survey
on days.day = date(survey.submitdate)
group by
days.day
You can look back 10 days by using the SUBDATE function and compare dates using '>'. You'll also want to GROUP BY the answer as well as the day, since you're attempting to calculate the count of each individual answer per day.
SELECT DATE(submitdate) AS day
answer,
COUNT(*) AS answer_count
FROM survey
WHERE DATE(submitdate) > DATE(SUBDATE(CURRENT_DATE, 10))
GROUP BY day, answer;

selecting between date range and forcing empty rows to be 0?

I have a table that looks like
expires | value
-------------------
2011-06-15 | 15
2011-06-15 | 15
2011-06-25 | 15
2011-07-15 | 15
2011-07-15 | 15
2011-07-25 | 15
2011-08-15 | 15
2011-08-15 | 15
2011-08-25 | 15
I want to run a query that will spit out
June | 45
July | 45
August | 45
So my query is
SELECT SUM(amount) AS `amount`,
DATE_FORMAT(expires , '%M') AS `month`
FROM dealDollars
WHERE DATE(expires) BETWEEN DATE(NOW())
AND LAST_DAY(DATE(NOW()+INTERVAL 3 MONTH))
GROUP BY MONTH(expires)
Which works fine. But with the result, if there were no rows in say July, July would not show up.
How can I force July to show up with 0 as its value?
There is no easy way to do this. One possible way is to have a table called months:
Which will have 12 rows: (January, February, ..., December)
You can left join the Months table with the query you have to get the desired output.
The general consensus is that you should just create a table of month names. What follows is a silly solution which can serve as a workaround.
You'll have to work on the specifics yourself, but have you looked at sub-queries in the from clause?
Basically, it would be something like this:
SELECT NVL(B.amount, 0) as `amount`, A.month as `month`
FROM (SELECT 'January' as `month`
UNION SELECT 'February' as `month`
UNION SELECT 'March' as `month`...
UNION SELECT 'DECEMBER' as `month`) as A
LEFT JOIN
(SELECT SUM(amount) AS `amount`,
DATE_FORMAT(expires , '%M') AS `month`
FROM dealDollars
WHERE
DATE(expires) BETWEEN
DATE(NOW()) AND
LAST_DAY(DATE(NOW()+INTERVAL 3 MONTH))
GROUP BY MONTH(expires)) as B
ON (A.MONTH = B.MONTH)
Crazy, no?
MySQL doesn't have recursive functionality, so you're left with using the NUMBERS table trick -
Create a table that only holds incrementing numbers - easy to do using an auto_increment:
DROP TABLE IF EXISTS `example`.`numbers`;
CREATE TABLE `example`.`numbers` (
`id` int(10) unsigned NOT NULL auto_increment,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Populate the table using:
INSERT INTO NUMBERS
(id)
VALUES
(NULL)
...for as many values as you need. In this case, the INSERT statement needs to be run at least 3 times.
Use DATE_ADD to construct a list of days, increasing based on the NUMBERS.id value:
SELECT x.dt
FROM (SELECT DATE(DATE_SUB(CURRENT_DATE(), INTERVAL (n.id - 1) MONTH)) AS dt
FROM numbers n
WHERE DATE(DATE_SUB(CURRENT_DATE(), INTERVAL (n.id - 1) MONTH)) BETWEEN CURRENT_DATE()
AND LAST_DAY(CURRENT_DATE() +INTERVAL 3 MONTH)) ) x
Use an OUTER JOIN to get your desired output:
SELECT x.dt,
COUNT(*) AS total
FROM (SELECT DATE(DATE_SUB(CURRENT_DATE(), INTERVAL (n.id - 1) MONTH)) AS dt
FROM numbers n
WHERE DATE(DATE_SUB(CURRENT_DATE(), INTERVAL (n.id - 1) MONTH)) BETWEEN CURRENT_DATE()
AND LAST_DAY(CURRENT_DATE() +INTERVAL 3 MONTH)) ) x
LEFT JOIN YOUR_TABLE y ON y.date = x.dt
GROUP BY x.dt
ORDER BY x.dt
Why Numbers, not Dates?
Simple - dates can be generated based on the number, like in the example I provided. It also means using a single table, vs say one per data type.
select MONTHNAME(expires) as month_name,sum(`value`) from Table1
group by month_name order by null;
fiddle