Using count(*) .. Over(*) in mysql

Using count(*) .. Over(*) in mysql - mysql

My data looks like the following,
requestedDate Status
2020-04-21 APPROVED
2020-04-23 APPROVED
2020-04-27 PENDING
2020-05-21 PENDING
2020-06-01 APPROVED
I would like to extarct a report that looks like the following where the count is by status and month.
Status StatusCount Month MonthCount CountTotal
APPROVED 2 APR 3 5
PENDING 1 MAY 1 5
APPROVED 1 JUN 1 5
My sql looks like the following,
select distinct
status,
count(status) over (partition by status) as total_by_status,
CASE
WHEN Month(requestedDate) = 1 THEN 'JAN'
WHEN Month(requestedDate) = 2 THEN 'FEB'
WHEN Month(requestedDate) = 3 THEN 'MAR'
WHEN Month(requestedDate) = 4 THEN 'APR'
WHEN Month(requestedDate) = 5 THEN 'MAY'
WHEN Month(requestedDate) = 6 THEN 'JUN'
WHEN Month(requestedDate) = 7 THEN 'JUL'
WHEN Month(requestedDate) = 8 THEN 'AUG'
WHEN Month(requestedDate) = 9 THEN 'SEP'
WHEN Month(requestedDate) = 10 THEN 'OCT'
WHEN Month(requestedDate) = 11 THEN 'NOV'
WHEN Month(requestedDate) = 12 THEN 'DEC'
END AS myMONTH,
count(Month(requestedDate)) over (partition by Month(requestedDate)) as total_by_month,
count(*) over () as Totals
from Reports
where
requestedDate between DATE_SUB(CURDATE(), INTERVAL 120 DAY) and date(CURDATE())
order by 1;
The output for that looks like,
status total_by_status myMONTH total_by_month Totals
APPROVED 3 APR 3 5
APPROVED 3 JUN 1 5
PENDING 2 APR 3 5
PENDING 2 MAY 1 5
dbfiddle

First you need a valid aggregation query. Then you can use window functions on top of it (here, you would typically compute window sums of the counts).
I would write this as:
select
status,
count(*) status_count,
date_format(requestedDate, '%b') requested_month
sum(count(*)) over(partition by year(requestedDate), month(requestedDate)) month_count,
sum(count(*)) over() total_count
from reports
where requestedDate between current_date - interval 120 day and current_date
group by status, year(requestedDate), month(requestedDate), date_format(requestedDate, '%b')

Since it is just for last 120 days (last years same month wouldnt occur) so we can also use distinct instead of group by), something like below:
select distinct status,
count(*) over (partition by status) as total_by_status,
date_format(requestedDate, '%b') mymonth,
count(Month(requestedDate)) over (partition by Month(requestedDate)) as total_by_month,
count(*) over () as total_by_month
from reports
where requestedDate between current_date - interval 120 day and current_date
order by status, mymonth
Demo

Related

mysql query is returning incorrect values

I have the following data,
started ended theDate theYear themonth growth division teams status location
2 0 8/31/2019 2019 8 2 Unknown Team A ACTIVE Town A
1 0 5/31/1996 1996 5 1 Unknown Team B ACTIVE Town A
1 0 8/31/2014 2014 8 1 Unknown Team B ACTIVE Town B
1 0 1/31/1996 1996 1 1 Unknown Team B ACTIVE Town C
1 0 7/31/2004 1985 7 1 Unknown Team C ACTIVE Town E
1 0 7/31/1985 1985 7 1 Unknown Team B ACTIVE Town A
1 0 5/31/2019 2019 5 1 Unknown Team A ACTIVE Town F
The started column shows the employees that have joined on that particular date. The growth column is started - ended where ended is the number of employees that left.
I have the following query which will extract the data correctly as long as i specifiy the variables.
set #theYear = 2019 ;
set #team = 'Team A' ;
SELECT
t1.growth,
SUM(t2.growth) AS Emp_Count,
CASE
WHEN t1.theYear IS NULL THEN t1.theYear
ELSE t1.theYear
END AS theYear,
t1.team,
t1.location,
t1.division,
t1.status,
CASE
WHEN t1.Month = 1 THEN 'JAN'
WHEN t1.Month = 2 THEN 'FEB'
WHEN t1.Month = 3 THEN 'MAR'
WHEN t1.Month = 4 THEN 'APR'
WHEN t1.Month = 5 THEN 'MAY'
WHEN t1.Month = 6 THEN 'JUN'
WHEN t1.Month = 7 THEN 'JUL'
WHEN t1.Month = 8 THEN 'AUG'
WHEN t1.Month = 9 THEN 'SEP'
WHEN t1.Month = 10 THEN 'OCT'
WHEN t1.Month = 11 THEN 'NOV'
WHEN t1.Month = 12 THEN 'DEC'
END AS myMONTH
FROM
(SELECT
CASE
WHEN r.growth IS NOT NULL THEN r.growth
WHEN r.growth IS NULL THEN 0
END AS growth,
r.theYear,
r.team,
r.division,
r.location,
t.mon_num AS Month,
r.status
FROM
Reports r
RIGHT JOIN (SELECT 1 mon_num UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 10 UNION SELECT 11 UNION SELECT 12) t ON t.mon_num = r.themonth
AND r.theYear = (select #theYear)
AND r.team = (select #team)
GROUP BY r.growth , Month
ORDER BY t.mon_num ASC) AS t1
JOIN
(SELECT
CASE
WHEN r2.growth IS NOT NULL THEN r2.growth
WHEN r2.growth IS NULL THEN 0
END AS growth,
r2.theYear,
r2.team,
r2.division,
r2.location,
t.mon_num AS Month,
r2.status
FROM
Reports r2
RIGHT JOIN (SELECT 1 mon_num UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 10 UNION SELECT 11 UNION SELECT 12) t ON t.mon_num = r2.themonth
AND r2.theYear = (select #theYear)
AND r2.team = (select #team)
GROUP BY r2.growth , Month
ORDER BY t.mon_num ASC) AS t2 ON t1.Month >= t2.Month
GROUP BY t1.Month;
As you can see below the query will list the months i.e Jan,feb etc along with the selected data along with the incremental count.
However if I want to use the query without the variables i.e. get all data without any conditions/filters the result i get is incorrect. First off it should start with the year 1985 as i do have the year 1985 in my dataset.

That's quite a complex query. A wise programmer troubleshoots that sort of thing a subquery at a time.
Your query contains this code:
AND r.theYear = (select #theYear)
What is this supposed to do if your #-variable is not defined?
Do you want something like this instead?
AND (#theYear IS NULL OR r.theYear=#theYear)
That will not filter by year if your #-variable is NULL.

A way to have a rolling summation

I have the below dataset. In the below example records for the year 1993. The Tgrowth column is start - end. Started is the number of employees that joined on a specific month and ended is the number of employees that left for the same month.
SELECT
r.Tgrowth,
CASE
WHEN t.mon_num = 1 THEN 'JAN'
WHEN t.mon_num = 2 THEN 'FEB'
WHEN t.mon_num = 3 THEN 'MAR'
WHEN t.mon_num = 4 THEN 'APR'
WHEN t.mon_num = 5 THEN 'MAY'
WHEN t.mon_num = 6 THEN 'JUN'
WHEN t.mon_num = 7 THEN 'JUL'
WHEN t.mon_num = 8 THEN 'AUG'
WHEN t.mon_num = 9 THEN 'SEP'
WHEN t.mon_num = 10 THEN 'OCT'
WHEN t.mon_num = 11 THEN 'NOV'
WHEN t.mon_num = 12 THEN 'DEC'
END AS myMONTH
FROM
(SELECT 1 mon_num UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 10 UNION SELECT 11 UNION SELECT 12) t
LEFT JOIN Reports r ON t.mon_num = r.theMONTH
AND r.Tyear = 1993
GROUP BY r.Tgrowth , myMONTH
ORDER BY t.mon_num ASC
The result set for the above is as follows,
Tgrowth Month
1 JAN
0 FEB
2 MAR
0 APR
0 MAY
0 JUN
0 JUL
0 AUG
0 SEP
0 OCT
0 NOV
0 DEC
Instead I would like the result to show a rolling sum i.e. add to the Tgrowth field. Something like the below,
growth Emp_Count myMONTH
1 1 JAN
0 1 FEB
2 3 MAR
0 3 APR
0 3 MAY
0 3 JUN
0 3 JUL
0 3 AUG
0 3 SEP
0 3 OCT
0 3 NOV
0 3 DEC

There are 2 options:
use join
use variables
The method of using join is as following:
SELECT
t1.Tgrowth,
sum(t2.Tgrowth) as Emp_Count,
CASE
WHEN t1.Month = 1 THEN 'JAN'
WHEN t1.Month = 2 THEN 'FEB'
WHEN t1.Month = 3 THEN 'MAR'
WHEN t1.Month = 4 THEN 'APR'
WHEN t1.Month = 5 THEN 'MAY'
WHEN t1.Month = 6 THEN 'JUN'
WHEN t1.Month = 7 THEN 'JUL'
WHEN t1.Month = 8 THEN 'AUG'
WHEN t1.Month = 9 THEN 'SEP'
WHEN t1.Month = 10 THEN 'OCT'
WHEN t1.Month = 11 THEN 'NOV'
WHEN t1.Month = 12 THEN 'DEC'
END AS myMONTH
FROM (
SELECT
case
when r.growth is not null then r.growth
when r.growth is null then 0
END as Tgrowth,
t.mon_num AS Month
FROM
(SELECT 1 mon_num UNION SELECT 2 UNION SELECT 3 UNION SELECT 4
UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8
UNION SELECT 9 UNION SELECT 10 UNION SELECT 11 UNION SELECT 12) t
LEFT JOIN Reports r ON t.mon_num = r.themonth
AND r.theYear = 1993
GROUP BY r.growth , Month
ORDER BY t.mon_num ASC
) as t1 join (
SELECT
case
when r.growth is not null then r.growth
when r.growth is null then 0
END as Tgrowth,
t.mon_num AS Month
FROM
(SELECT 1 mon_num UNION SELECT 2 UNION SELECT 3 UNION SELECT 4
UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8
UNION SELECT 9 UNION SELECT 10 UNION SELECT 11 UNION SELECT 12) t
LEFT JOIN Reports r ON t.mon_num = r.themonth
AND r.theYear = 1993
GROUP BY r.growth , Month
ORDER BY t.mon_num ASC
) as t2 on t1.Month >= t2.Month group by t1.Month;
Use variables solution is as following:
SET #num := 0;
select
Tgrowth,
#num := #num + Tgrowth as Emp_Count,
CASE
WHEN t1.Month = 1 THEN 'JAN'
WHEN t1.Month = 2 THEN 'FEB'
WHEN t1.Month = 3 THEN 'MAR'
WHEN t1.Month = 4 THEN 'APR'
WHEN t1.Month = 5 THEN 'MAY'
WHEN t1.Month = 6 THEN 'JUN'
WHEN t1.Month = 7 THEN 'JUL'
WHEN t1.Month = 8 THEN 'AUG'
WHEN t1.Month = 9 THEN 'SEP'
WHEN t1.Month = 10 THEN 'OCT'
WHEN t1.Month = 11 THEN 'NOV'
WHEN t1.Month = 12 THEN 'DEC'
END AS myMONTH
from (
SELECT
case
when r.growth is not null then r.growth
when r.growth is null then 0
END as Tgrowth,
t.mon_num AS Month
FROM
(SELECT 1 mon_num UNION SELECT 2 UNION SELECT 3 UNION SELECT 4
UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8
UNION SELECT 9 UNION SELECT 10 UNION SELECT 11 UNION SELECT 12) t
LEFT JOIN Reports r ON t.mon_num = r.themonth
AND r.theYear = 1993
GROUP BY r.growth , Month
ORDER BY t.mon_num ASC ) t1;

Since you are running MySQL 8.0, I would recommend a recursive query to generate the dates, and then window functions and aggregation.
If you want the whole 1993 year:
with dates as (
select '1993-01-01' dt
union all
select dt + interval 1 month from dates where dt < '1993-12-01'
)
select
date_format(d.dt, '%b') mymonth,
coalesce(sum(started), 0) - coalesce(sum(ended), 0) growth,
sum(coalesce(sum(started), 0) - coalesce(sum(ended), 0)) over(order by d.dt) emp_count
from dates d
left join reports r on r.theDate >= d.dt and r.theDate < d.dt + interval 1 month
group by d.dt
order by d.dt
This assumes that theDate is stored as a date datatype and not a string (else, you would need to convert it first, using str_to_date()).
This also takes in account the possibility that the table may contain several rows for a given month. If that's not the case, then there is no need for aggregation:
with dates as (
select '1993-01-01' dt
union all
select dt + interval 1 month from dates where dt < '1993-12-01'
)
select
date_format(d.dt, '%b') mymonth,
coalesce(started, 0) - coalesce(ended, 0) growth,
sum(coalesce(started, 0) - coalesce(ended, 0)) over(order by d.dt) emp_count
from dates d
left join reports r on r.theDate >= d.dt and r.theDate < d.dt + interval 1 month
order by d.dt

Select zero when no results in date range

I have a query that I am using to pull back the total costs per months for the previous 6 months of data. The issue I need to solve is when there is no records for a specific month, nothing is returned and only 5 months are shown.
I need to modify this query to always show the 6 months, even when there is no data for a specific month but I am unsure how to accomplish this.
select sum(cost),
CASE
WHEN MONTH(collection_date) = 1 THEN 'January'
WHEN MONTH(collection_date) = 2 THEN 'February'
WHEN MONTH(collection_date) = 3 THEN 'March'
WHEN MONTH(collection_date) = 4 THEN 'April'
WHEN MONTH(collection_date) = 5 THEN 'May'
WHEN MONTH(collection_date) = 6 THEN 'June'
WHEN MONTH(collection_date) = 7 THEN 'July'
WHEN MONTH(collection_date) = 8 THEN 'August'
WHEN MONTH(collection_date) = 9 THEN 'September'
WHEN MONTH(collection_date) = 10 THEN 'October'
WHEN MONTH(collection_date) = 11 THEN 'November'
WHEN MONTH(collection_date) = 12 THEN 'December'
ELSE 'NULL'
END AS datemodified
from invoices
WHERE collection_date >= DATE_SUB(now(), INTERVAL 5 MONTH)
GROUP BY MONTH(collection_date)
ORDER BY collection_date asc;
Sample of the results with an empty month
COST Datemodified
300 September
200 November
200 December
Desired output
COST Datemodified
0 August
300 September
0 October
200 November
200 December

You can create fake month data and join your invoices table to it. Try this:
SELECT SUM(cost), months.name AS datemodified
FROM (SELECT 1 AS num, 'January' AS name
UNION SELECT 2, 'February'
UNION SELECT 3, 'March'
UNION SELECT 4, 'April'
UNION SELECT 5, 'May'
UNION SELECT 6, 'June'
UNION SELECT 7, 'July'
UNION SELECT 8, 'August'
UNION SELECT 9, 'September'
UNION SELECT 10, 'October'
UNION SELECT 11, 'November'
UNION SELECT 12, 'December') months
LEFT JOIN invoices.collection_date = months.num
WHERE collection_date >= DATE_SUB(NOW(), INTERVAL 5 MONTH)
GROUP BY MONTH(collection_date)
ORDER BY collection_date ASC;
However, that gives you all the 12 months. To get only the 6 last months, you need to dynamically generate your fake month data:
SELECT SUM(cost),
CASE num WHEN 1 THEN 'January'
WHEN 2 THEN 'February'
WHEN 3 THEN 'March'
WHEN 4 THEN 'April'
WHEN 5 THEN 'May'
WHEN 6 THEN 'June'
WHEN 7 THEN 'July'
WHEN 8 THEN 'August'
WHEN 9 THEN 'September'
WHEN 10 THEN 'October'
WHEN 11 THEN 'November'
WHEN 12 THEN 'December'
END AS datemodified
FROM (SELECT MONTH(NOW()) AS num
UNION SELECT MONTH(DATE_SUB(NOW(), INTERVAL 1 MONTH)) AS num
UNION SELECT MONTH(DATE_SUB(NOW(), INTERVAL 2 MONTH)) AS num
UNION SELECT MONTH(DATE_SUB(NOW(), INTERVAL 3 MONTH)) AS num
UNION SELECT MONTH(DATE_SUB(NOW(), INTERVAL 4 MONTH)) AS num
UNION SELECT MONTH(DATE_SUB(NOW(), INTERVAL 5 MONTH)) AS num) months
LEFT JOIN invoices.collection_date = months.num
WHERE collection_date >= DATE_SUB(NOW(), INTERVAL 5 MONTH)
GROUP BY MONTH(collection_date)
ORDER BY collection_date ASC;

get average value depending on month?

My table:
rating date
4 12/02/2013
3 12/02/2013
2.5 12/01/2013
3 12/01/2013
4.5 21/11/2012
5 10/11/2012
If I give input as 3 the last three months (02,01,12), average of rating result should come
I tried by using GROUP BY but I get this result:
rating month
3.5 02
2.75 01
For the 12th month no rating so no output.....
My desired result:
rating month
3.5 02
2.75 01
0 12

The problem is that you want to return months that do not exist. If you do not have a calendar table with dates, then you will want to use something like the following:
select d.mth Month,
coalesce(avg(t.rating), 0) Rating
from
(
select 1 mth union all
select 2 mth union all
select 3 mth union all
select 4 mth union all
select 5 mth union all
select 6 mth union all
select 7 mth union all
select 8 mth union all
select 9 mth union all
select 10 mth union all
select 11 mth union all
select 12 mth
) d
left join yourtable t
on d.mth = month(t.date)
where d.mth in (1, 2, 12)
group by d.mth
See SQL Fiddle with Demo

SELECT coalesce(avg(rating), 0.0) avg_rating, req_month
FROM yourTable
RIGHT JOIN
(SELECT month(now()) AS req_month
UNION
SELECT month(now() - INTERVAL 1 MONTH) AS req_month
UNION
SELECT month(now() - INTERVAL 2 MONTH) AS req_month) tmpView
ON month(yourTable.date) = tmpView.req_month
WHERE yourTable.date > ( (curdate() - INTERVAL day(curdate()) - 1 DAY) - INTERVAL 2 MONTH)
OR ratings.datetime IS NULL
GROUP BY month(yourTable.date);

date : group by month from day 2 to day 1 next month

I have query like this :
SELECT EXTRACT(MONTH FROM d.mydate) AS synmonth, SUM(apcp) AS apcptot
FROM t_synop_data2 d
WHERE d.mydate
BETWEEN '2011-01-01' AND '2011-12-31'
AND d.idx_synop = '06712'
GROUP BY synmonth
This query adds all rain (apcp) in a month like this :
1 32.8 => from 2011.01.01 to 2011.01.31
2 27.2 => from 2011.02.01 to 2011.02.28
3 21.0
4 21.8
5 88.5
6 131.4
7 118.6
8 57.1
9 80.9
10 84.6
11 1.1
12 143.5 => from 2011.12.01 to 2011.12.31
That's what I want, but with a little difference.
This difference is that i have to adds apcp from day 2 in the month to day 1 next month and then return a result like above.
1 132.8 => from 2011.01.02 to 2011.02.01
2 27.2 => from 2011.02.02 to 2011.03.01
3 21.0
4 21.8
5 88.5
6 131.4
7 118.6
8 57.1
9 80.9
10 84.6
11 1.1
12 143.5 => from 2011.12.02 to 2012.01.01
I tried something with add_date(), extract() or date_format() but without result.
Thank you for your answer
Vince

Here is the query :
SELECT EXTRACT(MONTH FROM ADDDATE(d.mydate,-1) ) AS synmonth
, SUM(apcp) AS apcptot
FROM t_synop_data2 AS d
WHERE ADDDATE(d.mydate,-1) BETWEEN '2011-01-01' AND '2012-12-31'
AND d.idx_synop = '06712'
GROUP BY synmonth
You can check the result by adding two columns like this:
SELECT EXTRACT(MONTH FROM ADDDATE(d.mydate,-1) ) AS synmonth
, SUM(apcp) AS apcptot
, MIN(d.mydate) AS date_min
, MAX(d.mydate) AS date_max
FROM t_synop_data2 AS d
WHERE ADDDATE(d.mydate,-1) BETWEEN '2011-01-01' AND '2012-12-31'
AND d.idx_synop = '06712'
GROUP BY synmonth

You can group by EXTRACT(MONTH FROM d.mydate - INTERVAL 1 DAY)
SELECT EXTRACT(MONTH FROM d.mydate) AS synmonth, SUM(apcp) AS apcptot
FROM t_synop_data2 d
WHERE d.mydate
BETWEEN '2011-01-01' AND '2011-12-31'
AND d.idx_synop = '06712'
GROUP BY EXTRACT(MONTH FROM d.mydate - INTERVAL 1 DAY)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Using count() .. Over() in mysql - mysql

Related

mysql query is returning incorrect values

A way to have a rolling summation

Select zero when no results in date range

get average value depending on month?

date : group by month from day 2 to day 1 next month

Categories

Resources