MySQL - how do I force AVG function to include 0's - mysql

set #cumulativeSum := 0;
SELECT
z.hour+1 as time,
(#cumulativeSum := #cumulativeSum + z.enquiries) as target
FROM
( SELECT
y.hour as hour,
IFNULL(ROUND(AVG(y.enquiries), 0), 0) as enquiries
FROM
( SELECT DAY(o.date_created),
HOUR(o.date_created) as hour,
COUNT(DISTINCT o.phone) as enquiries
FROM orders o
WHERE phone IS NOT NULL
AND name NOT LIKE 'test%'
AND o.email NOT LIKE 'jawor%'
AND o.email NOT LIKE 'test%'
AND o.email NOT LIKE '%moneyshake%'
AND o.date_created < CURRENT_DATE()
AND o.date_created >= DATE(NOW()) + INTERVAL -7 DAY
GROUP BY HOUR(o.date_created), DAY(o.date_created) ) y
GROUP BY hour ) z
This query is meant to give an average number of enquiries by hour for the last 7 days. I've done it this way to exclude duplicates of o.phone only within each hour of each day, rather than across all days or all hours.
It outputs:
time
target
1
1
2
2
3
3
5
4
etc..
etc..
I want it to include a row for 4am, even if the value for target doesn't change (because the AVG for 4am is 0)
Please let me know if more info is needed!

Credit to Akina's comment for solving

Related

calculating variance with mysql week over week

I need to display week over week difference with mysql in Week Over Week Users column. My data looks like the following:
Date Users Week Over Week Users
06-01-2019 10 10
06-08-2019 15 15
06-15-2019 5 5
Currently, Week Over Week Users only reflects the data that I have in Users column. The desired output would be:
Date Users Week Over Week Users
06-01-2019 10 10
06-08-2019 15 5
06-15-2019 5 -10
Basically if on the second week the number of users grew up to 15 users, then I need to display 5 (as in +5 users since last week, so new week Users - last week Users would be the formula)
Here is my code:
(
SUM(
CASE
WHEN WEEK(`Date`) = WEEK(CURRENT_DATE()) THEN `Users`
ELSE 0
END
) - SUM(
CASE
WHEN WEEK(`Date`) = WEEK(CURRENT_DATE()) - 1 THEN `Users`
ELSE 0
END
)
)
But it doesn't work as it duplicates the Users column.
You want lag():
select t.*,
(users - lag(users, 1, 0) over (order by date)) as week_over_week
from t;
If you are running MySQL 5.x, where window functions such as lag() are not available, you can use a correlated subquery to get the "previous" value:
select
t.date,
t.users,
t.users - coalesce(
(
select t1.users
from mytable t1
where t1.date < t.date
order by t1.date desc
limit 1
),
0
) week_over_week_users
from mytable t

MySQL get number of records in each day between two given dates

I'm writing this query where it gets a row value and it will return the number of records for each day for that row between two given dates and returns 0 if there is no records for that day.
I've written a query which does this for the past week.
Current Query:
select d.day, count(e.event) as count
from (
select 0 day union all
select 1 union all
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6
) d
left join event e
on e.timestamp >= current_date - interval d.day day
and e.timestamp < current_date - interval (d.day - 1) day
and e.event = ?
group by d.day
The problem is this returns only the results for a fixed number of days.. I want to be able to give it two dates (start and end dates) and get the record counts for each day where I don't know the number of dates in between.
You could use/create a bona-fide calendar table. Something like this:
SELECT
d.day,
COUNT(e.timestamp) AS cnt
FROM
(
SELECT '2020-01-01' AS day UNION ALL
SELECT '2020-01-02' UNION ALL
...
SELECT '2020-12-31'
) d
LEFT JOIN event e
ON e.timestamp >= d.day AND e.timestamp < DATE_ADD(d.day, INTERVAL 1 DAY)
WHERE
d.day BETWEEN <start_date> AND <end_date>
GROUP BY
d.day;
I have covered only the calendar year 2020, but you may extend to cover whatever range you want.

Aggregating table data in MySQL, is there an easier way to do this?

I'm trying to write a query that aggregates data from a table.
Essentially I have a long list of devices that have been inventoried and eventually installed over the last couple of years.
I want to find the average amount of time between when the device was received and when it was installed, and then have that data sorted by the month the device was installed. BUT in each month's row, I also want to include the data from the previous months.
So essentially what I want to see is: (sorry for terrible formatting)
MonthInstalled | TimeToInstall | Total#Devices
-----------------+---------------+----------------------------
Jan | 10 Days | 5
Feb(=Jan+Feb) | 15 Days | 18 (5 in Jan + 13 in Feb)
Mar(=Jan+Feb+Mar)| 13 Days | 25 (5 + 13 + 7)
...
The query I currently have written looks like this:
INSERT INTO DevicesInstall
SELECT ROUND(AVG(DATEDIFF(dvc.dt_install , dvc.dt_receive)), 1) AS 'Install',
COUNT(dvc.dvc_model) AS 'Total Devices',
MAX(dvc.dt_install) AS 'Date',
loc.loc_campus AS 'Campus'
FROM dvc_info dvc, location loc
WHERE dvc.dvc_loc_bin = loc.loc_bin
AND dvc.dt_install < '20160201'
;
Although this is functional, I have to iterate this for each month manually, so it is not scale-able. Is there a way to condense this at all?
We can return the dates using an inline view (derived table), and then join to the dvc_info table, so we can get the "cumulative" results.
To get the results for:
Jan
Jan+Feb
Jan+Feb+Mar
We need to return three copies of the rows for Jan, and two copies of the rows for Feb, and then collapse the those rows into an appropriate group.
The loc_campus is being included in the SELECT list... not clear why that is needed. If we want results "by campus", then we need to include that expression in the GROUP BY clause. Otherwise, the value returned for that non-aggregate is indeterminate... we will get a value for some row "in the group", but it could be any row.
Something like this:
SELECT d.dt AS `before_date`
, loc.loc_campus AS `Campus`
, ROUND(AVG(DATEDIFF(dvc.dt_install,dvc.dt_receive)),1) AS `Install`
, COUNT(dvc.dvc_model) AS `Total Devices`
, MAX(dvc.dt_install) AS `latest_dt_install`
FROM ( SELECT '2016-01-01' + INTERVAL 1 MONTH AS dt
UNION ALL SELECT '2016-01-01' + INTERVAL 2 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 3 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 4 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 5 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 6 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 7 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 8 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 9 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 10 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 11 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 12 MONTH
) d
CROSS
JOIN location loc
LEFT
JOIN dvc_info dvc
ON dvc.dvc_loc_bin = loc.loc_bin
AND dvc.dt_install < d.dt
GROUP
BY d.dt
, loc.loc_campus
ORDER
BY d.dt
, loc.loc_campus
Note that the value returned for d.dt will be the "up until" date. We're going to get '2016-02-01' returned for the January results. If we want to return a value of January date, we can use an expression in the SELECT list...
SELECT DATE_FORMAT(d.dt + INTERVAL -1 MONTH,'%Y-%m') AS `month`
Lots of options on query alternatives.
But it looks like the "big hump" is that to get cumulative results, we need to return multiple copies of the dvc_info rows, so the rows can be collapsed into each "grouping".
I recommend working on just the SELECT first. And get that tested working, before monkeying around to turn it into an INSERT ... SELECT.
FOLLOWUP
We can use any query as an inline view (derived table d) that returns a set of dates we want.
e.g.
FROM ( SELECT DATE_FORMAT(m.install_dt,'%Y-%m-01') + INTERVAL 1 MONTH AS dt
FROM dvc_install m
WHERE m.install_dt >= '2016-01-01'
GROUP BY DATE_FORMAT(m.install_dt,'%Y-%m-01') + INTERVAL 1 MONTH
) d
Note that with this approach, if there are no install_dt in February, we won't get back a row for February. Using the static UNION ALL SELECT approach allows us to get back "zero" counts, i.e. to return rows for months where there isn't an install_dt in that month. (But that's the answer to a different question... how do I get back a "zero" count for February when there aren't any rows for Februrary?)
Alternatively, if we have a calendar table e.g. cal that contains a list of the dates we want, we could just reference the table in place of the inline view, or the inline view query could get rows from that.
FROM ( SELECT cal.dt
FROM cal cal
WHERE cal.dt >= '2016-01-01'
AND cal.dt <= NOW()
AND DATE_FORMAT(cal.dt,'%d') = '01'
) d

Find number of "active" rows each month for multiple months in one query

I have a mySQL database with each row containing an activate and a deactivate date. This refers to the period of time when the object the row represents was active.
activate deactivate id
2015-03-01 2015-05-10 1
2013-02-04 2014-08-23 2
I want to find the number of rows that were active at any time during each month. Ex.
Jan: 4
Feb: 2
Mar: 1
etc...
I figured out how to do this for a single month, but I'm struggling with how to do it for all 12 months in a year in a single query. The reason I would like it in a single query is for performance, as information is used immediately and caching wouldn't make sense in this scenario. Here's the code I have for a month at a time. It checks if the activate date comes before the end of the month in question and that the deactivate date was not before the beginning of the period in question.
SELECT * from tblName WHERE activate <= DATE_SUB(NOW(), INTERVAL 1 MONTH)
AND deactivate >= DATE_SUB(NOW(), INTERVAL 2 MONTH)
If anybody has any idea how to change this and do grouping such that I can do this for an indefinite number of months I'd appreciate it. I'm at a loss as to how to group.
If you have a table of months that you care about, you can do:
select m.*,
(select count(*)
from table t
where t.activate_date <= m.month_end and
t.deactivate_date >= m.month_start
) as Actives
from months m;
If you don't have such a table handy, you can create one on the fly:
select m.*,
(select count(*)
from table t
where t.activate_date <= m.month_end and
t.deactivate_date >= m.month_start
) as Actives
from (select date('2015-01-01') as month_start, date('2015-01-31') as month_end union all
select date('2015-02-01') as month_start, date('2015-02-28') as month_end union all
select date('2015-03-01') as month_start, date('2015-03-31') as month_end union all
select date('2015-04-01') as month_start, date('2015-04-30') as month_end
) m;
EDIT:
A potentially faster way is to calculate a cumulative sum of activations and deactivations and then take the maximum per month:
select year(date), month(date), max(cumes)
from (select d, (#s := #s + inc) as cumes
from (select activate_date as d, 1 as inc from table t union all
select deactivate_date, -1 as inc from table t
) t cross join
(select #s := 0) param
order by d
) s
group by year(date), month(date);

MySQL group by week

I have a large number of records with a transaction datetime field going back several years. I would like to do a comparative analysis between the same timespan this year and last. How can I group by week over a 3 month range?
I'm running into problems using the YEARWEEK and WEEK functions because of the day the year 2012 starts of versus the day 2011 starts on.
Given that I have records with datetimes everyday from Jan 1st to the current day, and records with the same datetimes from the prior year, how can I group by week so the output is sums with dates like: 01/01/2011, 01/08/2011, 01/15/2011, etc., and 01/01/2012, 01/08/2012, 01/15/2012, etc.?
My query so far is as follows:
SELECT
DATE_FORMAT(A.transaction_date, '%Y-%m-%d') as date,
ROUND(sum(A.quantity), 3) AS quantity,
ROUND(sum(A.total_amount), 3) AS amount,
A.product_code,
D.fuel_type_code,
D.fuel_type_name,
C.customer_code,
C.customer_name
FROM
cl_transactions AS A
INNER JOIN
card AS B ON A.card_number=B.card_number
INNER JOIN
customer AS C ON B.customer_code=C.customer_code
INNER JOIN
fuel_type AS D ON A.fuel_type=D.fuel_type_code
WHERE
((A.transaction_date >= DATE_FORMAT(NOW() - INTERVAL 3 MONTH, '%Y-%m-01')) OR (A.transaction_date - INTERVAL 1 YEAR >= DATE_FORMAT(NOW() - INTERVAL 15 MONTH, '%Y-%m-01') AND A.transaction_date <= NOW() - INTERVAL 1 YEAR))
GROUP BY
A.transaction_date, fuel_type_code;
I would essentially like something that achieves the following pseudo-query:
GROUP BY
STARTING FROM THE OLDEST DATE (A.transaction_date + INTERVAL 6 DAY)
I started with an inner query using sqlvariables to build out from/to ranges for this year and last year of each respective start of year/month/day (ex: 2012-01-01 and 2011-01-01 respectively). From that, I'm also pre-formatting the date for final output so you have ONE master date basis for display reflecting that of whatever the "this year" week would be.
From that, I do a join to the transaction table where the transaction date is BETWEEN the respective start of current week and start of next week. Since date/time stamps include hour minute, 2012-01-01 by itself is implied as 12:00:00am (midnight) of the day. and between will go UP TO 7 days later 12:00:00 am. And that date will become the start date of the following week.
So, by joining on the date being between EITHER last yr or this yr time period, its the same group qualification. So the field selection does a ROUND( SUM( IF() )) per respective last year or this year. if the incoming transaction date is LESS than the current year's week start, then it must be a record from prior year, otherwise its for the current year. So, respectively, add the value itself, or zero as it applies.
So now, you have the group by. The week that it qualified for was already prepared from the inner query via "ThisYearWeekOf" formatted column, regardless of the otherwise computed "YEARWEEK()" or "WEEK()". The date ranges took care of that qualification for us.
Finally, I added the fuel-type as a join and included that as the group by. You have to group by all non-aggregate columns for proper SQL, although MySQL lets you get by by just grabbing the first entry for the given group if it is NOT so specified in group by.
To close, I DID include the information for the customer as you didn't have it in the group by and did not appear to be applicable... it would just arbitrarily grab one. However, I've added it to the group by, so now your records will show at the per customer level, per product and fuel type, how much sales and quantity between this year and last.
SELECT
JustWeekRange.ThisYearWeekOf,
CTrans.product_code,
FT.fuel_type_code,
FT.fuel_type_name,
C.customer_code,
C.customer_name,
ROUND( SUM( IF( CTrans.transaction_date < JustWeekRange.ThisYrWeekStart, CTrans.Quantity, 0 )), 3) as LastYrQty,
ROUND( SUM( IF( CTrans.transaction_date < JustWeekRange.ThisYrWeekStart, CTrans.total_amount, 0 )), 3) as LastYrAmt,
ROUND( SUM( IF( CTrans.transaction_date < JustWeekRange.ThisYrWeekStart, 0, CTrans.Quantity )), 3) as ThisYrQty,
ROUND( SUM( IF( CTrans.transaction_date < JustWeekRange.ThisYrWeekStart, 0, CTrans.total_amount )), 3) as ThisYrAmt,
FROM
( SELECT
DATE_FORMAT(#ThisYearDate, '%Y-%m-%d') as ThisYearWeekOf,
#LastYearDate as LastYrWeekStart,
#ThisYearDate as ThisYrWeekStart,
#LastYearDate := date_add( #LastYearDate, interval 7 day ) LastYrStartOfNextWeek,
#ThisYearDate := date_add( #ThisYearDate, interval 7 day ) ThisYrStartOfNextWeek
FROM
(select #ThisYearDate := '2012-01-01',
#LastYearDate := '2011-01-01' ) sqlvars,
cl_transactions justForLimit
HAVING
ThisYrWeekStart < '2012-04-01'
LIMIT 15 ) JustWeekRange
JOIN cl_transactions AS CTrans
ON CTrans.transaction_date BETWEEN
JustWeekRange.LastYrWeekStart AND JustWeekRange.LastYrStartOfNextWeek
OR CTrans.transaction_date BETWEEN
JustWeekRange.ThisYrWeekStart AND JustWeekRange.ThisYrStartOfNextWeek
JOIN fuel_type FT
ON CTrans.fuel_type = FT.fuel_type_code
JOIN card
ON CTrans.card_number = card.card_number
JOIN customer AS C
ON card.customer_code = C.customer_code
GROUP BY
JustWeekRange.ThisYearWeekOf,
CTrans.product_code,
FT.fuel_type_code,
FT.fuel_type_name,
C.customer_code,
C.customer_name