How to add value for every date in MySQL query - mysql

I have following sql-query:
SELECT DATE(time), ROUND(AVG(out_temp),2)
FROM data_table
WHERE id= 1 AND time BETWEEN '2012-08-18' AND '2012-08-30'
GROUP BY DATE(time)
ORDER BY time ASC
This returns:
date avg_temp
2012-08-18 11.41
2012-08-19 5.90
2012-08-28 11.22
2012-08-29 10.07
Everything works well so far... but I would like to add missing dates with constant value like this:
date avg_temp
2012-08-18 11.41
2012-08-19 5.90
2012-08-20 <value>
... ...
2012-08-27 <value>
2012-08-28 11.22
2012-08-29 10.07
How should I modify my query? Could somebody help me with this problem? I read some posts about creating a separate calendar table with prefilled date values, but I still didnt get it to work.

If your data table actually has data on every date, you can do this:
SELECT thedate, coalesce(ROUND(AVG(out_temp),2), <value>)
FROM (select distinct date(time) as thedate
from data_table
) dates left outer join
data_table dt
on date(time) = thedates.date
WHERE id= 1 AND thedate BETWEEN '2012-08-18' AND '2012-08-30'
GROUP BY thedate
ORDER BY time ASC
What you need is a driving table to generate the dates that you need. You can then left join to this table, to get the summaries you want. The COALESCE function lets you put in your default value.

Create a table with all dates you need, and then do a LEFT JOIN. E.g.
CREATE TABLE calendar ( day DATE PRIMARY KEY );
Then insert into the table, probably with a loop on your programming language (pseudocode):
for day in day_range( start_date, end_date ):
query( 'INSERT INTO calendar VALUES ( ' + day + ' );' );
And then do a LEFT JOIN:
SELECT DATE(day), ROUND(AVG(out_temp),2)
FROM data_table LEFT JOIN calendar ON data_table.time = calendar.day
WHERE id= 1 AND day BETWEEN '2012-08-18' AND '2012-08-30'
GROUP BY DATE(day)
ORDER BY day ASC

Related

How do I select SQL data in buckets when data doesn't exist for one bucket?

I'm trying to get a complete set of buckets for a given dataset, even if no records exist for some buckets.
For example, I want to display totals by day of week, with zero total for days with no records.
SELECT
WEEKDAY(transaction_date) AS day_of_week,
SUM(sales) AS total_sales
FROM table1
GROUP BY day_of_week
If I have sales every day, I'll get 7 rows in my result representing total sales on days 0-6.
If I don't have sales on Day 2, I get no result for Day 2.
What's the most efficient way to force a zero value for day 2?
Should I join to a temporary table or array of defined buckets? ['0','1','2','3','4','5','6']
Or is it better to insert zeros outside of MySQL, after I've done the query?
I am using MySQL, but this is a general SQL question.
In MySQL, you could simply use a derived table of numbers from 1 to 7, left join it with the table, then aggregate:
select d.day_of_week, sum(sales) AS total_sales
from (
select 1 day_of_week union all select 2 union all select 3 union all select 4
union all select 5 union all select 6 union all select 7
) d
left join table1 t1 on weekday(t1.transaction_date) = d.day_of_week
group by day_of_week
Very recent versions have the values(row...) syntax, which shortens the query:
select d.day_of_week, sum(sales) AS total_sales
from (values row(1), row(2), row(3), row(4), row(5), row(6), row(7)) d(day_of_week)
left join table1 t1 on weekday(t1.transaction_date) = d.day_of_week
group by day_of_week
Basically you want the answer to be 0 when the data is actually null for that bucket, therefore you want the max(null, 0). A max function wouldn't natively work with NULL in this way, however, you can use COALESCE to force it:
COALESCE(MAX(SUM(sales)),0)
as suggested by this answer
First off you need a calendar table; something like this or this. Or create calendar subset on the fly. I am not sure of the mySQL syntax, but here is what it would look like in SQL Server.
DECLARE
#FromDate DATE
, #ToDate DATE
-- set these variables to appropriate values
SET #FromDate = '2020-03-01';
SET #ToDate = '2020-03-31';
;WITH cteCalendar (MyDate) AS
(
SELECT CONVERT(DATE, #FromDate) AS MyDate
UNION ALL
SELECT DATEADD(DAY, 1, MyDate)
FROM cteCalendar
WHERE DATEADD(DAY, 1, MyDate) <= #ToDate
)
SELECT WEEKDAY(cte.MyDate) AS day_of_week,
SUM(sales) AS total_sales
FROM cteCalendar cte
LEFT JOIN table1 t1 ON cte.MyDate = t1.transaction_date
GROUP BY day_of_week

SUM subquery with condition depends on parent query columns returns NULL

everyone!
I'm trying to calc sum of price of deals by each day. What i do:
SET #symbols_set = "A,B,C,D";
DROP TABLE IF EXISTS temp_deals;
CREATE TABLE temp_deals AS SELECT Deal, TimeMsc, Price, VolumeExt, Symbol FROM deals WHERE TimeMsc >= "2019-04-01" AND TimeMsc <= "2019-06-30" AND FIND_IN_SET(Symbol, #symbols_set) > 0;
SELECT
DATE_FORMAT(TimeMsc, "%d/%m/%Y") AS Date,
Symbol,
(SELECT SUM(Price) FROM temp_deals dap WHERE dap.TimeMsc BETWEEN Date AND Date + INTERVAL 1 DAY AND dap.Symbol = Symbol) AS AvgPrice
FROM temp_deals
ORDER BY Date;
DROP TABLE IF EXISTS temp_deals;
But in result i've got NULL in AvgPrice column. I can't understand what i'm doing wrong.
It's look like i can't pass parent query's column to subquery, am i right?
Qualify your column names. But mostly, don't use a string for comparing dates:
SELECT DATE_FORMAT(d.TimeMsc, '%d/%m/%Y') AS Date,
d.Symbol,
(SELECT SUM(dap.Price)
FROM temp_deals dap
WHERE dap.TimeMsc >= d.TimeMsc AND
dap.TimeMsc < d.TimeMsc + INTERVAL 2 DAY AND -- not sure if you want 1 day or 2 day
dap.Symbol = d.Symbol
) AS AvgPrice
FROM temp_deals d
ORDER BY d.TimeMsc;

SQL: Count/ sum columns within a specific date range

I have a table that roughly looks like this. There are thousands of rows.
booking_date checkin_date ...some other columns .... booking_value
22-mar-2016 29-mar-2016 ........................... $150
01-apr-2016 17-may-2016 ........................... $500
09-apr-2016 09-apr-2016 ........................... $222
17-apr-2016 23-apr-2016 ........................... $75
19-apr-2016 31-july-2016 ........................... $690
03-May-2016 07-May-2016 ............................. $301
.
.
.
.
I am trying to calculate number of bookings per day and the value of bookings per day in April 2016. The second part is to calculate for how many bookings the booking_date and checking_date were the same.
I am very new to SQL. I can formulate the logic in paper, but can't seem to figure out how to proceed with the code.
I recommend the following query:
SELECT CAST(booking_date AS DATE), COUNT(*) as Number_of_Booking,
SUM(CAST(booking_date AS DATE) = CAST(checkin_date AS DATE)) as count_with_same_date,
SUM(booking_value) as booking_value
FROM t
WHERE booking_date >= '2016-04-01' AND
booking_date < '2016-05-01'
GROUP BY CAST(booking_date AS DATE);
In particular, note the filtering on the dates. The direct comparisons allow MySQL to use an index.
The calculation of the number on the same date uses a nice feature of MySQL where boolean values are treated as numbers in a numeric context.
You can try this below code-
SELECT CAST(booking_date AS DATE),
COUNT(*) Number_of_Booking,
COUNT(
CASE
WHEN CAST(booking_date AS DATE)
= CAST(checkin_date AS DATE) THEN 1
ELSE NULL
END
) count_with_same_date,
SUM(booking_value) booking_value -- Booking value has to be Number field
FROM your_table
WHERE YEAR(booking_date ) = 2016
AND MONTH(booking_date ) = 4
GROUP BY CAST(booking_date AS DATE)
For the first question you can try
Select booking_date
,count(*) as Number_of_bookings
,Sum(booking_value) as value
From table_name
Where booking_date between '01-apr-2016' and '30-apr-2016'
group by booking_date:
Or you can use month() and year() function in filter.
For the second question try,
Select booking_date
,checkin_date
,count(*)
from table_name
where booking_date=checkin_date
group by booking_date, checkin_date

Selecting rows that are within a date range based on a date from another table. (MYSQL)

I have two tables, which share a key that link the two. Table A has a date column (of the format MM/DD/YYYY), and table B has a date field of the format (YYYY-MM-DD HH:MM:SS).
What I need to do is select all those in table B, that have a key matching table A AND a date field within 30 days of the date field found in table A.
Edit: Both variables are varchars, here is what I currently have (error from using alias formattedEffective in a join). I think the below would work, if I could use aliases in that way.
select *,
DATE_FORMAT(STR_TO_DATE(`Eff_date`, '%m/%d/%Y'), '%Y-%m-%d') as formattedEffective
from `customers`
right join `dispatch` on `customers`.`Member_no` = `dispatch`.`Member_no`
AND `dispatch`.`sortdate` > formattedEffective
AND `dispatch`.`sortdate` < DATE_ADD(formattedEffective,INTERVAL 30 DAY)
What the community is asking for is the ability to create a scenario to give a definitive answer to your question (create table statements, sample data, etc..). The approach below is speculation.
The assumption the query makes is eff_date is a string and sortdate is stored as a MySQL date (i.e., date, datetime, timestamp).
select d.*,
str_to_date(c.eff_date, '%m/%d/%Y') ) as formattedEffective
from customer c
join dispatch d on ( d.member_no = c.member_no
and d.sortdate between str_to_date(c.eff_date, '%m/%d/%Y')
and str_to_date(c.eff_date, '%m/%d/%Y') + interval 30 day );
In case the above answer
does not work, move the matching range to where condition.
select d.*,
str_to_date(c.eff_date, '%m/%d/%Y') ) as formattedEffective
from customer c
join dispatch d on ( d.member_no = c.member_no)
where d.sortdate between str_to_date(c.eff_date, '%m/%d/%Y')
and str_to_date(c.eff_date, '%m/%d/%Y') + interval 30 day );

MySQL query to count items by week for the current 52-weeks?

I have a query that I'd like to change so that it gives me the counts for the current 52 weeks. This query makes use of a calendar table I've made which contains a list of dates in a fixed range. The query as it stands is selecting max and min dates and not necessarily the last 52 weeks.
I'm wondering how to keep my calendar table current such that I can get the last 52-weeks (i.e, from right now to one year ago). Or is there another way to make the query independent of using a calendar table?
Here's the query:
SELECT calendar.datefield AS date, IFNULL(SUM(purchaseyesno),0) AS item_sales
FROM items_purchased join items on items_purchased.item_id=items.item_id
RIGHT JOIN calendar ON (DATE(items_purchased.purchase_date) = calendar.datefield)
WHERE (calendar.datefield BETWEEN (SELECT MIN(DATE(purchase_date))
FROM items_purchased) AND (SELECT MAX(DATE(purchase_date)) FROM items_purchased))
GROUP BY week(date)
thoughts?
Some people dislike this approach but I tend to use a dummy table that contains values from 0 - 1000 and then use a derived table to produce the ranges that are needed -
CREATE TABLE dummy (`num` INT NOT NULL);
INSERT INTO dummy VALUES (0), (1), (2), (3), (4), (5), .... (999), (1000);
If you have a table with an auto-incrementing id and plenty of rows you could generate it from that -
CREATE TABLE `dummy`
SELECT id AS `num` FROM `some_table` WHERE `id` <= 1000;
Just remember to insert the 0 value.
SELECT CURRENT_DATE - INTERVAL num DAY
FROM dummy
WHERE num < 365
So, applying this approach to your query you could do something like this -
SELECT WEEK(calendar.datefield) AS `week`, IFNULL(SUM(purchaseyesno),0) AS item_sales
FROM items_purchased join items on items_purchased.item_id=items.item_id
RIGHT JOIN (
SELECT (CURRENT_DATE - INTERVAL num DAY) AS datefield
FROM dummy
WHERE num < 365
) AS calendar ON (DATE(items_purchased.purchase_date) = calendar.datefield)
WHERE calendar.datefield >= (CURRENT_DATE - INTERVAL 1 YEAR)
GROUP BY week(datefield) -- shouldn't this be datefield instead of date?
I too typically "simulate" a table on the fly by using #sql variables and just join to ANY table in your system that has AT least as many weeks as you want. NOTE... when dealing with dates, I like to typically use the date-part only which implies a 12:00:00 am. Also, by advancing the start date by 7 days for the "EndOfWeek", you can now apply a BETWEEN clause for records within a given time period... such as your weekly needs.
I've applied such a sample to coordinate the join based on date association to the per week basis... Since your
select
DynamicCalendar.StartOfWeek,
COALESCE( SUM( IP.PurchaseYesNo ), 0 ) as Item_Sales
from
( select
#weekNum := #weekNum +1 as WeekNum,
#startDate as StartOfWeek,
#startDate := date_add( #startDate, interval 1 week ) EndOfWeek
from
( select #weekNum := 0,
#startDate := date(date_sub(now(), interval 1 year ))) sqlv,
AnyTableThatHasAtLeast52Records,
limit
52 ) DynamicCalendar
LEFT JOIN items_purchased IP
on IP.Purchase_Date bewteen DynamicCalendar.StartOfWeek
AND DynamicCalendar.EndOfWeek
group by
DynamicCalendar.StartOfWeek
This is under the premise that your "PurchaseYesNo" value is in your purchased table directly. If so, no need to join to the ITEMS table. If the field IS in the items table, then I would just tack on a LEFT JOIN for your items table and get value from that.
However you could use the dynamicCalendar context in MANY conditions.