SQL "First relevant day" - mysql

I have a table with opening_hours for restaurants:
SELECT * FROM opening_hours;
+----+---------------+------------+----------+-----+
| id | restaurant_id | start_time | end_time | day |
+----+---------------+------------+----------+-----+
| 1 | 1 | 12:00:00 | 18:00:00 | 1 |
| 2 | 1 | 09:00:00 | 19:00:00 | 4 |
| 3 | 2 | 09:00:00 | 16:00:00 | 4 |
| 4 | 2 | 09:00:00 | 16:00:00 | 5 |
| 5 | 3 | 09:00:00 | 16:00:00 | 4 |
| 6 | 3 | 09:00:00 | 16:00:00 | 5 |
| 7 | 3 | 09:00:00 | 16:00:00 | 1 |
| 8 | 3 | 09:00:00 | 16:00:00 | 6 |
+----+---------------+------------+----------+-----+
http://www.sqlfiddle.com/#!2/eaea09/1
Now I want to fetch the "closest" next day or same day for every restaurant to the current day. For example when the current day is 1 the result would be:
restaurant_id: 1 day: 1
restaurant_id: 2 day: 4
restaurant_id: 3 day: 1
In the case of day 1 I could do this:
SELECT day FROM opening_hours WHERE day >= 1 GROUP BY restaurant_id LIMIT 1
But if today would be 6 that would not work. I would need the query to go get the maximum number of days (7) and if that could not be found it should start trying from 1 again. So the result for day 6 would be in this case:
restaurant_id: 1 day: 1
restaurant_id: 2 day: 4
restaurant_id: 3 day: 6
How could I achieve this with a query?
I'd think it could be something like this in pseudo SQL:
SELECT `day` FROM opening_hours WHERE `day` >= 'today' IF NOT FOUND WHERE `day` >= 1 GROUP BY `restaurant_id` LIMIT 1
edit:
I could run 2 queries, and determine if a match for a restaurant was found in the first. If not, run a second query. But there must be a better way.

Tested, this one works :) you can keep it simple.
Query:
SELECT * FROM (
SELECT
oh.*
FROM
opening_hours oh
ORDER BY restaurant_id,
`day` + IF(`day` < $current_day, 7, 0)
) sq
GROUP BY restaurant_id;
Explanation:
Note though, that this is a bit hacky. To select a column that is not used in the group by and has no aggregate function applied to it, usually isn't allowed, because theoretically it could give you a random row of each group. That's why it's not allowed in most database systems. MySQL is actually the only one I know of, that allows this (if not set otherwise via sql-mode). Like I said, in theory. Practically it's a bit different and if you do an order by in the subquery, MySQL will always give you the minimum or maximum value (depending on the sort order).
Tests:
Desired result with current day = 1:
root#VM:playground > SELECT * FROM (
-> SELECT
-> oh.*
-> FROM
-> opening_hours oh
-> ORDER BY restaurant_id,
-> `day` + IF(`day` < 1, 7, 0)
-> ) sq
-> GROUP BY restaurant_id;
+----+---------------+------------+----------+-----+
| id | restaurant_id | start_time | end_time | day |
+----+---------------+------------+----------+-----+
| 1 | 1 | 12:00:00 | 18:00:00 | 1 |
| 3 | 2 | 09:00:00 | 16:00:00 | 4 |
| 7 | 3 | 09:00:00 | 16:00:00 | 1 |
+----+---------------+------------+----------+-----+
3 rows in set (0.00 sec)
Desired result with current day = 6:
root#VM:playground > SELECT * FROM (
-> SELECT
-> oh.*
-> FROM
-> opening_hours oh
-> ORDER BY restaurant_id,
-> `day` + IF(`day` < 6, 7, 0)
-> ) sq
-> GROUP BY restaurant_id;
+----+---------------+------------+----------+-----+
| id | restaurant_id | start_time | end_time | day |
+----+---------------+------------+----------+-----+
| 1 | 1 | 12:00:00 | 18:00:00 | 1 |
| 3 | 2 | 09:00:00 | 16:00:00 | 4 |
| 8 | 3 | 09:00:00 | 16:00:00 | 6 |
+----+---------------+------------+----------+-----+
3 rows in set (0.00 sec)

This is a tricky one.
Best I can come up with is this, which seems to work but I might be missing some edge cases.
SELECT sub0.restaurant_id, MIN(sub1.day)
FROM
(
SELECT restaurant_id, MIN( LEAST(ABS(DAYOFWEEK(CURDATE()) - day), ABS(DAYOFWEEK(CURDATE()) - (day + 7)), ABS(DAYOFWEEK(CURDATE()) - (day - 7)))) AS difference
FROM opening_hours
GROUP BY restaurant_id
) sub0
INNER JOIN
(
SELECT restaurant_id, day, LEAST(ABS(DAYOFWEEK(CURDATE()) - day), ABS(DAYOFWEEK(CURDATE()) - (day + 7)), ABS(DAYOFWEEK(CURDATE()) - (day - 7))) AS difference
FROM opening_hours
) sub1
ON sub0.restaurant_id = sub1.restaurant_id
AND sub0.difference = sub1.difference
GROUP BY sub0.restaurant_id
The first sub query is getting the absolute difference between todays day and each restaurant day. It is using the day, the day plus 7 and the day minus 7 to compare with, using ABS to just get the difference in days and using LEAST to get the lowest of those differences. This way if the current day is 1 and there is a restaurant day of 6 it is comparing 1 + 7 with 6, 1 - 7 with 6 and 1 with 6 and getting the least of those (in this case that would be 1 + 7).
The 2nd sub query just gets that difference and the day of the week for each possible restaurant / day, and this is joined to the first sub query.
The outer query uses MIN just to pick a single day when 2 are just as close.

Something like that should do it:
SELECT oh1.*
-- set the start day
FROM (SELECT #start := 1) AS start,
-- calculate difference in days
(SELECT *, (CASE WHEN day-#start >= 0 THEN day-#start ELSE day-#start+7 END) AS diff
FROM opening_hours) AS oh1
-- find minimum difference
JOIN (SELECT restaurant_id, MIN(CASE WHEN day-#start >= 0 THEN day-#start ELSE day-#start+7 END) AS min_diff
FROM opening_hours
GROUP BY restaurant_id) AS oh2
ON oh1.restaurant_id = oh2.restaurant_id AND
oh1.diff = oh2.min_diff
Replace #start := 1 with your starting day or a call to DAYOFWEEK(CURDATE()), depending on how you want to do it!

First step is to simply add 7 to the result of your day difference calculation when the day is less than the day you are searching for:
SET #Day = 6;
SELECT ID, Restaurant_id, Day,
CASE WHEN Day < #Day THEN 7 ELSE 0 END + Day - #Day AS DaysFromNow
FROM opening_hours;
This will give:
ID RESTAURANT_ID DAY DAYSFROMNOW
1 1 1 2
2 1 4 5
3 2 4 5
4 2 5 6
5 3 4 5
6 3 5 6
7 3 1 2
8 3 6 0
Then to get the next relavant day you need to get the minimum DaysFromNow for each restaurant, then join back to your main table:
SET #Day = 6;
SELECT o.*
FROM opening_hours AS o
INNER JOIN
( SELECT Restaurant_id,
MIN(CASE WHEN Day < #Day THEN 7 ELSE 0 END + Day - #Day) AS DaysFromNow
FROM opening_hours
GROUP BY Restaurant_id
) AS mo
ON mo.Restaurant_id = o.Restaurant_id
AND mo.DaysFromNow = (CASE WHEN Day < #Day THEN 7 ELSE 0 END + Day - #Day);
Example on SQL Fiddle

Related

GROUP BY custom date intervals per year

Situation: I need a custom interval between dates. The problem I face when I try to GROUP BY the year and the result I get amounts to by the given year. I need a custom interval per year from December 20th with time: 00:00:00 of previous year to December 19th with time: 23:59:59 of said year. Here is some of my data:
Table - History:
id | date | income | spent
--------------------------------------------
1 | 2019-12-21 17:15:00 | 600,00 | NULL
2 | 2019-12-23 12:55:00 | 183,00 | NULL
3 | 2019-12-30 20:05:00 | NULL | 25,00
4 | 2020-01-01 15:35:00 | NULL | 13,00
5 | 2020-01-01 20:25:00 | NULL | 500,50
6 | 2020-12-10 10:25:00 | NULL | 5,50
7 | 2021-05-22 12:45:00 | 1098,00 | NULL
8 | 2021-05-23 10:18:00 | NULL | 186,00
9 | 2021-11-25 12:32:00 | NULL | 10,00
10 | 2021-12-23 10:35:00 | NULL | 10,00
The expected result:
Year | Summary Income | Summary Spent | Difference
--------------------------------------------------
2020 | 783,00 | 544,00 | 239,50
2021 | 1098,00 | 196,00 | 902,00
2022 | 0,00 | 10,00 | -10,00
I have managed to get a result with the help of a loop within a procedure:
...
SET #Aa = (SELECT MIN(date) FROM History);
CREATE TEMPORARY TABLE Yr (Year VARCHAR(4), Income FLOAT(8,2), Spent FLOAT(8,2), differ FLOAT(8,2));
Yearly: LOOP
SET #Aa = #Aa + 1;
SET #From = CONCAT((#Aa - 1), '-12-20 00:00:00');
SET #To = CONCAT(#Aa, '-12-19 23:59:59');
SET #Count = (SELECT SUM(income) FROM History WHERE date >= #From AND date <= #To);
SET #diff = (SELECT SUM(spent) FROM History WHERE date >= #From AND date <= #To);
INSERT INTO Yr (Year, Income, Spent, differ) VALUES (#Aa, #Count, #diff, (#Count - #diff));
IF (#Aa = (SELECT MAX(YEAR(date)) FROM History)) THEN LEAVE Yearly; END IF;
END LOOP;
SELECT * FROM Yr;
...
Question: I wonder if it's possible to get a custom interval for an annual summary with an condensed SQL query without using a loop?
You can simply add 11 days to the date before applying the year function to get this grouping, e.g.
SELECT YEAR(DATE_ADD(date, INTERVAL 11 DAY)) AS Year,
SUM(income) AS income,
SUM(spent) AS Spent,
IFNULL(SUM(income),0) - IFNULL(SUM(spent),0) AS difference
FROM History
GROUP BY YEAR(DATE_ADD(date, INTERVAL 11 DAY));
Example on db-fiddle

mysql sum group by month and date using a contract start and end date

I have a table full of monthly contracts. There is a monthly price, a start date, and an end date for each. I am trying to graph each month's total revenue and am wondering if it's possible to do this in one query (vs. a query for each month).
I know how to group by month and year in mysql, but this requires a more complex solution that "understands" whether to include in the sum for a given month/year based on the start and end date of the contract.
Shorthand example
| contract_id | price | start_date | end_date |
| 1 | 299 | 1546318800 (1/1/19) | 1554004800 (3/31/19) |
| 2 | 799 | 1551416400 (3/1/19) | 1559275200 (5/31/19) |
With this example, there's an overlap in March. Both contracts are running in March, so the sum returned for that month should be 1098.
I'd like to be able to produce a report that includes every month between two dates, so in this case I'd send 1/1/19 - 12/31/19, the full year of 2019 and would hope to see 0 results as well.
| month | year | price_sum |
| 1 | 2019 | 299 |
| 2 | 2019 | 299 |
| 3 | 2019 | 1098 |
| 4 | 2019 | 799 |
| 5 | 2019 | 799 |
| 6 | 2019 | 0 |
| 7 | 2019 | 0 |
| 8 | 2019 | 0 |
| 9 | 2019 | 0 |
| 10 | 2019 | 0 |
| 11 | 2019 | 0 |
| 12 | 2019 | 0 |
Here is a full working script for your problem, which uses a calendar table approach to represent every month in 2019. Specifically, we represent each month using the first of that month. Then, a given price from your table is applicable to that month if there is overlap with the start and end range.
WITH yourTable AS (
SELECT 1 AS contract_id, 299 AS price, '2019-01-01' AS start_date, '2019-03-31' AS end_date UNION ALL
SELECT 2, 799, '2019-03-01', '2019-05-31'
),
dates AS (
SELECT '2019-01-01' AS dt UNION ALL
SELECT '2019-02-01' UNION ALL
SELECT '2019-03-01' UNION ALL
SELECT '2019-04-01' UNION ALL
SELECT '2019-05-01' UNION ALL
SELECT '2019-06-01' UNION ALL
SELECT '2019-07-01' UNION ALL
SELECT '2019-08-01' UNION ALL
SELECT '2019-09-01' UNION ALL
SELECT '2019-10-01' UNION ALL
SELECT '2019-11-01' UNION ALL
SELECT '2019-12-01'
)
SELECT
d.dt,
SUM(t.price) AS price_sum
FROM dates d
LEFT JOIN yourTable t
ON d.dt < t.end_date
AND DATE_ADD(d.dt, INTERVAL 1 MONTH) > t.start_date
GROUP BY
d.dt;
Demo
Notes:
If your dates are actually stored as UNIX timestamps, then just call FROM_UNIXTIME(your_date) to convert them to dates, and use the same approach I gave above.
I had to use the overlapping date range formula here, because the criteria for overlap in a given month is that the range of that month intersects the range given by a start and end date. Have a look at this SO question for more information on that.
My code is for MySQL 8+, though in practice you may wish to create a bona fide calendar table (the CTE version of which I called dates above), which contains the range of months/years which you want to cover your data set.
I understand that you will be given a range of dates for which you will need to report. My solution requires you to initialize a temporary table, such as date_table with the first day of each month for which you want to report on:
create temporary table date_table (
d date,
primary key(d)
);
set #start_date = '2019-01-01';
set #end_date = '2019-12-01';
set #months = -1;
insert into date_table(d)
select DATE_FORMAT(date_range,'%Y-%c-%d') AS result_date from (
select (date_add(#start_date, INTERVAL (#months := #months +1 ) month)) as date_range
from mysql.help_topic a limit 0,1000) a
where a.date_range between #start_date and last_day(#end_date);
Then this should do it:
select month(dt.d) as month, year(dt.d) as year, ifnull(sum(c.price), 0) as price_sum
from date_table dt left join contract c on
dt.d >= date(from_unixtime(c.start_date)) and dt.d < date(from_unixtime(c.end_date))
group by dt.d
order by dt.d
;
Resulting in:
+-------+------+-----------+
| month | year | price_sum |
+-------+------+-----------+
| 1 | 2019 | 299 |
| 2 | 2019 | 299 |
| 3 | 2019 | 1098 |
| 4 | 2019 | 799 |
| 5 | 2019 | 799 |
| 6 | 2019 | 0 |
| 7 | 2019 | 0 |
| 8 | 2019 | 0 |
| 9 | 2019 | 0 |
| 10 | 2019 | 0 |
| 11 | 2019 | 0 |
| 12 | 2019 | 0 |
+-------+------+-----------+
See demo
I am not sure about the semantics of the column end_date. Right now I am comparing the first a follows: start_date <= first_of_month < end_date. Perhaps the test should be start_date <= first_of_month <= end_date, in which case:
dt.d >= date(from_unixtime(c.start_date)) and dt.d < date(from_unixtime(c.end_date))
becomes:
dt.d between date(from_unixtime(c.start_date)) and date(from_unixtime(c.end_date))
With end_date being the last day of the month, it would not matter either way.

Mysql sum multiple column values with date condition

I have a client table with below columns which have data of every day purchase of every client month wise.
ID|MONTH|DAY1|DAY2|DAY3|DAY4|..........|DAY31
1 | 4 | 10 | 20 | 0 | 15 |..........|10
2 | 4 | 20 | 30 | 23 | 7 |..........| 5
1 | 5 | 5 | 10 | 20 | 4 |..........| 20
1 | 6 | 12 | 0 | 10 | 5 |..........| 10
2 | 6 | 10 | 10 | 5 | 10 |..........| 5
Now i want to find the total qty purchased by every client between 15/4/2015 to 15/6/2015.
I am new to mysql, so have no idea how to move forward.
Thanks in advance
It's not a good idea to use a column for every day of the month to store a value. But sometimes we are stuck on bad data formats, here's how you can get the count you need:
SELECT ID, SUM(qty)
FROM (
SELECT
ID,
MAKEDATE(2015,1) + INTERVAL (month-1) MONTH AS dt,
DAY1 AS qry
FROM yourtable
UNION ALL
SELECT
ID,
MAKEDATE(2015,1) + INTERVAL (month-1) MONTH AS dt + INTERVAL 1 DAY,
DAY2 AS qry
UNION ALL
SELECT
ID,
MAKEDATE(2015,1) + INTERVAL (month-1) MONTH AS dt + INTERVAL 2 DAY,
DAY3 AS qry
FROM yourtable
UNION ALL
...
until day 31
...
) s
WHERE
dt>='2015-04-15' AND dt<='2015-06-15'
GROUP BY ID
I'm using a subquery to normalize the data structure, then I'm doing the counts on the outer query, a simple where clause and group by will give the results that you need.

UPDATE + SET + WHERE - Dynamic minimum value

This is a follow-up to:
Dynamic minimum value for specfic range (mysql)
I do have the query to fetch the third column (lowest of the last 3 days) "Low_3_days" via SELECT command:
-----------------------------------------
| Date | Unit_ | Lowest_in_last_|
| | price | 3_days |
|----------------------------------------
| 2015-01-01 | 15 | 15 |
| 2015-01-02 | 17 | 15 |
| 2015-01-03 | 21 | 15 |
| 2015-01-04 | 18 | 17 |
| 2015-01-05 | 12 | 12 |
| 2015-01-06 | 14 | 12 |
| 2015-01-07 | 16 | 12 |
|----------------------------------------
select S.Date,Unit_price,
(select S.Date, Unit_price,
(SELECT min(s2.Unit_Price)
FROM table s2
WHERE s2.DATE BETWEEN s.DATE - interval 3 day and
s.DATE - interval 1 day
) as min_price_3_days
FROM table S;
My new challenge is - what is the best way to use UPDATE-SET-WHERE so I could add the ("Lowest_in_last_3_days") values to a new column in a table (instead of having temporary results displayed to me via SELECT).
By following the UPDATE-SET-WHERE syntax, the query would be:
UPDATE table
SET min_price_3_days =
(select S.Date, Unit_price,
(SELECT min(s2.Unit_Price)
FROM table s2
WHERE s2.DATE BETWEEN s.DATE - interval 3 day and
s.DATE - interval 1 day
) as min_price_3_days
but I have difficulties constructing the correct query.
What would be the correct approach to this? I do recognize this one is a tough one to solve.
Your UPDATE should look like:
update table set low_3_days=
(SELECT min(Unit_Price)
FROM (select unit_price, date as date2 from table) as s2
WHERE s2.date2 BETWEEN date - interval 3 day and date - interval 1 day
);
You can check it in SQLFiddle
In Fiddle I used different names for table and column. I prefer not to use SQL keywords as names

sql query with current day comparison

I have a mysql table with a due_date field which is simply an integer value.
dealID | due_day
1 | 15
2 | 25
3 | 10
4 | 9
5 | 31
6 | 20
I would like to query this table to only display the data that would be 14 days before the due_day. For example, today is 01/05/13, if I query this table it should only show me dealID 1, 3 and 9. How should I go about this condition?
You can do that simply using DATE_SUB to substract the number of days from current date and then DAYOFMONTH to get the day.
You can create the query using the mentioned functions.
So based on user1951544's answer this is what I came up with.
SELECT due_day
FROM deals
WHERE (due_day - DAYOFMONTH( NOW( ) ) ) <=14
Query:
SQLFIDDLEExample
SELECT
dealID,
due_day
FROM Table1
WHERE due_day < 14 + DAYOFMONTH( NOW( ) )
Result:
| DEALID | DUE_DAY |
--------------------
| 1 | 15 |
| 3 | 10 |
| 4 | 9 |