MySQL select a row from a daterange excluding the year - mysql

I'm trying to create a MySQL query to select the daily price from a table that is between a date range from another. I only want to use 'starting-ending' months and days from the table "seasons" and I want to pass the year dynamically to the query.
This is my query: (I'm giving it the Year to exclude the one on the table)
SELECT a.season, b.base_price
FROM seasons a
JOIN pricebyseason b ON a.id=b.season_id
WHERE b.prop_id='6' AND '2015-11-29' BETWEEN DATE_FORMAT(a.starting,'2015-%m-%d') AND DATE_FORMAT(a.ending,'2016-%m-%d')
ORDER BY b.base_price DESC
It works but not with all dates.
These are the tables:
seasons (these are static date values)
+----+--------------+------------+------------+
| id | season | starting | ending |
+----+--------------+------------+------------+
| 1 | Peak Season | 2015-12-11 | 2016-01-09 |
| 2 | High Season | 2015-11-27 | 2016-04-15 |
| 3 | Mid Season | 2015-04-16 | 2015-09-01 |
| 4 | Low Season | 2015-09-02 | 2015-11-26 |
| 5 | Spring Break | 2015-03-05 | 2015-03-21 |
+----+--------------+------------+------------+
pricebyseason
+----+---------+-----------+------------+
| id | prop_id | season_id | base_price |
+----+---------+-----------+------------+
| 1 | 6 | 1 | 950 |
| 2 | 6 | 2 | 750 |
| 3 | 6 | 3 | 450 |
| 4 | 6 | 4 | 400 |
| 5 | 6 | 5 | 760 |
+----+---------+-----------+------------+
What I want to achive is query the dialy price between checkin, checkout selection
I create this sqlfiddle: http://sqlfiddle.com/#!9/4a6f4
This is a previuos query that is not working either:
SELECT a.base_price,b.season,b.starting,b.ending
FROM pricebyseason a JOIN seasons b ON a.season_id=b.id
WHERE a.prop_id='6' AND
(DATE_FORMAT(b.starting,'%m-%d') <= '12-27' OR DATE_FORMAT(b.starting,'2016-%m-%d') >= '2015-12-27')
AND
(DATE_FORMAT(b.ending,'%m-%d') >= '12-27' OR DATE_FORMAT(b.ending,'2016-%m-%d') <= '2015-12-27')
ORDER BY base_price DESC
And here are some sample dates for each season: '2016-01-08','2015-12-27','2016-04-14','2015-11-29','2016-04-15','2015-09-01','2016-09-02','2015-11-26','2016-10-10','2016-03-18','2016-06-22','2015-06-15'
Thank a lot

Related

How to get Total Overdraft amounts from a particular Date in SQL

I'm trying to get the total amount of overdraft accounts from an old Date, the goal is to get the total amount it was on the 31st of January.
I have the following tables Users and Transactions.
USERS (currently)
| user_id | name | account_balance |
|---------|---------|------------------|
| 1 | Wells | 1.00 |
| 2 | John | -10.00 |
| 3 | Sahar | -5.00 |
| 4 | Peter | 1.00 |
TRANSACTIONS (daily transition can go back in time)
| trans_id | user_id | amount_tendered | trans_datetime |
|------------|---------|-------------------|---------------------|
| 1 | 1 | 2 | 2021-02-16 |
| 2 | 2 | 3 | 2021-02-16 |
| 3 | 3 | 5 | 2021-02-16 |
| 4 | 4 | 2 | 2021-02-16 |
| 5 | 1 | 10 | 2021-02-15 |
so the current total overdraft amount is
SELECT sum(account_balance) AS O_D_Amount
FROM users
WHERE account_balance < 0;
| O_D_Amount |
|------------|
| -15 |
I need Help to reverse this amount to a date in history.
Assuming overdrafts are based on the sum of transactions up to a point, you can use a subquery:
select sum(total) as total_overdraft
from (select user_id, sum(amount_tendered) as total
from transactions t
where t.trans_datetime <= ?
group by user_id
) t
where total < 0;
The ? is a parameter placeholder for the date/time you care about.

Query with dynamic date intervals

Given a statuses table that holds information about products availability, how do I select the date that corresponds to the 1st day in the latest 20 days that the product has been active?
Yes I know the question is hard to follow. I think another way to put it would be: I want to know how many times each product has been sold in the last 20 days that it was active, meaning the product could have been active for years, but I'd only want the sales count from the latest 20 days that it had a status of "active".
It's something easily doable in the server-side (i.e. getting any collection of products from the DB, iterating them, performing n+1 queries on the statuses table, etc), but I have hundreds of thousands of items so it's imperative to do it in SQL for performance reasons.
table : products
+-------+-----------+
| id | name |
+-------+-----------+
| 1 | Apple |
| 2 | Banana |
| 3 | Grape |
+-------+-----------+
table : statuses
+-------+-------------+---------------+---------------+
| id | name | product_id | created_at |
+-------+-------------+---------------+---------------+
| 1 | active | 1 | 2018-01-01 |
| 2 | inactive | 1 | 2018-02-01 |
| 3 | active | 1 | 2018-03-01 |
| 4 | inactive | 1 | 2018-03-15 |
| 6 | active | 1 | 2018-04-25 |
| 7 | active | 2 | 2018-03-01 |
| 8 | active | 3 | 2018-03-10 |
| 9 | inactive | 3 | 2018-03-15 |
+-------+-------------+---------------+---------------+
table : items (ordered products)
+-------+---------------+-------------+
| id | product_id | order_id |
+-------+---------------+-------------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 3 |
| 4 | 1 | 4 |
| 5 | 1 | 5 |
| 6 | 2 | 3 |
| 7 | 2 | 4 |
| 8 | 2 | 5 |
| 9 | 3 | 5 |
+-------+---------------+-------------+
table : orders
+-------+---------------+
| id | created_at |
+-------+---------------+
| 1 | 2018-01-02 |
| 2 | 2018-01-15 |
| 3 | 2018-03-02 |
| 4 | 2018-03-10 |
| 5 | 2018-03-13 |
+-------+---------------+
I want my final results to look like this:
+-------+-----------+----------------------+--------------------------------+
| id | name | recent_sales_count | date_to_start_counting_sales |
+-------+-----------+----------------------+--------------------------------+
| 1 | Apple | 3 | 2018-01-30 |
| 2 | Banana | 0 | 2018-04-09 |
| 3 | Grape | 1 | 2018-03-10 |
+-------+-----------+----------------------+--------------------------------+
So this is what I mean by latest 20 active days for e.g. Apple:
It was last activated at '2018-04-25'. That's 4 days ago.
Before that, it was inactive since '2018-03-15', so all these days until '2018-04-25' don't count.
Before that, it was active since '2018-03-01'. That's more 14 days until '2018-03-15'.
Before that, inactive since '2018-02-01'.
Finally, it was active since '2018-01-01', so it should only count the missing 2 days (4 + 14 + 2 = 20) backwards from '2018-02-01', resulting in date_to_start_counting_sales = '2018-01-30'.
With the '2018-01-30' date in hand, I'm then able to count Apple orders in the last 20 active days: 3.
Hope that makes sense.
Here is a fiddle with the data provided above.
I've got a standard SQL solution, that does not use any window function as you are on MySQL 5
My solution requires 3 stacked views.
It would have been better with a CTE but your version doesn't support it. Same goes for the stacked Views... I don't like to stack views and always try to avoid it, but sometimes you have no other choice, because MySQL doesn't accept subqueries in FROM clause for Views.
CREATE VIEW VIEW_product_dates AS
(
SELECT product_id, created_at AS active_date,
(
SELECT created_at
FROM statuses ti
WHERE name = 'inactive' AND ta.created_at < ti.created_at AND ti.product_id=ta.product_id
GROUP BY product_id
) AS inactive_date
FROM statuses ta
WHERE name = 'active'
);
CREATE VIEW VIEW_product_dates_days AS
(
SELECT product_id, active_date, inactive_date, datediff(IFNULL(inactive_date, SYSDATE()),active_date) AS nb_days
FROM VIEW_product_dates
);
CREATE VIEW VIEW_product_dates_days_cumul AS
(
SELECT product_id, active_date, ifnull(inactive_date,sysdate()) AS inactive_date, nb_days,
IFNULL((SELECT SUM(V2.nb_days) + V1.nb_days
FROM VIEW_product_dates_days V2
WHERE V2.active_date >= IFNULL(V1.inactive_date, SYSDATE()) AND V1.product_id=V2.product_id
),V1.nb_days) AS cumul_days
FROM VIEW_product_dates_days V1
);
The final view produce this :
| product_id | active_date | inactive_date | nb_days | cumul_days |
|------------|----------------------|----------------------|---------|------------|
| 1 | 2018-01-01T00:00:00Z | 2018-02-01T00:00:00Z | 31 | 49 |
| 1 | 2018-03-01T00:00:00Z | 2018-03-15T00:00:00Z | 14 | 18 |
| 1 | 2018-04-25T00:00:00Z | 2018-04-29T11:28:39Z | 4 | 4 |
| 2 | 2018-03-01T00:00:00Z | 2018-04-29T11:28:39Z | 59 | 59 |
| 3 | 2018-03-10T00:00:00Z | 2018-03-15T00:00:00Z | 5 | 5 |
So it aggregates all active periods of all products, it counts the number of days for each period, and the cumulative days of all past active periods since current date.
Then we can query this final view to get the desired date for each product. I set a variable for your 20 days, so you can change that number easily if you want.
SET #cap_days = 20 ;
SELECT PD.id, Pd.name,
SUM(CASE WHEN o.created_at > PD.date_to_start_counting_sales THEN 1 ELSE 0 END) AS recent_sales_count ,
PD.date_to_start_counting_sales
FROM
(
SELECT p.*,
(CASE WHEN LowerCap.max_cumul_days IS NULL
THEN ADDDATE(ifnull(HigherCap.min_inactive_date,sysdate()),(-#cap_days))
ELSE
CASE WHEN LowerCap.max_cumul_days < #cap_days AND HigherCap.min_inactive_date IS NULL
THEN ADDDATE(ifnull(LowerCap.max_inactive_date,sysdate()),(-LowerCap.max_cumul_days))
ELSE ADDDATE(ifnull(HigherCap.min_inactive_date,sysdate()),(LowerCap.max_cumul_days-#cap_days))
END
END) as date_to_start_counting_sales
FROM products P
LEFT JOIN
(
SELECT product_id, MAX(cumul_days) AS max_cumul_days, MAX(inactive_date) AS max_inactive_date
FROM VIEW_product_dates_days_cumul
WHERE cumul_days <= #cap_days
GROUP BY product_id
) LowerCap ON P.id=LowerCap.product_id
LEFT JOIN
(
SELECT product_id, MIN(cumul_days) AS min_cumul_days, MIN(inactive_date) AS min_inactive_date
FROM VIEW_product_dates_days_cumul
WHERE cumul_days > #cap_days
GROUP BY product_id
) HigherCap ON P.id=HigherCap.product_id
) PD
LEFT JOIN items i ON PD.id = i.product_id
LEFT JOIN orders o ON o.id = i.order_id
GROUP BY PD.id, Pd.name, PD.date_to_start_counting_sales
Returns
| id | name | recent_sales_count | date_to_start_counting_sales |
|----|--------|--------------------|------------------------------|
| 1 | Apple | 3 | 2018-01-30T00:00:00Z |
| 2 | Banana | 0 | 2018-04-09T20:43:23Z |
| 3 | Grape | 1 | 2018-03-10T00:00:00Z |
FIDDLE : http://sqlfiddle.com/#!9/804f52/24
Not sure which version of MySql you're working with, but if you can use 8.0, that version came out with a lot of functionality that makes things slightly more doable (CTE's, row_number(), partition, etc.).
My recommendation would be to create a view like in this DB-Fiddle Example, call the view on server side and iterate programatically. There are ways of doing it in SQL, but it'd be a bear to write, test and likely would be less efficient.
Assumptions:
Products cannot be sold during inactive date ranges
Statuses table will always alternate status active/inactive/active for each product. I.e. no date ranges where a certain product is both active and inactive.
View Results:
+------------+-------------+------------+-------------+
| product_id | active_date | end_date | days_active |
+------------+-------------+------------+-------------+
| 1 | 2018-01-01 | 2018-02-01 | 31 |
+------------+-------------+------------+-------------+
| 1 | 2018-03-01 | 2018-03-15 | 14 |
+------------+-------------+------------+-------------+
| 1 | 2018-04-25 | 2018-04-29 | 4 |
+------------+-------------+------------+-------------+
| 2 | 2018-03-01 | 2018-04-29 | 59 |
+------------+-------------+------------+-------------+
| 3 | 2018-03-10 | 2018-03-15 | 5 |
+------------+-------------+------------+-------------+
View:
CREATE OR REPLACE VIEW days_active AS (
WITH active_rn
AS (SELECT *, Row_number()
OVER ( partition BY NAME, product_id
ORDER BY created_at) AS rownum
FROM statuses
WHERE name = 'active'),
inactive_rn
AS (SELECT *, Row_number()
OVER ( partition BY NAME, product_id
ORDER BY created_at) AS rownum
FROM statuses
WHERE name = 'inactive')
SELECT x1.product_id,
x1.created_at AS active_date,
CASE WHEN x2.created_at IS NULL
THEN Curdate()
ELSE x2.created_at
END AS end_date,
CASE WHEN x2.created_at IS NULL
THEN Datediff(Curdate(), x1.created_at)
ELSE Datediff(x2.created_at,x1.created_at)
END AS days_active
FROM active_rn x1
LEFT OUTER JOIN inactive_rn x2
ON x1.rownum = x2.rownum
AND x1.product_id = x2.product_id ORDER BY
x1.product_id);

mysql return two minimum values with limit

I have a table named: workers and a table named: schedule with the following format:
workers:
| id | name | vacationA | vacationB | workhistory |
| 1 | Florin | 2017-05-05 | 2017-05-25 | 2010-01-01 |
| 2 | Andrei | 2017-06-05 | 2017-06-25 | 2010-01-01 |
| 3 | Alexandra | 2017-07-05 | 2017-07-25 | 2010-01-01 |
| 4 | Emilia | 2017-08-05 | 2017-08-25 | 2010-01-01 |
| 5 | Nicoleta | 2017-09-05 | 2017-09-25 | 2010-01-01 |
+----+-----------+------------+------------+-------------+
schedule:
| day | month | name | shifts |
+-----+-------+-----------+--------+
| 1 | 6 | Florin | 0 |
| 1 | 6 | Andrei | 1 |
| 1 | 6 | Alexandra | 2 |
| 1 | 6 | Emilia | 3 |
| 1 | 6 | Nicoleta | 4 |
+-----+-------+-----------+--------+
I need to interrogate table "workers" to give me 2 random names, with minimum shifts number, and workers should not be in vacation period. Also work history must be greater than 18 MONTHS.
In this case, the query i need should return Florin and Andrei.
This is what I've got so far, but it doesn't work as supposed:
SELECT name
FROM workers
WHERE (CURDATE() NOT BETWEEN vacationA AND vacationB) AND
workhistory > (DATE_SUB(CURDATE(), INTERVAL 18 MONTH)) AND
name IN (SELECT name FROM schedule ORDER BY shifts LIMIT 2)
ORDER BY RAND() LIMIT 2;
This query returns
1235 - This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'.
Thank you!
Join to schedule instead and then use LIMIT 2:
SELECT w.name
FROM workers w
INNER JOIN schedule s
ON w.name = s.name
WHERE CURDATE() NOT BETWEEN w.vacationA AND w.vacationB AND
w.workhistory > DATE_SUB(CURDATE(), INTERVAL 18 MONTH)
ORDER BY s.shifts
LIMIT 2;
I don't understand the random ordering, because you are only returning two worker records, and those records are not chosen randomly, but rather belong to the smallest shift numbers.

MySQL - select average of column A for first N entries from column B

I have a ratings table, where each user can add one rating a day. But each user might miss several days between ratings.
I'd like to get the average rating for each user_id's first 7 entries of created_at.
My table:
mysql> desc entries;
+------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| rating | tinyint(4) | NO | | NULL | |
| user_id | int(10) unsigned | NO | MUL | NULL | |
| created_at | timestamp | YES | | NULL | |
+------------+------------------+------+-----+---------+----------------+
Ideally I'd just get something like:
+------------+------------------+
| day | average_rating |
+------------+------------------+
| 1 | 2.53 |
+------------+------------------+
| 2 | 4.30 |
+------------+------------------+
| 3 | 3.67 |
+------------+------------------+
| 4 | 5.50 |
+------------+------------------+
| 5 | 7.23 |
+------------+------------------+
| 6 | 6.98 |
+------------+------------------+
| 7 | 7.22 |
+------------+------------------+
The closest I've been able to get is:
SELECT rating, user_id, created_at FROM entries ORDER BY user_id asc, created at desc
Which isn't very close at all...
Is it even possible? Will the performance be terrible? It's something that would need to run every time a web page is loaded, so would it be better to just run this once a day and save the results? (to another table!?)
edit - second attempt
Working towards a solution, I think this would get the rating for each user's first day:
select rating from entries where user_id in
(select user_id from entries order by created_at limit 1);
But I get:
ERROR 1235 (42000): This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'
So now I'm going to play around with JOIN to see if that helps.
edit - third attempt, getting closer
I found this stackoverflow post, which is closer to what I want.
select e1.* from entries e1 left join entries e2
on (e1.user_id = e2.user_id and e1.created_at > e2.created_at)
where e2.id is null;
It gets the rating for the first day for each user.
Next step is to work out how to get days 2 to 7. I can't use 1.created_at > e2.created_at for that, so I'm really confused now.
edit - fourth attempt
Okay, I think it's not possible. Once I worked out how to turn off 'full group by' mode, I realised I'll probably need to use a subquery with limit <user_id>, <day_num>, for which I get:
ERROR 1235 (42000): This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'
My current method is to just get the entire table, and use PHP to calculate the average for each day.
If I understand correctly you want to take the last 7 ratings the user gave, ordered by the date they gave the rating. The last 7 ratings of one user may fall on different days to another user, however they will be averaged together regardless of date.
First we need to order the data by user and date and give each user their own incrementing row count. I do this by adding two variables, one for the last user id and one for the row number:
select e.created_at,
e.rating,
if(#lastUser=user_id,#row := #row+1, #row:=1) as row,
#lastUser:= e.user_id as user_id
from entries e,
( select #row := 0, #lastUser := 0 ) vars
order by e.user_id asc,
e.created_at desc;
If the previous user_id is different we reset the row counter to 1. The result from this is:
+---------------------+--------+------+---------+
| created_at | rating | row | user_id |
+---------------------+--------+------+---------+
| 2017-01-10 00:00:00 | 1 | 1 | 1 |
| 2017-01-09 00:00:00 | 1 | 2 | 1 |
| 2017-01-08 00:00:00 | 1 | 3 | 1 |
| 2017-01-07 00:00:00 | 1 | 4 | 1 |
| 2017-01-06 00:00:00 | 1 | 5 | 1 |
| 2017-01-05 00:00:00 | 1 | 6 | 1 |
| 2017-01-04 00:00:00 | 1 | 7 | 1 |
| 2017-01-03 00:00:00 | 1 | 8 | 1 |
| 2017-01-02 00:00:00 | 1 | 9 | 1 |
| 2017-01-01 00:00:00 | 1 | 10 | 1 |
| 2017-01-13 00:00:00 | 1 | 1 | 2 |
| 2017-01-11 00:00:00 | 1 | 2 | 2 |
| 2017-01-09 00:00:00 | 1 | 3 | 2 |
| 2017-01-07 00:00:00 | 1 | 4 | 2 |
| 2017-01-05 00:00:00 | 1 | 5 | 2 |
| 2017-01-03 00:00:00 | 1 | 6 | 2 |
| 2017-01-01 00:00:00 | 1 | 7 | 2 |
| 2017-01-13 00:00:00 | 1 | 1 | 3 |
| 2017-01-01 00:00:00 | 1 | 2 | 3 |
| 2017-01-03 00:00:00 | 1 | 1 | 4 |
| 2017-01-01 00:00:00 | 1 | 2 | 4 |
| 2017-01-02 00:00:00 | 1 | 1 | 5 |
+---------------------+--------+------+---------+
We now simply wrap this in another statement to select the avg where the row number is less than or equal to seven.
select e1.row day, avg(e1.rating) avg
from (
select e.created_at,
e.rating,
if(#lastUser=user_id,#row := #row+1, #row:=1) as row,
#lastUser:= e.user_id as user_id
from entries e,
( select #row := 0, #lastUser := 0 ) vars
order by e.user_id asc,
e.created_at desc) e1
where e1.row <=7
group by e1.row;
This outputs:
+------+--------+
| day | avg |
+------+--------+
| 1 | 1.0000 |
| 2 | 1.0000 |
| 3 | 1.0000 |
| 4 | 1.0000 |
| 5 | 1.0000 |
| 6 | 1.0000 |
| 7 | 1.0000 |
+------+--------+

Complex MySQL query between two tables

I've a real complex query here, at least for me.
Here's a table with car dates releases (where model_key = 320D):
+------------+-----------+
| date_key | model_key |
+------------+-----------+
| 2003-08-13 | 320D |
| 2005-11-12 | 320D |
| 2007-02-11 | 320D |
+------------+----------+
Then I have a table with daily purchases (where model_key = 320D):
+------------+-----------+-----------+
| date_key | model_key | sal_quant |
+------------+-----------+ ----------+
| 2003-08-13 | 320D | 0 |
| 2003-08-14 | 320D | 1 |
| 2003-08-15 | 320D | 2 |
| 2003-08-16 | 320D | 0 |
...
| 2005-11-12 | 320D | 2 |
| 2005-11-13 | 320D | 0 |
| 2005-11-14 | 320D | 4 |
| 2005-11-15 | 320D | 3 |
...
| 2007-02-11 | 320D | 1 |
| 2007-02-12 | 320D | 0 |
| 2007-02-13 | 320D | 0 |
| 2007-02-14 | 320D | 0 |
...
+------------+-----------+-----------|
I want to know the sum of car sales by day after each release during N days.
I want a table like (assuming 4 days analysis):
+-----------------+
| sum(sal_quant) |
+-----------------+
| 3 |
| 1 |
| 6 |
| 3 |
+-----------------+
Which means, in the first day of car release, 3 BMW 320D were sold, in the second day, just one,... an so on.
Now I have the following query:
SELECT SUM(sal_quant)
FROM daily_sales
WHERE model_key='320D'
AND date_key IN (
SELECT date_key FROM car_release_dates WHERE model_key='320D')
But this only gives me the sum for the first release day. How can I get the next days?
Thanks.
I suppose there's no way a car could be sold before its release date. So you could simply group on the date_key:
SELECT date_key, SUM(sal_quant)
FROM daily_sales
WHERE model_key='320D'
GROUP BY date_key
ORDER BY date_key DESC
LIMIT 4
If it is possible to sell cars before the release date, add a WHERE clause:
AND date_key >= (
SELECT date_key FROM car_release_dates WHERE model_key='320D')
EDIT: Per your comment, you can constrain the query to within four days of multiple release dates with an INNER JOIN. For example:
SELECT ds.date_key, SUM(ds.sal_quant)
FROM daily_sales ds
INNER JOIN car_release_dates crd
ON crd.model_key = ds.model_key
AND ds.date_key BETWEEN crd.date_key
AND DATE_ADD(crd.date_key, INTERVAL 4 DAY)
WHERE ds.model_key='320D'
GROUP BY ds.date_key
ORDER BY ds.date_key DESC