I have a yearly list of sales of an item for the past 20 years.
the data is like this.
date ; sales value
2001-01-01 ; 423
2001-01-02 ; 152
2001-01-03 ; 162
2001-01-04 ; 172
.
.
.
I have a behavioral problem. I must find the five consecutive days where the sum of sales is maximum in a year, for each year, for the past 20 years. Then using the result i must analyse the spending pattern.
how can i get the 5 consecutive days whose sum is maximum in a year?
I must get it for all years with dates and sum of sales in those 5 days total value. Can anyone help me in my assignment, please?
TIA
Well, in MySQL 8+, you can use window functions. In earlier versions, a correlated subquery. That looks like:
select year(t.date),
max(next_5_days_sales),
substring_index(group_concat(date order by next_5_days_sales desc), ',', 1) as date_at_max
from (select t.*,
(select sum(t2.sales)
from t t2
where t2.date >= t.date and t2.date < t.date + interval 5 day
) as next_5_days_sales
from t
) t
group by year(t.date);
Notes:
You will need to reset the group_concat_max_len, because 1024 is probably not long enough for the intermediate result.
This allows the periods to span year boundaries.
In MySQL 8, use window functions!
select t.*
from (select t.*,
row_number() over (partition by year(date) order by next_5_days_sales) as seqnum
from (select t.*,
sum(t2.sales) over (order by date range between current row and 4 following) as as next_5_days_sales
from t
) t
) t
where seqnum = 1;
Related
I need to run a report that finds the sum of records that have a value in a certain field ( >50 ), but only when there are at least 2 consecutive timestamps. Once the timestamps stop being consecutive, i then need to ignore the until we find the next 2 consecutive.
1 2021-01-26 09:45:58 50
2 2021-01-26 09:47:23 20
3 2021-01-26 09:47:29 50
4 2021-01-26 09:48:23 50
in the example above,
The first record would (ID1) would fail (only 1 hit in the required timescale )
ID2 (value too low )
but records 3 and 4 would qualify for inclusion in the sum.
This is a type of gaps-and-islands problem. You can identify the groups by counting the number of non-50+ rows up to each row. Then, aggregate the groups with the conditions you want:
select grp, sum(value)
from (select t.*,
sum(value < 50) over (order by timestamp) as grp
from t
) t
where value >= 50
group by grp
having count(*) >= 2;
This produces a separate value for each adjacent values. If you want the total sum, then you can use a subquery or CTE based on this query.
If you actually just want the overall sum, you can use lead() and lag():
select sum(value)
from (select t.*,
lag(value) over (order by timestamp) as prev_value,
lead(value) over (order by timestamp) as next_value
from t
) t
where value >= 50 and
(prev_value >= 50 or next_value >= 50)
i have a table that looks like this :
what i need is For each day show the accumulated (moving) number of new Droppers in the last 5 days inclusive
Split to US vs. Non-US geos.
Report Columns:
DataTimstamp - upper boundary of a 5-day time frame
Total - number of New Droppers within the time frame
Region_US - number of New Droppers where country =’US’
Region_rest - number of New Droppers where country<>’US’
this is my code :
Create view new_droppers_per_date as Select date,country,count(dropperid) as num_new_customers
From(
Select dropper id,country,min(cast( LoginTimestamp as date)) as date
From droppers) as t1 group by date,country
Select DataTimstamp,region_us,region_rest from(
(Select date as DataTimstamp ,sum(num_new_customers) over(oreder by date range between 5
preceding and 1 preceding) as total
From new_droppers_per_date ) as t1 inner join
(Select date ,sum(num_new_customers) over(oreder by date range between 5 preceding and preceding)
as region_us
From new_droppers_per_date where country=”us”) as t2 on t1.date=t2.date inner join
(Select date, sum(num_new_customers) over(oreder by date range between 5 preceding and 1
preceding)
as region_rest
From new_droppers_per_date where country <>”us”) as t3 on t2.date=t3.date)
i was wondering if there is any easier\smarter way to do so without using so many joins and view
thank you for the help:)
Here is one way to do it using window functions. First assign a flag to the first login of each DropperId, then aggregate by day and count the number of new logins. Finally, make a window sum() with a range frame that spans over the last 5 days.
select
LoginDay,
sum(CntNewDroppers) over(
order by LoginDay
range between interval 5 day preceding and current row
) RunningCntNewDroppers
from
select
date(LoginTimestamp) LoginDay,
sum(rn = 1) CntNewDroppers
from (
select
LoginTimestamp,
row_number() over(partition by DropperId order by LoginTimestamp) rn
from mytable
) t
) t
My problem: I have table with price and date. I need to have an average price from last 7 existing days. E.g.: I have prices from today, yesterday, 30 days ago, 43 days ago etc. I need an average not from the last 7 days, but from the last 7 existing days.
My code:
SELECT AVG(price)
FROM table
GROUP BY date
ORDER BY date DESC LIMIT 7
But I have 7 average price from every day.
Maybe someone has another idea
Use a subquery to get the last 7 existing days, get the earliest of those dates, then join that with the table.
SELECT AVG(price)
FROM table AS t1
JOIN (SELECT MIN(dateday) AS mindate
FROM
(SELECT DATE(date) AS dateday
FROM table
GROUP BY dateday
ORDER BY dateday DESC LIMIT 7
) AS x
) AS t2
ON t1.date >= t2.mindate
use avg function and subquery
select avg(price)
from
(SELECT date,price
FROM table
ORDER BY date desc limit 7
) as t
I'm looking for a function to return the most predominant non numeric value from a table.
My database table records readings from a weatherstation. Many of these are numeric, but wind direction is recorded as one of 16 text values - N,NNE,NE,ENE,E... etc in a varchar field. Records are added every 15 minutes so 95 rows represent a day's weather.
I'm trying to compute the predominant wind direction for the day. Manually you would add together the number of Ns, NNEs, NEs etc and see which there are most of.
Has MySQL got a neat way of doing this?
Thanks
It's difficult to answer your question without seeing your schema, but this should help you.
Assuming the wind directions are stored in the same column as the numeric values you want to ignore, you can use REGEXP to ignore the numeric values, like this:
select generic_string, count(*)
from your_table
where day = '2014-01-01'
and generic_string not regexp '^[0-9]*$'
group by generic_string
order by count(*) desc
limit 1
If wind direction is the only thing stored in the column then it's a little simpler:
select wind_direction, count(*)
from your_table
where day = '2014-01-01'
group by wind_direction
order by count(*) desc
limit 1
You can do this for multiple days using sub-queries. For example (assuming you don't have any data in the future) this query will give you the most common wind direction for each day in the current month:
select this_month.day,
(
select winddir
from weatherdatanum
where thedate >= this_month.day
and thedate < this_month.day + interval 1 day
group by winddir
order by count(*) desc
limit 1
) as daily_leader
from
(
select distinct date(thedate) as day
from weatherdatanum
where thedate >= concat(left(current_date(),7),'-01') - interval 1 month
) this_month
The following query should return you a list of wind directions along with counts sorted by most occurrences:
SELECT wind_dir, COUNT(wind_dir) AS count FROM `mytable` GROUP BY wind_dir ORDER DESC
Hope that helps
I need to do a query and join with all days of the year but in my db there isn't a calendar table.
After google-ing I found generate_series() in PostgreSQL. Does MySQL have anything similar?
My actual table has something like:
date qty
1-1-11 3
1-1-11 4
4-1-11 2
6-1-11 5
But my query has to return:
1-1-11 7
2-1-11 0
3-1-11 0
4-1-11 2
and so on ..
This is how I do it. It creates a range of dates from 2011-01-01 to 2011-12-31:
select
date_format(
adddate('2011-1-1', #num:=#num+1),
'%Y-%m-%d'
) date
from
any_table,
(select #num:=-1) num
limit
365
-- use limit 366 for leap years if you're putting this in production
The only requirement is that the number of rows in any_table should be greater or equal to the size of the needed range (>= 365 rows in this example). You will most likely use this as a subquery of your whole query, so in your case any_table can be one of the tables you use in that query.
Enhanced version of solution from #Karolis that ensures it works for any year (including leap years):
select date from (
select
date_format(
adddate('2011-1-1', #num:=#num+1),
'%Y-%m-%d'
) date
from
any_table,
(select #num:=-1) num
limit
366
) as dt
where year(date)=2011
I was looking to this solution but without the "hardcoded" date, and I came-up with this one valid for the current year(helped from this answers).
Please note the
where year(date)=2011
is not needed as the select already filter the date. Also this way, it does not matter which table(at least as stated before the table has at least 366 rows) is been used, as date is "calculated" on runtime.
select date from (
select
date_format(
adddate(MAKEDATE(year(now()),1), #num:=#num+1),
'%Y-%m-%d'
) date
from
your_table,
(select #num:=-1) num
limit
366 ) as dt
Just in case someone is looking for generate_series() to generate a series of dates or ints as a temp table in MySQL.
With MySQL8 (MySQL version 8.0.27) you can do something like this to simulate:
WITH RECURSIVE nrows(date) AS (
SELECT MAKEDATE(2021,333) UNION ALL
SELECT DATE_ADD(date,INTERVAL 1 day) FROM nrows WHERE date<=CURRENT_DATE
)
SELECT date FROM nrows;
Result:
2021-11-29
2021-11-30
2021-12-01
2021-12-02
2021-12-03
2021-12-04
2021-12-05
2021-12-06