from the below table:
newID
year
ID
newValore
1
2020
111
50
1
2020
111
60
1
2021
111
70
1
2021
112
20
1
2021
112
40
1
2022
113
30
1
2022
113
80
2
2020
222
20
2
2020
223
10
2
2021
223
40
2
2021
224
10
2
2021
224
90
2
2021
224
99
2
2022
225
10
2
2023
225
50
given the example table above i need a single query in mysql which creates a new table which will have in the first column the list of newID values and in the second column it will have the different years present in the table for each newID and in the third column i will have a value which is called diff_cum_year given by this rule:
if the year present in the year column for each newID value is the smallest year then the value of diff_cum_year will be given by the sum of the maximum newValues for each of the different ID values for the same newID value and for the same year value
if for each value present in the year column with the same value of newID I have only one value of ID and this value of ID was already present for the same value of newID with the value of year equal to year -1 then the value of diff_cum_year will be the maximum value of newValue for the same newID and for the same year minus the value of diff_cum_year with the value of year equal to year -1 for the same newID
if, on the other hand, for each year present in the year column with the same newID value I have only one ID value and this is an ID value not present among the IDs having same newID and with year value uagual to year - 1 then the value of diff_cum_year will be the maximum of the newValue field for the year value being predicted for the same newID
if for each year in the year column with the same newID value I have multiple ID values the value will be the sum of the maximum newValue values for each of the different ID values for the same newID minus the value of diff_cum_year with year equal to year -1 for the same newID
the output table should be like this one:
newID
year
diff_cum_year
1
2020
60 [rule 1 max(50,60)]
1
2021
50 [rule 4 max(70)+max(20,40) - 60 (previous value for diff_cum_year)]
1
2022
80 [rule 3 max(30,80)]
2
2020
30 [rule 1 max(20) + max(10)]
2
2021
109 [rule 4 max(40) + max(10,90,99) - 30 (previous value for diff_cum_year)]
2
2022
10
2
2023
40
There's one tricky way of carrying out this problem. These are the steps followed by this solution:
generating the max values for "newValore" with respect to triples <newID, year_, ID>
getting the total sum of max values for each couple <newID, year_>
subtracting the total sums for ids present in consecutive years
getting the least total sums among all the available (since the subtraction is the last operation we did, the smallest sums will be the latest generated values we need)
Each of these operations is done within a separate subquery:
WITH max_vals AS (
SELECT DISTINCT newId,
year_,
ID,
MAX(newValore) OVER(PARTITION BY newID, year_, ID) AS max_value
FROM tab
), sum_max_vals AS (
SELECT *, SUM(max_value) OVER(PARTITION BY newId, year_) AS sum_max_value
FROM max_vals
), sum_max_vals_with_subs AS(
SELECT newID,
year_,
sum_max_value -
CASE WHEN LAG(year_) OVER(PARTITION BY ID ORDER BY year_) = year_-1
THEN LAG(sum_max_value) OVER(PARTITION BY ID ORDER BY year_)
ELSE 0
END AS diff_cum_year
FROM sum_max_vals
)
SELECT newID,
year_,
MIN(diff_cum_year) AS diff_cum_year
FROM sum_max_vals_with_subs
GROUP BY newID, year_
Check the demo here.
Related
I have a table which looks like this:
start_date end_date id value
05.10.2010 07.10.2010 1 5
11.12.2010 15.12.2010 2 10
01.01.2023 3 6
I want to write sql query that will multiply number of days from start_date to end_date for each id with its value. So desired result is:
id sum_value
1 15
2 50
3 60
its 15 because there are 3 days (from 05.10.2010 to 07.10.2010) for id 1 and value is 5
its 50 because there are 5 days (from 11.12.2010 to 15.12.2010) for id 2 and value is 10
its 60 because there are 10 days (from 01.01.2023 to current date) for id 3 and value is 6
if end_date is empty it means its current date
How to do that?
Use DATEDIFF() to subtract the dates. Add 1 to that because it doesn't include both ends.
Use IFNULL() to replace the missing end_date with the current date.
SELECT id, value * (1 + (datediff(IFNULL(end_date, CURDATE()), start_date)) AS sum_value
FROM yourtable
I want to add the missing months to the table. If the current row having missing month then it should check the previous row and the name of the month in that row and add the next month in the current row.
For eg: the current month is null, it should check the month name in previous row, if the previous row having January then the current month should replace null with February,
for eg. if the current month is null, it should check the month name in previous row having August then next null month name should be replaced with September.
Code for the creating table:
CREATE TABLE IF NOT EXISTS missing_months (
`Cust_id` INT,
`Month` VARCHAR(9) CHARACTER SET utf8,
`Sales_value` INT
);
INSERT INTO missing_months VALUES
(1,'Janurary',224),
(2,'February',224),
(3,NULL,239),
(4,'April',205),
(5,NULL,218),
(6,'June',201),
(7,NULL,205),
(8,'August',246),
(9,NULL,218),
(10,NULL,211),
(11,'November',223),
(12,'December',211);
output is:
Cust_id Month Sales_value
1 Janurary 224
2 February 224
3 null 239
4 April 205
5 null 218
6 June 201
7 null 205
8 August 246
9 null 218
10 null 211
11 November 223
12 December 211
BUT I WANT THE OUTPUT LIKE:
Cust_id Month Sales_value
1 Janurary 224
2 Febrauary 224
3 March 239
4 April 205
5 May 218
6 June 201
7 July 205
8 August 246
9 September 218
10 October 211
11 November 223
12 December 211
update missing_months m
join missing_months prev on prev.Cust_id=m.Cust_id-1
set m.Month=date_format(str_to_date(concat(prev.Month,'-1970-01'),'%M-%Y-%d') + interval 1 month,'%M')
where m.Month is null
order by m.Cust_id
But relying on an identifier field for order is bad; if your data is ordered, you should have some other column indicating what the order is.
select Cust_id, monthname(STR_TO_DATE(rn, '%m')) as Month_Name,
Sales_value
from
(Select Cust_id, Month, row_number() over() as rn,
Sales_value
from missing_month) x;
I have a data table as shown below
Owner
Month
Year
Target
Achieved
A
April
2021
100
50
B
April
2021
100
80
A
May
2021
100
80
B
May
2021
100
130
A
June
2021
100
50
B
June
2021
100
60
The logic is if there is a shortfall with respect to Achieved then the shortfall amount should be added to next month target.
For Example A's April Target is 100 and Achieved is 50. The Shortfall would be 100-50=50. The 50 should be added to May Target
The output required as
Owner
Month
Year
Target
Achieved
Shortfall(Target-Achieved)
A
April
2021
100
50
50
A
May
2021
150
80
70
A
June
2021
170
50
120
B
April
2021
100
80
20
A
May
2021
120
130
-10
B
June
2021
100
60
40
Is it possible to achieve this automation in SQL?
Thanks
You want a cumulative sum. Assuming that the month column is really ordered, then the final column is:
select t.*,
sum(target - achieved) over (partition by owner, year
order by month
)
from t;
You can use this for the calculation for the new target:
select t.*,
sum(target - achieved) over (partition by owner, year
order by month
)
(achieved +
sum(target - achieved) over (partition by owner, year
order by month
)
) as new_target
from t;
Consider year wise all month as ordering purpose if data available. If previous short fall is negative then current row short fall will be calculated as target - achieved otherwise target + prev.shortfall - achieved.
-- MySQL(v5.8)
SELECT t.owner, t.month, t.year
, t.target + (CASE WHEN t.row_num = 1 THEN 0
ELSE CASE WHEN LAG(short_fall) OVER (PARTITION BY t.owner ORDER BY t.row_num) < 0
THEN 0
ELSE LAG(short_fall) OVER (PARTITION BY t.owner ORDER BY t.row_num)
END
END) target
, t.achieved
, CASE WHEN LAG(short_fall) OVER (PARTITION BY t.owner ORDER BY t.row_num) < 0
THEN t.target - t.achieved
ELSE short_fall
END short_fall
FROM (select owner, month
, year
, target
, achieved
, SUm(target - achieved) OVER
(PARTITION BY owner, year ORDER BY DATE_FORMAT(STR_TO_DATE(CONCAT(month, ' 1, ', year),'%M %d,%Y'), '%c')) short_fall
, ROW_NUMBER() OVER
(PARTITION BY owner, year ORDER BY DATE_FORMAT(STR_TO_DATE(CONCAT(month, ' 1, ', year),'%M %d,%Y'), '%c')) row_num
from test) t;
Please check from url https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=3e114348c2d92015490f76fdbab1c46f
I´m trying to do some analysis in the following data
WeekDay Date Count
5 06/09/2018 20
6 07/09/2018 Null
7 08/09/2018 19
1 09/09/2018 16
2 10/09/2018 17
3 11/09/2018 24
4 12/09/2018 25
5 13/09/2018 24
6 14/09/2018 23
7 15/09/2018 23
1 16/09/2018 9
2 17/09/2018 23
3 18/09/2018 33
4 19/09/2018 22
5 20/09/2018 31
6 21/09/2018 17
7 22/09/2018 10
1 23/09/2018 12
2 24/09/2018 26
3 25/09/2018 29
4 26/09/2018 27
5 27/09/2018 24
6 28/09/2018 29
7 29/09/2018 27
1 30/09/2018 19
2 01/10/2018 26
3 02/10/2018 39
4 03/10/2018 32
5 04/10/2018 37
6 05/10/2018 Null
7 06/10/2018 26
1 07/10/2018 11
2 08/10/2018 32
3 09/10/2018 41
4 10/10/2018 37
5 11/10/2018 25
6 12/10/2018 20
The problem that I want to solve is: I want to create a table with the average of the 3 last same weekdays related to the day. But, when there is a NULL in the weekday, I want to ignore and do the average only with the remain numbers, not count NULL as an 0. I will give you an example here:
The date in this table is day/month/year :)
Ex: On day 12/10/2018, I need the average from
the days 05/10/2018; 28/09/2018; 21/09/2018. These are the last 3 same weekday(six) as 12/10/2018.
. Their values are Null; 29; 17. Then the result of this average must be 23, because I need to ignore the NULL, and not be 15,333.
How can I do this?
The count() function ignores nulls (i.e. does NOT increment if it encounters null) so I suggest you simply count the values then may contain the nulls you wish to ignore.
dow datecol value
6 21/09/2018 17
6 28/09/2018 29
6 05/10/2018 Null
e.g. sum(value) above = 46, and the count(value) = 2 so the average is 23.0 (and avg(value) will also return 23.0 as it also ignores nulls)
select
weekday
, `date`
, `count`
, (select (sum(`count`) * 1.0) / (count(`count`) * 1.0)
from atable as t2
where t2.weekday = t1.weekday
and t2.`date` < t1.`date
order by t2.`date` DESC
limit 3
) as average
from atable as t1
You could just use avg(count) in the query above, and get the same result.
ps. I do hope you do NOT use count as a column name! I also would suggest you do NOT use date as a column name either. i.e. Avoid using SQL terms as names.
SELECT WeekDay, AVG(Count)
FROM myTable
WHERE Count IS NOT NULL
GROUP BY WeekDay
Use IsNULL(Count,0) in your Select
SELECT WeekDay, AVG(IsNULL(Count,0))
FROM myTable
GROUP BY WeekDay
First off, you need to get the number of instances of that weekday in the data since you just need the last 3 same week days
create table table2
as
select
row_number() over(partition by weekday order by date desc) as rn
,weekday
,date
,count
from table
From here, you can get what you want. With you explanation, you don't need to filter out the NULL values for count. Just doing the avg() aggregation will simply ignore it.
select
weekday
,avg(count)
from table2
where rn in (1,2,3)
group by weekday
I have table like this
Post_ID KEY Value
1 year 2014
1 month 09
2 year 2014
2 month 10
3 year 2014
3 month 09
In this table I have post_id , key (which indicated year and month of the post) and value (which represents the year value and month value )
I want to return all the post_ID which is published "2014" and Month is "09" ,that means I have value 1 and 3 .
I'd count how many properties match this specification and filter their count with a having clause:
SELECT post_id
FROM my_table
WHERE (key = 'year' AND value = '2014') OR
(key = 'month' AND value = '09')
GROUP BY post_id
HAVING COUNT(*) = 2