Calculating moving average for different values in a column MySQL - mysql

I have a dataset like this:
team date score
A 2011-05-01 50
A 2011-05-02 54
A 2011-05-03 51
A 2011-05-04 49
A 2011-05-05 59
B 2011-05-03 30
B 2011-05-04 35
B 2011-05-05 39
B 2011-05-06 47
B 2011-05-07 50
I want to add another column called MA3 where I can calculate the moving average of scores for the last 3 days. The point that made it tricky is to calculate the MA for each team. The end result should be like this:
team date score MA3
A 2011-05-01 50 null
A 2011-05-02 54 null
A 2011-05-03 51 null
A 2011-05-04 49 51.66
A 2011-05-05 59 51.33
B 2011-05-03 30 null
B 2011-05-04 35 null
B 2011-05-05 39 null
B 2011-05-06 47 34.66
B 2011-05-07 50 40.33
If that would be a single team, I would go on and do:
SELECT team,
year,
AVG(score) OVER (ORDER BY date ASC ROWS 3 PRECEDING) AS MA3
FROM table

You're missing the PARTITION BY clause:
SELECT team,
date,
AVG(score) OVER (
PARTITION BY team
ORDER BY date ASC ROWS 3 PRECEDING
) AS MA3
FROM table
Note that there will always be an average calculation, regardless of the window size. If you want the average to be null if your window size is smaller than 3, you could do it like this:
SELECT team,
date,
CASE
WHEN count(*) OVER w <= 3 THEN null
ELSE AVG(score) OVER w
END AS MA3
FROM table
WINDOW w AS (PARTITION BY team ORDER BY date ASC ROWS 3 PRECEDING)
dbfiddle
Side note
Your next question might be about logical windowing, because often, you don't actually want to calculate the average over 3 rows, but over some interval,
like e.g. 3 days. Luckily, MySQL implements this. You could then write:
WINDOW w AS (PARTITION BY team ORDER BY date ASC RANGE INTERVAL 3 DAY PRECEDING)

Related

Get original RANK() value based on row create date

Using MariaDB and trying to see if I can get pull original rankings for each row of a table based on the create date.
For example, imagine a scores table that has different scores for different users and categories (lower score is better in this case)
id
leaderboardId
userId
score
submittedAt ↓
rankAtSubmit
9
15
555
50.5
2022-01-20 01:00:00
2
8
15
999
58.0
2022-01-19 01:00:00
3
7
15
999
59.1
2022-01-15 01:00:00
3
6
15
123
49.0
2022-01-12 01:00:00
1
5
15
222
51.0
2022-01-10 01:00:00
1
4
14
222
87.0
2022-01-09 01:00:00
1
5
15
555
51.0
2022-01-04 01:00:00
1
The "rankAtSubmit" column is what I'm trying to generate here if possible.
I want to take the best/smallest score of each user+leaderboard and determine what the rank of that score was when it was submitted.
My attempt at this failed because in MySQL you cannot reference outer level columns more than 1 level deep in a subquery resulting in an error trying to reference t.submittedAt in the following query:
SELECT *, (
SELECT ranking FROM (
SELECT id, RANK() OVER (PARTITION BY leaderboardId ORDER BY score ASC) ranking
FROM scores x
WHERE x.submittedAt <= t.submittedAt
GROUP BY userId, leaderboardId
) ranks
WHERE ranks.id = t.id
) rankAtSubmit
FROM scores t
Instead of using RANK(), I was able to accomplish this by with a single subquery that counts the number of users that have a score that is lower than and submitted before the given score.
SELECT id, userId, score, leaderboardId, submittedAt,
(
SELECT COUNT(DISTINCT userId) + 1
FROM scores t2
WHERE t2.userId = t.userId AND
t2.leaderboardId = t.leaderboardId AND
t2.score < t.score AND
t2.submittedAt <= t.submittedAt
) AS rankAtSubmit
FROM scores t
What I understand from your question is you want to know the minimum and maximum rank of each user.
Here is the code
SELECT userId, leaderboardId, score, min(rankAtSubmit),max(rankAtSubmit)
FROM scores
group BY userId,
leaderboardId,
scorescode here

Query runs too slow and even it stops because exceded of time with 17000 rows

I have table 1:
historial_id
timestamp
address
value
insertion_time
1
2022-01-29
1
84
2022-01-31
2
2022-01-29
2
40
2022-01-31
3
2022-01-30
1
84
2022-01-31
4
2022-01-30
2
41
2022-01-31
5
2022-01-30
2
41
2022-01-31
(sometimes it has repeated rows)
...
I need a Query to get:
timestamp
value(address 1)
value(address 2)
2022-01-29
84
40
2022-01-30
84
41
......
I tried with:
SELECT timestamp, ( SELECT value
FROM historical
WHERE register_type=11
AND address=2
AND timestamp=t1.timestamp
GROUP BY value
) AS CORRIENTE_mA,
( SELECT value
FROM historical
WHERE register_type=11
AND address=1
AND timestamp=t1.timestamp
GROUP BY value ) AS Q_M3pH
FROM historical AS t1
GROUP BY timestamp;
But it's too slow, it even stops because of exceeded time.
I tried with distinct too instead of group by
I think you need dynamic pivot.
Please try and avoid MySQL reserved words like timestamp.
Below query return only the max value for address 1 and 2 grouping by timestamp.
This is a simplified version of your query :
select
`timestamp`
, max(case when address=1 then value end) as value_address1
, max(case when address=2 then value end) as value_address2
from historical
group by `timestamp`;
Result:
timestamp value_address1 value_address2
2022-01-29 84 40
2022-01-30 84 41
Demo

mysql group by day and count then filter only the highest value for each day

I'm stuck on this query. I need to do a group by date, card_id and only show the highest hits. I have this data:
date card_name card_id hits
29/02/2016 Paul Stanley 1345 12
29/02/2016 Phil Anselmo 1347 16
25/02/2016 Dave Mustaine 1349 10
25/02/2016 Ozzy 1351 17
23/02/2016 Jhonny Cash 1353 13
23/02/2016 Elvis 1355 15
20/02/2016 James Hethfield 1357 9
20/02/2016 Max Cavalera 1359 12
My query at the moment
SELECT DATE(card.create_date) `day`, `name`,card_model_id, count(1) hits
FROM card
Join card_model ON card.card_model_id = card_model.id
WHERE DATE(card.create_date) >= DATE(DATE_SUB(NOW(), INTERVAL 1 MONTH)) AND card_model.preview = 0
GROUP BY `day`, card_model_id
;
I want to group by date, card_id and filter the higher hits result showing only one row per date. As if I run a max(hits) with group by but I won't work
Like:
date card_name card_id hits
29/02/2016 Phil Anselmo 1347 16
25/02/2016 Ozzy 1351 17
23/02/2016 Elvis 1355 15
20/02/2016 Max Cavalera 1359 12
Any light on that will be appreciated. Thanks for reading.
Here is one way to do this. Based on your sample data (not the query):
select s.*
from sample s
where s.hits = (select max(s2.hits)
from sample s2
where date(s2.date) = date(s.date)
);
Your attempted query seems to have no relationship to the sample data, so it is unclear how to incorporate those tables (the attempted query has different columns and two tables).

Summing data for last 7 day look back window

I want a query that can give result with sum of last 7 day look back.
I want output date and sum of last 7 day look back impressions for each date
e.g. I have a table tblFactImps with below data:
dateFact impressions id
2015-07-01 4022 30
2015-07-02 4021 33
2015-07-03 4011 34
2015-07-04 4029 35
2015-07-05 1023 39
2015-07-06 3023 92
2015-07-07 8027 66
2015-07-08 2024 89
I need output with 2 columns:
dateFact impressions_last_7
query I got:
select dateFact, sum(if(datediff(curdate(), dateFact)<=7, impressions,0)) impressions_last_7 from tblFactImps group by dateFact;
Thanks!
If your fact table is not too big, then a correlated subquery is a simple way to do what you want:
select i.dateFact,
(select sum(i2.impressions)
from tblFactImps i2
where i2.dateFact >= i.dateFact - interval 6 day
) as impressions_last_7
from tblFactImps i;
You can achieve this by LEFT OUTER JOINing the table with itself on a date range, and summing the impressions grouped by date, as follows:
SELECT
t1.dateFact,
SUM(t2.impressions) AS impressions_last_7
FROM
tblFactImps t1
LEFT OUTER JOIN
tblFactImps t2
ON
t2.dateFact BETWEEN
DATE_SUB(t1.dateFact, INTERVAL 6 DAY)
AND t1.dateFact
GROUP BY
t1.dateFact;
This should give you a sliding 7-day sum for each date in your table.
Assuming your dateFact column is indexed, this query should also be relatively fast.

MySQL query that gets all rows UNTIL the SUM(column) is bigger than X

I have the following data
user_id days date
88 2 2013-08-25
88 4 2013-08-23
88 18 2013-08-5
88 1 2013-08-4
88 2 2013-08-2
73 11 2013-08-2
299 4 2013-08-2
12 983 2013-08-2
I'm trying to get all recent rows (order by DATE desc) for a specific user_id , until the SUM of days column is bigger than X. For example in this case if X=7 I would get the three first rows with SUM(days)=24.
Try this. Here you will use a local variable that will count the sums in the subquery.
select
user_id,
days,
date
from
(
select
user_id,
days,
date,
#sum_days := #sum_days + days as sum_days
from
myTable
order by
date desc
) t
cross join (select #sum_days := 0) const -- resetting your #sum_days var.
where
sum_days < X -- fill a number in for X here.