Calculate difference between min and max for each group [duplicate] - mysql

This question already has answers here:
Calculate difference between min and max for each column only if higher then 0
(3 answers)
Why Mysql's Group By and Oracle's Group by behaviours are different
(3 answers)
Closed 1 year ago.
I've got table with football odds and I need to calculate difference between the odds where the updated value was MIN and where updated value was MAX for each fixture and for each of the selected markets. In the output I need to get the difference and the odds where the value in updated column was MAX
I would like to accomplish that with one query.
The table looks like that
| fix_id| odds| odds_type | updated|
|:-----:| ---:| ---------:| ------:|
| 120 | 1.80| home | 160 |
| 120 | 1.40| home | 150 |
| 120 | 2.00| home | 140 |
| 188 | 1.00| u/o | 200 |
| 121 | 1.60| away | 160 |
| 121 | 1.40| away | 150 |
| 121 | 1.10| away | 140 |
What I'm expecting to get
| fix_id| odds| odds_type | updated| diff|
| -----:| ---:| ---------:| ------:|----:|
| 120 | 1.80| home | 160 | -0.2|
| 121 | 1.60| away | 160 | 0.5|
The code I was trying and it seems to get the difference between the MIN and MAX correct but returns random odds instead off the MAX and I'm not sure if it would be efficient to calculate differences for hundreds of fixtures.
SELECT a.*, MAX(a.odds) - MIN(a.odds) difference FROM odds_table a
where odds_type in ('home','away') group by odds_type,fix_id
I used to calculate the differences in php and then use them to update different table inside the loop but there is a 1000's of odds so it takes ages to process.
P.S I'm using MySQL 5.7

If you were using a more recent version of Mysql (which I'd highly recommend) this could be achieved using row_number and CTEs in a much more elegant way however the following should achieve what you are after, and does not rely on MySQLs unique grouping ablities:
SELECT o.fix_id,
o.odds_type,
SUM(case when max_values.fix_id IS NOT NULL then odds end) AS odds,
SUM(max_updated) AS updated,
cast(SUM(case when max_values.fix_id IS NOT NULL then odds end) - SUM(case when min_values.fix_id IS NOT NULL then odds END) AS decimal(10,2)) AS difference
FROM odds_table o
LEFT JOIN (
SELECT fix_id, MAX(updated) AS max_updated
FROM odds_table
GROUP BY fix_id
) AS max_values ON o.fix_id = max_values.fix_id AND o.updated = max_values.max_updated
LEFT JOIN (
SELECT fix_id, MIN(updated) AS min_updated
FROM odds_table
GROUP BY fix_id
) AS min_values ON o.fix_id = min_values.fix_id AND o.updated = min_values.min_updated
WHERE odds_type in ('home','away')
GROUP BY o.fix_id

Related

laravel group by date in join query to find sum of values

I am looking for laravel developer to solve a simple issue. I have 3 tables that I am joining to get data. Model data is like this:
date | order number | amount
I need to group by date and find the sum of amount. Like this:
date | order number | amount
12/06/2022 | ask20 | 150
12/06/2022 | ask20 | 50
13/06/2022 | ask21 | 120
15/06/2022 | ask20 | 110
15/06/2022 | ask23 | 10
16/06/2022 | ask20 | 30
Now, I need to group by date to get the value like this:
date | order number | amount
12/06/2022 | ask20 | 200 (added value)
13/06/2022 | ask21 | 120
15/06/2022 | ask20 | 110 (not added as the order number is different)
15/06/2022 | ask23 | 10
16/06/2022 | ask20 | 30
Remember, I am getting this data by joining 3 tables, Can anyone help solve this?
This seems a simple SUM function -
SELECT date, order_number, SUM(amount)
FROM <YOUR BIGGER QUERY..>
GROUP BY date, order_number

Calculating average based on distinct ID while preserving all the data in a table?

If I have data like so:
+------+----+-------+-------+
| year | id | value | group |
+------+----+-------+-------+
| 2019 | 1 | 10 | A |
| 2019 | 1 | 10 | B |
| 2019 | 2 | 20 | A |
| 2019 | 3 | 30 | A |
| 2019 | 2 | 20 | B |
| 2020 | 1 | 5 | A |
| 2020 | 1 | 5 | B |
| 2020 | 2 | 10 | A |
| 2020 | 3 | 15 | A |
| 2020 | 2 | 10 | B |
+------+----+-------+-------+
Is there a way to calculate the average value based on the distinct id while preserving all the data?
I need to do this because I will also have WHERE clause(s) to filter other columns in the table, but I also need to get an overall view of the data in the case the WHERE clause(s) are not added (these WHERE filters will be added by an automated software in the OUTERMOST query which I can't control).
The group column is an example.
For the above example, the results should be:
Overall --> 20 for 2019 and 10 for 2020
WHERE group = 'A' --> 20 for 2019 and 10 for 2020
WHERE group = 'B' --> 15 for 2019 and 7.5 for 2020
I tried to do the following:
SELECT
year,
AVG(IF(id = LAG(id) OVER (ORDER BY id), NULL, value)) AS avg
FROM table
WHERE group = 'A' -- this clause may or may not exist
GROUP BY year
Basically I was thinking that if I order by id and check the previous row to see if it has the same id, the value should be NULL and thus it would not be counted into the calculation, but unfortunately I can't put analytical functions inside aggregate functions.
While the data model is inappropriate and not normalized (you are storing values redundantly), the real problem is the late automated SQL injection (the optionally added where clause).
When a where clause gets added to your query, everything is fine, because the where clause properly restricts the rows to take into consideration (group A or B). When no where clause gets added, however, you would have to work on an aggregated data set (distinct year/id rows). The latter means an aggreation on an aggregation, which can be done with a subquery as was shown by DineshDB in an earlier answer. But here you have the problem that the where clause must work on the intermediate result (the subquery) and you say that your software adds the where clause to the main query instead.
The surprising solution to this is making this three aggregations. In below query I am mixing MAX (first aggregation), AVG OVER (second aggregation), and DISTINCT (third aggregation) and the three can happily co-exist in one query. No subquery is needed.
SELECT DISTINCT
year,
AVG(MAX(value)) OVER (PARTITION BY year)
FROM yourtable
WHERE `group` = ... -- optional where clause
GROUP BY year, id
ORDER BY year;
Demo: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=973ae4f260597392c55f260d3c260084
The following query will give you the expected output.
SELECT
`Year`,
AVG(DISTINCT `value`*1.0) `value`
FROM table
WHERE `group` = 'B' -- this clause is optional
GROUP BY `Year`;
The query will return the below results.
Year | Value
2019 | 20
2020 | 10
SQLFiddle

Ranges overlap - MySQL

Does anyone know how to find ranges that overlap, using MySQL? Essentially, as seen on table below (just for illustrating the problem as the actual table contains 1000+ ranges), I am trying to fetch all ranges that overlap inside of a table.
Thanks!
RANGES
| count | Begin | End | Comment |
| 1 | 1001 | 1095 | overlaps with ranges 2, 3 |
| 2 | 1005 | 1030 | overlaps with ranges 1, 3 |
| 3 | 1017 | 1020 | overlaps with ranges 1, 2 |
| 4 | 1110 | 1125 | no overlap |
One method is a self join and aggregation:
select r1.count, r1.begin, r1.end,
group_concat(r2.count order by r2.count) as overlaps
from ranges r1 left join
ranges r2
on r1.end >= r2.begin and
r1.begin <= r2.end and
r1.count <> r2.count
group by r1.count, r1.begin, r1.end;
On a table with 1000 rows, this will not be fast, but it should be doable. You may want to validate the logic on a smaller table.
This assumes that count is really a unique identifier for each row.
Note that count and end are poor choices for column names because they are SQL keywords.
Here is a db<>fiddle.

MYSQL : Group by all weeks of a year with 0 included

I have a question about some mysql code.
I have a table referencing some employees with the date of arrival et the project id. I wanna calculate all the entries in the enterprise and group it by week.
A this moment, I can have this result
Project ID | Week | Count
1 | 2019-S01 | 2
1 | 2019-S03 | 1
2 | 2019-S01 | 1
2 | 2019-S04 | 5
2 | 2019-S05 | 3
2 | 2019-S06 | 2
This is good, but I would like to have all the weeks returned, even if a week has 0 as result :
Project ID | Week | Count
1 | 2019-S01 | 2
1 | 2019-S02 | 0
1 | 2019-S03 | 1
...
2 | 2019-S01 | 1
2 | 2019-S02 | 0
2 | 2019-S03 | 0
2 | 2019-S04 | 5
2 | 2019-S05 | 3
2 | 2019-S06 | 2
...
Here is my actual code :
SELECT
AP.SECTION_ANALYTIQUE AS SECTION,
FS_GET_FORMAT_SEMAINE(AP.DATE_ARRIVEE_PROJET) AS SEMAINE,
Count(*) AS COMPTE
FROM
RT00_AFFECTATIONS_PREV AP
WHERE
(AP.DATE_ARRIVEE_PROJET <= CURDATE() AND Year(AP.DATE_ARRIVEE_PROJET) >= Year(CURDATE()))
GROUP BY
SECTION, SEMAINE
ORDER BY
SECTION
Does anybody have a solution ?
I searched things on internet but didn't find anything accurate :(
Thank you in advance ! :)
The classic way to meet this requirement is to create a referential table to store all possible weeks.
create table all_weeks(week varchar(8) primary key);
insert into all_weeks values
('2019-S01'), ('2019-S02'), ('2019-S03'), ('2019-S04'), ('2019-S05'), ('2019-S06');
Once this is done, you can generate a cartesian product of all possible sections and weeks with a CROSS JOIN, and LEFT JOIN that with the original table.
Given your code snippet, this should look like:
SELECT
s.section_analytique AS section,
w.week AS semaine,
COUNT(ap.section_analytique) AS compte
FROM
(SELECT DISTINCT section_analytique from rt00_affectations_prev) s
CROSS JOIN all_weeks w
LEFT JOIN rt00_affectations_prev ap
ON s.section_analytique = ap.section_analytique AND w.week = FS_GET_FORMAT_SEMAINE(ap.date_arrivee_projet)
GROUP BY s.section_analytique, w.week
ORDER BY s.section_analytique
PS: be careful not to put conditions on the original table in the WHERE clause: this would defeat the purpose of the LEFT JOIN. If you need to do some filtering, use the referential table instead (you might need to add a few columns to it, like the starting date of the week maybe).

mysql update with subquery 2 level deep

Thanks for taking a look at this question. I'm kind of lost and hope someone can help me. Below is a update query i would like to run.
This query now returns an error:
1054 - Unknown column 'spi.et_cross_rank' in 'where clause'
Some background:
from table: tmp_ranking_tbl
I would like to get the nth(spi.et_return_rank) record
for a group with value x (spi.et_cross_rank)
SET #rownum=0;
UPDATE STRToer_Poule_indeling spi
SET spi.team_id = (SELECT R.team_poule_id
FROM (SELECT #rownum:=#rownum+1 AS rownum, trt.team_poule_id
FROM tmp_ranking_tbl trt
WHERE trt.overal_rank = spi.et_cross_rank
ORDER BY trt.punten DESC, (trt.goals_voor - trt.goals_tegen) DESC, trt.goals_voor DESC) R
WHERE R.rownum = spi.et_return_rank)
WHERE spi.et_ronde = v_et_ronde
AND spi.poule_id IN (SELECT row_id FROM STRToer_Poules WHERE toernooi_onderdeel_id=v_onderdeel_id) ;
Data in tmp_ranking_tbl looks like:
team_poule_id | punten | goals_voor | goals_tegen | overal_rank
65 | 6 | 10 | 10 | 2
69 | 6 | 9 | 10 | 2
75 | 7 | 11 | 4 | 2
84 | 6 | 6 | 8 | 2
112 | 5 | 7 | 7 | 2
Thanks in advance for the help!
Update after question in comment about the goal, i'll try to keep it short. :-)
This query is used on a website to keep scores of a tournament. Sometimes you have an odd number of teams going to the next round. At that point I want to select the best number 3(spi.et_cross_rank) team across poules. This is setting saved in the STRToer_Poule_indeling with what rank per poule and the 1st, 2nd or nth team(spi.et_return_rank). The table tmp_ranking_tbl is filled with all rank 3 teams across the poules. When this if filled I would like the 1st or 2nd, depedining on the setting in STRToer_Poule_indeling, record to return.
Subset of structure the STRToer_Poule_indeling table
row_id | team_id | et_ronde | et_cross_rank | et_return_rank
1 | null | 1 | 3 | 1
Just check if you have a column named et_cross_rank on your table STRToer_Poule_indeling
The problem seems to be that SQL can't find that column on your table.
Hope it helps.