Grouping top N rows, but also needing to sum column values

Grouping top N rows, but also needing to sum column values - mysql

I am using the following query to group the top N rows in my data set:
SELECT mgap_ska_id,mgap_ska_id_name, account_manager_id,
mgap_growth AS growth,mgap_recovery,
(mgap_growth+mgap_recovery) total
FROM
(SELECT mgap_ska_id,mgap_ska_id_name, account_manager_id, mgap_growth,
mgap_recovery,(mgap_growth+mgap_recovery) total,
#acid_rank := IF(#current_acid = account_manager_id, #acid_rank + 1, 1)
AS acid_rank,
#current_acid := account_manager_id
FROM mgap_orders
ORDER BY account_manager_id, mgap_growth DESC
) ranked
WHERE acid_rank <= 5
and the result is VERY close to what I need, but I am having an aggregate issue that I need help with. I have attcached a screenshot of my query results (I had to block out the customer names and ids for privacy; the mgap_ska_id and account_manager_id are INT columns and the mgap_ska_id_name is a VARCHAR.
In theory I need to SUM (I know its an aggregate; that is the issue) multiple mgap_growth values while keeping the ranking in tact.
If I GROUP BY, then I lose the top 5 ranking. Currently, the mgap_growth value is only one value per mgap_ska_id within the mgap_growth column; I need it to be the SUM of all mgap_growth values per mgap_ska_id and keep the top five ranking as shown.
Thanks!

You can add the following line in your select fields:
(SELECT SUM(t.mgap_growth) FROM mgap_orders t WHERE t.mgap_ska_id = ranked.mgap_ska_id ) AS total_mgap_growth
So your code will be:
SELECT mgap_ska_id,mgap_ska_id_name, account_manager_id,
mgap_growth AS growth,mgap_recovery,
(mgap_growth+mgap_recovery) total,
(SELECT SUM(t.mgap_growth) FROM mgap_orders t WHERE t.mgap_ska_id = ranked.mgap_ska_id ) AS total_mgap_growth
FROM
(SELECT mgap_ska_id,mgap_ska_id_name, account_manager_id, mgap_growth,
mgap_recovery,(mgap_growth+mgap_recovery) total,
#acid_rank := IF(#current_acid = account_manager_id, #acid_rank + 1, 1)
AS acid_rank,
#current_acid := account_manager_id
FROM mgap_orders
ORDER BY account_manager_id, mgap_growth DESC
) ranked
WHERE acid_rank <= 5

Related

Order by highest value alternating with lowest value

I currently need to order data by highest value down, and then lowest value up, in between.
My Query is close, but doesn't quite order by largest down, though it is inserting the lowest in between:
DEMO Fiddle
select users.*
from users CROSS JOIN (select #even := 0, #odd := 0) param
order by
IF(score > 1, 2*(#odd := #odd + 1), 2*(#even := #even + 1) + 1),
score DESC;
Current Results
Email Score
----- --------
foo1#gmail.com 42
foo5#gmail.com 1
foo2#gmail.com 49
foo6#gmail.com 0
foo3#gmail.com 37
foo4#gmail.com 7
foo#gmail.com 22
Desired Results
Email Score
----- --------
foo2#gmail.com 49
foo6#gmail.com 0
foo1#gmail.com 42
foo5#gmail.com 1
foo3#gmail.com 37
foo4#gmail.com 7
foo#gmail.com 22

You can achieve using MySQL but avoid such complex SQL statements as the same can be achieved using programming language very easily.
SET #totalRows := (CASE WHEN (SELECT COUNT(*) FROM users) IS NULL THEN 0 ELSE (SELECT COUNT(*) FROM users) END);
PREPARE stmt1 FROM '(SELECT t.* FROM (SELECT ROW_NUMBER() OVER (ORDER BY score DESC) AS row_num, email, score FROM users UNION ALL SELECT ROW_NUMBER() OVER (ORDER BY score ASC) AS row_num, email, score FROM users) AS t ORDER BY t.row_num, t.score DESC LIMIT 0,?)';
EXECUTE stmt1 USING #totalRows;
Here is the explanation to achieve it:
Set variable #totalRows contains total rows in users table as that many rows will be displayed as a final result set.
Used ROW_NUMBER() function of MySQL to set ordering based on SOCRE field DESCENDING and ASCENDING for another result set.
Combined both the result set using UNION ALL statement of MySQL
Add the LIMIT keyword to make sure the final result must have a total number of rows that should not exceed #totalRows variable.
As in MySQL LIMIT statement we can't pass a dynamic value at a query level. I used the approach of prepare a statement.
You can ignore row_num column I used as final result set.
Hope my solution will help you.

For MySql 8.0+ you can use ROW_NUMBER() window function:
SELECT email, score
FROM (
SELECT *,
ROW_NUMBER() OVER (ORDER BY score DESC) rn1,
ROW_NUMBER() OVER (ORDER BY score ASC) rn2
FROM users
) t
ORDER BY LEAST(rn1, rn2), rn1;
For previous versions you can simulate ROW_NUMBER() with correlated subqueries (with the cost of poor performance for large datasets):
SELECT email, score
FROM (
SELECT u1.*,
(SELECT COUNT(*) FROM users u2 WHERE u2.score > u1.score) rn1,
(SELECT COUNT(*) FROM users u2 WHERE u2.score < u1.score) rn2
FROM users u1
) t
ORDER BY LEAST(rn1, rn2), rn1;
See the demo.

MySQL - Percentile calculation and update it in other column in the same table

I have a table in MySQL (phpMYAdmin) with the following columns
I am trying to determine the percentile for each row and update that value in the G1Ptile column. G1Ptile column is the percentile calculation based on G1%. I am using the following based on John Woo's answer given here
SELECT `G1%`,
(1-ranks/totals)*100 Percentile FROM (
SELECT distinct `G1%`,
#rank:=#rank + 1 ranks,
(SELECT COUNT(*) FROM PCount) totals
FROM PCount a,
(SELECT #rank:=0) s
ORDER BY `G1%` DESC ) s;
and get the following output
The output is in a select statement, I want to be able to update it to the G1Ptile column in my table, however I am unable to update it using
UPDATE `PCount` SET `G1Ptile`= --(All of the select query mentioned above)
Can you please help with modifying the query/suggest an alternative so that I can use the percentile values obtained using the above query and update it into G1Ptile in the same table. One more problem I have is that there are two 20% values in G1%, however the percentile assigned to one is 20 and other is 30. I want both of them to be 20 and the next row in the series to be 30.

I would write your calculation as:
SELECT `G1%`,
(1 - ranks / totals) * 100 as Percentile
FROM (SELECT `G1%`,
(#rank := #rank + 1) ranks,
(SELECT COUNT(*) FROM PCount) as totals
FROM (SELECT DISTINCT `G1%`
FROM PCount
ORDER BY `G1%` DESC
) p CROSS JOIN
(SELECT COUNT(*) as totals, #rank := 0
FROM Pcount
) params
) p;
I made certain changes more consistent with how MySQL processes variables. In particular, the SELECT DISTINCT and ORDER BY are in a subquery. This is necessary in more recent versions of MySQL (although in the most recent you can use window functions).
This can now be incorporated into an update using JOIN:
UPDATE PCount p JOIN
(SELECT `G1%`,
(1 - ranks / totals) * 100 as Percentile
FROM (SELECT `G1%`,
(#rank := #rank + 1) ranks,
(SELECT COUNT(*) FROM PCount) as totals
FROM (SELECT DISTINCT `G1%`
FROM PCount
ORDER BY `G1%` DESC
) p CROSS JOIN
(SELECT COUNT(*) as totals, #rank := 0
FROM Pcount
) params
) pp
) pp
ON pp.`G1%` = p.`G1%`
SET p.G1Ptile = pp.percentile;

SQL - Percentiles

I have one table:
country(ID, city, freg, counts, date)
I want to calculate the 90th percentile of counts in a specific interval of dates ($min and $max).
I've already did the same but with the average (code below):
SELECT
AVG(counts)
FROM country
WHERE date>= #min AND date < #max
;
How can I calculate the 90th percentile instead of the average?

Finally, something GROUP_CONCAT is good for...
SELECT SUBSTRING_INDEX(
SUBSTRING_INDEX(
GROUP_CONCAT(ct.ctdivol ORDER BY ct.ctdivol SEPARATOR ','),',',90/100 * COUNT(*) + 1
),',',-1
) `90th Percentile`
FROM ct
JOIN exam e
ON e.examid = ct.examid
AND e.date BETWEEN #min AND #max
WHERE e.modality = 'ct';

It appears doing it with a single query is not possible. At least not in MySQL.
You can do it in multiple queries:
1) Select how many rows satisfy your condition.
SELECT
COUNT(*)
FROM exam
INNER JOIN ct on exam.examID = ct.examID AND ct.ctdivol_mGy > 0
WHERE exam.modality = 'CT'
AND exam.date >= #min AND exam.date < #max
2) Check the percentile threshold by multiplying the number of rows by percentile/100. For example:
Number of rows in previous count: 200
Percentile: 90%
Number of rows to threshold: 200 * (90/100) = 180
3) Repeat the query, order by the value you want the percentile from and LIMIT the result to the only row number you found in the 2nd point. Like so:
SELECT
ct.ctdivol_mGy
FROM exam
INNER JOIN ct on exam.examID = ct.examID AND ct.ctdivol_mGy > 0
WHERE exam.modality = 'CT'
AND exam.date >= #min AND exam.date < #max
ORDER BY ct.ctdivol_mGy
LIMIT 1 OFFSET 179 --> Take 1 row after 179 rows, so our 180th we need
You'll get the 180th value of the selected rows, so the 90th percentile you need.
Hope this helps!

MySQL position in recordset

I'm ordering a recordset like this:
SELECT * FROM leaderboards ORDER BY time ASC, percent DESC
Say I have the id of the record which relates to you, how can I find out what position it is in the recordset, as ordered above?
I understand if it was just ordered by say 'time' I could
SELECT count from table where time < your_id
But having 2 ORDER BYs has confused me.

You can use a variable to assign a counter:
SELECT *, #ctr := #ctr + 1 AS RowNumber
FROM leaderboards, (SELECT #ctr := 0) c
ORDER BY time ASC, percent DESC

Does this do what you want?
SELECT count(*)
FROM leaderboards lb cross join
(select * from leaderboards where id = MYID) theone
WHERE lb.time < theone.time or
(lb.time = theone.time and lb.percent >= theone.percent);
This assumes that there are no duplicates for time, percent.

Possible to add auto incrementing variable to row results of GROUP BY in mysql?

Basically, I'm using this query to group a bunch of users based on the sum of numbers associated with them. I need to some how assign an index to each result. I am blanking on how to do this. I'm thinking I need to alias something with AS but not sure quite how. Any ideas?
This is the current query where I switch out the page and per dynamically:
SELECT COUNT(*) as count, user_id, SUM(earnings) as sum FROM ci_league_result
GROUP BY user_id ORDER BY sum desc LIMIT ".$page.', '.$per;
I'm lookin for it to work something like this:
SELECT COUNT(*) as count, user_id, SUM(earnings) as sum, *NEW-RESULTS-OVERALL-INDEX* AS newindex FROM ci_league_result
GROUP BY user_id ORDER BY sum desc LIMIT ".$page.', '.$per;
notice the AS newindex in the second query.
Thanks for your advice!

Try this:
SELECT user_id, count, sum, #row := #row + 1 AS newindex FROM
(SELECT
COUNT(*) as count,
user_id,
SUM(earnings) as sum
FROM ci_league_result
GROUP BY user_id ORDER BY sum desc LIMIT ".$page.', '.$per) r
CROSS JOIN (SELECT #row := 0) rr;
EDITED

You should be able to accomplish this with SQL variables
SELECT
COUNT(*) as count,
user_id,
SUM(earnings) as sum,
#rownum := #rownum+1 AS newindex
FROM ci_league_result,
(SELECT #rownum:=0) r
GROUP BY user_id
ORDER BY sum DESC
LIMIT ".$page.', '.$per;

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Grouping top N rows, but also needing to sum column values - mysql

Related

Order by highest value alternating with lowest value

MySQL - Percentile calculation and update it in other column in the same table

SQL - Percentiles

MySQL position in recordset

Possible to add auto incrementing variable to row results of GROUP BY in mysql?

Categories

Resources