Ranking a Group By result set - mysql

I would like to have a column showing the rank (highest amount being #1) of this result set. Can this be done somehow?
Here is the query to produce this result:
SELECT user_names.user_name,city.city,state.state,SUM(events_full.amount) AS total
FROM user_names,city,state,events_full
WHERE user_names.user_id=events_full.user_id
AND city.city_id=events_full.city_id
AND state.state_id=events_full.state_id
AND events_full.season_id=13
AND amount > 0
Group By user_names.user_name

I've got a hunch that you actually use MariaDB.
(based on your comment in a deleted answer)
Then you could try to add a DENSE_RANK over the SUM to your SELECT
DENSE_RANK() OVER (ORDER BY SUM(events_full.amount) DESC) AS Ranking
A simplyfied example:
create table test
(
col1 int,
col2 int
);
insert into test values
(1,1),(1,2),(1,3),
(2,1),(2,2),(2,3),(2,4),
(3,1),(3,5),
(4,1),(4,2);
select col1
, sum(col2) tot
, dense_rank() over (order by sum(col2) desc) rnk
from test
group by col1
order by rnk
col1 | tot | rnk
---: | --: | --:
2 | 10 | 1
1 | 6 | 2
3 | 6 | 2
4 | 3 | 3
db<>fiddle here
In MySql 5.7 it can be emulated via variables
For example:
select *
from
(
select col1, total
, case
when total = #prev_tot
and #prev_tot := total
then #rnk
when #prev_tot := total
then #rnk := #rnk + 1
end as rnk
from
(
select col1
, sum(col2) as total
from test
group by col1
order by total desc
) q1
cross join (select #rnk:=0, #prev_tot:=0) v
) q2
order by rnk;
col1 | total | rnk
---: | ----: | :--
2 | 10 | 1
1 | 6 | 2
3 | 6 | 2
4 | 3 | 3
db<>fiddle here

Related

Show TOP 4 rows into 4 columns in SQL SELECT Query

I need to show values from one column in a SQL tabe into 4 columns according its top 4 values.
Example current table:
----------------------------------
ID | Name | Amount
1 | Test | 80
1 | Test2| 70
1 | Test3| 40
1 | Any | 25
1 | Any1 | 15
1 | Any2 | 12
1 | Any3 | 5
2 | TS1 | 70
2 | TS2 | 55
2 | TS3 | 30
2 | TS4 | 19
2 | Any | 11
--------------------------
Example expected SELECT Query result:
----------------------------------
ID | Col1 | Col2 | Col3 | Col4
1 | 80 | 70 | 40 | 25
2 | 70 | 55 | 30 | 19
----------------------------------
The issue here is to group the top 4 amount in 4 columns not considering names, just the numbers.
Is there some way to reach this result in table like that?
Please try this pseudocode. Where data retrieves as per given ordering in sample input or storing position in database.
SELECT t.id
, MAX(CASE WHEN row_num = 1 THEN t.amount END) col1
, MAX(CASE WHEN row_num = 2 THEN t.amount END) col2
, MAX(CASE WHEN row_num = 3 THEN t.amount END) col3
, MAX(CASE WHEN row_num = 4 THEN t.amount END) col4
FROM (SELECT id
, amount
, ROW_NUMBER() OVER (PARTITION BY id) row_num
FROM test) t
WHERE t.row_num <= 4
GROUP BY t.id;
Please check from url https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=8d3b9a3da33f9177f2bb6957ef08e21b
Also you can use amount in ORDER BY clause in descending order as per expected result if needed. Try this pseudocode then.
SELECT t.id
, MAX(CASE WHEN row_num = 1 THEN t.amount END) col1
, MAX(CASE WHEN row_num = 2 THEN t.amount END) col2
, MAX(CASE WHEN row_num = 3 THEN t.amount END) col3
, MAX(CASE WHEN row_num = 4 THEN t.amount END) col4
FROM (SELECT id
, amount
, ROW_NUMBER() OVER (PARTITION BY id ORDER BY amount DESC) row_num
FROM test) t
WHERE t.row_num <= 4
GROUP BY t.id;
FIRST STEP: Create Ranking (1 to 4) with order by DESC amount as TOP 4 will be highest to lowest
Second Step: USE MAX() on ranking to get all the rankings in one row rather than diagonally
WITH data as (
SELECT
ID,
AMOUNT,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY AMOUNT DESC) RANKING
FROM [TABLE NAME]
)
SELECT
ID,
MAX(CASE WHEN RANKING = 1 THEN AMOUNT ELSE NULL END) COL_1,
MAX(CASE WHEN RANKING = 2 THEN AMOUNT ELSE NULL END) COL_2,
MAX(CASE WHEN RANKING = 3 THEN AMOUNT ELSE NULL END) COL_3,
MAX(CASE WHEN RANKING = 4 THEN AMOUNT ELSE NULL END) COL_4
FROM data
GROUP BY 1
Alternatively you can use window functionality to achieve this result.
Below is the query written in postgresql
select
id,col1,col2,col3,col4
from
(
SELECT
tbl.id
,tbl.amount AS col1 -- 1st
,LEAD(tbl.amount, 1) OVER (ORDER BY id ASC) as col2 --2nd
,LEAD(tbl.amount, 2) OVER (ORDER BY id ASC) as col3 --3rd
,LEAD(tbl.amount, 3) OVER (ORDER BY id ASC) as col4 --4th
,ROW_NUMBER() OVER (PARTITION BY id order by amount desc) rn
FROM
(
SELECT
id
, amount
, ROW_NUMBER() OVER (PARTITION BY id order by amount desc) rn
FROM <TABLE_NAME>
) tbl
WHERE 1=1 and tbl.rn <= 4 -- 4: variable
)tbl2
where 1=1 and tbl2.rn =1 -- 1: fixed
;
LEAD(column, n) returns column's value at the row n rows aer the current row

Get column after group by lastest date

Please help me to get table 2 from table 1
A simple summarisation can be achieved using GROUP BY
select
code1
, max(code2) as code2
, count(*) as times
, max(`date`) as max_dt
from table1
group by
code1
However the result is different to the image:
+-------+-------+-------+------------+
| code1 | code2 | times | max_dt |
+-------+-------+-------+------------+
| 1 | D | 4 | 2020-02-21 |
| 2 | NNNN | 2 | 2021-01-21 |
+-------+-------+-------+------------+
Note:
The maximum of code2 may not be what you expect it to be. e.g. "D" is after "BBBBB" as the data is alphabetical. not based on length of string.
maximum of date for code1 isn't 16/2
I would not recommend naming any column "date" as it is usually a reserved word and can cause difficulties when developing queries.
for MySQL prior to version 8 a technique to get "the most recent" row may be used as follows:
SELECT code1, code2, `date`
, (select count(*) from mytable t2 where d.code1 = t2.code1) as times
FROM (
SELECT
#row_num :=IF(#prev_value = t.code1, #row_num + 1, 1) AS rn
, t.code1
, t.code2
, t.`date`
, #prev_value := t.code1
FROM mytable t
CROSS JOIN (SELECT #row_num :=1, #prev_value :='') vars
ORDER BY
t.code1
, t.`date` DESC
) as d
WHERE rn = 1
;
Or, for MySQL version 8 or later it is possible to use a much simpler query as there are windowing functions available through the over() clause:
SELECT code1, code2, `date`, times
FROM (
SELECT
row_number() over(partition by t.code1
order by t.`date` DESC) AS rn
, t.code1
, t.code2
, t.`date`
, count(*) over(partition by t.code1) as times
FROM mytable t
) as d
WHERE rn = 1
;
The results of the 2nd and 3rd queries are the same:
+-------+-------+------------+-------+
| code1 | code2 | date | times |
+-------+-------+------------+-------+
| 1 | D | 2020-02-21 | 4 |
| 2 | NNNN | 2021-01-21 | 2 |
+-------+-------+------------+-------+
solution demonstrated at db<>fiddle here

Get top n counted rows in table within group [MySQL]

I'm trying to get just top 3 selling products grouped within categories (just top 3 products by occurrence in transactions (id) count(id) by each category). I was searching a lot for possible solution but with no result. It looks like it is a bit tricky in MySQL since one can't simply use top() function and so on. Sample data structure bellow:
+--------+------------+-----------+
| id |category_id | product_id|
+--------+------------+-----------+
| 1 | 10 | 32 |
| 2 | 10 | 34 |
| 3 | 10 | 32 |
| 4 | 10 | 21 |
| 5 | 10 | 100 |
| 6 | 7 | 101 |
| 7 | 7 | 39 |
| 8 | 7 | 41 |
| 9 | 7 | 39 |
+--------+------------+-----------+
In earlier versions of MySQL, I would recommend using variables:
select cp.*
from (select cp.*,
(#rn := if(#c = category_id, #rn + 1,
if(#c := category_id, 1, 1)
)
) as rn
from (select category_id, product_id, count(*) as cnt
from mytable
group by category_id, product_id
order by category_id, count(*) desc
) cp cross join
(select #c := -1, #rn := 0) params
) cp
where rn <= 3;
If you are running MySQL 8.0, you can use window function rank() for this:
select *
from (
select
category_id,
product_id,
count(*) cnt,
rank() over(partition by category_id order by count(*) desc) rn
from mytable
group by category_id, product_id
) t
where rn <= 3
In earlier versions, one option is to filter with a correlated subquery:
select
category_id,
product_id,
count(*) cnt
from mytable t
group by category_id, product_id
having count(*) >= (
select count(*)
from mytable t1
where t1.category_id = t.category_id and t1.product_id = t.product_id
order by count(*) desc
limit 3, 1
)

MySQL calculate score by rank in percentage

I am using MYSQL to create a rating system to implement my database. What I want to do is to rate each attribute by its percentage with some calculation. Here is the example database:
| ID | VALUE1 | VALUE2|
-----------------------
| 2 | 5 | 20 |
| 4 | 5 | 30 |
| 1 | 3 | 5 |
| 3 | 2 | 8 |
Here is the ideal output I need:
| ID | VALUE1 | RANK1 | Score1 | VALUE2 | RANK2 | Score2 |
---------------------------------------------------------
| 2 | 5 | 1 | 10 | 20 | 2| 8.3|
| 4 | 5 | 1 | 10 | 30 | 1| 10|
| 1 | 3 | 2 | 7.5| 5 | 4| 5|
| 3 | 2 | 3 | 5 | 8 | 3| 6.6|
The formula for score calculation is
5+5*(MaxRank-rank)/(MaxRank-MinRank)
How to generate multiple ranking like the table? I have tried
SELECT
#min_rank := 1 AS min_rank
, #max_rank1 := (SELECT COUNT(DISTINCT value1) FROM table) AS max_rank1
, #max_rank2 := (SELECT COUNT(DISTINCT value2) FROM table) AS max_rank2
;
SELECT
ID
, R1
, TRUNCATE(5.0+5.0 * (#max_rank1 - R1) / (#max_rank1 - #min_rank), 2) AS Score1
, R2
, TRUNCATE(5.0+5.0 * (#max_rank2 - R2) / (#max_rank2 - #min_rank), 2) AS Score2
FROM (
SELECT
ID
, value1
, FIND_IN_SET( `value1`, (SELECT GROUP_CONCAT(DISTINCT `value1` ORDER BY `value1` DESC) FROM table)) AS R1
, value2
, FIND_IN_SET( `value2`, (SELECT GROUP_CONCAT(DISTINCT `value2` ORDER BY `value2` DESC) FROM table)) AS R2
FROM table
) ranked_table;
It works fine with ranking below 170. My database has approximate 200+ ranking for some values and ranks larger then 170 will be seen as 0 when it returns. In that case, the scores with ranks >170 will be miscalculated. Thank you guys.
That looks nasty to calculate.
Something like this might do it
SELECT a.ID, a.VALUE1, Sub1.Rank1, (5.0+5.0 * (Sub3.MaxRank1 - Sub1.Rank1) / (Sub3.MaxRank1 - 1)) AS Score1, a.VALUE2, Sub2.Rank2, (5.0+5.0 * (Sub4.MaxRank2 - Sub2.Rank2) / (Sub4.MaxRank2 - 1)) AS Score2
FROM TestTable a
INNER JOIN (SELECT DISTINCT z.VALUE1, (SELECT ((COUNT(DISTINCT VALUE1) + 1)) FROM TestTable y WHERE z.VALUE1 < y.VALUE1) AS RANK1
FROM TestTable z
) Sub1 ON a.VALUE1 = Sub1.VALUE1
INNER JOIN (SELECT DISTINCT z.VALUE2, (SELECT ((COUNT(DISTINCT VALUE2) + 1)) FROM TestTable y WHERE z.VALUE2 < y.VALUE2) AS RANK2
FROM TestTable z
) Sub2 ON a.VALUE2 = Sub2.VALUE2
CROSS JOIN (SELECT COUNT(*) + 1 AS MaxRank1 FROM TestTable CROSS JOIN (SELECT MAX(VALUE1) AS MaxValue1 FROM TestTable) Sub3a WHERE VALUE1 < MaxValue1) Sub3
CROSS JOIN (SELECT COUNT(*) + 1 AS MaxRank2 FROM TestTable CROSS JOIN (SELECT MAX(VALUE2) AS MaxValue2 FROM TestTable) Sub4a WHERE VALUE2 < MaxValue2) Sub4
Note I am not sure on your score calculation. The equation you give doesn't appear to me to give the results in your example. But I might just be misreading it.

How to combine near same item by SQL?

I have some data in database:
id user
1 zhangsan
2 zhangsan
3 zhangsan
4 lisi
5 lisi
6 lisi
7 zhangsan
8 zhangsan
I want keep order, and combine near same user items, how to do it?
When I use shell script, I will(data in file test.):
cat test|cut -d " " -f2|uniq -c
this will get result as:
3 zhangsan
3 lisi
2 zhangsan
But how to do it use sql?
If you try:
SET #name:='',#num:=0;
SELECT id,
#num:= if(#name = user, #num, #num + 1) as number,
#name := user as user
FROM foo
ORDER BY id ASC;
This gives:
+------+--------+------+
| id | number | user |
+------+--------+------+
| 1 | 1 | a |
| 2 | 1 | a |
| 3 | 1 | a |
| 4 | 2 | b |
| 5 | 2 | b |
| 6 | 2 | b |
| 7 | 3 | a |
| 8 | 3 | a |
+------+--------+------+
So then you can try:
SET #name:='',#num:=0;
SELECT COUNT(*) as count, user
FROM (
SELECT #num:= if(#name = user, #num, #num + 1) as number,
#name := user as user
FROM foo
ORDER BY id ASC
) x
GROUP BY number;
Which gives
+-------+------+
| count | user |
+-------+------+
| 3 | a |
| 3 | b |
| 2 | a |
+-------+------+
(I called my table foo and also just used names a and b because I was too lazy to write zhangsan and lisi over and over).
if in oracle, you can do like below.
SELECT NAME,
num - lagnum
FROM (SELECT lagname,
NAME,
num,
nvl(lag(num) over(ORDER BY num), 0) lagnum
FROM (SELECT id,
lag(NAME) over(ORDER BY ID) lagname,
NAME,
lead(NAME) over(ORDER BY ID) leadname,
ROWNUM num
FROM (SELECT * FROM test ORDER BY ID))
WHERE (lagname = NAME AND (NAME <> leadname OR leadname IS NULL))
OR (lagname IS NULL AND NAME <> leadname)
OR (lagname <> NAME AND leadname IS NULL)
ORDER BY ID);
if in sql server, oracle, db2...
with x as(
select c.*, rn = row_number() over (order by c.id)
from test c
left join test n
on c.[user] = n.[user]
and c.[id] + 1 = n.[id]
where n.id is null
)
select a.[user], a.id - coalesce(b.id, 0)
from x a
left join x b
on a.rn = b.rn + 1
I think what you are looking for is to COUNT(ID):
SELECT COUNT(ID) FROM table GROUP BY user
You cannot do this in sql without doing some sort of sequential (iterative) analysis. Remember sql is set operation language.
A little improvement to the selected answer would be not to have to define those variables. So this query can be solved in just a single statement:
SELECT COUNT(*) cnt, user
FROM (
SELECT #num := #num + (#name != user) as number,
#name := user as user
FROM t, (select #num := 0, #name := '') as s
ORDER BY id
) x
GROUP BY number
Output:
| CNT | USER |
|-----|----------|
| 3 | zhangsan |
| 3 | lisi |
| 2 | zhangsan |
Fiddle here