SQL partition by with original order - mysql

Here's the original MySQL table:
+----+-----+
| Id | Num |
+----+-----+
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 1 |
| 6 | 2 |
| 7 | 2 |
+----+-----+
When I use select Id, Num, row_number() over(partition by Num) from t, MySQL automatically disrupts the order of the Num column. However, I want to keep Num column order unchanged.
Specifically, the ideal output should be like:
+----+-----+-----+
| Id | Num | row |
+----+-----+-----+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 3 |
| 4 | 2 | 1 |
| 5 | 1 | 1 |
| 6 | 2 | 1 |
| 7 | 2 | 2 |
+----+-----+-----+
How to write this MySQL query?

This is a gaps-and-islands problem. I would recommend using the difference between row numbers to identify the groups.
If id is always incrementing without gaps:
select id, num,
row_number() over(partition by num, id - rn order by id) rn
from (
select t.*, row_number() over(partition by num order by id) rn
from mytable t
) t
order by id
Otherwise, we can generate our own incrementing id with another row_number():
select id, num,
row_number() over(partition by num, rn1 - rn2 order by id) rn
from (
select t.*,
row_number() over(order by id) rn1,
row_number() over(partition by num order by id) rn2
from mytable t
) t
order by id
Demo on DB Fiddle - for your sample data, both queries yield:
id | num | rn
-: | --: | -:
1 | 1 | 1
2 | 1 | 2
3 | 1 | 3
4 | 2 | 1
5 | 1 | 1
6 | 2 | 1
7 | 2 | 2

You can do this by writing your own row_number to have greater control over its partitioning.
set #prev_num = null;
set #row_number = 0;
select
id,
-- Reset row_number to 1 whenever num changes, else increment it.
#row_number := case
when #prev_num = num then
#row_number + 1
else
1
end as `row_number`,
-- Emulate lag(). This must come after the row_number.
#prev_num := num as num
from foo
order by id;

Same idea as the solution proposed by Schwern. Just another style of syntax in MySQL which I find very simplistic and easy to use.
Select
id
, num
, value
from
(select
T.id,
T.num,
if( #lastnum = T.num, #Value := #Value + 1,#Value := 1) as Value,
#lastnum := T.num as num2
from
mytable T,
( select #lastnum := 0,
#Value := 1 ) SQLVars
order by
T.id) T;
DB fiddle link - https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=e04692841d091ccd54ee3435a409c67a

Related

How do I use row_number() with partitioning and without ordering?

My table looks like :
table_1
| Id | Num |
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 1 |
| 6 | 2 |
| 7 | 2 |
I want a row_number next to 'num' column, but as soon as the num changes it's value, the row_number resets.
I want my table to look like:
| Id | Num | row_num |
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 3 |
| 4 | 2 | 1 |
| 5 | 1 | 1 |
| 6 | 2 | 1 |
| 7 | 2 | 2 |
One way to get your desired output is to use lag and a conditional sum to flag when the number changes, then you can use row_number and partition by this flag:
with lagNum as (
select id, num, Lag(num) over(order by id) as v
from t
), changed as (
select id, num,
Sum(case when num = v then 0 else 1 end) over(order by id rows unbounded preceding) as v
from lagNum
)
select id, num, row_number() over(partition by v order by id) as row_num
from changed
Example Fiddle
This does require at least MySql 8 which added support for window functions
This is a type of gaps-and-islands problem. For this version, the simplest solution is probably to identify the islands using the difference of row numbers:
select t.*,
row_number() over (partition by seqnum - seqnum_2 order by id) as row_num
from (select t.*,
row_number() over (order by id) as seqnum,
row_number() over (partition by num order by id) as seqnum_2
from table_1 t
) t;
If you run the subquery, you will see how the difference identifies the "adjacent" values of num.
Note: If (as in your example) the ids are sequential with no gaps, you can simplify this to:
select t.*,
row_number() over (partition by id - seqnum_2 order by id) as row_num
from (select t.*,
row_number() over (partition by num order by id) as seqnum_2
from table_1 t
) t;

Calculate rank of the users based on their max score using mysql

I have a table (called users) I need rank of users based on their score but I want rank on the bases of users max score.
+-----------+------------+
| User_id | Score |
+-----------+------------+
| 1 | 12258 |
| 1 | 112 |
| 2 | 9678 |
| 5 | 9678 |
| 3 | 689206 |
| 3 | 1868 |
Expect result
+-----------+------------+---------+
| User_id | Score | Rank |
+-----------+------------+---------+
| 3 | 689206 | 1 |
| 1 | 12258 | 2 |
| 2 | 9678 | 3 |
| 5 | 9678 | 3 |
You are looking for DENSE_RANK, But it supports mysql version higher than 8.0
use correlated-subquery to get max value by each User_id
use two variables one to store rank another to store previous value to make the DENSE_RANK number.
look like this.
CREATE TABLE T(
User_id int,
Score int
);
insert into t values (1,12258);
insert into t values (1,112);
insert into t values (2,9678);
insert into t values (5,9678);
insert into t values (3,689206);
insert into t values (3,1868);
Query 1:
SELECT User_id,Score,Rank
FROM (
SELECT User_id,
Score,
#rank :=IF(#previous = t1.score, #rank, #rank + 1) Rank,
#previous := t1.Score
FROM T t1 CROSS JOIN (SELECT #Rank := 0,#previous := 0) r
WHERE t1.Score =
(
SELECT MAX(Score)
FROM T tt
WHERE t1.User_id = tt.User_id
)
ORDER BY Score desc
) t1
Results:
| User_id | Score | Rank |
|---------|--------|------|
| 3 | 689206 | 1 |
| 1 | 12258 | 2 |
| 2 | 9678 | 3 |
| 5 | 9678 | 3 |
Another trick in MySql 5.7 to calculate a DENSE_RANK (like in MySql 8) is to use a CASE WHEN with the variable assignments in it.
SELECT User_id, MaxScore AS Score,
CASE
WHEN MaxScore = #prevScore THEN #rnk
WHEN #prevScore := MaxScore THEN #rnk := #rnk+1
ELSE #rnk := #rnk+1
END AS Rank
FROM
(
SELECT User_id, MAX(Score) AS MaxScore
FROM YourTable
GROUP BY User_id
ORDER BY MaxScore DESC, User_id
) AS q
CROSS JOIN (SELECT #rnk := 0, #prevScore := null) AS vars
You can test it here on rextester.

cumulative product over a big MySQL table

I have a big MySQL table on which I'd like to calculate a cumulative product. This product has to be calculated for each group, a group is defined by the value of the first column.
For example :
name | number | cumul | order
-----------------------------
a | 1 | 1 | 1
a | 2 | 2 | 2
a | 1 | 2 | 3
a | 4 | 8 | 4
b | 1 | 1 | 1
b | 1 | 1 | 2
b | 2 | 2 | 3
b | 1 | 2 | 4
I've seen this solution but don't think it would be efficient to join or subselect in my case.
I've seen this solution which is what I want except it does not partition by name.
This is similar to a cumulative sum:
select t.*,
(#p := if(#n = name, #p * number,
if(#n := name, number, number)
)
) as cumul
from t cross join
(select #n := '', #p := 1) params
order by name, `order`;

MySQL get ids of the range between two conditions

Let's say I've a table
+----+------------+
| id | condition |
+----+------------+
| 1 | open |
+----+------------+
| 2 | content |
+----+------------+
| 3 | content |
+----+------------+
| 4 | close |
+----+------------+
| 5 | nocontentx |
+----+------------+
| 6 | nocontenty |
+----+------------+
| 7 | open |
+----+------------+
| 8 | content |
+----+------------+
| 9 | close |
+----+------------+
| 10 | nocontentz |
+----+------------+
| 11 | open |
+----+------------+
| 12 | content |
+----+------------+
and want to get a new table where I get the IDs (the first and the last) of the values between "close" and "open". Note that the values between this two conditions are dynamic (I can't search by "nocontent"whatever)
Such as I get this table:
+----+----------+--------+
| id | start_id | end_id |
+----+----------+--------+
| 1 | 5 | 6 |
+----+----------+--------+
| 2 | 10 | 10 |
+----+----------+--------+
Thanks in advance!
http://sqlfiddle.com/#!2/c255c8/2
You can do this using a correlated subquery:
select (#rn := #rn + 1) as id,
id as startid,
(select id
from atable a2
where a2.id > a.id and
a2.condition = 'close'
order by a2.id asc
limit 1
) as end_id
from atable a cross join
(select #rn := 0) vars
where a.condition = 'open';
The working SQL Fiddle is here.
Note this returns the third open as well. If you don't want it, then add having end_id is not null to the end of the query.
EDIT:
If you know the ids are sequential, you can just add and subtract 1 from the above query:
select (#rn := #rn + 1) as id,
id+1 as startid,
(select id
from atable a2
where a2.id > a.id and
a2.condition = 'open'
order by a2.id asc
limit 1
) - 1 as end_id
from atable a cross join
(select #rn := 0) vars
where a.condition = 'close';
You can also do this in a different way, which is by counting the number of open and closes before any given row and using this as a group identifier. The way your data is structured, every other group is what you are looking for:
select grp, min(id), max(id)
from (select t.*,
(select sum(t2.condition in ('open', 'close'))
from t t2
where t2.id <= t.id
) as grp
from t
) t
where t.condition not in ('open', 'close') and
grp % 2 = 0
group by grp;

Limiting the output of specific column sql hana

I have a table structure as given below and what I'd like to be able to do is retrieve the top three records with the highest value for each Company code.
I've googled and I couldn't find a better way so hopefully you guys can help me out.
By the way, I'm attempting this in MySQL and SAP HANA. But I am hoping that I can grab the "structure" if the query for HANA if I can get help for only MySQL
Thanks much!
Here's the table:
http://pastebin.com/xgzCgpKL
In MySQL you can do
To get exactly three records per group (company) no matter ties emulating ROW_NUMBER() analytic function. Records with the same value get the same rank.
SELECT company, plant, value
FROM
(
SELECT company, plant, value, #n := IF(#g = company, #n + 1, 1) rnum, #g := company
FROM table1 CROSS JOIN (SELECT #n := 0, #g := NULL) i
ORDER BY company, value DESC, plant
) q
WHERE rnum <= 3;
Output:
| COMPANY | PLANT | VALUE |
|---------|-------|-------|
| 1 | C | 5 |
| 1 | B | 4 |
| 1 | A | 3 |
| 2 | G | 6 |
| 2 | C | 5 |
| 2 | D | 3 |
| 3 | E | 8 |
| 3 | A | 7 |
| 3 | B | 3 |
Get all records per group that have a rank from 1 to 3 emulating DENSE_RANK() analytic function
SELECT company, plant, value
FROM
(
SELECT company, plant, value, #n := IF(#g = company, IF(#v = value, #n, #n + 1), 1) rnum, #g := company, #v := value
FROM table1 CROSS JOIN (SELECT #n := 0, #g := NULL, #v := NULL) i
ORDER BY company, value DESC, plant
) q
WHERE rnum <= 3;
Output:
| COMPANY | PLANT | VALUE |
|---------|-------|-------|
| 1 | C | 5 |
| 1 | B | 4 |
| 1 | A | 3 |
| 1 | E | 3 |
| 1 | G | 3 |
| 2 | G | 6 |
| 2 | C | 5 |
| 2 | D | 3 |
| 3 | E | 8 |
| 3 | A | 7 |
| 3 | B | 3 |
| 3 | G | 3 |
Here is SQLFiddle demo
UPDATE: Now it looks like HANA supports analytic functions so the queries will look like
SELECT company, plant, value
FROM
(
SELECT company, plant, value,
ROW_NUMBER() OVER (PARTITION BY company ORDER BY value DESC) rnum
FROM table1
)
WHERE rnum <= 3;
SELECT company, plant, value
FROM
(
SELECT company, plant, value,
DENSE_RANK() OVER (PARTITION BY company ORDER BY value DESC) rank
FROM table1
)
WHERE rank <= 3;
Here is SQLFiddle demo It's for Oracle but I believe it will work for HANA too