How to use rank() over PARTITION BY in Mysql - mysql

Let's consider that there are three material types such as ('COTTON', 'LEATHER', 'SILK') and I want to fetch the dress_id's which has all theses three material types. I want to rank them as well.
Can someone explain step by step on how to do this ?
I came through few examples and none of them seems to be clear to me.
The output should look something like
DRESS_ID MATERIAL LAST_UPDATED_DATE RANK
111 COTTON 2019-08-29 1
111 SILK 2019-08-30 2
111 LEATHER 2019-08-31 3
222 COTTON 2019-08-29 1
222 SILK 2019-08-30 2
222 LEATHER 2019-08-31 3
222 LEATHER 2019-09-02 4
I get an error in MYSQL work bench while executing this query.
Error Code: 1305. FUNCTION rank does not exist.
SELECT dress_id,
rank() over(PARTITION BY dress_id, material ORDER by LAST_UPDATED_DATE asc) as rank
FROM dress_types;

In earlier versions of MySQL, you can either use variables or a correlated subquery.
Because you have only a handful of materials for each dress, a correlated subquery is reasonable, particularly with the right index. The code looks like:
SELECT d.dress_id, d.material,
(SELECT COUNT(*)
FROM dress_types d2
WHERE d2.dress_id = d.dress_id AND
d2.last_updated_date <= d.last_updated_date
) as rank
FROM dress_types d;
Note that this implements the logic based on your data not the query. The corresponding query would be:
SELECT dress_id,
rank() over (PARTITION BY dress_id ORDER by LAST_UPDATED_DATE asc) as rank
FROM dress_types;
The index that you want is on dress_types(dress_id, last_updated_date).
Actually, these are the same so long as there are no duplicates (by date). The logic may be different if there are duplicates.

For previous versions of MySQL 8.0 you must use variables to simulate the ranking:
SET #rownum := 0;
SET #group_number := 0;
SELECT dress_id, material, last_updated_date, rank FROM (
SELECT #rownum := case
when #group_number = dress_id then #rownum + 1
else 1
end AS rank, dress_id, material, last_updated_date,
#group_number := dress_id
FROM dress_types
ORDER BY
dress_id,
FIELD(material, 'COTTON', 'SILK', 'LEATHER'),
last_updated_date
) t
See the demo.
Results:
| dress_id | material | last_updated_date | rank |
| -------- | -------- | ------------------- | ---- |
| 111 | COTTON | 2019-08-29 00:00:00 | 1 |
| 111 | SILK | 2019-08-30 00:00:00 | 2 |
| 111 | LEATHER | 2019-08-31 00:00:00 | 3 |
| 222 | COTTON | 2019-08-29 00:00:00 | 1 |
| 222 | SILK | 2019-08-30 00:00:00 | 2 |
| 222 | LEATHER | 2019-08-31 00:00:00 | 3 |
| 222 | LEATHER | 2019-09-02 00:00:00 | 4 |

SELECT T.*,
CASE WHEN #prev_dress_id != T.dress_id THEN #rank:=1
ELSE #rank:=#rank+1
END as rank,
#prev_dress_id := T.dress_id as set_prev_dress_id
FROM
(SELECT dress_id,material,last_updated_date
FROM dress_types T1
WHERE EXISTS (SELECT 1 FROM dress_types E1 WHERE E1.dress_id = T1.dress_ID AND E1.material = 'COTTON')
AND EXISTS (SELECT 1 FROM dress_types E2 WHERE E2.dress_id = T1.dress_ID AND E2.material = 'SILK')
AND EXISTS (SELECT 1 FROM dress_types E3 WHERE E3.dress_id = T1.dress_ID AND E3.material = 'LEATHER')
ORDER BY dress_id asc,last_updated_date asc
)T,(SELECT #prev_dress_id:=-1)V
The inner select selects dresses that have existence of all 3 materials and ordered by dress_id, last_updated_date.
The outer joins it with a prev_dress_id variable that can be set at the end of each row. The the logics in case statement to calculate rank based on #prev_dress_id != or = T.dress_id.
sqlfiddle

SELECT dress_id
, material
, LAST_UPDATED_DATE
rank() over(PARTITION BY dress_id ORDER by LAST_UPDATED_DATE asc) as rank
FROM dress_types

Related

Alternative to "ntile" for MySQL version lower than 8?

I am trying the below code, which analyses and scores customers based on recency, frequency and monetary value of transactions.
select customer_id, rfm_recency, rfm_frequency, rfm_monetary
from
(
select customer_id,
ntile(4) over (order by last_order_date) as rfm_recency,
ntile(4) over (order by count_order) as rfm_frequency,
ntile(4) over (order by sum_amount) as rfm_monetary
from
(
select customer_id,
max(local_date) as last_order_date,
count(*) as count_order,
sum(amount) as sum_amount
from transaction
group by customer_id) as T
) as P
However ntile is not available in my MySQL version (v5) as apparently it's a "window function" which works on v8+ only.
I can't find a working alternative to this function. I am very new to SQL so I'm having a hard time figuring it out myself.
Is there an ntile alternative that I can use? The code works fine if i remove the ntile segment.
You should really upgrade to MySQL 8.0 if you need features in MySQL 8.0. They are bound to be easier and more optimized.
I found a way to simulate the ntile query shown in the documentation:
SELECT
val,
ROW_NUMBER() OVER w AS 'row_number',
NTILE(2) OVER w AS 'ntile2',
NTILE(4) OVER w AS 'ntile4'
FROM numbers
WINDOW w AS (ORDER BY val);
Here's a solution:
SELECT val, #r:=#r+1 AS rownum,
FLOOR((#r-1)*2/9)+1 AS ntile2,
FLOOR((#r-1)*4/9)+1 AS ntile4
FROM (SELECT #r:=0,#n:=0) AS _init, numbers
The 2 and 4 factors are for the ntile(2) and ntile(4) respectively. The 9 value is because there are 9 rows in this example table. You must know the count of the table before you can run this query. The solution also requires user defined variables, which are always kind of tricky.
Result:
+------+--------+--------+--------+
| val | rownum | ntile2 | ntile4 |
+------+--------+--------+--------+
| 1 | 1 | 1 | 1 |
| 1 | 2 | 1 | 1 |
| 2 | 3 | 1 | 1 |
| 3 | 4 | 1 | 2 |
| 3 | 5 | 1 | 2 |
| 3 | 6 | 2 | 3 |
| 4 | 7 | 2 | 3 |
| 4 | 8 | 2 | 4 |
| 5 | 9 | 2 | 4 |
+------+--------+--------+--------+
I'll leave it as an exercise for you to adapt this technique to your query and your table, or to decide that it's time to upgrade to MySQL 8.0.
You can enumerate rows and use arithmetic. Unfortunately, you'll need to do this three times:
select floor(seqnum * 4 / #rn) as ntile_recency, t.*
from (select (#rn := #rn + 1) as seqnum, t.*
from (select customer_id, max(local_date) as last_order_date, count(*) as count_order,
sum(amount) as sum_amount
from transaction
group by customer_id
order by last_order_date
) t cross join
(select #rn := 0) params
) t;

Get user's highest score from a table

I have a feeling this is a very simple question but maybe i'm having brain fart right now and just can't seem to figure out how to go about it.
I have a MySQL table structure like below
+---------------------------------------------------+
| id | date | score | speed | user_id |
+---------------------------------------------------+
| 1 | 2016-11-17 | 2 | 133291 | 17 |
| 2 | 2016-11-17 | 6 | 82247 | 17 |
| 3 | 2016-11-17 | 6 | 21852 | 17 |
| 4 | 2016-11-17 | 1 | 109338 | 17 |
| 5 | 2016-11-17 | 7 | 64762 | 61 |
| 6 | 2016-11-17 | 8 | 49434 | 61 |
Now i can get a particular user's best performance by doing this
SELECT *
FROM performance
WHERE user_id = 17 AND date = '2016-11-17'
ORDER BY score desc,speed asc LIMIT 1
This should return the row with ID = 3. Now what I want is a single query to run to be able to return that 1 such row for each unique user_id in the table. So the resulting result would be something like this
+---------------------------------------------------+
| id | date | score | speed | user_id |
+---------------------------------------------------+
| 3 | 2016-11-17 | 6 | 21852 | 17 |
| 6 | 2016-11-17 | 8 | 49434 | 61 |
Also further more, can I have another question within this same query that would further sort this eventual resulting table by the same criteria of sort (score desc, speed asc). Thanks
A simple method uses a correlated subquery:
select p.*
from performance p
where p.date = '2016-11-17' and
p.id = (select p2.id
from performance p2
where p2.user_id = p.user_id and p2.date = p.date
order by score desc, speed asc
limit 1
);
This should be able to take advantage of an index on performance(date, user_id, score, speed).
Is easy using variable to emulate row_number() over (partition by Order by)
Explanation:
First create two variables in the subquery.
Order by user_id so when user change the #rn reset to 1
Order by score desc, speed asc so each row will have a row_number, and the one you want always will have rn = 1
#rn := you change #rn for each row
if you have a new user_id then #rn is set to 1
otherwise #rn is set to #rn+1
SQL Fiddle Demo
SELECT `id`, `date`, `score`, `speed`, `user_id`
FROM (
SELECT *,
#rn := if(#user_id = `user_id`,
#rn + 1 ,
if(#user_id := `user_id`,1,1)
) as rn
FROM Table1
CROSS JOIN (SELECT #user_id := 0, #rn := 0) as param
WHERE date = '2016-11-17'
ORDER BY `user_id`, `score` desc, `speed` asc
) T
where T.rn =1
OUTPUT
For mysql
You can try with a double in subselect and group by
select * from performance
where (user_id, score,speed ) in (
SELECT user_id, max_score, max(speed)
FROM performance
WHERE (user_id, score) in (select user_id, max(score) max_score
from performance
group by user_id)
group by user_id, max_score
);

Sql to find timediff between two rows based on ID

The subject of the question is not very explanatory, sorry for that.
Ya so the question follows:
I have a database structure as below where pk is primary key, id
is something which is multiple for many rows.
+------+------+---------------------+
| pk | id | value |
+------+------+---------------------+
| 99 | 1 | 2013-08-06 11:10:00 |
| 100 | 1 | 2013-08-06 11:15:00 |
| 101 | 1 | 2013-08-06 11:20:00 |
| 102 | 1 | 2013-08-06 11:25:00 |
| 103 | 2 | 2013-08-06 15:10:00 |
| 104 | 2 | 2013-08-06 15:15:00 |
| 105 | 2 | 2013-08-06 15:20:00 |
+------+------+---------------------+
What is really need to get is, value difference between first two rows (which is ordered by value) for each
group (where group is by id). So according to above structure I need
timediff(value100, value99) [ which is for id 1 group]
and timediff(value104, value103) [ which is for id 2 group]
i.e. value difference of time ordered by value for 1st two rows in each group.
One way i can think to do is by 3 self joins (or 3 sub queries) so as to find the
first two in 2 of them , and third query subtracting it. Any suggestions?
try this.. CTE is pretty powerfull!
WITH CTE AS (
SELECT
value, pk, id,
rnk = ROW_NUMBER() OVER ( PARTITION BY id order by id DESC)
, rownum = ROW_NUMBER() OVER (ORDER BY id, pk)
FROM test
)
SELECT
curr.rnk, prev.rnk, curr.rownum, prev.rownum, curr.pk, prev.pk, curr.id, prev.id, curr.value, prev.value, curr.value - prev.value
FROM CTE curr
INNER JOIN CTE prev on curr.rownum = prev.rownum -1 and curr.id = prev.id
and curr.rnk <=1
Looks a bit wierd... But you can try this way
SET #previous = 0;
SET #temp = 0;
SET #tempID = 0;
Above step may not be needed .. But just to make sure nothing goes wrong
SELECT pkid, id, diff, valtemp FROM (
SELECT IF(#previousID = id, #temp := #temp + 1, #temp := 1) occ, #previousID := id,
TIMEDIFF(`value`, #previous) diff, pk, id, `value`, #previous := `value`
FROM testtable) a WHERE occ = 2
Demo on sql fiddle

How to get the nth Highest mark in all branch?

I have a table Student with fields: Student_id, Student_Name, Mark, Branch.
I want to get the nth highest mark and name of each branch with in a single query. Is it possible?
for Example if the datas are
S1 | Amir | EC | 121
S2 | Ewe | EC | 123
S3 | Haye | EC | 45
S4 | Mark | EC | 145
S5 | Tom | CS | 152
S6 | Hudd | CS | 218
S7 | Ken | CS | 48
S8 | Ben | CS | 15
S9 | Wode | CS | 123
S10 | Kayle | IT | 125
S11 | Den | IT | 120
S12 | Noy | IT | 126
And I am selecting to display the third highest mark in each branch the output should be like
S1 | Amir | EC | 121
S9 | Wode | CS | 123
S11 | Den | IT | 120
This would be much easier if MySQL had windowing functions like several of the other answers have shown. But they don't so you can use something like the following:
select student_id,
student_name,
branch,
mark
from
(
select student_id,
student_name,
branch,
mark,
#num := if(#branch = `branch`, #num + 1, 1) as group_row_number,
#branch := `branch` as dummy,
overall_row_num
from
(
select student_id,
student_name,
branch,
mark,
#rn:=#rn+1 overall_row_num
from student, (SELECT #rn:=0) r
order by convert(replace(student_id, 'S', ''), signed)
) src
order by branch, mark desc
) grp
where group_row_number = 3
order by overall_row_num
See SQL Fiddle with Demo
The result would be:
| STUDENT_ID | STUDENT_NAME | BRANCH | MARK |
---------------------------------------------
| S1 | Amir | EC | 121 |
| S9 | Wode | CS | 123 |
| S11 | Den | IT | 120 |
MAX()
select branch, MAX(mark) as 'highest mark' from Student group by branch;
MySQl does not support CTE and also rownumber, so you have do something like this :-
Make a temp table , insert data into that then select and finally drop temp table ..
create table #temp(branch varchar(30),mark int,row_numbers int)
Insert into #temp
SELECT a.branch, a.mark, count(*) as row_numbers FROM student a
JOIN student b ON a.branch = b.branch AND a.mark <= b.mark
GROUP BY a.mark,a.branch
Order by a.mark desc
select branch,mark from #temp where row_numbers=2
drop table #temp
This is for SQL SERVER :-
;WITH CTE AS
(
SELECT Branch,mark, ROW_NUMBER() OVER (PARTITION BY Branch ORDER BY mark DESC) AS RowNum
FROM
student
)
SELECT Branch,mark from cte
WHERE RowNum = 2
This will give suppose 2nd highest mark branch wise, you can accordingly choose any Nth level .
Hope it helps :-
This one works in oracle
select Student_id,Student_Name,Mark,Branch from (
select Student_id,Student_Name,Mark,Branch,dense_rank() over (partition by Branch order by Mark desc) rnk
from Student ) where rnk=nth_value;
nth_value:=nth highest mark needed.
please check this sqlfiddle: http://sqlfiddle.com/#!4/7b559/3
You can use Limit to get n highest market ( limit 1,1 will give you 2nd highest, set as per your requirement)
SELECT mark, name
FROM student ORDER BY mark
DESC LIMIT 1,1

How to combine near same item by SQL?

I have some data in database:
id user
1 zhangsan
2 zhangsan
3 zhangsan
4 lisi
5 lisi
6 lisi
7 zhangsan
8 zhangsan
I want keep order, and combine near same user items, how to do it?
When I use shell script, I will(data in file test.):
cat test|cut -d " " -f2|uniq -c
this will get result as:
3 zhangsan
3 lisi
2 zhangsan
But how to do it use sql?
If you try:
SET #name:='',#num:=0;
SELECT id,
#num:= if(#name = user, #num, #num + 1) as number,
#name := user as user
FROM foo
ORDER BY id ASC;
This gives:
+------+--------+------+
| id | number | user |
+------+--------+------+
| 1 | 1 | a |
| 2 | 1 | a |
| 3 | 1 | a |
| 4 | 2 | b |
| 5 | 2 | b |
| 6 | 2 | b |
| 7 | 3 | a |
| 8 | 3 | a |
+------+--------+------+
So then you can try:
SET #name:='',#num:=0;
SELECT COUNT(*) as count, user
FROM (
SELECT #num:= if(#name = user, #num, #num + 1) as number,
#name := user as user
FROM foo
ORDER BY id ASC
) x
GROUP BY number;
Which gives
+-------+------+
| count | user |
+-------+------+
| 3 | a |
| 3 | b |
| 2 | a |
+-------+------+
(I called my table foo and also just used names a and b because I was too lazy to write zhangsan and lisi over and over).
if in oracle, you can do like below.
SELECT NAME,
num - lagnum
FROM (SELECT lagname,
NAME,
num,
nvl(lag(num) over(ORDER BY num), 0) lagnum
FROM (SELECT id,
lag(NAME) over(ORDER BY ID) lagname,
NAME,
lead(NAME) over(ORDER BY ID) leadname,
ROWNUM num
FROM (SELECT * FROM test ORDER BY ID))
WHERE (lagname = NAME AND (NAME <> leadname OR leadname IS NULL))
OR (lagname IS NULL AND NAME <> leadname)
OR (lagname <> NAME AND leadname IS NULL)
ORDER BY ID);
if in sql server, oracle, db2...
with x as(
select c.*, rn = row_number() over (order by c.id)
from test c
left join test n
on c.[user] = n.[user]
and c.[id] + 1 = n.[id]
where n.id is null
)
select a.[user], a.id - coalesce(b.id, 0)
from x a
left join x b
on a.rn = b.rn + 1
I think what you are looking for is to COUNT(ID):
SELECT COUNT(ID) FROM table GROUP BY user
You cannot do this in sql without doing some sort of sequential (iterative) analysis. Remember sql is set operation language.
A little improvement to the selected answer would be not to have to define those variables. So this query can be solved in just a single statement:
SELECT COUNT(*) cnt, user
FROM (
SELECT #num := #num + (#name != user) as number,
#name := user as user
FROM t, (select #num := 0, #name := '') as s
ORDER BY id
) x
GROUP BY number
Output:
| CNT | USER |
|-----|----------|
| 3 | zhangsan |
| 3 | lisi |
| 2 | zhangsan |
Fiddle here