How do I use row_number() with partitioning and without ordering? - mysql

My table looks like :
table_1
| Id | Num |
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 1 |
| 6 | 2 |
| 7 | 2 |
I want a row_number next to 'num' column, but as soon as the num changes it's value, the row_number resets.
I want my table to look like:
| Id | Num | row_num |
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 3 |
| 4 | 2 | 1 |
| 5 | 1 | 1 |
| 6 | 2 | 1 |
| 7 | 2 | 2 |

One way to get your desired output is to use lag and a conditional sum to flag when the number changes, then you can use row_number and partition by this flag:
with lagNum as (
select id, num, Lag(num) over(order by id) as v
from t
), changed as (
select id, num,
Sum(case when num = v then 0 else 1 end) over(order by id rows unbounded preceding) as v
from lagNum
)
select id, num, row_number() over(partition by v order by id) as row_num
from changed
Example Fiddle
This does require at least MySql 8 which added support for window functions

This is a type of gaps-and-islands problem. For this version, the simplest solution is probably to identify the islands using the difference of row numbers:
select t.*,
row_number() over (partition by seqnum - seqnum_2 order by id) as row_num
from (select t.*,
row_number() over (order by id) as seqnum,
row_number() over (partition by num order by id) as seqnum_2
from table_1 t
) t;
If you run the subquery, you will see how the difference identifies the "adjacent" values of num.
Note: If (as in your example) the ids are sequential with no gaps, you can simplify this to:
select t.*,
row_number() over (partition by id - seqnum_2 order by id) as row_num
from (select t.*,
row_number() over (partition by num order by id) as seqnum_2
from table_1 t
) t;

Related

MySql group by and order by different column

I have a table and I wanted to group by one column and get all values with order by date and time column.
**My table**
-------------------------------
id | name | created_at
===+======+===========
1 | a | 2020-11-18 04:33:55
2 | b | 2020-11-14 10:17:28
3 | c | 2020-11-12 20:26:00
4 | a | 2020-11-11 18:35:24
5 | c | 2020-11-10 10:55:04
**Result**
-------------------------------
id | name | created_at
===+======+===========
1 | a | 2020-11-18 04:33:55
2 | b | 2020-11-14 10:17:28
3 | c | 2020-11-12 20:26:00
In the older version of Mysql(V. 5.7.32) the below query is working fine.
SELECT * FROM `my_table` GROUP BY name ORDER by created_at DESC
But in the new version of mysql(V. 8.0.22) the below code is not working.
Please anyone have a solution for it.
WITH
cte AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY name ORDER BY created_at DESC) rn
FROM my_table )
SELECT *
FROM cte
WHERE rn = 1;

SQL partition by with original order

Here's the original MySQL table:
+----+-----+
| Id | Num |
+----+-----+
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 1 |
| 6 | 2 |
| 7 | 2 |
+----+-----+
When I use select Id, Num, row_number() over(partition by Num) from t, MySQL automatically disrupts the order of the Num column. However, I want to keep Num column order unchanged.
Specifically, the ideal output should be like:
+----+-----+-----+
| Id | Num | row |
+----+-----+-----+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 3 |
| 4 | 2 | 1 |
| 5 | 1 | 1 |
| 6 | 2 | 1 |
| 7 | 2 | 2 |
+----+-----+-----+
How to write this MySQL query?
This is a gaps-and-islands problem. I would recommend using the difference between row numbers to identify the groups.
If id is always incrementing without gaps:
select id, num,
row_number() over(partition by num, id - rn order by id) rn
from (
select t.*, row_number() over(partition by num order by id) rn
from mytable t
) t
order by id
Otherwise, we can generate our own incrementing id with another row_number():
select id, num,
row_number() over(partition by num, rn1 - rn2 order by id) rn
from (
select t.*,
row_number() over(order by id) rn1,
row_number() over(partition by num order by id) rn2
from mytable t
) t
order by id
Demo on DB Fiddle - for your sample data, both queries yield:
id | num | rn
-: | --: | -:
1 | 1 | 1
2 | 1 | 2
3 | 1 | 3
4 | 2 | 1
5 | 1 | 1
6 | 2 | 1
7 | 2 | 2
You can do this by writing your own row_number to have greater control over its partitioning.
set #prev_num = null;
set #row_number = 0;
select
id,
-- Reset row_number to 1 whenever num changes, else increment it.
#row_number := case
when #prev_num = num then
#row_number + 1
else
1
end as `row_number`,
-- Emulate lag(). This must come after the row_number.
#prev_num := num as num
from foo
order by id;
Same idea as the solution proposed by Schwern. Just another style of syntax in MySQL which I find very simplistic and easy to use.
Select
id
, num
, value
from
(select
T.id,
T.num,
if( #lastnum = T.num, #Value := #Value + 1,#Value := 1) as Value,
#lastnum := T.num as num2
from
mytable T,
( select #lastnum := 0,
#Value := 1 ) SQLVars
order by
T.id) T;
DB fiddle link - https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=e04692841d091ccd54ee3435a409c67a

Select only last record of distinct values [duplicate]

This question already has answers here:
SQL select only rows with max value on a column [duplicate]
(27 answers)
Closed 2 years ago.
I have a table like that
| Symbol | Value | created_at |
|:-----------|------------:|:------------:|
| A | A1 | 01/01/1970 |
| A | A2 | 01/01/2020 |
| B | B1 | 01/01/1970 |
| B | B2 | 01/01/2020 |
| C | C1 | 01/01/1970 |
| C | C2 | 01/01/2020 |
I need to query only the last record ( sorted by created_at ) of each symbol in the table
Expected output is this :
| Symbol | Value | created_at |
|:-----------|------------:|:------------:|
| A | A2 | 01/01/2020 |
| B | B2 | 01/01/2020 |
| C | C2 | 01/01/2020 |
I have no idea how I can achieve that, do you have some suggestions? Thanks you !
One option is to filter with a subquery:
select t.*
from mytable t
where t.created_at = (
select max(t1.created_at) from mytable t1 where t1.symbol = t.symbol
)
This query would take advantage of an index on (symbol, created_at).
If you are running MySQL 8.0, you can also use row_number():
select t.*
from (
select t.*, row_number() over(partition by symbol order by created_at desc) rn
from mytable
) t
where rn = 1
with t as
(
select *, row_number() over(PARTITION BY Symbol ORDER BY created_at DESC) as rn
from your_table
)
select * from t
where rn = 1
You can use windowing functions (something like the following should work although I haven't tested it):
select *, row_number() over(partition by symbol order by created_at desc) as rownum where rownum = 1

How to increment count of occurences of column value in MySQL

I have the following column names:
customer_email
increment_id
other_id (psuedo name)
created_at
increment_id and other_id will be unique, customer_email will have duplicates. As the results are returned I want to know what number of occurance of the email it is.
For each row, I want to know how many times thecustomer_email value has shown up so far. There will be an order by clause at the end for the created_at field and I plan to also add a where clause of where occurrences < 2
I am querying > 5 million rows but performance isn't too important because I'll be running this as a report on a read-replica database from production. In my use case, I will sacrifice performance for robustness.
| customer_email | incremenet_id | other_id | created_at | occurances <- I want this |
|----------------|---------------|----------|---------------------|---------------------------|
| joe#test.com | 1 | 81 | 2019-11-00 00:00:00 | 1 |
| sue#test.com | 2 | 82 | 2019-11-00 00:01:00 | 1 |
| bill#test.com | 3 | 83 | 2019-11-00 00:02:00 | 1 |
| joe#test.com | 4 | 84 | 2019-11-00 00:03:00 | 2 |
| mike#test.com | 5 | 85 | 2019-11-00 00:04:00 | 1 |
| sue#test.com | 6 | 86 | 2019-11-00 00:05:00 | 2 |
| joe#test.com | 7 | 87 | 2019-11-00 00:06:00 | 3 |
You can use variables in earlier versions of MySQL:
select t.*,
(#rn := if(#ce = customer_email, #rn + 1,
if(#ce := customer_email, 1, 1)
)
) as occurrences
from (select t.*
from t
order by customer_email, created_at
) t cross join
(select #ce := '', #rn := 0) params;
In MyQL 8+, I would recommend row_number():
select t.*,
row_number() over (partition by customer_email order by created_at) as occurrences
from t;
If you are running MySQL 8.0, you can just do a window count:
select
t.*,
count(*) over(partition by customer_email order by created_at) occurences
from mytable t
You don't need an order by clause at the end of the query for this to work (but you need one if you want to order the results).
If you need to filter on the results of the window count, an additional level is needed, since window functions cannot be used in the where clause of a query:
select *
from (
select
t.*,
count(*) over(partition by customer_email order by created_at) occurences
from mytable t
) t
where occurences < 2

Sql Multi-selecting one line from subgroups inside the same table

Given the following table:
id | group_s | name
_____________________
1 | 1 | pollo
2 | 1 | cordero
3 | 1 | cerdo
4 | 2 | tomates
5 | 2 | naranjas
6 | 2 | manzanas
I would like to randomly select one line from every group.
Example of possible outputs (since it is random):
id | group_s | name
_____________________
3 | 1 | cerdo
5 | 2 | naranjas
or
id | group_s | name
_____________________
1 | 1 | pollo
6 | 2 | manzanas
and so on..
I don't have a clue how to do it. I suppose I should multiselect the table.
I did try the following without success:
SELECT T2.* FROM (
SELECT group_s
FROM mytable
GROUP BY group_s ORDER BY RAND() LIMIT 1) AS T1
JOIN mytable AS T2
ON T1.group_s = T2.group_s;
Use the window function ROW_NUMBER() OVER(PARTITION BY group_s) with ORDER BY NEWID() to randomly get the ordering, something like this:
WITH CTE
AS
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY group_s
ORDER BY newid()) AS RN
FROM yourTable
)
SELECT id , group_s , name
FROM CTE
WHERE RN = 1;
See it in action here:
SQL Fiddle Demo