Mysql - Get distinct value on one field ordered by another field - mysql

Let's assume the given MYSQL table structure
+----+-------+-------+
| id | group | count |
+----+-------+-------+
| 1 | cat | 11 |
| 2 | cat | 12 |
| 3 | dog | 4 |
| 4 | dog | 6 |
| 5 | cow | 16 |
| 6 | cow | 12 |
+----+-------+-------+
What I want to do is : Take one animal per animal group, ordered by the count field ascending. In the example above, the output should be :
| 1 | cat | 11 |
| 3 | dog | 4 |
| 6 | cow | 12 |
But it's more complexe than it looks like. What is the most optimized query to get thoses results ? (Of course, making a subquery for each group is not an option)

As per the manual...
SELECT x.*
FROM my_table x
JOIN
( SELECT `group`, MIN(count) count FROM my_table GROUP BY `group`) y
ON y.group = x.group
AND y.count = x.count;

From MySQL 8.0+ you could use ROW_NUMBER():
SELECT *
FROM (SELECT *, ROW_NUMBER() OVER(PARTITION BY `group` ORDER BY `count`) AS rn
FROM table_name) sub
WHERE rn = 1;
DBFiddle Demo

Related

How to select other columns of a table when grouping? [duplicate]

This question already has answers here:
SQL select only rows with max value on a column [duplicate]
(27 answers)
Closed 1 year ago.
Please assume this table:
// mytable
+--------+-------------+---------+
| num | business_id | user_id |
+--------+-------------+---------+
| 3 | 503 | 12 |
| 7 | 33 | 12 |
| 1 | 771 | 13 |
| 2 | 86 | 13 |
| 1 | 772 | 13 |
| 4 | 652 | 14 |
| 4 | 567 | 14 |
+--------+-------------+---------+
I need to group it based on user_id, So, here is my query:
select max(num), user_id from mytable
group by user_id
Here is the result:
// res
+--------+---------+
| num | user_id |
+--------+---------+
| 7 | 12 |
| 2 | 13 |
| 4 | 14 |
+--------+---------+
Now I need to also get the business_id of those rows. Here is the expected result:
// mytable
+--------+-------------+---------+
| num | business_id | user_id |
+--------+-------------+---------+
| 7 | 33 | 12 |
| 2 | 86 | 13 |
| 4 | 567 | 14 | -- This is selected randomly, because of the equality of values
+--------+-------------+---------+
Any idea how can I do that?
You don't group. You filter. One method uses window functions such as row_number():
select t.*
from (select t.*,
row_number() over (partition by user_id order by num desc) as seqnum
from mytable t
) t
where seqnum = 1;
Another method which can have slightly better performance with an index on (user_id, num) is a correlated subquery:
select t.*
from mytable t
where t.num = (select max(t2.num)
from mytable t2
where t2.user_id = t.user_id
);
You should think "group by" when you want to summarize rows. You should think "where" when you want to choose rows with particular characteristics.

Sort a table but keep groups of rows together

How do I sort a table by it's minimum value per group but at the same time keep a group of rows together. Below a simple example of what i am trying to accomplish. The table is sorted by the lowest group value, but the group remains together. I am pretty sure this question has been asked already but i could not find an answer.
+---------+-------+
| Group | value |
+---------+-------+
| 1 | 3.99 |
| 1 | 10.99 |
| 3 | 12.69 |
| 1 | 20.95 |
| 2 | 19.95 |
| 3 | 10.09 |
+---------+-------+
Desired output
+---------+-------+
| Group | value |
+---------+-------+
| 1 | 3.99 |
| 1 | 10.99 |
| 1 | 20.95 |
| 3 | 10.69 |
| 3 | 12.09 |
| 2 | 19.95 |
+---------+-------+
If you are running MySQL 8.0, you can sort with window functions:
select t.*
from mytable t
order by min(value) over(partition by grp), value
In earlier versions, one option is to join an aggregate subquery:
select t.*
from mytable t
inner join (
select grp, min(value) min_value from mytable group by grp
) m on m.grp = t.grp
order by m.min_value, t.value
SELECT *,RN = ROW_NUMBER() OVER (PARTITION BY ID ORDER BY VALUE,ID) FROM TEMP

Retrieving a variable number of rows using a table join

This is an addition layer of complexity on another question I asked here: Using GROUP BY and ORDER BY in same MySQL query
Same table structure and problem, except this time imagine that the past_election table is now set up as...
| election_ID | Date | jurisdiction | Race | Seats |
|-------------|------------|----------------|---------------|-------|
| 1 | 2016-11-08 | federal | president | 1 |
| 2 | 2016-11-08 | state_district | state senator | 2 |
(last record has seats set as 2 instead of 1.)
I want to use the Seats number to grab different numbers of records, ordered by the number of votes, for each group. So in this case with the following additional tables...
candidates
| Candidate_ID | FirstName | LastName | MiddleName |
|--------------|-----------|----------|------------|
| 1 | Aladdin | Arabia | A. |
| 2 | Long | Silver | John |
| 3 | Thor | Odinson | NULL |
| 4 | Baba | Yaga | NULL |
| 5 | Robin | Hood | Locksley |
| 6 | Sherlock | Holmes | J. |
| 7 | King | Kong | Null |
past_elections-candidates
| ID | PastElection | Candidate | Votes |
|----|--------------|-----------|-------|
| 1 | 1 | 1 | 200 |
| 2 | 1 | 2 | 100 |
| 3 | 1 | 6 | 50 |
| 4 | 2 | 3 | 75 |
| 5 | 2 | 4 | 25 |
| 6 | 2 | 5 | 150 |
| 7 | 2 | 7 | 100 |
I would expect the following output:
| election_ID | FirstName | LastName | votes | percent |
|-------------|-----------|----------|-------|---------|
| 1 | Aladdin | Arabia | 200 | 0.5714 |
| 2 | Robin | Hood | 150 | 0.4286 |
| 2 | King | Kong | 100 | 0.2857 |
I've tried setting a variable and using that with a LIMIT statement but variables don't work in limits. I've also tried using ROW_NUMBER() (I'm not using MySQL 8.0 so this won't work but I'd be willing to upgrade if it did) or a related workaround like #row_number := IF ... and then filtering based on the row number but nothing has worked.
Last tried query:
SELECT pe.election_ID as elec,
pe.Seats as s,
pecs.row_num,
c.FirstName,
c.LastName,
pecs.max_votes AS votes,
pecs.max_votes / pecs.total_votes AS percent
FROM past_elections pe
JOIN `past_elections-candidates` pec ON pec.PastElection = pe.election_ID
JOIN (SELECT PastElection,
Candidate,
#row_num := IF(PastElection = #current_election, #current_election + 1, 1) as row_num,
MAX(Votes) AS max_votes,
SUM(Votes) AS total_votes,
#current_election := PastElection
FROM `past_elections-candidates`
GROUP BY PastElection) pecs ON pecs.PastElection = pec.PastElection AND pecs.row_num <= pe.Seats
JOIN candidates c ON c.Candidate_ID = pec.Candidate
Use MySQL 8 regardless ;)
Use ROW_NUMBER to order the past elections:
SELECT *, ROW_NUMBER() OVER(PARTITION BY pastelection ORDER BY votes DESC) as rown
FROM `past_elections-candidates`
Join this to past_elections as a subquery (this is just the bit you're stuck on with the "using pe.seats to vary the number of rows returned per election" and doesn't include the percent bits:
SELECT *
FROM
past_elections pe
INNER JOIN
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY pastelection ORDER BY votes DESC) as rown
FROM `past_elections-candidates`
) pecr
ON pecr.pastelection = pe.electionid AND
pecr.rown <= pe.seats
If you want to test things out on 8 before you upgrade, loads of the db fiddle sites support v8
ps; percent-y stuff can be done at the same time as the ROW_NUMBER with eg:
votes/SUM(votes) OVER(PARTITION BY past_election)
eg for election ID 1 that sum will be 200+100+50, giving 200/350 = ~57%
SELECT *, votes/SUM(votes) OVER(PARTITION BY past_election) as pcnt, ROW_NUMBER() OVER(PARTITION BY pastelection ORDER BY votes DESC) as rown
FROM `past_elections-candidates`
You need to calc it before filtering
I don't have the right fields listed but this is as close as I'll probably get tonight... I've gotten the rows I need but need to join the candidate table to get the name out...
Using Dense_Rank seems to work for this...
SELECT * FROM (
SELECT pec.PastElection,
c.FirstName,
c.LastName,
pec.Votes,
pecs.totalVotes,
pe.Seats as s,
DENSE_RANK() OVER(PARTITION BY PastElection ORDER BY Votes DESC) as rank_votes
FROM `past_elections-candidates` pec
JOIN (SELECT PastElection,
Max(Votes) as maxVotes,
Sum(Votes) as totalVotes
FROM `past_elections-candidates`
GROUP BY PastElection) pecs ON pecs.PastElection = pec.PastElection
JOIN `past_elections` pe ON pec.PastElection = pe.election_ID
JOIN candidates c ON c.Candidate_ID = pec.Candidate
) t WHERE rank_votes <= s;
This results in
| PastElection | FirstName | LastName | Votes | totalVotes | s | rank_votes |
|--------------|-----------|----------|-------|------------|---|------------|
| 1 | Aladdin | Arabia | 200 | 350 | 1 | 1 |
| 2 | Robin | Hood | 150 | 350 | 2 | 1 |
| 2 | King | Kong | 100 | 350 | 2 | 2 |
I guess it's just kind of messy having the rank_votes and s columns in the data, but that's honestly fine with me if it gets the results I need.

SELECT query where LIMIT is a distinct count of repeating key

I have a problem with selecting specific amount of data. The problem is that one of the keys have the same repeated value.
--------------------
| id | name | key |
--------------------
| 1 | alfa | a |
| 2 | alfa | b |
| 3 | alfa | c |
| 4 | beal | a |
| 5 | beal | b |
| 6 | gala | c |
| 7 | gala | d |
| 8 | delt | a |
| 9 | ceta | a |
--------------------
In this situation I want to select three individual names. For example I want to limit distinct name to 3 positions to get this result:
SAMPLE DUMP CODE:
SELECT * in Table
WHERE `name` LIKE '%al%'
LIMIT BY DISTINCT
`name`, 3
------ RESULT ------
| 1 | alfa | a |
| 2 | alfa | b |
| 3 | alfa | c |
| 4 | beal | a |
| 5 | beal | b |
| 6 | gala | c |
| 7 | gala | d |
--------------------
I will be glad for help.
Without window functions:
select *
from (
select distinct name
from mytable
where `name` like '%al%'
order by name
limit 3
) n
natural join mytable
db-fiddle
If you don't like NATURAL JOINs you can also use
select t.*
from (
select distinct name
from mytable
where `name` like '%al%'
order by name
limit 3
) n
join mytable t on t.name = n.name
If window functions are supported, you can use DENSE_RANK():
with cte as (
select *,
dense_rank() over (order by name) as dr
from mytable
where `name` like '%al%'
)
select id, name, `key`
from cte
where dr <= 3
db-fiddle
I prefer the LIMIT 3 subquery, since it can stop the index scan (depending on optimizer) after three distinct names are found.
MySQL 8.0 solution utilizing Window functions is as follows:
SELECT
dt.id, dt.name, dt.`key`
FROM
(
SELECT
ROW_NUMBER() OVER (PARTITION BY name ORDER BY id) AS rn,
id,
name,
`key`
FROM your_table_name
WHERE name LIKE '%al%'
) AS dt
WHERE dt.rn <= 3
ORDER BY dt.id
Explanation:
In a Derived table (subquery), determine Row_Number() within a partition (group) of specific name, ordered by id in ascending order. We will consider only names matching %al% condition.
Now, use the subquery result to SELECT only the rows having row number upto 3 (basically limiting to 3 rows per name).
By the way, key is a Reserved Keyword in MySQL. You should consider renaming column to something else; otherwise you will need to use backticks around it.
Result
| id | name | key |
| --- | ---- | --- |
| 1 | alfa | a |
| 2 | alfa | b |
| 3 | alfa | c |
| 4 | beal | a |
| 5 | beal | b |
| 6 | gala | c |
| 7 | gala | d |
View on DB Fiddle

sort data by specific order sequence (mysql)

So, let say I have this data
id | value | group
1 | 100 | A
2 | 120 | A
3 | 150 | B
4 | 170 | B
I want to sort it so it become like this
id | value | group
1 | 100 | A
3 | 150 | B
2 | 120 | A
4 | 170 | B
there will be more group than that, so if I the data ordered the group like (A,C,B,D,B,C,A), it will become (A,B,C,D,A,B,C)
You can add a counter column to the table, which will be used to sort the table:
select t.id, t.value, t.`group`
from (
select t.id, t.value, t.`group`,
(select count(*) from tablename
where `group` = t.`group` and id < t.id) counter
from tablename t
) t
order by t.counter, t.`group`
See the demo.
Results:
| id | value | group |
| --- | ----- | ----- |
| 1 | 100 | A |
| 3 | 150 | B |
| 2 | 120 | A |
| 4 | 170 | B |
You can approach this as
SELECT *
FROM `tablename`
ORDER BY
row_number() OVER (PARTITION BY `group` ORDER BY `group`), `group`