MySQL Select row with lowest value in column - mysql

I have a table
+------+-----+-------+
| name | age | class |
+------+-----+-------+
| Ben | 4 | B |
| Alex | 7 | A |
| Jim | 3 | B |
| Ben | 5 | C |
| Ben | 2 | C |
| Alex | 9 | A |
+------+-----+-------+
I need a query so that I can select the person with the lowest age such that I get:
+------+-----+-------+
| name | age | class |
+------+-----+-------+
| Ben | 2 | C |
| Jim | 3 | B |
| Alex | 7 | A |
+------+-----+-------+
I've been messing with various combinations or GROUP BYs and ORDER BYs and can't seem to get it right.
Also, the table consists of about 8 million records so performance is important.

You first have to select the minimum age per class:
select min(age) as age, class as class from t group by class
(Note: I am assuming you want the minimum age per class. I you want the minimum age per name, then replace class with name in the queries ...)
Then you have to join the result with your table to get the respective rows.
The full SQL would be
select t.* from t
inner join
(
select min(age) as age, class as class from t group by class
) min_ages on t.age = min_ages.age and t.class = min_ages.class;
For optimal performance, make sure that age is indexed as well as class (or name, whichever you want in your group by expression).

SELECT name,age,class FROM table t1
JOIN
(SELECT name,MIN(age)as minage FROM table GROUP BY name)t2
ON t1.name=t2.name AND t1.age=t2.minage

Related

Mysql query to get max age by section and if two or more has same age return student with smallest id

I have a table of students with temporary test values like this:
Table students
+----+-------------+-------+-----------+
| id | section_id | age | name |
+----+-------------+-------+-----------+
| 1 | 1 | 18 | Justin |
+----+-------------+-------+-----------+
| 2 | 2 | 14 | Jillian |
+----+-------------+-------+-----------+
| 3 | 2 | 16 | Cherry |
+----+-------------+-------+-----------+
| 4 | 3 | 19 | Ronald |
+----+-------------+-------+-----------+
| 5 | 3 | 21 | Marie |
+----+-------------+-------+-----------+
| 6 | 3 | 21 | Arthur |
+----+-------------+-------+-----------+
I want to query the table such that I want to get all the maximum ages of each section. However, if two students have the same age, the table produced will return the student with smallest id.
Return:
+----+------------+-----+--------+
| id | section_id | age | name |
+----+------------+-----+--------+
| 1 | 1 | 18 | Justin |
+----+------------+-----+--------+
| 3 | 2 | 16 | Cherry |
+----+------------+-----+--------+
| 5 | 3 | 21 | Marie |
+----+------------+-----+--------+
I tried this query:
SELECT ANY_VALUE(id), ANY_VALUE(section_id), MAX(age), ANY_VALUE(name) FROM
(SELECT id, section_id, age, name FROM students ORDER BY id) as X
GROUP BY section_id
Unfortunately, there are instances that id does not match the age and name.
I have on my end:
sql_mode = only_full_group_by
and I don't have a privilege to edit that, hence the any_value function but I have no idea how to use it.
This will do what you want.
It starts by finding the maximum age per section (including duplicates).
Then it joins those results with the minimum id per section (to eliminate duplicates).
And finally, select all fields for the matching id and section combinations.
SELECT s3.*
FROM students s3
INNER JOIN (
SELECT MIN(s2.id) AS id, s2.section_id
FROM students s2
INNER JOIN (
SELECT s1.section_id, MAX(s1.age) AS age
FROM students s1
GROUP BY s1.section_id
) s1 USING (section_id, age)
GROUP BY s2.section_id
) s2 USING (id, section_id);
Working SQL fiddle: https://www.db-fiddle.com/f/aezgAYM6A5KnXykceB7At1/0
I would simply use a correlated subquery:
select s.*
from students s
where s.id = (select s2.id
from students s2
where s2.section_id = s.section_id
order by s2.age desc, s2.id asc
limit 1
);
This is pretty much the simplest way to express the logic. And with an index on students(section, age, id), it should be the most performant as well.

MySQL select unique rows in two columns with the highest value in one column

I have a basic table:
+-----+--------+------+------+
| id, | name, | cat, | time |
+-----+--------+------+------+
| 1 | jamie | 1 | 100 |
| 2 | jamie | 2 | 100 |
| 3 | jamie | 1 | 50 |
| 4 | jamie | 2 | 150 |
| 5 | bob | 1 | 100 |
| 6 | tim | 1 | 300 |
| 7 | alice | 4 | 100 |
+-----+--------+------+------+
I tried using the "Left Joining with self, tweaking join conditions and filters" part of this answer: SQL Select only rows with Max Value on a Column but some reason when there are records with a value of 0 it breaks, and it also doesn't return every unique answer for some reason.
When doing the query on this table I'd like to receive the following values:
+-----+--------+------+------+
| id, | name, | cat, | time |
+-----+--------+------+------+
| 1 | jamie | 1 | 100 |
| 4 | jamie | 2 | 150 |
| 5 | bob | 1 | 100 |
| 6 | tim | 1 | 300 |
| 7 | alice | 4 | 100 |
+-----+--------+------+------+
Because they are unique on name and cat and have the highest time value.
The query I adapted from the answer above is:
SELECT a.name, a.cat, a.id, a.time
FROM data A
INNER JOIN (
SELECT name, cat, id, MAX(time) as time
FROM data
WHERE extra_column = 1
GROUP BY name, cat
) b ON a.id = b.id AND a.time = b.time
The issue here is that ID is unique per row you can't get the unique value when getting the max; you have to join on the grouped values instead.
SELECT a.name, a.cat, a.id, a.time
FROM data A
INNER JOIN (
SELECT name, cat, MAX(time) as time
FROM data
WHERE extra_column = 1
GROUP BY name, cat
) b ON A.Cat = B.cat and A.Name = B.Name AND a.time = b.time
Think about it... So what ID is mySQL returning form the Inline view? It could be 1 or 3 and 2 or 4 for jamie. Hows does the engine know to pick the one with the max ID? it is "free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. " it could pick the wrong one resulting in incorrect results. So you can't use it to join on.
https://dev.mysql.com/doc/refman/5.0/en/group-by-handling.html
If you want to use a self join, you could use this query:
SELECT
d1.*
FROM
date d1 LEFT JOIN date d2
ON d1.name=d2.name
AND d1.cat=d2.cat
AND d1.time<d2.time
WHERE
d2.time IS NULL
It is very simple
SELECT MAX(TIME),name,cat FROM table name group by cat

SQL Query for selecting multiple rows but highest value for each PK

I know that the title sounds horrible but I have no idea how to summarize it better. I'm pretty sure that somebody had the same problem before but I couldn't find anything. RDBMS: MySQL.
Problem:
I have the following (simplified) table:
+------+------------+---------------------------------+
| name | date | score |
+------+------------+---------------------------------+
| A | 01.01.2015 | 1 |
| A | 01.02.2015 | 3 |
| A | 01.03.2015 | 4 |
| B | 01.01.2015 | 3 |
| B | 01.02.2015 | 4 |
| B | 01.03.2015 | 5 |
| C | 01.01.2015 | 1 |
| C | 01.02.2015 | 2 |
| C | 01.03.2015 | 3 |
+------+------------+---------------------------------+
There is no unique constraint or PK defined.
The table represents a highscore of a game. Every day the score of all players are inserted with values that are: name, points, now(),...
The data represent a snapshot of the score of each player at a specific time.
I want the most recent entry for each user only but only for the highest X players. So the result should look like
+------+------------+---------------------------------+
| name | date | score |
+------+------------+---------------------------------+
| A | 01.03.2015 | 4 |
| B | 01.03.2015 | 5 |
+------+------------+---------------------------------+
C doesn't appear since he's not in the top 2 (by score)
A appears with the most recent row (by date)
B appears, like A, with the most recent row (by date) and because he is in the top 2
I hope it becomes clear what I mean.
Thanks in advance!
I understand that what you need is to first select the X players who've gotten the highest score and then get their latest performance. In this case, you should do this:
SELECT *
FROM tablename t
JOIN
(
SELECT t.name, max(t.date) as max_date
FROM tablename t
JOIN
(
SELECT name
FROM
(
SELECT name, max(score) as max_score
FROM table_name
GROUP BY name
) all_highscores
ORDER BY max_score DESC
LIMIT X
) top_scores
ON top_scores.name = t.name
GROUP BY t.name
) top_last
on t.name = top_last.name
and t.date = top_last.date;

Mysql include column with no rows returned for specific dates

I would like to ask a quick question regarding a mysql query.
I have a table named trans :
+----+---------------------+------+-------+----------+----------+
| ID | Date | User | PCNum | Customer | trans_In |
+----+---------------------+------+-------+----------+----------+
| 8 | 2013-01-23 16:24:10 | test | PC2 | George | 10 |
| 9 | 2013-01-23 16:27:22 | test | PC2 | Nick | 0 |
| 10 | 2013-01-24 16:28:48 | test | PC2 | Ted | 10 |
| 11 | 2013-01-25 16:36:40 | test | PC2 | Danny | 10 |
+----+---------------------+------+-------+----------+----------+
and another named customers :
+----+---------+-----------+
| ID | Name | Surname |
+----+---------+-----------+
| 1 | George | |
| 2 | Nick | |
| 3 | Ted | |
| 4 | Danny | |
| 5 | Alex | |
| 6 | Mike | |
.
.
.
.
+----+---------+-----------+
I want to view the sum of trans_in column for specific customers in a date range BUT ALSO include in the result set, those customers that haven't got any records in the selected date range. Their sum of trans_in could appear as NULL or 0 it doesn't matter...
I have the following query :
SELECT
`Date`,
Customer,
SUM(trans_in) AS 'input'
FROM trans
WHERE Customer IN('George','Nick','Ted','Danny')
AND `Date` >= '2013-01-24'
GROUP BY Customer
ORDER BY input DESC;
But this will only return the sum for 'Ted' and 'Danny' because they only have transactions after the 24th of January...
How can i include all the customers that are inside the WHERE IN (...) function, even those who have no transactions in the selected date range??
I suppose i'll have to join them somehow with the customers table but i cannot figure out how.
Thanks in advance!!
:)
In order to include all records from one table without matching records in another, you have to use a LEFT JOIN.
SELECT
t.`Date`,
c.name,
SUM(t.trans_in) AS 'input'
FROM customers c LEFT JOIN trans t ON (c.name = t.Customer AND t.`Date` >= '2013-01-24')
WHERE c.name IN('George','Nick','Ted','Danny')
GROUP BY c.name
ORDER BY input DESC;
Of course, I would mention that you should be referencing customer by ID, and not by name in your related table. Your current setup leads to information duplication. If the customer changes their name, you now have to update all related records in the trans table instead of just in the customer table.
try this
SELECT
`Date`,
Customer,
SUM(trans_in) AS 'input'
FROM trans
inner join customers
on customers.Name = trans.Customer
WHERE Customer IN('George','Nick','Ted','Danny')
GROUP BY Customer
ORDER BY input DESC;

LIMIT results to n unique column values?

I have some MySQL results like this:
---------------------------
| name | something_random |
---------------------------
| john | ekjalsdjalfjkldd |
| alex | akjsldfjaekallee |
| alex | jkjlkjslakjfjflj |
| alex | kajslejajejjaddd |
| bob | ekakdie33kkd93ld |
| bob | 33kd993kakakl3ll |
| paul | 3k309dki595k3lkd |
| paul | 3k399kkfkg93lk3l |
etc...
This goes on for 1000's of rows of results. I need to limit the number of results to the first 50 unique names. I think there is a simple solution to this but I'm not sure.
I've tried using derived tables and variables but can't quite get there. If I could figure out how to increment a variable once every time a name is different I think I could say WHERE variable <= 50.
UPDATED
I've tried the Inner Join approach(es) suggested below. The problem is this:
The subselect SELECT DISTINCT name FROM testTable LIMIT 50 grabs the first 50 distinct names. Perhaps I wasn't clear enough in my original post, but this limits my query too much. In my query, not every name in the table is returned in the result. Let me modify my original example:
----------------------------------
| id | name | something_random |
----------------------------------
| 1 | john | ekjalsdjalfjkldd |
| 4 | alex | akjsldfjaekallee |
| 4 | alex | jkjlkjslakjfjflj |
| 4 | alex | kajslejajejjaddd |
| 6 | bob | ekakdie33kkd93ld |
| 6 | bob | 33kd993kakakl3ll |
| 12 | paul | 3k309dki595k3lkd |
| 12 | paul | 3k399kkfkg93lk3l |
etc...
So I added in some id numbers here. These ID numbers pertain to the people's names in the tables. So you can see in the results, not every single person/name in the table is necessarily in the result (due to some WHERE condition). So the 50th distinct name in the list will always have an ID number higher than 49. The 50th person could be id 79, 234, 4954 etc...
So back to the problem. The subselect SELECT DISTINCT name FROM testTable LIMIT 50 selects the first 50 names in the table. That means that my search results will be limited to names that have ID <=50, which is too constricting. If there are certain names that don't show up in the query (due to some WHERE condition), then they are still counted as one of the 50 distinct names. So you end up with too few results.
UPDATE 2
To #trapper: This is a basic simplification of what my query looks like:
SELECT
t1.id,
t1.name,
t2.details
FROM t1
LEFT JOIN t2 ON t1.id = t2.some_id
INNER JOIN
(SELECT DISTINCT name FROM t1 ORDER BY id LIMIT 0,50) s ON s.name = t1.name
WHERE
SOME CONDITIONS
ORDER BY
t1.id,
t1.name
And my results look like this:
----------------------------------
| id | name | details |
----------------------------------
| 1 | john | ekjalsdjalfjkldd |
| 3 | alex | akjsldfjaekallee |
| 3 | alex | jkjlkjslakjfjflj |
| 4 | alex | kajslejajejjaddd |
| 6 | bob | ekakdie33kkd93ld |
| 6 | bob | 33kd993kakakl3ll |
| 12 | paul | 3k309dki595k3lkd |
| 12 | paul | 3k399kkfkg93lk3l |
...
| 37 | bill | kajslejajejjaddd |
| 37 | bill | ekakdie33kkd93ld |
| 41 | matt | 33kd993kakakl3ll |
| 50 | jake | 3k309dki595k3lkd |
| 50 | jake | 3k399kkfkg93lk3l |
----------------------------------
The results stop at id=50. There are NOT 50 distinct names in the list. There are only roughly 23 distinct names.
My MySql syntax may be rusty, but the idea is to use a query to select the top 50 distinct names, then do a self-join on name and select the name and other information from the join.
select a.name, b.something_random
from Table b
inner join (select distinct name from Table order by RAND() limit 0,50) a
on a.name = b.name
SELECT DISTINCT name FROM table LIMIT 0,50
Edited: Ahh yes I misread question first time, this should do the trick though :)
SELECT a.name, b.something_random
FROM `table` b
INNER JOIN (SELECT DISTINCT name FROM `table` ORDER BY RAND() LIMIT 0,50) a
ON a.name = b.name ORDER BY a.name
How this work is the (SELECT DISTINCT name FROMtableORDER BY RAND() LIMIT 0,50) part is what pulls out the names to include in the join. So here I am taking 50 unique names at random, but you can change this to any other selection criteria if you want.
Then you join those results back into your table. This links each of those 50 selected names back to all of the rows with a matching name for your final results. Finally ORDER BY a.name just to be sure all the rows for each name end up grouped together.
This should do it:
SELECT tA.*
FROM
testTable tA
INNER JOIN
(SELECT distinct name FROM testTable LIMIT 50) tB ON tA.name = tB.name
;