Getting a sorted distinct list from mySQL - mysql

Goal
I'l like to get a list of unique FID's ordered by the the one which has most recently been changed. In this sample table it should return FIDs in the order of 150, 194, 122
Example Data
ID FID changeDate
----------------------------------------------
1 194 2010-04-01
2 122 2010-04-02
3 194 2010-04-03
4 150 2010-04-04
My Attempt
I thought distinct and order by would do the trick. I initially tried:
SELECT distinct `FID` FROM `tblHistory` WHERE 1 ORDER BY changeDate desc
# Returns 150, 122, 194
using GROUP BY has the same result. I'm just barely a SQL amateur, and I'm a bit hung up. What seems to be happening is the aggregating functions find the first occurrence of each and then perform the sort.
Is there a way I can get the result I want straight from mySQL or do I have to grab all the data and then sort it in the PHP?

This worked for me:
SELECT FID
FROM tblHistory
GROUP BY FID
ORDER BY MAX(changeDate) DESC;

Okay, been reading questions on the site since I asked this one, and came up with an answer that seems to work, although I hope perhaps someone else may shed light on a simpler way:
SELECT t1.FID, t1.CD
FROM (
SELECT FID, max(changeDate) as CD
FROM sorted
GROUP BY FID
) as t1
WHERE 1
order by t1.CD desc
Seems to get the results I expect. I didn't know subqueries existed until a few minuets ago. I'm a real SQL newbie.

select * from tbl A
where changeDate =
(select max(changeDate) from tbl where tbl.fid = A.fid)
order by changeDate desc

Related

Efficiently get latest appointment for every person sorted by oldest first

I already asked this question earlier but forgot a few (important) details or got them wrong.
My table in MySQL 8.0.29 looks like this
UserID
Appointment
Description
Bob
2022-06-01
Cleaning
Bob
2022-06-03
Toothache
John
2022-06-02
Braces
I'm trying to get the latest appointment for every person sorted by oldest first.
The query should return
UserID
Appointment
Description
John
2022-06-02
Braces
Bob
2022-06-03
Toothache
Using one of the previous answers I get
SELECT Name, Appointment, Description
FROM (
SELECT Name, Appointment, Description, ROW_NUMBER() OVER(PARTITION BY Name ORDER BY Appointment DESC) rn) t1
WHERE rn = 1
The problem is the database currently has 3 million rows and it'll continue to grow so this query ends up being pretty slow.
My plan is to consume the data in chunks so I'd prefer the query having "pagination". Something like a LIMIT 0, 5000 to get 5000 records at a time.
I'm open to even re-architecting the database if it comes to that.
For now i've resorted to creating a new table that just keeps the latest appointment for each user.
You are halfway there. Use that query as a 'derived table' instead of making it permanent:
SELECT b.*
FROM ( SELECT user_id, MAX(appointment) AS last_date)
FROM tbl
GROUP BY user_id ) AS x
JOIN tbl AS b ON b.user_id = x.user_id
AND b.appointment = x.last_date
And be sure to have INDEX(user_id, appointment)
I would be interested to see if this and the "OVER" approach both give the same results and which is faster.

Looking for a low footprint solution to GROUP rows using HAVING to filter

Here is a table
id date name
1 180101 josh
2 180101 peter
3 180101 julia
4 180102 robert
5 180103 patrick
6 180104 josh
7 180104 adam
I need to get all the names whom having the same days as 'josh'. how can i achieve it without groupping the whole table together. i need to keep it efficient (this is not my real table, i just simplified my problem here, and i have hundred thousands of records, and 99% of the rows have different dates, so groupable rows by date is kind of rare).
So basicaly what i want is: if 'josh' is the target, i need to get 'josh,peter,julia,adam' (actually the first 10 distinct names sharing the same date with josh).
SELECT
COUNT(date) as datecount,
GROUP_CONCAT(DISTINCT name) as names,
FROM
table
GROUP BY
date
HAVING
datecount>1
// && name IN ('josh') would work nice for me, but im getting error because 'name' is not in GROUPED BY
LIMIT 10
Any idea ? As i mentioned it needs to be fast, and most of the rows have unique dates
Join the table with itself on date:
select distinct t1.name
from tbl t1
join tbl t2 using (date)
where t2.name = 'josh'
Demo
For the best performance you would have indexes on (name) and (date, name).

How to form a subquery

I think I need a subquery for this, and while I have read what subqueries are, I have not found help on how to write a subquery. I am interested in learning how to fish, but I also would like a fish soon, please :)
Simple, 1 table of data:
lastname, (found or not found boolean)
I want to generate some stats, across the whole alphabet, of who has been found.
Desired results:
A : 5 of 16 found, or about 31 percent
B : 2 of 4 found, or about 50 percent
C : 30 of 90 found, or about 30 percent
etc
I can form simple SQL, I need help with forming the subquery, if that's what is needed here.
I can write a query to list how many were found by the first letter of the last name:
select substring(lastname,1,1) as lastinitial, count(*) from members where found !=0 and found is not null group by lastinitial;
I can write a query to list how many total there are, by last initial:
select substring(lastname,1,1) as lastinitial, count(*) from members group by lastinitial;
But how do I combine the two queries to yield the desired result? Thanks for the help.
You probably don't need sub-query for this. The grouping can give you both found and not found for each name. Just add "found" to the grouping and you will get two records for each name, one for found and another for not found. You also don't need another query for the total, just add the found and not found together.
SELECT SUBSTRING(lastname,1,1) AS lastinitial,
(CASE WHEN found = 1 THEN 1 ELSE 0 END) AS found_val,
COUNT(lastname) AS found_count
FROM members
GROUP BY lastinitial, found_val;
If you want to have both of the found and not found in one row for each letter, try this:
SELECT found_list.lastinitial, found_count, not_found_count
FROM (
SELECT SUBSTRING(lastname,1,1) AS lastinitial, COUNT(lastname) AS found_count
FROM members
WHERE found = 1
GROUP BY lastinitial
) AS found_list,
(
SELECT SUBSTRING(lastname,1,1) AS lastinitial, COUNT(lastname) AS not_found_count
FROM members
WHERE found IS NULL OR found = 0
GROUP BY lastinitial
) AS not_found_list
WHERE found_list.lastinitial = not_found_list.lastinitial
As you can see, the first query is much shorter, more elegant, and also performs faster.

Selecting most recent as part of group by (or other solution ...)

I've got a table where the columns that matter look like this:
username
source
description
My goal is to get the 10 most recent records where a user/source combination is unique. From the following data:
1 katie facebook loved it!
2 katie facebook it could have been better.
3 tom twitter less then 140
4 katie twitter Wowzers!
The query should return records 2,3 and 4 (assume higher IDs are more recent - the actual table uses a timestamp column).
My current solution 'works' but requires 1 select to generate the 10 records, then 1 select to get the proper description per row (so 11 selects to generate 10 records) ... I have to imagine there's a better way to go. That solution is:
SELECT max(id) as MAX_ID, username, source, topic
FROM events
GROUP BY source, username
ORDER BY MAX_ID desc;
It returns the proper ids, but the wrong descriptions so I can then select the proper descriptions by the record ID.
Untested, but you should be able to handle this with a join:
SELECT
fullEvent.id,
fullEvent.username,
fullEvent.source,
fullEvent.topic
FROM
events fullEvent JOIN
(
SELECT max(id) as MAX_ID, username, source
FROM events
GROUP BY source, username
) maxEvent ON maxEvent.MAX_ID = fullEvent.id
ORDER BY fullEvent.id desc;

Returning query results in predefined order

Is it possible to do a SELECT statement with a predetermined order, ie. selecting IDs 7,2,5,9 and 8 and returning them in that order, based on nothing more than the ID field?
Both these statements return them in the same order:
SELECT id FROM table WHERE id in (7,2,5,9,8)
SELECT id FROM table WHERE id in (8,2,5,9,7)
I didn't think this was possible, but found a blog entry here that seems to do the type of thing you're after:
SELECT id FROM table WHERE id in (7,2,5,9,8)
ORDER BY FIND_IN_SET(id,"7,2,5,9,8");
will give different results to
SELECT id FROM table WHERE id in (7,2,5,9,8)
ORDER BY FIND_IN_SET(id,"8,2,5,9,7");
FIND_IN_SET returns the position of id in the second argument given to it, so for the first case above, id of 7 is at position 1 in the set, 2 at 2 and so on - mysql internally works out something like
id | FIND_IN_SET
---|-----------
7 | 1
2 | 2
5 | 3
then orders by the results of FIND_IN_SET.
Your best bet is:
ORDER BY FIELD(ID,7,2,4,5,8)
...but it's still ugly.
Could you include a case expression that maps your IDs 7,2,5,... to the ordinals 1,2,3,... and then order by that expression?
All ordering is done by the ORDER BY keywords, you can only however sort ascending and descending. If you are using a language such as PHP you can then sort them accordingly using some code but I do not believe it is possible with MySQL alone.
This works in Oracle. Can you do something similar in MySql?
SELECT ID_FIELD
FROM SOME_TABLE
WHERE ID_FIELD IN(11,10,14,12,13)
ORDER BY
CASE WHEN ID_FIELD = 11 THEN 0
WHEN ID_FIELD = 10 THEN 1
WHEN ID_FIELD = 14 THEN 2
WHEN ID_FIELD = 12 THEN 3
WHEN ID_FIELD = 13 THEN 4
END
You may need to create a temp table with an autonumber field and insert into it in the desired order. Then sort on the new autonumber field.
Erm, not really. Closest you can get is probably:
SELECT * FROM table WHERE id IN (3, 2, 1, 4) ORDER BY id=4, id=1, id=2, id=3
But you probably don't want that :)
It's hard to give you any more specific advice without more information about what's in the tables.
It's hacky (and probably slow), but you can get the effect with UNION ALL:
SELECT id FROM table WHERE id = 7
UNION ALL SELECT id FROM table WHERE id = 2
UNION ALL SELECT id FROM table WHERE id = 5
UNION ALL SELECT id FROM table WHERE id = 9
UNION ALL SELECT id FROM table WHERE id = 8;
Edit: Other people mentioned the find_in_set function which is documented here.
You get answers fast around here, don't you…
The reason I'm asking this is that it's the only way I can think of to avoid sorting a complex multidimensional array. I'm not saying it would be difficult to sort, but if there were a simpler way to do it with straight sql, then why not.
One Oracle solution is:
SELECT id FROM table WHERE id in (7,2,5,9,8)
ORDER BY DECODE(id,7,1,2,2,5,3,9,4,8,5,6);
This assigns an order number to each ID. Works OK for a small set of values.
Best I can think of is adding a second Column orderColumn:
7 1
2 2
5 3
9 4
8 5
And then just do a ORDER BY orderColumn