While very easy to do in Perl or PHP, I cannot figure how to use mysql only to extract the first unique occurence of a record.
For example, given the following table:
Name Date Time Sale
John 2010-09-12 10:22:22 500
Bill 2010-08-12 09:22:37 2000
John 2010-09-13 10:22:22 500
Sue 2010-09-01 09:07:21 1000
Bill 2010-07-25 11:23:23 2000
Sue 2010-06-24 13:23:45 1000
I would like to extract the first record for each individual in asc time order.
After sorting the table is ascending time order, I need to extract the first unique record by name.
So the output would be :
Name Date Time Sale
John 2010-09-12 10:22:22 500
Bill 2010-07-25 11:23:23 2000
Sue 2010-06-24 13:23:45 1000
Is this doable in an easy fashion with mySQL?
I think that something along the lines of
select name, date, time, sale from mytable order by date, time group by name;
will get you what you're looking for
you need to perform a groupwise max or groupwise min
see below or http://pastie.org/973117 for an example
select
u.user_id,
u.username,
latest.comment_id
from
users u
left outer join
(
select
max(comment_id) as comment_id,
user_id
from
user_comment
group by
user_id
) latest on u.user_id = latest.user_id;
In databases, there really is no "first" or "last" record; think of each record as its own, non-positional entity in the table. The only positions they have are when you give them one, say, using ORDER BY.
This will give you what you want. It might not be efficient, but it works.
select Name, Date, Time, Sale from
(select Name, Date, Time, Sale from MyTable
order by Date asc, Time asc) MyTable_subquery_name
group by Name
Note: MyTable_subquery_name is just a dummy name for the subquery. MySQL will give the error ERROR 1248 (42000): Every derived table must have its own alias without it.
If only GROUP BY and ORDER BY were communicative operations, then this wouldn't have to be a subquery.
Related
I already asked this question earlier but forgot a few (important) details or got them wrong.
My table in MySQL 8.0.29 looks like this
UserID
Appointment
Description
Bob
2022-06-01
Cleaning
Bob
2022-06-03
Toothache
John
2022-06-02
Braces
I'm trying to get the latest appointment for every person sorted by oldest first.
The query should return
UserID
Appointment
Description
John
2022-06-02
Braces
Bob
2022-06-03
Toothache
Using one of the previous answers I get
SELECT Name, Appointment, Description
FROM (
SELECT Name, Appointment, Description, ROW_NUMBER() OVER(PARTITION BY Name ORDER BY Appointment DESC) rn) t1
WHERE rn = 1
The problem is the database currently has 3 million rows and it'll continue to grow so this query ends up being pretty slow.
My plan is to consume the data in chunks so I'd prefer the query having "pagination". Something like a LIMIT 0, 5000 to get 5000 records at a time.
I'm open to even re-architecting the database if it comes to that.
For now i've resorted to creating a new table that just keeps the latest appointment for each user.
You are halfway there. Use that query as a 'derived table' instead of making it permanent:
SELECT b.*
FROM ( SELECT user_id, MAX(appointment) AS last_date)
FROM tbl
GROUP BY user_id ) AS x
JOIN tbl AS b ON b.user_id = x.user_id
AND b.appointment = x.last_date
And be sure to have INDEX(user_id, appointment)
I would be interested to see if this and the "OVER" approach both give the same results and which is faster.
I have a data set like this:
User Date Status
Eric 1/1/2015 4
Eric 2/1/2015 2
Eric 3/1/2015 4
Mike 1/1/2015 4
Mike 2/1/2015 4
Mike 3/1/2015 2
I'm trying to write a query in which I will retrieve users whose MOST RECENT transaction status is a 4. If it's not a 4 I don't want to see that user in the results. This dataset could have 2 potential results, one for Eric and one for Mike. However, Mike's most recent transaction was not a 4, therefore:
The return result would be:
User Date Status
Eric 3/1/2015 4
As this record is the only record for Eric that has a 4 as his latest transaction date.
Here's what I've tried so far:
SELECT
user, MAX(date) as dates, status
FROM
orders
GROUP BY
status,
user
This would get me to a unqiue record for every user for every status type. This would be a subquery, and the parent query would look like:
SELECT
user, dates, status
WHERE
status = 4
GROUP BY
user
However, this is clearly flawed as I don't want status = 4 records IF their most recent record is not a 4. I only want status = 4 when the latest date is a 4. Any thoughts?
SELECT user, date
, actualOrders.status
FROM (
SELECT user, MAX(date) as date
FROM orders
GROUP BY user) AS lastOrderDates
INNER JOIN orders AS actualOrders USING (user, date)
WHERE actualOrders.status = 4
;
-- Since USING is being used, there is not a need to specify source of the
-- user and date fields in the SELECT clause; however, if an ON clause was
-- used instead, either table could be used as the source of those fields.
Also, you may want to rethink the field names used if it is not too late and user and date are both found here.
SELECT user, date, status FROM
(
SELECT user, MAX(date) as date, status FROM orders GROUP BY user
)
WHERE status = 4
The easiest way is to include your order table a second time in a subquery in your from clause in order to retrieve the last date for each user. Then you can add a where clause to match the most recent date per user, and finally filter on the status.
select orders.*
from orders,
(
select ord_user, max(ord_date) ord_date
from orders
group by ord_user
) latestdate
where orders.ord_status = 4
and orders.ord_user = latestdate.ord_user
and orders.ord_date = latestdate.ord_date
Another option is to use the over partition clause:
Oracle SQL query: Retrieve latest values per group based on time
Regards,
My SQL query needs to return a list of values alongside the date, but with my limited knowledge I have only been able to get this far.
This is my SQL:
select lsu_students.student_grouping,lsu_attendance.class_date,
count(lsu_attendance.attendance_status) AS count
from lsu_attendance
inner join lsu_students
ON lsu_students.student_grouping="Central1A"
and lsu_students.student_id=lsu_attendance.student_id
where lsu_attendance.attendance_status="Present"
and lsu_attendance.class_date="2015-02-09";
This returns:
student_grouping class_date count
Central1A 2015-02-09 23
I want it to return:
student_grouping class_date count
Central1A 2015-02-09 23
Central1A 2015-02-10 11
Central1A 2015-02-11 21
Central1A 2015-02-12 25
This query gets the list of the dates according to the student grouping:
select distinct(class_date)from lsu_attendance,lsu_students
where lsu_students.student_grouping like "Central1A"
and lsu_students.student_id = lsu_attendance.student_id
order by class_date
I think you just want a group by:
select s.student_grouping, a.class_date, count(a.attendance_status) AS count
from lsu_attendance a inner join
lsu_students s
ON s.student_grouping = 'Central1A' and
s.student_id = a.student_id
where a.attendance_status = 'Present'
group by s.student_grouping, a.class_date;
Comments:
Using single quotes for string constants, unless you have a good reason.
If you want a range of class dates, then use a where with appropriate filtering logic.
Notice the table aliases. The query is easier to write and to read.
I added student grouping to the group by. This would be required by any SQL engine other than MySQL.
Just take out and lsu_attendance.class_date="2015-02-09" or change it to a range, and then add (at the end) GROUP BY lsu_students.student_grouping,lsu_attendance.class_date.
The group by clause is what you're looking for, to limit aggregates (e.g. the count function) to work within each group.
To get the number of students present in each group on each date, you would do something like this:
select student_grouping, class_date, count(*) as present_count
from lsu_students join lsu_attendance using (student_id)
where attendance_status = 'Present'
group by student_grouping, class_date
Note: for your example, using is simpler than on (if your SQL supports it), and putting the table name before each field name isn't necessary if the column name doesn't appear in more than one table (though it doesn't hurt).
If you want to limit which data rows get included, put your constraints get in the where clause (this constrains which rows are counted). If you want to constrain the aggregate values that are displayed, you have to use the having clause. For example, to see the count of Central1A students present each day, but only display those dates where more than 20 students showed up:
select student_grouping, class_date, count(*) as present_count
from lsu_students join lsu_attendance using (student_id)
where attendance_status = 'Present' and student_grouping = 'Central1A'
group by student_grouping, class_date
having count(*) > 20
I'm getting grey hair by now...
I have a table like this.
ID - Place - Person
1 - London - Anna
2 - Stockholm - Johan
3 - Gothenburg - Anna
4 - London - Nils
And I want to get the result where all the different persons are included, but I want to choose which Place to order by.
For example. I want to get a list where they are ordered by LONDON and the rest will follow, but distinct on PERSON.
Output like this:
ID - Place - Person
1 - London - Anna
4 - London - Nils
2 - Stockholm - Johan
Tried this:
SELECT ID, Person
FROM users
ORDER BY FIELD(Place,'London'), Person ASC "
But it gives me:
ID - Place - Person
1 - London - Anna
4 - London - Nils
3 - Gothenburg - Anna
2 - Stockholm - Johan
And I really dont want Anna, or any person, to be in the result more then once.
This is one way to get the specified output, but this uses MySQL specific behavior which is not guaranteed:
SELECT q.ID
, q.Place
, q.Person
FROM ( SELECT IF(p.Person<=>#prev_person,0,1) AS r
, #prev_person := p.Person AS person
, p.Place
, p.ID
FROM users p
CROSS
JOIN (SELECT #prev_person := NULL) i
ORDER BY p.Person, !(p.Place<=>'London'), p.ID
) q
WHERE q.r = 1
ORDER BY !(q.Place<=>'London'), q.Person
This query uses an inline view to return all the rows in a particular order, by Person, so that all of the 'Anna' rows are together, followed by all the 'Johan' rows, etc. The set of rows for each person is ordered by, Place='London' first, then by ID.
The "trick" is to use a MySQL user variable to compare the values from the current row with values from the previous row. In this example, we're checking if the 'Person' on the current row is the same as the 'Person' on the previous row. Based on that check, we return a 1 if this is the "first" row we're processing for a a person, otherwise we return a 0.
The outermost query processes the rows from the inline view, and excludes all but the "first" row for each Person (the 0 or 1 we returned from the inline view.)
(This isn't the only way to get the resultset. But this is one way of emulating analytic functions which are available in other RDBMS.)
For comparison, in databases other than MySQL, we could use SQL something like this:
SELECT ROW_NUMBER() OVER (PARTITION BY t.Person ORDER BY
CASE WHEN t.Place='London' THEN 0 ELSE 1 END, t.ID) AS rn
, t.ID
, t.Place
, t.Person
FROM users t
WHERE rn=1
ORDER BY CASE WHEN t.Place='London' THEN 0 ELSE 1 END, t.Person
Followup
At the beginning of the answer, I referred to MySQL behavior that was not guaranteed. I was referring to the usage of MySQL User-Defined variables within a SQL statement.
Excerpts from MySQL 5.5 Reference Manual http://dev.mysql.com/doc/refman/5.5/en/user-variables.html
"As a general rule, other than in SET statements, you should never assign a value to a user variable and read the value within the same statement."
"For other statements, such as SELECT, you might get the results you expect, but this is not guaranteed."
"the order of evaluation for expressions involving user variables is undefined."
Try this:
SELECT ID, Place, Person
FROM users
GROUP BY Person
ORDER BY FIELD(Place,'London') DESC, Person ASC;
You want to use group by instead of distinct:
SELECT ID, Person
FROM users
GROUP BY ID, Person
ORDER BY MAX(FIELD(Place, 'London')), Person ASC;
The GROUP BY does the same thing as SELECT DISTINCT. But, you are allowed to mention other fields in clauses such as HAVING and ORDER BY.
I've got a table where the columns that matter look like this:
username
source
description
My goal is to get the 10 most recent records where a user/source combination is unique. From the following data:
1 katie facebook loved it!
2 katie facebook it could have been better.
3 tom twitter less then 140
4 katie twitter Wowzers!
The query should return records 2,3 and 4 (assume higher IDs are more recent - the actual table uses a timestamp column).
My current solution 'works' but requires 1 select to generate the 10 records, then 1 select to get the proper description per row (so 11 selects to generate 10 records) ... I have to imagine there's a better way to go. That solution is:
SELECT max(id) as MAX_ID, username, source, topic
FROM events
GROUP BY source, username
ORDER BY MAX_ID desc;
It returns the proper ids, but the wrong descriptions so I can then select the proper descriptions by the record ID.
Untested, but you should be able to handle this with a join:
SELECT
fullEvent.id,
fullEvent.username,
fullEvent.source,
fullEvent.topic
FROM
events fullEvent JOIN
(
SELECT max(id) as MAX_ID, username, source
FROM events
GROUP BY source, username
) maxEvent ON maxEvent.MAX_ID = fullEvent.id
ORDER BY fullEvent.id desc;