Selecting most recent as part of group by (or other solution ...) - mysql

I've got a table where the columns that matter look like this:
username
source
description
My goal is to get the 10 most recent records where a user/source combination is unique. From the following data:
1 katie facebook loved it!
2 katie facebook it could have been better.
3 tom twitter less then 140
4 katie twitter Wowzers!
The query should return records 2,3 and 4 (assume higher IDs are more recent - the actual table uses a timestamp column).
My current solution 'works' but requires 1 select to generate the 10 records, then 1 select to get the proper description per row (so 11 selects to generate 10 records) ... I have to imagine there's a better way to go. That solution is:
SELECT max(id) as MAX_ID, username, source, topic
FROM events
GROUP BY source, username
ORDER BY MAX_ID desc;
It returns the proper ids, but the wrong descriptions so I can then select the proper descriptions by the record ID.

Untested, but you should be able to handle this with a join:
SELECT
fullEvent.id,
fullEvent.username,
fullEvent.source,
fullEvent.topic
FROM
events fullEvent JOIN
(
SELECT max(id) as MAX_ID, username, source
FROM events
GROUP BY source, username
) maxEvent ON maxEvent.MAX_ID = fullEvent.id
ORDER BY fullEvent.id desc;

Related

Efficiently get latest appointment for every person sorted by oldest first

I already asked this question earlier but forgot a few (important) details or got them wrong.
My table in MySQL 8.0.29 looks like this
UserID
Appointment
Description
Bob
2022-06-01
Cleaning
Bob
2022-06-03
Toothache
John
2022-06-02
Braces
I'm trying to get the latest appointment for every person sorted by oldest first.
The query should return
UserID
Appointment
Description
John
2022-06-02
Braces
Bob
2022-06-03
Toothache
Using one of the previous answers I get
SELECT Name, Appointment, Description
FROM (
SELECT Name, Appointment, Description, ROW_NUMBER() OVER(PARTITION BY Name ORDER BY Appointment DESC) rn) t1
WHERE rn = 1
The problem is the database currently has 3 million rows and it'll continue to grow so this query ends up being pretty slow.
My plan is to consume the data in chunks so I'd prefer the query having "pagination". Something like a LIMIT 0, 5000 to get 5000 records at a time.
I'm open to even re-architecting the database if it comes to that.
For now i've resorted to creating a new table that just keeps the latest appointment for each user.
You are halfway there. Use that query as a 'derived table' instead of making it permanent:
SELECT b.*
FROM ( SELECT user_id, MAX(appointment) AS last_date)
FROM tbl
GROUP BY user_id ) AS x
JOIN tbl AS b ON b.user_id = x.user_id
AND b.appointment = x.last_date
And be sure to have INDEX(user_id, appointment)
I would be interested to see if this and the "OVER" approach both give the same results and which is faster.

Looking for a low footprint solution to GROUP rows using HAVING to filter

Here is a table
id date name
1 180101 josh
2 180101 peter
3 180101 julia
4 180102 robert
5 180103 patrick
6 180104 josh
7 180104 adam
I need to get all the names whom having the same days as 'josh'. how can i achieve it without groupping the whole table together. i need to keep it efficient (this is not my real table, i just simplified my problem here, and i have hundred thousands of records, and 99% of the rows have different dates, so groupable rows by date is kind of rare).
So basicaly what i want is: if 'josh' is the target, i need to get 'josh,peter,julia,adam' (actually the first 10 distinct names sharing the same date with josh).
SELECT
COUNT(date) as datecount,
GROUP_CONCAT(DISTINCT name) as names,
FROM
table
GROUP BY
date
HAVING
datecount>1
// && name IN ('josh') would work nice for me, but im getting error because 'name' is not in GROUPED BY
LIMIT 10
Any idea ? As i mentioned it needs to be fast, and most of the rows have unique dates
Join the table with itself on date:
select distinct t1.name
from tbl t1
join tbl t2 using (date)
where t2.name = 'josh'
Demo
For the best performance you would have indexes on (name) and (date, name).

MySQL: How to get TOP visited product for each user in a table?

I have a system with products. Everytime a user enters a product, I insert a record into my database.
I have a table with users and id_products, like this:
users id_product
____________________________
jondoe 2
george 9
jondoe 5
jondoe 2
george 9
george 9
george 2
I need a result (query) wich shows what is TOP visited product id for each user, so the result would be something like this:
jondoes most visited product is ID 2
georges most visitedproduct is ID 9
I was looking for the answer but I am not able to figure it out. Thanks a lot for your help, I appreciate it a lot.
Jan
This is a pain because it involves aggregation. One way to solve this uses a very complicated query. Another uses variables. A third method uses an aggregation trick that works under many circumstances:
select user,
substring_index(group_concat(id_product order by cnt desc), ',', 1) as mostCommonProduct
from (select user, id_product, count(*) as cnt
from t
group by user, id_product
) t
group by user;
One danger when using this method is that the intermediate result might be too long. You can set the group_concat_max_len system variable to get around that particular problem.

ORDER BY and GROUP BY those results in a single query

I am trying to query a dataset from a single table, which contains quiz answers/entries from multiple users. I want to pull out the highest scoring entry from each individual user.
My data looks like the following:
ID TP_ID quiz_id name num_questions correct incorrect percent created_at
1 10154312970149546 1 Joe 3 2 1 67 2015-09-20 22:47:10
2 10154312970149546 1 Joe 3 3 0 100 2015-09-21 20:15:20
3 125564674465289 1 Test User 3 1 2 33 2015-09-23 08:07:18
4 10153627558393996 1 Bob 3 3 0 100 2015-09-23 11:27:02
My query looks like the following:
SELECT * FROM `entries`
WHERE `TP_ID` IN('10153627558393996', '10154312970149546')
GROUP BY `TP_ID`
ORDER BY `correct` DESC
In my mind, what that should do is get the two users from the IN clause, order them by the number of correct answers and then group them together, so I should be left with the 2 highest scores from those two users.
In reality it's giving me two results, but the one from Joe gives me the lower of the two values (2), with Bob first with a score of 3. Swapping to ASC ordering keeps the scores the same but places Joe first.
So, how could I achieve what I need?
You're after the groupwise maximum, which can be obtained by joining the grouped results back to the table:
SELECT * FROM entries NATURAL JOIN (
SELECT TP_ID, MAX(correct) correct
FROM entries
WHERE TP_ID IN ('10153627558393996', '10154312970149546')
GROUP BY TP_ID
) t
Of course, if a user has multiple records with the maximal score, it will return all of them; should you only want some subset, you'll need to express the logic for determining which.
MySql is quite lax when it comes to group-by-clauses - but as a rule of thumb you should try to follow the rule that other DBMSs enforce:
In a group-by-query each column should either be part of the group-by-clause or contain a column-function.
For your query I would suggest:
SELECT `TP_ID`,`name`,max(`correct`) FROM `entries`
WHERE `TP_ID` IN('10153627558393996', '10154312970149546')
GROUP BY `TP_ID`,`name`
Since your table seems quite denormalized the group by name-par could be omitted, but it might be necessary in other cases.
ORDER BY is only used to specify in which order the results are returned but does nothing about what results are returned - so you need to apply the max()-function to get the highest number of right answers.

mysql query that a user has a row for every experience id

I have a SQL problem. I have a table where a user gets a row for every experience they complete. The schema looks similar to this fiddle: http://sqlfiddle.com/#!2/5d6a87/4
I am trying to write a query that lists every user that has expid 1-5. So in my example it would list userids: 1,2, and 4. Since userid 3 does not have 5 rows, one for each experience that user shouldn't be listed.
What you would like to use is "group by"
the query would look like it:
select userid
from some_table
group by userid
having count(*)>=5
you can also be creative and force 5 different expeids by
select userid
from some_table
group by userid
having count(distinct expid)>=5
light reading about group by:
http://www.w3schools.com/sql/sql_groupby.asp
Good luck!
Try this:
SELECT id,userid,expid FROM
some_table
GROUP BY userid
HAVING count(*)>= 5
Result:
ID USERID EXPID
1 1 1
6 2 1
13 4 1
See result in SQL Fiddle.
You just need to add the having clause. Read more here.
Here
SELECT * FROM some_table GROUP BY UserID having count(*)>= 5;
They way I see it should be like this but your question is not 100% clear, since you said 5 rows for every ExpID.
SELECT * FROM some_table GROUP BY UserID, ExpID having count(*)>= 5;