MySQL - Fetching users who sent the most recent messages - mysql

I have a messages table as follows:
Basically what I want is, I want to fetch n users who sent the most recent messages to a group. So it has to be grouped by from_user_id and sorted by id in descending order. I have the following query:
SELECT `users`.`id` AS `user_id`, `users`.`username`, `users`.`image`
FROM `group_messages`
JOIN `users` ON `users`.`id` = `group_messages`.`from_user_id`
WHERE `group_messages`.`to_group_id` = 31
GROUP BY `users`.`id`
ORDER BY `group_messages`.`id` DESC;
The problem with this is, when I group by user.id, the row with the smallest id field is taken into account. Therefor what I get is not in the order which id is descending.
So is there a way to group by, taking the greatest id into account ? Or should I approach it another way ?
Thanks in advance.
Edit: I think I got it.
SELECT `x`.`id`, `users`.`id` AS `user_id`, `users`.`username`, `users`.`image`
FROM (SELECT * FROM `group_messages` ORDER BY `group_messages`.`id` DESC) `x`
JOIN `users` ON `users`.`id` = `x`.`from_user_id`
WHERE `x`.`to_group_id` = 31
GROUP BY `users`.`id`
ORDER BY `x`.`id` DESC;
Just had to make a select from an already ordered list.

Using order by inside subquery is not a very efficient way. Try this:
users table
| uid | name | ---------- |
messages table
|id | from_uid | to_group_id | ----------- |
SELECT u.* FROM users as u JOIN (
SELECT g1.*
FROM messages g1 LEFT JOIN messages g2
ON g1.from_uid = g2.from_uid AND g1.to_groupid = g2.to_groupid
AND g1.id<g2.id
WHERE g2.id is NULL) as lastmessage
ON lastmessage.from_uid = u.uid
WHERE lastmessage.to_groupid = 1
ORDER BY lastmessage.id DESC;
Its not good to have order by in subqueries because they will make the queries slow and in your case u were running it on the whole table .
Check this out Retrieving the last record in each group.

Related

Avoid using a subquery in a table join

In a MySQL 5.7 database, I have the following User table:
Name
Id
David
1
Frank
2
And the following Order table:
Id
Price
UserId
1
55
1
2
68
1
3
50
1
4
10
2
For every user, I want to select the price of the order with the biggest ID.
I can use the following query which adds additional complexity due to the nested subquery :
SELECT
User.Name,
last_user_order.Price
FROM User
LEFT JOIN (
SELECT Price, UserId FROM Order
ORDER BY Id DESC LIMIT 1
) AS last_user_order ON last_user_order.UserId = User.Id
There exist many questions here where the column to be selected is the same than the one being ordered. Hence, it is possible to use MAX in the first SELECT statement to avoid a subquery. Is it possible to avoid a subquery in my case?
For every user, I want to select the price of the order with the biggest ID.
That looks like:
SELECT
u.*,
o.Price,
FROM
User u
INNER JOIN Order o ON u.ID = o.UserID
INNER JOIN
(
SELECT MAX(ID) as OrderID FROM Order GROUP BY UserId
) maxO ON o.Id = maxO.OrderId
SELECT User.Name,
( SELECT Order.Price
FROM Order
WHERE Order.UserId = User.Id
ORDER BY Order.Id DESC LIMIT 1 ) LastPrice
FROM User;

Selecting a count of rows having a max value

Working example: http://sqlfiddle.com/#!9/80995/20
I have three tables, a user table, a user_group table, and a link table.
The link table contains the dates that users were added to user groups. I need a query that returns the count of users currently in each group. The most recent date determines the group that the user is currently in.
SELECT
user_groups.name,
COUNT(l.name) AS ct,
GROUP_CONCAT(l.`name` separator ", ") AS members
FROM user_groups
LEFT JOIN
(SELECT MAX(added), group_id, name FROM link LEFT JOIN users ON users.id = link.user_id GROUP BY user_id) l
ON l.group_id = user_groups.id
GROUP BY user_groups.id
My question is if the query I have written could be optimized, or written better.
Thanks!
Ben
You actual query is not giving you the answer you want; at least, as far as I understand your question. John actually joined group 2 on 2017-01-05, yet it appears on group 1 (that he joined on 2017-01-01) on your results. Note also you're missing one Group 4.
Using standard SQL, I think the next query is what you're looking for. The comments in the query should clarify what each part is doing:
SELECT
user_groups.name AS group_name,
COUNT(u.name) AS member_count,
group_concat(u.name separator ', ') AS members
FROM
user_groups
LEFT JOIN
(
SELECT * FROM
(-- For each user, find most recent date s/he got into a group
SELECT
user_id AS the_user_id, MAX(added) AS last_added
FROM
link
GROUP BY
the_user_id
) AS u_a
-- Join back to the link table, so that the `group_id` can be retrieved
JOIN link l2 ON l2.user_id = u_a.the_user_id AND l2.added = u_a.last_added
) AS most_recent_group ON most_recent_group.group_id = user_groups.id
-- And get the users...
LEFT JOIN users u ON u.id = most_recent_group.the_user_id
GROUP BY
user_groups.id, user_groups.name
ORDER BY
user_groups.name ;
This can be written in a more compact way in MySQL (abusing the fact that, in older versions of MySQL, it doesn't follow the SQL standard for the GROUP BY restrictions).
That's what you'll get:
group_name | member_count | members
:--------- | -----------: | :-------------
Group 1 | 2 | Mikie, Dominic
Group 2 | 2 | John, Paddy
Group 3 | 0 | null
Group 4 | 1 | Nellie
dbfiddle here
Note that this query can be simplified if you use a database with window functions (such as MariaDB 10.2). Then, you can use:
SELECT
user_groups.name AS group_name,
COUNT(u.name) AS member_count,
group_concat(u.name separator ', ') AS members
FROM
user_groups
LEFT JOIN
(
SELECT
user_id AS the_user_id,
last_value(group_id) OVER (PARTITION BY user_id ORDER BY added) AS group_id
FROM
link
GROUP BY
user_id
) AS most_recent_group ON most_recent_group.group_id = user_groups.id
-- And get the users...
LEFT JOIN users u ON u.id = most_recent_group.the_user_id
GROUP BY
user_groups.id, user_groups.name
ORDER BY
user_groups.name ;
dbfiddle here

How to optimize this MySQL query? (CROSS JOIN, subquery)

I have a challenging question for MySQL experts.
I have a users permissions system with 4 tables:
users (id | email | created_at)
permissions (id | responsibility_id | key | weight)
permission_user (id | permission_id | user_id)
responsibilities (id | key | weight)
Users can have any number of permissions assigned and any permission can be granted to any number of users (many to many). Responsibilities are like groups for permissions, each permission belongs to exactly one responsibility. For example, one permission is called update with responsibility of customers. Another one would be delete with orders responsibility.
I need to get a full map of permissions per user, but only for those who have at least one permission granted. Results should be ordered by:
User's number of permissions from most to least
User's created_at column, oldest first
Responsibility's weight
Permission's weight
Example result set:
user_id | responsibility | permission | granted
-----------------------------------------------
5 | customers | create | 1
5 | customers | update | 1
5 | orders | create | 1
5 | orders | update | 1
2 | customers | create | 0
2 | customers | delete | 0
2 | orders | create | 1
2 | orders | update | 0
Let's say I have 10 users in database, but only two of them have any permissions granted. There are 4 permissions in total:
create of customers responsibility
update of customers responsibility
create of orders responsibility
update of orders responsibility.
That's why we have 8 records in results (2 users with any permission × 4 permissions). User with id = 5 is displayed first, because he's got more permissions. If there were any draws, the ones with older created_at date would go first. Permissions are always sorted by the weight of their responsibility and then by their own weight.
My question is, how to write optimal query for this case? I have already made one myself and it works good:
SELECT `users`.`id` AS `user_id`,
`responsibilities`.`key` AS `responsibility`,
`permissions`.`key` AS `permission`,
!ISNULL(`permission_user`.`id`) AS `granted`
FROM `users`
CROSS JOIN `permissions`
JOIN `responsibilities`
ON `responsibilities`.`id` = `permissions`.`responsibility_id`
LEFT JOIN `permission_user`
ON `permission_user`.`user_id` = `users`.`id`
AND `permission_user`.`permission_id` = `permissions`.`id`
WHERE (
SELECT COUNT(*)
FROM `permission_user`
WHERE `user_id` = `users`.`id`
) > 0
ORDER BY (
SELECT COUNT(*)
FROM `permission_user`
WHERE `user_id` = `users`.`id`
) DESC,
`users`.`created_at` ASC,
`responsibilities`.`weight` ASC,
`permissions`.`weight` ASC
The problem is that I'm using the same subquery twice.
Can I do better? I count on you, MySQL experts!
--- EDIT ---
Thanks to Gordon Linoff's comment I made it use HAVING clause:
SELECT `users`.`email`,
`responsibilities`.`key`,
`permissions`.`key`,
!ISNULL(`permission_user`.`id`) as `granted`,
(
SELECT COUNT(*)
FROM `permission_user`
WHERE `user_id` = `users`.`id`
) AS `total_permissions`
FROM `users`
CROSS JOIN `permissions`
JOIN `responsibilities`
ON `responsibilities`.`id` = `permissions`.`responsibility_id`
LEFT JOIN `permission_user`
ON `permission_user`.`user_id` = `users`.`id`
AND `permission_user`.`permission_id` = `permissions`.`id`
HAVING `total_permissions` > 0
ORDER BY `total_permissions` DESC,
`users`.`created_at` ASC,
`responsibilities`.`weight` ASC,
`permissions`.`weight` ASC
I was surprised to discover that HAVING can go alone without GROUP BY.
Can it now be improved for better performance?
Probably the most efficient way to do this is:
SELECT u.email, r.`key`, r.`key`,
!ISNULL(pu.id) as `granted`
FROM (SELECT u.*,
(SELECT COUNT(*) FROM `permission_user` pu WHERE pu.user_id = u.id
) AS `total_permissions`
FROM `users` u
) u CROSS JOIN
permissions p JOIN
responsibilities r
ON r.id = p.responsibility_id LEFT JOIN
permission_user pu
ON pu.user_id = u.id AND
pu.permission_id = p.id
WHERE u.total_permissions > 0
ORDER BY `total_permissions` DESC,
`users`.`created_at` ASC,
`responsibilities`.`weight` ASC,
`permissions`.`weight` ASC;
This will run the subquery once per user, rather than once per user/permission combination (as both the modified query and the original query were doing). This has two costs. The first is the materialization of the subquery, so the data in the users table has to be read and written again. Probably not a big deal, given everything else in the query. The second is the loss of indexes on the users table. Once again, with a cross join, indexes are (probably) not being used, so this is also minor.

Duplicate content in LEFT JOIN query + get count on joined table

I am trying to join two tables with similar ids, then get a sum of two fields as well. let me explain:
test table: id | post | desc | Date
likes_dislikes table: id | song_id | user_ip | like | dislike
on test 'test table', the 'id' matches that of the likes_dislikes 'song_id', so I tried LEFT JOIN since not every post will have an id in the likes_dislikes table, but I got duplicate results .
SELECT *
FROM
test
LEFT JOIN likes_dislikes ON test.song_id = likes_dislikes.page_id
GROUP BY test.song_id
ORDER BY test.id DESC LIMIT $start, $limit
how can I prevent the duplicate content, and also, get the TOTAL likes/dislikes associated with each post as I run through a while loop?
I Assume you are looking for something like this:
SELECT
T.`id`,
T.`post`,
T.`desc`,
T.`Date`,
COUNT(L.`like`) as `LikeCount`,
COUNT(L.`dislike`) as `DislikeCount`
FROM `test` T
LEFT JOIN `likes_dislikes` L
ON T.`Id` = L.`song_id`
GROUP BY T.`Id`, T.`post`, T.`desc`, T.`Date`
ORDER BY T.`id` DESC;

Group messages by latest response in conversation threading

I need a simple internal messaging system between users.
My tables:
+--------------+ +---------------------+
| messages | | users |
+----+---------+ +---------------------+
| id | message | | id | username | ...
+----+---------+ +---------------------+
+------------------------------------------------------------------------------+
| users_messages |
+------------------------------------------------------------------------------+
| id | from_usr_id | to_usr_id | msg_id | thread_id | read | sent_at | read_at |
+------------------------------------------------------------------------------+
INT 'thread_id' represents the conversation thread, its used to group messages.
BOOLEAN 'read' represents if the user opened/viewed the message or not.
I want to group messages by 'thread_id', sorted by 'sent_at' so I can show the user his latest messages by thread. I want also to count the messages in each thread.
I want to get something like this for a specific user id:
+----------------------------------------------------------------------------
| last_messages_by_conversation
+----------------------------------------------------------------------------
| message | from_username | sent_at | count_thread_msgs | count_unread_msg |
+----------------------------------------------------------------------------
TEXT 'message' is the latest message in the specific 'thread_id'
VARCHAR 'from_username' and DATETIME 'sent_at' are related to the latest message.
INT 'count_thread_msgs' and INT 'count_unread_msg' are related to the thread, representing the total number of messages and the number of unread messages in the thread.
Each row represents a thread/conversation (group by 'thread_id'), showing the last message (sorted by 'sent_at') for that specific thread.
You are looking for the groupwise maximum, which can be found by first grouping the users_messages table by thread_id and selecting MAX(sent_at), then joining the result back onto the users_messages table to find the other fields of that maximum record.
I find that NATURAL JOIN is a very handy shortcut here:
SELECT messages.message,
users.username AS from_username,
t.sent_at,
t.count_thread_msgs,
t.count_unread_msg
FROM users_messages NATURAL JOIN (
SELECT thread_id,
to_usr_id,
MAX(sent_at) AS sent_at,
COUNT(*) AS count_thread_msgs,
SUM(NOT read) AS count_unread_msg
FROM users_messages
WHERE to_usr_id = ?
GROUP BY thread_id
) t JOIN messages ON messages.id = users_messages.msg_id
JOIN users ON users.id = users_messages.from_usr_id
SELECT
users.id,
users.username,
user_messages.thread_id,
user_messages.unread ,
messages.message
FROM users
LEFT JOIN (SELECT
from_usr_id ,
msg_id,
count(thread_id)) as thread_id,
count(read_at) as unread
FROM user_messages)as user_messages on user_messages.from_usr_id = users.id
LEFT JOIN messages on messages.id = user_messages.msg_id
You can try this solution:
SELECT c.message,
d.username AS from_username,
b.sent_at,
a.count_thread_msgs,
a.count_unread_msg
FROM (
SELECT MAX(id) AS maxid,
COUNT(*) AS count_thread_msgs,
COUNT(CASE WHEN `read` = 0 AND <uid> = to_usr_id THEN 1 END) AS count_unread_msg
FROM users_messages
WHERE <uid> IN (from_usr_id, to_usr_id)
GROUP BY thread_id
) a
JOIN users_messages b ON a.maxid = b.id
JOIN messages c ON b.msg_id = c.id
JOIN users d ON b.from_usr_id = d.id
ORDER BY b.sent_at DESC
This gets the latest message in each thread that the user <uid> started or is a part of.
The latest message is based on the highest id of each thread_id.
This solution makes the following assumptions:
The id in users_messages is a unique auto-incrementing int for each new row.
Each thread contains correspondence between never more than two users.
If the thread can contain more than two users, then the query will need to be slightly adjusted so as to derive an accurate count aggregation.
Try this and let me know, change $$ for your user ID..
select u.username,msg.message,m.sent_at,
(select count(*) from user_message where read=0 and to_usr_id=$$) as count_thread_msgs,
(select count(*) from user_message where to_usr_id= $$) as count_unread_msg
from users as u join user_messages as m
on u.id=m.id where u.id=$$
join messages as msg on msg.id=m.id
group by u.id;`
Try this query -
SELECT
m.message,
u.username from_username,
um1.sent_at,
um2.count_thread_msgs,
um2.count_unread_msg
FROM users_messages um1
JOIN (
SELECT
thread_id,
MAX(sent_at) sent_at,
COUNT(*) count_thread_msgs,
COUNT(IF(`read` = 1, `read`, NULL)) count_unread_msg
FROM users_messages GROUP BY thread_id) um2
ON um1.thread_id = um2.thread_id AND um1.sent_at = um2.sent_at
JOIN messages m
ON m.id = um1.msg_id
JOIN users u
ON u.id = um1.from_usr_id
-- WHERE u.id = 100 -- specify user id here
Answers on your questions:
About last datetime: I have changed query a little, just try new one.
About specific users: Add WHERE condition to filter users - ...WHERE u.id = 100.
About many records: because you join another tables (messages and users), and there can be more then one record with the same thread_id. To avoid this you should group result set by thread_id field and use aggregate function to get single result, e.g. using GROUP_CONCAT function.