MySQL GROUP BY / ORDER BY issue with flat messages table / threads - mysql

Ok, I'm trying to base something similar off of this, but not quite getting it nailed: GROUP BY and ORDER BY
Basically, wanting a query to find the latest messages in each 'thread' between the current logged-in user and any other users, but via a flat (non-'threaded') table of messages:
messages {
id,
from_uid,
to_uid,
message_text,
time_added
}
Assuming the current user's uid is '1', and the latest message in each 'thread' could either be from that user, or to that user (the other party always denoted by thread_recipient):
SELECT a.*,thread_recipient
FROM messages a
JOIN (SELECT IF(from_uid = '1',to_uid,from_uid) AS thread_recipient,
MAX(time_added) AS recency
FROM messages
WHERE (from_uid = '1' OR to_uid = '1')
GROUP BY thread_recipient) b ON thread_recipient = (IF(a.from_uid = '1',a.to_uid,a.from_uid))
AND b.recency = a.time_added
ORDER BY a.time_added DESC
But I fear this ain't gonna work right, and maybe messages sent at the same time might end up being returned for the wrong user?
Is my WHERE condition misplaced?
Any wisdom much appreciated.

Here's an idea: take the UNION of the following two queries, then get
the maximum dates from the result.
SELECT id,to_uid AS other_party,time_added FROM messages WHERE from_uid = '1'
SELECT id,from_uid AS other_party,time_added FROM messages WHERE to_uid = '1'
When you do the following:
SELECT MAX(time_added),other_party
FROM (SELECT id,to_uid AS other_party,time_added FROM messages WHERE from_uid = '1'
UNION
SELECT id,from_uid AS other_party,time_added FROM messages WHERE to_uid = '1'
) MyMessages
GROUP BY other_party
You will get the most recent time associated with a message sent to
each person that user '1' is corresponding with. Then you can join the
results of that to the original messages table to get what you want:
SELECT Messages.*
FROM (SELECT MAX(time_added) AS MaxTime,other_party
FROM (SELECT id,to_uid AS other_party,time_added FROM messages WHERE from_uid = '1'
UNION
SELECT id,from_uid AS other_party,time_added FROM messages WHERE to_uid = '1'
) MyMessages
GROUP BY other_party
)
JOIN Messages
ON (Messages.time_added = MyMessages.MaxTime AND
(Messages.to_uid = MyMessages.other_party AND Messages.from_uid = '1' OR
Messages.from_uid = MyMessages.other_party AND Messages.to_uid = '1')
)

Try this - I think it gives you what you are looking for more simply and efficiently.
SELECT latest_time = MAX(a.time_added, b.time_added)
FROM messages a, messages b
LEFT JOIN messages a1
ON a1.from_uid = a.from_id
-- find the case where the following doesn't exist
-- so you know there is nothing after b1
LEFT JOIN messages a2
ON a2.from_uid = a.from_id
AND a2.time_added > a1.time_added
LEFT JOIN messages b1
ON b1.to_uid = b.to_id
LEFT JOIN messages b2
ON b2.to_uid = b.to_id
AND b2.time_added > b1.time_added
WHERE a.from_id = '1'
AND b.to_id = '1'
AND c1.id IS NULL
AND c2.id IS NULL
ORDER BY a.time_added DESC

Ok, after comments - if id is autoincrement use it instead of message time. But you have propoer condition to ensure that messages will not be delivered to wrong persons.

Related

Improving the performance of sql joined count query

In my application the users can create campaigns for sending messages. When the campaign tries to send a message, one of the three things can happen:
The message is suppressed and not let through
The message can't reach the recipient and is considered failed
The message is successfully delivered
To keep track of this, I have the following table:
My problem is that when the application has processed a lot of messages (more than 10 million), the query I use for showing campaign statistics for the user slows down by a considerable margin (~ 15 seconds), even when there are only a few (~ 10) campaigns being displayed for the user.
Here is the query I'm using:
select `campaigns`.*, (select count(*) from `processed_messages`
where `campaigns`.`id` = `processed_messages`.`campaign_id` and `status` = 'sent') as `messages_sent`,
(select count(*) from `processed_messages` where `campaigns`.`id` = `processed_messages`.`campaign_id` and `status` = 'failed') as `messages_failed`,
(select count(*) from `processed_messages` where `campaigns`.`id` = `processed_messages`.`campaign_id` and `status` = 'supressed') as `messages_supressed`
from `campaigns` where `user_id` = 1 and `campaigns`.`deleted_at` is null order by `updated_at` desc;
So my question is: how can I make this query run faster? I believe there should be some way of not having to use sub-queries multiple times but I am not very experienced with MySQL syntax yet.
You should write this as a single join, using conditional aggregation:
SELECT
c.*,
COUNT(CASE WHEN pm.status = 'sent' THEN 1 END) AS messages_sent,
COUNT(CASE WHEN pm.status = 'failed' THEN 1 END) AS messages_failed,
COUNT(CASE WHEN pm.status = 'suppressed' THEN 1 END) AS messages_suppressed
FROM campaigns c
LEFT JOIN processed_messages pm
ON c.id = pm.campaign_id
WHERE
c.user_id = 1 AND
c.deleted_at IS NULL
GROUP BY
c.id
ORDER BY
c.updated_at DESC;
It should be noted that at first glance, doing SELECT c.* appears to be a violation of the GROUP BY rules which say that only columns which appear in the GROUP BY clause can be selected. However, assuming that campaigns.id is the primary key column, then there is nothing wrong with selecting all columns from this table, provided that we aggregate by the primary key.
Edit:
If the above answer does not run on your MySQL server version, with an error message complaining about only full group by, then use this version:
SELECT c1.*, c2.messages_sent, c2.messages_failed, c2.message_suppressed
FROM campaigns c1
INNER JOIN
(
SELECT
c.id
COUNT(CASE WHEN pm.status = 'sent' THEN 1 END) AS messages_sent,
COUNT(CASE WHEN pm.status = 'failed' THEN 1 END) AS messages_failed,
COUNT(CASE WHEN pm.status = 'suppressed' THEN 1 END) AS messages_suppressed
FROM campaigns c
LEFT JOIN processed_messages pm
ON c.id = pm.campaign_id
WHERE
c.user_id = 1 AND
c.deleted_at IS NULL
GROUP BY
c.id
) c2
ON c1.id = c2.id
ORDER BY
c2.updated_at DESC;

MySQL join with two potential field names

For a website I'm working on I was tasked with implementing a private messaging system.
My basic scenario here is, there are several message entries in the database, each containing a sender and a recipient. However, the "current user" should be able to see both those messages, as they are both relevant to him.
The problem is, I am only interested in the data of the other user, not my own. But the "current user" can be both the sender or the recipient.
My query down here gets this job done, but it is hardly elegant. I am joining both users, then deciding using an IF which data I should get.
SELECT
IF(m.sender = ?, 1, 0) AS isself,
IF(m.sender = ?, u_recipient.id, u_sender.id) AS other_id,
IF(m.sender = ?, u_recipient.displayName, u_sender.displayName) AS other_name,
IF(m.sender = ?, u_recipient_avatar.url, u_sender_avatar.url) AS other_avatar,
m.text AS text
FROM messages AS m
LEFT JOIN user AS u_sender
ON u_sender.id = m.sender
LEFT JOIN avatars AS u_sender_avatar
ON u_sender_avatar.id = u_sender.avatarId
LEFT JOIN user AS u_recipient
ON u_recipient.id = m.recipient
LEFT JOIN avatars AS u_recipient_avatar
ON u_recipient_avatar.id = u_recipient.avatarId
WHERE ( m.sender = ? OR m.recipient = ? )
AND UNIX_TIMESTAMP(m.timestamp) > ?
ORDER BY m.timestamp ASC
LIMIT 100
So basically, my question here is, is there any more elegant way of doing this? Storing the sender/recipient int into 1 single table to be reused in the join? Otherwise, is this a performance hog (joining tables I don't need?). Or should I just take care of seperating these in the application itself?
Thanks in advance!
Seeing that I am not allowed to edit the other answer, which was partially correct.
My answer here is based on Ben's, however, with the syntax errors removed.
SELECT d.isself, other_id, u.displayName AS other_name, a.url AS other_avatar, text
FROM
(
SELECT 0 AS isself, m.sender AS other_id, m.timestamp, m.text
FROM messages AS m
WHERE m.recipient = ?
UNION
SELECT 1 AS isself, m.recipient AS other_id, m.timestamp, m.text
FROM messages AS m
WHERE m.sender = ?
) AS d
LEFT JOIN user AS u
ON u.id = d.other_id
LEFT JOIN avatars AS a
ON a.id = u.avatarId
WHERE UNIX_TIMESTAMP(timestamp) > ?
ORDER BY m.timestamp ASC
LIMIT 100
How about something like:
SELECT isself, other_id, u.displayName AS other_name, a.url AS other_avatar, text
FROM
(
SELECT 0 AS isself, m.sender AS other_id, m.timestamp, m.text
FROM messages AS m
WHERE m.recipient = ?
UNION
SELECT 1 AS isself, m.recipient AS other_id, m.timestamp, m.text
FROM messages AS m
WHERE m.sender = ?
) AS d
LEFT JOIN user AS u
ON u.id = d.other_id
LEFT JOIN avatars AS a
ON a.id = u.avatarId
WHERE UNIX_TIMESTAMP(timestamp) > ?
LIMIT 100
If the message's sender and recipient IDs will always be found in the user table, the "LEFT JOIN user" should be changed to an inner join--"JOIN user". If each user has an avatar entry, then that left join should also be changed to an inner join.

MySQL nested ANDs and several conditions

I have two table for a multiple choice questionnaire (each user answers a series of questions):
users (userID, name, email)
votes (voteID, userID, questionID, answerID)
Sample data (users):
0, Some Name, some#thing.com
1, Other Name, some#one.com
Sample data (votes):
0, 1, 1, 1
1, 1, 2, 2
2, 1, 3, 2
I would like select all users who has the correct answers.
I tried this (where I've hardcoded the answers in):
$sql = "SELECT users.userID, users.name, users.email FROM users
INNER JOIN votes ON (users.userID = votes.userID)
WHERE (votes.questionID = '1' AND votes.answerID = '1')
AND (votes.questionID = '2' AND votes.answerID = '2')
AND (votes.questionID = '3' AND votes.answerID = '2')
AND (votes.questionID = '4' AND votes.answerID = '3')
AND (votes.questionID = '5' AND votes.answerID = '1')
GROUP BY users.userID";
But this doesn't return anything.
I've also tried something like this (where I've also hardcoded the answers in):
$sql = "SELECT users.userID, users.name, users.email FROM users
INNER JOIN transfertipsvotes ON (users.userID = transfertipsvotes.userID)
WHERE (transfertipsvotes.questionID = '1' AND transfertipsvotes.answerID = '1') GROUP BY users.userID
UNION
SELECT users.userID, users.name, users.email FROM users
INNER JOIN transfertipsvotes ON (users.userID = transfertipsvotes.userID)
WHERE (transfertipsvotes.questionID = '2' AND transfertipsvotes.answerID = '2') GROUP BY users.userID
UNION
SELECT users.userID, users.name, users.email FROM users
INNER JOIN transfertipsvotes ON (users.userID = transfertipsvotes.userID)
WHERE (transfertipsvotes.questionID = '3' AND transfertipsvotes.answerID = '2') GROUP BY users.userID";
But this just returns all users with one correct answer.
How do I make the correct query to select all users with the correct answers?
As far as I can see it now you need to use an INNER JOIN for each question, so each inner join will look like this:
INNER JOIN votes AS q1 ON (users.userID = q1.userID) AND q1.questionID = '1' AND q1.answerID ='1'
Repeat this for each question and you can check it.
If i understood you correctly, you want users who have answered all questions correct. In that case you should use INTERSECT instead of UNION
But for this you are hitting the table as many times as your questions. Its better to use OR clause in where "((question1 and Answer1) or (question2 and Answer2))". and at last do a group based on userid and get count of correct answer and fetch only those members whose has all correct answer.

MySql Query with UNION and SORT

Actually i am trying to create a conversation interface like FB(Messages) and for that a sql query is used to fetch all the persons whom with user is talked already.
I need the id of the user from whom he had talked in descending order,
Like if A has chatted with B and C. Then B AND C will be result of that query and B will come first because A chatted with B recently.
My 'messages' table structure is :
http://www.softnuke.com/me/files/DB.png
This is the FB example:
http://www.softnuke.com/me/files/msg.png
This is my incorrect query which needs to be fixed:
SELECT DISTINCT(`mates`)FROM(
SELECT `time` AS `time`,`from_id` AS `mates`
FROM `messages` AS T WHERE (`from_id`=$uid OR `to_id`=$uid)
UNION
SELECT `time` AS `time`,`to_id` AS `mates`
FROM `messages` AS T WHERE (`from_id`=$uid OR `to_id`=$uid)
) AS T
WHERE `mates`!='$uid'
ORDER BY `time`
$uid will give me the variable of the user I want to fetch List(Here its A).
You seem to be getting the main person and the person they were talking to, irrespective of which one is the main person. Also not quite sure how MySQL will work out the time to order things by when you are using DISTINCT which will remove some of the records with their times.
You could get the max time and order by that:-
SELECT `mates`, MAX(`time`) AS LatestConv
FROM(
SELECT `time` AS `time`,`from_id` AS `mates`
FROM `messages` AS T WHERE `to_id`=$uid
UNION
SELECT `time` AS `time`,`to_id` AS `mates`
FROM `messages` AS T WHERE `from_id`=$uid
) AS T
GROUP BY `mates`
ORDER BY LatestConv
To get the status of that latest message:-
SELECT a.mates, a.LatestConv, IFNULL(b.Status, c.Status)
FROM
(
SELECT mates, MAX(`time`) AS LatestConv
FROM(
SELECT `time` AS `time`, from_id AS mates
FROM messages AS T
WHERE to_id = $uid
UNION
SELECT `time` AS `time`, to_id AS mates
FROM messages AS T
WHERE from_id = $uid
) AS T
GROUP BY `mates`
) a
LEFT OUTER JOIN messages b
ON a.mates = b.from_id AND a.LatestConv = b.`time` AND b.to_id = $uid
LEFT OUTER JOIN messages c
ON a.mates = c.to_id AND a.LatestConv = c.`time` AND c.from_id = $uid
ORDER BY LatestConv
Note that this might get a touch confused if there are multiple messages to the same person which all share the same latest time. If this is likely it could be coped with as follows:-
SELECT a.mates, a.LatestConv, MAX(IFNULL(b.Status, c.Status))
FROM
(
SELECT mates, MAX(`time`) AS LatestConv
FROM(
SELECT `time` AS `time`, from_id AS mates
FROM messages AS T
WHERE to_id = $uid
UNION
SELECT `time` AS `time`, to_id AS mates
FROM messages AS T
WHERE from_id = $uid
) AS T
GROUP BY `mates`
) a
LEFT OUTER JOIN messages b
ON a.mates = b.from_id AND a.LatestConv = b.`time` AND b.to_id = $uid
LEFT OUTER JOIN messages c
ON a.mates = c.to_id AND a.LatestConv = c.`time` AND c.from_id = $uid
GROUP BY a.mates, a.LatestConv
ORDER BY LatestConv

Set limit for MySQL query

I have query like this:
SELECT `all_messages`.`user_1`, `messages`.*, `users`.`username`
FROM `all_messages`
JOIN `messages` ON (`all_messages`.`user_2` = `messages`.`from_user`)
JOIN `users` ON (`all_messages`.`user_2` = `users`.`id`)
WHERE `all_messages`.`user_1` = '12'
ORDER BY `messages`.`sent` DESC LIMIT 2
Now this query does what I need but my problem is with this line
ON (`all_messages`.`user_2` = `messages`.`from_user`)
It selects all data from messages where the matches was found but I need only one newest record. I hope you guys get what I mean.
If you need one "newest record" you should have a date column or something, lets name it "CREATION_TIME", so you could do something like this
SELECT AM.user_1, M.*, U.username
FROM all_messages AM, messages M , users U
WHERE AM.user_1 = '12'
AND AM.user_2 = M.from_user
AND AM.user_2 = U.id
AND M.CREATION_TIME =
(
SELECT MAX(CREATION_TIME)
FROM messages
WHERE from_user= M.from_user
)
ORDER BY M.sent DESC LIMIT 2
Edit
SELECT AM.user_1, M.*, U.username
FROM all_messages AM, messages M, users U
WHERE AM.user_1 = '12'
AND AM.user_2 = M.from_user
AND AM.user_2 = U.id
AND M.sent =
(
SELECT MAX(sent)
FROM messages
WHERE from_user= M.from_user
)
ORDER BY M.sent DESC LIMIT 2
It should work