MySQL: Is there a way to eliminates duplicate entries from the result - mysql

I have a table named "message" that stores messages from one user to another user. I want to make a message box that contains both incoming and outcoming messages for particular user. This message box should be contain the last message between two users. So, I have to eliminate duplicate messages between two users. I tried group by and it eliminates duplicate messages but I don't pick the most recent message because order by works after group by. I tried distinct function to eliminates duplicate messages. It works well, but I have to select all colums which isn't possible with distinct
My message table:
+-------+---------+------+-----+------------+
| id | from_id | to_id| text| created_at |
+-------+---------+------+-----+------------+
| 1 | 1 | 2 | mes | 2014-01-16 |
| 2 | 2 | 1 | mes | 2014-01-17 |
| 3 | 1 | 3 | mes | 2014-01-18 |
| 4 | 3 | 1 | mes | 2014-01-19 |
+-------+---------+------+-----+------------+
My Group By SQL
SELECT * FROM message WHERE (from_id = 1 OR to_id = 1) GROUP BY(from_id + to_id) ORDER BY created_at DESC;
And Distinct
SELECT DISTINCT(from_id + to_id) FROM message WHERE (from_id = 1 OR to_id = 1)
In the above example, I want to select second and fourth message.
Is there a way to eliminate duplicate messages between two user from the result?
EDIT: I've improved the example

I tried group by and it eliminates duplicate messages but I don't pick the most recent message because order by works after group by
So you can order it before:
SELECT *
FROM (SELECT * FROM message ORDER BY created_at DESC)
WHERE (from_id = 1 OR to_id = 1) GROUP BY(from_id + to_id);

If I understand correctly what you're trying to achieve, you can leverage LEAST(), GREATEST() functions and non-standard GROUP BY extension behavior in MySQL like this
SELECT id, from_id, to_id, text, created_at
FROM
(
SELECT id, from_id, to_id, text, created_at
FROM message
ORDER BY LEAST(from_id, to_id), GREATEST(from_id, to_id), created_at DESC
) q
GROUP BY LEAST(from_id, to_id), GREATEST(from_id, to_id)
That will give you the last message row for each pair of users.
Output:
+------+---------+-------+------+------------+
| id | from_id | to_id | text | created_at |
+------+---------+-------+------+------------+
| 2 | 2 | 1 | mes | 2014-01-17 |
| 4 | 3 | 1 | mes | 2014-01-19 |
+------+---------+-------+------+------------+
Here is SQLFiddle demo

You can use:
ORDER BY id DESC LIMIT 1
or by timestamp (assuming it contains data AND time):
ORDER BY create_at DESC LIMIT 1
This will sort all results in a descending order and only give you the last row.
Hope this helps!

Just use a simple select. There is no reason for duplicates to be created.
SELECT from_id, to_id, text, created_at
FROM message
WHERE
(from_id = ? AND to_id = ??)
OR (from_id = ?? AND to_id = ?)
Here ? represents one id and ?? the other.
There would be no duplicates here. Ordering can be achieved in a few ways:
Order by most recent message regardless of sender:
SELECT from_id, to_id, text, created_at
FROM message
WHERE
(from_id = ? AND to_id = ??)
OR (from_id = ?? AND to_id = ?)
ORDER BY created_at DESC
Order all sender message first (then by created_at)
SELECT from_id, to_id, text, created_at
FROM message
WHERE
(from_id = ? AND to_id = ??)
OR (from_id = ?? AND to_id = ?)
ORDER BY from_id = ? DESC, created_at DESC

Try adding a HAVING clause after your GROUP BY: HAVING COUNT(*) > 1
or
SELECT
columns names, COUNT(*)
FROM
(SELECT DISTINCT
column names
FROM
message
)
message
GROUP BY
column names
HAVING COUNT(*) > 1

SQL Fiddle
MySQL 5.5.32 Schema Setup:
CREATE TABLE message
(`id` int, `from_id` int, `to_id` int, `text` varchar(3), `created_at` datetime)
;
INSERT INTO message
(`id`, `from_id`, `to_id`, `text`, `created_at`)
VALUES
(1, 1, 2, 'mes', '2014-01-16 00:00:00'),
(2, 2, 1, 'MUL', '2014-01-17 00:00:00')
;
Query 1:
SELECT *
FROM message
WHERE from_id = 1 OR to_id = 1
ORDER BY created_at DESC
limit 1
Results:
| ID | FROM_ID | TO_ID | TEXT | CREATED_AT |
|----|---------|-------|------|--------------------------------|
| 2 | 2 | 1 | MUL | January, 17 2014 00:00:00+0000 |

Related

Select all columns, only taking into account the highest scores per user

It's been asked before, but I can't get it to work properly. The selected answer doesn't work with duplicate values. The second answer should be able to handle duplicates according to the poster, but it's not functioning correctly with my data.
What I want to achieve is pretty simple:
I have a database containing all scores of all users. I want to build a highscore table, so I want to select all highscore rows of each user. With highscore row I mean the row for that user where his score is the highest.
Here's a demo I made based on the answer I mentioned at the top:
CREATE TABLE test(
score INTEGER,
user_id INTEGER,
info INTEGER
);
insert into test(score, user_id, info)
values
(1000, 1, 1),
(1000, 1, 2),
(2000, 2, 3),
(2001, 2, 1);
--
SELECT t.*
FROM test t
JOIN (SELECT test.user_id, max(score) as mi FROM test GROUP BY user_id) j ON
t.score = j.mi AND
t.user_id = j.user_id
ORDER BY score DESC, info ASC;
Expected output:
+-------+---------+------+
| score | user_id | info |
+-------+---------+------+
| 2001 | 2 | 1 |
| 1000 | 1 | 1 |
+-------+---------+------+
--> every user_id is present with the row where the user had the highest score value.
Real output:
+-------+---------+------+
| score | user_id | info |
+-------+---------+------+
| 2001 | 2 | 1 |
| 1000 | 1 | 1 |
| 1000 | 1 | 2 |
+-------+---------+------+
--> when there are duplicate values, user show up multiple times.
Anyone who can point me in the right direction?
I assume when there are duplicate scores you want the lowest info just like your expected output.
With NOT EXISTS:
select t.* from test t
where not exists (
select 1 from test
where user_id = t.user_id and (
score > t.score or (score = t.score and info < t.info)
)
);
See the demo.
For MySql 8.0+ you can use ROW_NUMBER():
select t.score, t.user_id, t.info
from (
select *, row_number() over (partition by user_id order by score desc, info asc) rn
from test
) t
where t.rn = 1
See the demo.
Results:
| score | user_id | info |
| ----- | ------- | ---- |
| 1000 | 1 | 1 |
| 2001 | 2 | 1 |
If the combination of (user_id, info) is UNIQUE and NOT NULL (or PRIMARY KEY), then you can use a LIMIT 1 subquery in the WHERE clause:
SELECT t.*
FROM test t
WHERE (t.score, t.info) = (
SELECT t2.score, t2.info
FROM test t2
WHERE t2.user_id = t.user_id
ORDER BY t2.score DESC, t2.info ASC
LIMIT 1
)
ORDER BY t.score DESC, t.info ASC;
The result will be:
| score | user_id | info |
|-------|---------|------|
| 2001 | 2 | 1 |
| 1000 | 1 | 1 |
demo on sqlfiddle
SELECT info FROM test HAVING MAX(score) was used to keep the info field relevant with the row containing the MAX(score).
SELECT MAX(score) score, user_id, (SELECT info FROM test HAVING MAX(score)) AS info FROM test GROUP BY user_id ORDER BY score DESC;

MySQL grouping by 2 columns

I'm trying to return back all the messages for a user_id ( 1 ) in the table sorted by created_at desc, but have it grouped by sender_id or recipient_id depending on whichever created_at is newer.
messages
sender_id | recipient_id | text | created_at
1 | 2 | hey | 2017-03-26 04:00:00
1 | 2 | tees | 2017-03-26 00:00:00
2 | 1 | rrr | 2017-03-27 00:00:00
3 | 1 | edd | 2017-03-27 00:00:00
1 | 3 | cc3 | 2017-02-27 00:00:00
Ideally it would return
2 | 1 | rrr | 2017-03-27 00:00:00
1 | 3 | cc3 | 2017-02-27 00:00:00
The query I have so far is -
select *
from messages
where (
sender_id = 1
or recipient_id = 1
)
group by least(sender_id, recipient_id)
order by created_at desc
but it seems it is doing the order by before the group by.
Any help would be appreciative.
GROUP BY is intended for aggregation (sum, count, etc...), the fact that it orders is little more than an official side effect (that is being deprecated, and not guaranteed behavior in future versions). ORDER BY is done after GROUP BY, it sorts the final results, and can take multiple expressions. Your interchangeable use of the terms makes it difficult to understand exactly what you are looking for, but going by sample desired results this is probably it:
ORDER BY LEAST(sender_id, recipient_id), created_at DESC
However, since all rows in your sample result have the same "LEAST" value, it could be this:
ORDER BY LEAST(sender_id, recipient_id), created_at DESC
If you are attempting to get the most recent post by user 1 where they are a sender and the most recent post by user 1 where they are a recipient.
I would do a union with the first query getting the most recent post as a sender and the second getting the most recent post as a recipient.
select *
from messages
join
(
select
sender_id,
max(created_at) as max1
from messages
where
sender_id = 1
group by
sender_id
) t1
on messages.created_at = t1.max1
union
select *
from messages
join
(
select
recipient_id,
max(created_at) as max2
from messages
where
recipient_id = 1
group by
recipient_id
) t2
on messages.created_at = t2.max2

Number of unread messages sum

I want to retrieve messages and number of unread message (0) for a sender and dest in a conversation.
+---------------------------------------------------------------+
| messages |
+---------------------------------------------------------------+
| message_id | id_sender | id_dest | subject | message | read |
+---------------------------------------------------------------+
| 1 | 25 | 50 | Hi | message | 0 |
| 2 | 25 | 50 | Hi2 |message2 | 1 |
| 3 | 25 | 50 | Hi3 |message3 | 0 |
+---------------------------------------------------------------+
In this case the result must be 2. I try with
SELECT *
FROM
(SELECT message,sum(read = 0) as nm_messages
FROM messages
WHERE ( id_sender = id1 AND id_dest = id2 ) or
( id_dest = id1 AND id_sender = id2 )
ORDER BY message_id DESC
LIMIT 10) AS ttbl
ORDER BY message_id ASC
The messages part is ok but when
I add
sum(read = 0) as nm_messages
return only the firsth message if possible for both mysql postgresql
Thanks!
I have used PostgreSQL 9.4.11, compiled by Visual C++ build 1800, 64-bit.
With distinct on you can eliminate same rows with their unique ids. in this case i have used id_sender.
SELECT DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate to equal
more information look at this link:
distinct on
Below sql query will return only the first message and the number of unread messages (0):
SELECT distinct on (id_Sender)
message,
count(case when read=0 then 1 end) over() as nm_messages
FROM messages
group by id_Sender,message,message_id
order by id_Sender,message_id
message | nm_message
message | 2
You should use the sum with if condition. Should be like this:
SELECT *
FROM
(SELECT GROUP_CONCAT(message),sum(IF(read = 0,1,0)) as nm_messages
FROM messages
WHERE ( id_sender = id1 AND id_dest = id2 ) or
( id_dest = id1 AND id_sender = id2 )
GROUP BY id_sender, id_dest
LIMIT 10) AS ttbl
ORDER BY message_id ASC
When the condition is true (read = 1), then it will sum up 1, otherwise 0.
I just saw that there was no grouping in the query. I added that. Also if you use an aggregate function, it doesnt make sense to do that only for one field (read), and not for others (message). So i put group_concact around message. That makes more sense to me?!

Get column values based on last entry (not null)

I have a table in MySQL which holds conversations. These conversation are composed of messages. A single conversation looks like the table below.
Importance and eId's are only sometimes set. What I am trying to get from the table is the last message in the conversation (messageId = 4) but with the last set importance and last set eId.
So, from this table
+----------------+-----------+----------+----------+------------+-------+---------+
| conversationId | messageId | time | status | importance | eId | message |
+----------------+-----------+----------+----------+------------+-------+---------+
| 25 | 4 | 11:00:00 | feedback | NULL | NULL | d.. |
+----------------+-----------+----------+----------+------------+-------+---------+
| 25 | 3 | 10:00:00 | open | MEDIUM | NULL | c.. |
+----------------+-----------+----------+----------+------------+-------+---------+
| 25 | 2 | 09:00:00 | feedback | NULL | 123 | b... |
+----------------+-----------+----------+----------+------------+-------+---------+
| 25 | 1 | 08:00:00 | open | HIGH | NULL | a... |
+----------------+-----------+----------+----------+------------+-------+---------+
I need to get this result
+----------------+-----------+----------+----------+------------+-------+---------+
| conversationId | messageId | time | status | importance | eId | message |
+----------------+-----------+----------+----------+------------+-------+---------+
| 25 | 4 | 11:00:00 | feedback | MEDIUM | 123 | d.. |
+----------------+-----------+----------+----------+------------+-------+---------+
I can't get the query to work.
Any help would be appriciated. Thanks.
If there is more than one conversationId in the table, and you want to get the desired result for all of the conversationIds at the same time, then I think you need to join 3 subqueries within subqueries. Something like the 3 below:
SELECT messages.conversationId, messages.messageId, messages.time, messages.status, messages.message
FROM messages
JOIN (
SELECT conversationId, MAX(messageId) as messageId
FROM messages
GROUP BY conversationId) as m2
ON (messages.messageId = m2.messageId);
SELECT messages.conversationId, messages.importance
FROM messages
JOIN (
SELECT conversationId, MAX(messageId) as messageId
FROM messages
WHERE importance IS NOT NULL
GROUP BY conversationId) as m3
ON (messages.messageId = m3.messageId);
SELECT messages.conversationId, messages.eId
FROM messages
JOIN (
SELECT conversationId, MAX(messageId) as messageId
FROM messages
WHERE eId IS NOT NULL
GROUP BY conversationId) as m4
ON (messages.messageId = m4.messageId);
The JOIN would look like this:
SELECT
main.conversationId,
main.messageId,
main.time,
main.status,
importance.importance,
eId.eId,
main.message
FROM (
SELECT messages.conversationId, messages.messageId, messages.time, messages.status, messages.message
FROM messages
JOIN (
SELECT conversationId, MAX(messageId) AS messageId
FROM messages
GROUP BY conversationId) AS m2
ON (messages.messageId = m2.messageId)
) AS main
JOIN (
SELECT messages.conversationId, messages.importance
FROM messages
JOIN (
SELECT conversationId, MAX(messageId) AS messageId
FROM messages
WHERE importance IS NOT NULL
GROUP BY conversationId) AS m3
ON (messages.messageId = m3.messageId)
) AS importance
JOIN (
SELECT messages.conversationId, messages.eId
FROM messages
JOIN (
SELECT conversationId, MAX(messageId) AS messageId
FROM messages
WHERE eId IS NOT NULL
GROUP BY conversationId) AS m4
ON (messages.messageId = m4.messageId)
) AS eId
ON (
main.conversationId = importance.conversationId
AND main.conversationId = eId.conversationId
);
Here's an sqlfiddle: http://sqlfiddle.com/#!2/857aa/38. This assumes that messageId is unique. If it is not unique, then you need to join on messageId AND conversationId.
Maybe a bit oldschool but should do the trick :
SELECT c.conversationId as convId, max(c.messageId), t.time, s.status, i.importance, e.eId, c.message
FROM convs c,
(SELECT conversationId, max(time) AS time FROM convs) t,
(SELECT conversationId, status FROM convs WHERE status IS NOT NULL ORDER BY messageId DESC) s,
(SELECT conversationId, importance FROM convs WHERE importance IS NOT NULL ORDER BY messageId DESC) i,
(SELECT conversationId, eId FROM convs WHERE eId IS NOT NULL ORDER BY messageId DESC) e
WHERE 1=1
AND t.conversationId = c.conversationId
AND s.conversationId = c.conversationId
AND i.conversationId = c.conversationId
AND e.conversationId = c.conversationId
GROUP BY c.conversationId
SQL Fiddle demo
This one is not necessarily the best solution as I had to create this in my local SQL Server (SQL fiddle was down when tried and I have no MySQL installed), also a "quick and dirty" query (means: there might be a better solution for your problem), but as I am keen to help I post it anyway.
If you happen to wait for another, possibly better solution from the community, I will not feel offended :) That is what I wanted to say :)
SELECT TOP 1 conversationId,messageId,RecTime,ConvStatus,
(
SELECT TOP 1 importance
FROM Conversations
WHERE importance IS NOT NULL
ORDER BY messageId DESC
) AS importance,
(
SELECT TOP 1 eId
FROM Conversations
WHERE eId IS NOT NULL
ORDER BY messageId DESC
) AS eId,
UsrMessage
FROM Conversations
ORDER BY messageId DESC
GO
(do not forget that you might have to change "formatting items" to make it work in MySQL)

how to get a single row for each user?

sorry for the title, i don't know how to explain it better...
i have a forum and i want to make a sort of achievement system in php
i want to know when users with posts>10 posted their 10th message...
the post table is like
post_id | post_date | userid | post_message | ...
i can get this result for each user with
select userid, post_date from posts where userid=1 order by post_date limit 9,1
but i need a resultset like
id | date
id | date
id | date
it can only be done with procedures?
Try this query
select
*
from (
select
#rn:=if(#prv=userid, #rn+1, 1) as rid,
#prv:=userid as userid,
post_message
from
tbl
join
(select #rn:=0, #prv:=0) tmp
order by
userid,
post_date) tmp
where
rid=10
SQL FIDDLE
| RID | USERID | POST_MESSAGE |
-------------------------------
| 10 | 1 | asdasd |
| 10 | 2 | asdasd |
try this one:
SELECT userid
, SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(post_date ORDER BY post_date), ',', 10), ',', -1) AS PostDate
FROM posts
GROUP BY userid
HAVING PostDate <> '' OR PostDate IS NOT NULL
But you need to pay attention with the maximum length that the GROUP_CONCAT can hold.