I have a table that holds relations of users participating in conversations like follows:
CREATE TABLE `so` (
`id` int(11) NOT NULL AUTO_INCREMENT PRIMARY KEY,
`user_id` int(11) NOT NULL,
`conversation_id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
ALTER TABLE `so`
ADD UNIQUE KEY `uc` (`user_id`,`conversation_id`) USING BTREE;
INSERT INTO `so` (`id`, `user_id`, `conversation_id`) VALUES
(1, 1, 1),
(3, 1, 2),
(2, 2, 1),
(4, 2, 2),
(5, 3, 2);
According to sample data, users 1 and 2 have conversation with ID of 1 and users 1, 2, 3 - conversation with ID of 2.
I need to get unique conversation_id for the list of user ids.
My current query is:
SELECT conversation_id, COUNT(user_id) as usersCount
FROM so
WHERE user_id IN (1,2)
GROUP BY conversation_id
HAVING usersCount = 2
ORDER BY NULL
But it returns 2 rows for both conversations and I expect the row with conversation_id of 1.
How can I select the row that belongs exactly to users 1 and 2, and not to 1, 2, 3? Thanks.
UPDATE:
I can't use subqueries on joins for performance reasons because users list in the query may be up to 30 ids and I'm afraid 30 subqueries is not the case.
You can use group_concat
select conversation_id
from so
group by conversation_id
having group_concat(user_id order by user_id) = '1,2';
To avoid full index scan, you can put your original query in a subquery:
SELECT a.conversation_id
FROM (
SELECT conversation_id
FROM so
WHERE user_id IN (1,2)
GROUP BY conversation_id
HAVING COUNT(conversation_id) = 2) a
JOIN so b ON a.conversation_id = b.conversation_id
GROUP BY a.conversation_id
HAVING COUNT(a.conversation_id) = 2;
Instead of checking the user_id in the WHERE clause, compare the number of rows that satisfy that condition with the total rows for each conversation.
SELECT conversation_id, COUNT(*) AS allCount, SUM(user_id IN (1, 2)) AS userCount
FROM so
GROUP BY conversation_id
HAVING allCount = 2 AND allCount = userCount
This answer is an alternative to the already given, and will provide better efficiency through not using sub-selects.
HAVING COUNT(user_id IN ('1','2') OR NULL) > 0 specifies that you want conversations with userid 1 and 2.
COUNT(user_id) = 2 says that there can only be 2 users in the conversation.
You could even remove COUNT(user_id) as usersCount from the result set if you don't actually use it as part of your exercise.
SELECT conversation_id, COUNT(user_id) as usersCount
FROM so
GROUP BY conversation_id
HAVING COUNT(user_id IN ('1','2') OR NULL) > 0 AND
COUNT(user_id) = 2;
To avoid a full index scan you would have to use a where clause as #Fabricator has shown in his answer. When you apply conditions to groups of rows, it has to group them first, and then do the aggregations and conditions, and a where clause only applies conditions to single rows. How big is your table out of interest?
Related
I'm have a table with following structure and data. I would like to get the conversation_id of a row that having given user_id(s).
For example. I would like to get the conversation_id between user_id 1 and user 2, so the result should be 1. If I would like to get the conversation_id of user_id 1 to 4, the result should be 4.
How could I write in sql query?
You can GROUP BY conversation_id and set the condition in the HAVING clause:
SELECT conversation_id
FROM tablename
WHERE user_id IN (1, 2) -- the ids of the users
GROUP BY conversation_id
HAVING COUNT(DISTINCT user_id) = 2 -- the number of the users
I have a simple table arranged as below and I'd like to write a query that returns the single client_id that has most transactions. One client has most transactions than any other client in the table.
transaction_id int
client_id int
comments varchar
Cheers.
Here is one method for obtaining the data by counting the rows, grouping on the client_id and then filtering for top 1 ordered by the count descending.
declare #table table (
transaction_id int,
client_id int,
comments varchar(20)
);
insert into #table (transaction_id, client_id, comments)
values
(1, 1, ''),
(2, 2, ''),
(3, 2, ''),
(4, 3, '')
select top 1 client_id, count(*) as vol
from #table
group by client_id
order by vol desc
This will do it for you:
SELECT client_id FROM table_name GROUP BY client_id ORDER BY COUNT(*) DESC limit 1;
select client_id, count(transaction_id)
from table
group by client_id
order by 2 desc
limit 1;
Group by helps to gather records based on a particular column and the countreturns the number of records corresponding to that column value. Order by desc arranges in descending order and limit 1 restricts to 1 record.
TABLE [tbl_hobby]
person_id (int) , hobby_id(int)
has many records. I want to get a SQL query to find all pairs of personid who have the same hobbies( same hobby_id ).
If A has hobby_id 1, B has too, if A doesn't have hobby_id 2, B doesn't have too, we will output A & B 's person_ids.
If A and B and C reach the limits, we output A & B , B & C, A & C.
I've finished in a very very very stupid method, multiple joins the table itself and multiple sub-queries. And of course be laughed by leader.
Is there any high performance method in a SQL for this question?
I have been thinking hard for this since 36 hrs ago......
sample data in mysql dump
CREATE TABLE `tbl_hobby` (
`person_id` int(11) NOT NULL,
`hobby_id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `tbl_hobby` (`person_id`, `hobby_id`) VALUES
(1, 1),(1, 2),(1, 3),(1, 4),(1, 5),(2, 2),
(2, 3),(2, 4),(3, 1),(3, 2),(3, 3),(3, 4),
(4, 1),(4, 3),(4, 4),(5, 1),(5, 5),(5, 9),
(6, 2),(6, 3),(6, 4),(7, 1),(7, 3),(7, 7),
(8, 2),(8, 3),(8, 4),(9, 1),(9, 2),(9, 3),
(9, 4),(10, 1),(10, 5),(10, 9),(10, 11);
COMMIT;
Expert result: (2 and 6 and 8 same, 3 and 9 same)
2,6
2,8
6,8
3,9
Order of result records and order of the two number in one record is not important. Result record in one column or in two columns are all accepted since it can be easily concated or seperated.
Aggregate per person to get strings of their hobbies. Then aggregate per hobby list find out which belong to more than one person.
select hobbies, group_concat(person_id order by person_id) as persons
from
(
select person_id, group_concat(hobby_id order by hobby_id) as hobbies
from tbl_hobby
group by person_id
) persons
group by hobbies
having count(*) > 1
order by hobbies;
This gives a a list of persons per hobby. Which is the easiest way to output a solution as we would otherwise have to build all possible pairs.
UPDATE: If you want pairs, you'll have to query the table twice:
select p1.person_id as person 1, p2.person_id as person2
from
(
select person_id, group_concat(hobby_id order by hobby_id) as hobbies
from tbl_hobby
group by person_id
) p1
join
(
select person_id, group_concat(hobby_id order by hobby_id) as hobbies
from tbl_hobby
group by person_id
) p2 on p2.person_id > p1.person_id and p2.hobbies = p1.hobbies
order by person1, person2;
Alternative version, without using any proprietary string handling:
select distinct t1.person_id, t2.person_id
from tbl_hobby t1
join tbl_hobby t2
on t1.person_id < t2.person_id
where 2 = all (select count(*)
from tbl_hobby
where person_id in (t1.person_id, t2.person_id)
group by hobby_id);
Perhaps less efficient, but portable!
I ran into a problem trying to pull one action per user with the least priority, the priority is based on other columns content and is an integer,
This is the initial query :
SELECT
CASE
...
END AS dummy_priority,
id,
user_id
FROM
actions
Result :
id user_id priority
1 2345 1
2 2345 3
3 2999 5
4 2999 2
5 3000 10
Desired result :
id user_id priority
1 2345 1
4 2999 2
5 3000 10
Following what i want i tried
SELECT x.id, x.user_id, MIN(x.priority)
FROM (
SELECT
CASE
...
END AS priority,
id,
user_id
FROM
actions
) x
GROUP BY x.user_id
Which didn't work
Error Code: 1055. Expression #1 of SELECT list is not in GROUP BY
clause and contains nonaggregated column 'x.id' which is not
functionally dependent on columns in GROUP BY clause;
Most examples of this I found were extracting just the user_id and priority and then doing an inner join with both of them to get the row, but I can't do that since (priority, user_id) isn't unique
A simple verifiable example would be
CREATE TABLE `actions` (
`id` int(11) NOT NULL,
`user_id` int(11) DEFAULT NULL,
`priority` int(11) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
INSERT INTO `actions` (`id`, `user_id`, `priority`) VALUES
(1, 2345, 1),
(2, 2345, 3),
(3, 2999, 5),
(4, 2999, 2),
(5, 3000, 10);
how to extract the desired result (please hold in mind that this table is a subquery)?
The proper way to do this would involve a subquery of some sort . . . and that would require repeating the case definition.
Here is another method, using the substring_index()/group_concat() trick:
SELECT SUBSTRING_INDEX(GROUP_CONCAT(x.id ORDER BY x.priority), ',', 1) as id,
x.user_id, MIN(x.priority)
FROM (SELECT (CASE ...
END) AS priority,
id, user_id
FROM actions a
) x
GROUP BY x.user_id;
And that proper way in full...
SELECT x...
, CASE...x... priority
FROM my_table x
JOIN
( SELECT user_id
, MIN(CASE...) priority
FROM my_table
GROUP
BY user_id
) y
ON y.user_id = x.user_id
AND y.priority = CASE...x...;
This should work ...
SELECT id , user_id, priority FROM actions act
INNER JOIN
(SELECT
user_id, MIN(priority) AS priority
FROM
actions
GROUP BY user_id) pri
ON act.user_id = pri.user_id AND act.priority = pri.prority
I'm stuck with the following problem:
SQL query for the table:
CREATE TABLE IF NOT EXISTS `thread_users` (
`thread_id` bigint(20) unsigned NOT NULL,
`user_id` bigint(20) unsigned NOT NULL,
PRIMARY KEY (`thread_id`,`user_id`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Let's say that I have those data:
INSERT INTO `thread_users` (`thread_id`, `user_id`) VALUES
(1, 1),
(1, 2),
(2, 1),
(2, 2),
(2, 3),
(3, 1),
(3, 4);
I need to retrieve the thread_id referred by only 2 ids: X & Y (both known).
With the above data, I want to be able to retrieve the thread_id where only user_id = 1 & user_id = 2 are present.
What i Know for sure about this table:
If a thread is composed by only 2 users, there is no other threads containing only those two ids. (It's check outside mysql before the insertion)
A user can't be present in a thread more than once. (primary key)
What i have thinking of to resolve this problem:
Sum up (user_id 1 + user_id 2) search for SUMs equal to that result + (user_id = X OR user_id = Y). But i haven't been able to write correctly this query AND I also need to check the number of user_id in that thread...
Obviously: searching id where the number of user_id on threads are equal to 2 and where user_id are equals to X & Y.
Thanks for the help guys!
SELECT tu1.thread_id
FROM thread_users AS tu1
INNER JOIN thread_users AS tu2
ON tu1.thread_id = tu2.thread_id
AND tu1.user_id <> tu2.user_id
LEFT OUTER JOIN thread_users AS tu3
ON tu1.thread_id = tu3.thread_id
AND tu1.user_id <> tu3.user_id
AND tu2.user_id <> tu3.user_id
WHERE tu1.user_id = 1
AND tu2.user_id = 2
AND tu3.user_id IS NULL
Something like this
SELECT thread_id FROM thread_users WHERE user_id IN(1,2) GROUP BY thread_id
HAVING COUNT(user_id)=2
SQL fiddle
I think this is a bit simpler than the JOIN example, and also a bit faster:
SELECT `thread_id`, GROUP_CONCAT(`user_id`) AS this_match FROM `thread_users`
GROUP BY `thread_id` HAVING this_match = '1,2'