Retrieve one id referred by two ids - mysql

I'm stuck with the following problem:
SQL query for the table:
CREATE TABLE IF NOT EXISTS `thread_users` (
`thread_id` bigint(20) unsigned NOT NULL,
`user_id` bigint(20) unsigned NOT NULL,
PRIMARY KEY (`thread_id`,`user_id`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Let's say that I have those data:
INSERT INTO `thread_users` (`thread_id`, `user_id`) VALUES
(1, 1),
(1, 2),
(2, 1),
(2, 2),
(2, 3),
(3, 1),
(3, 4);
I need to retrieve the thread_id referred by only 2 ids: X & Y (both known).
With the above data, I want to be able to retrieve the thread_id where only user_id = 1 & user_id = 2 are present.
What i Know for sure about this table:
If a thread is composed by only 2 users, there is no other threads containing only those two ids. (It's check outside mysql before the insertion)
A user can't be present in a thread more than once. (primary key)
What i have thinking of to resolve this problem:
Sum up (user_id 1 + user_id 2) search for SUMs equal to that result + (user_id = X OR user_id = Y). But i haven't been able to write correctly this query AND I also need to check the number of user_id in that thread...
Obviously: searching id where the number of user_id on threads are equal to 2 and where user_id are equals to X & Y.
Thanks for the help guys!

SELECT tu1.thread_id
FROM thread_users AS tu1
INNER JOIN thread_users AS tu2
ON tu1.thread_id = tu2.thread_id
AND tu1.user_id <> tu2.user_id
LEFT OUTER JOIN thread_users AS tu3
ON tu1.thread_id = tu3.thread_id
AND tu1.user_id <> tu3.user_id
AND tu2.user_id <> tu3.user_id
WHERE tu1.user_id = 1
AND tu2.user_id = 2
AND tu3.user_id IS NULL

Something like this
SELECT thread_id FROM thread_users WHERE user_id IN(1,2) GROUP BY thread_id
HAVING COUNT(user_id)=2
SQL fiddle

I think this is a bit simpler than the JOIN example, and also a bit faster:
SELECT `thread_id`, GROUP_CONCAT(`user_id`) AS this_match FROM `thread_users`
GROUP BY `thread_id` HAVING this_match = '1,2'

Related

How to count all rows with multiple table joins and where conditions?

I have the following tables:
create table loans
(
id int null,
status int null,
user_id int null
);
INSERT INTO loans VALUES (1, 1, 1);
INSERT INTO loans VALUES (2, 0, 1);
INSERT INTO loans VALUES (3, 1, 1);
create table deals
(
id int null,
status int null,
user_id int null
);
INSERT INTO deals VALUES (2, 0, 1);
INSERT INTO deals VALUES (3, 0, 1);
create table listings
(
id int null,
status int null,
user_id int null
);
INSERT INTO listings VALUES (1, 1, 1);
INSERT INTO listings VALUES (2, 1, 1);
INSERT INTO listings VALUES (3, 1, 1);
And have the following SQL:
SELECT COUNT(*) AS active_items
FROM loans
LEFT JOIN deals ON deals.user_id = 1
LEFT JOIN listings ON listings.user_id = 1
WHERE
loans.status = 1
AND deals.status = 1
AND listings.status = 1
AND loans.user_id = 1
The goal is to count all the rows where each table item has a status of 1, leaving out any that have a status of 0. My query which I have made seems to only return 0 all the time and I do not understand why? How can I query the database so I can find each loan, deal and listing which has a status of 1 and returns in one total called active_items? Why does my query not work?
DB Fiddle: https://www.db-fiddle.com/f/g9CoA9CdDujqzG4ZpgmJXh/1
The output for active_items is expected to be 5.
Don't use JOIN for this, since you're not relating the tables to each other. Just do 3 separate queries and add the counts.
SELECT SUM(count) AS total
FROM (
SELECT COUNT(*) AS count
FROM loans
WHERE user_id = 1 AND status = 1
UNION ALL
SELECT COUNT(*) AS count
FROM deals
WHERE user_id = 1 AND status = 1
UNION ALL
SELECT COUNT(*)
FROM listings
WHERE user_id = 1 AND status = 1
) AS x
DEMO
It is not 100% clear to me what you are trying to check.
But if I understand it correctly, I think the problem you have, is you are only checking for the user_id = 1, which might not have status 1 in all the tables (I really can't be sure without seeing your data).
I think you want to do something like:
SELECT COUNT(*) AS active_items
FROM loans
INNER JOIN deals ON deals.user_id = loans.user_id
INNER JOIN listings ON listings.user_id = loans.user_id
WHERE
loans.status = 1
AND deals.status = 1
AND listings.status = 1

SQL multiple JOINs or subqueries but avoid cartesian product

I want to realize an SQL database for a game. There are a number of players that participate in different tournaments. For each tournament, a player has a separate account. All games are listed in one large table in which the tournament accounts are used to describe winner, loser, along with the score of the game.
The schema is given in http://sqlfiddle.com/#!9/55378a or here again
CREATE TABLE `players` (
`id` int NOT NULL,
`name` varchar(5),
PRIMARY KEY (`id`)
);
CREATE TABLE `tournamentAccounts` (
`tId` int NOT NULL,
`playerId` int NOT NULL,
`handicap` int NOT NULL DEFAULT 10,
PRIMARY KEY (`tId`)
);
CREATE TABLE `games` (
`gameId` int NOT NULL,
`winnerTId` int NOT NULL,
`loserTId` int NOT NULL,
`score` int NOT NULL DEFAULT 0,
PRIMARY KEY (`gameId`)
);
INSERT INTO `players` (`id`, `name`) VALUES
(1, 'a'), (2, 'b'), (3, 'c');
INSERT INTO `tournamentAccounts` (`tId`, `playerId`, `handicap`) VALUES
(1, 1, 10), (2, 1, 2), (3, 2, 0);
INSERT INTO `games` (`gameId`, `winnerTId`, `loserTId`, `score`) VALUES
(1, 1, 3, 3), (2, 1, 3, 2), (3, 3, 1, 6);
What I want to achieve: List for a specific player all tournament scores, i.e. handicap + scorepoints of won games - scorepoints of lost games. For the given inputs, the result set should contain two rows with total scores 9 (for tId=1) and 2 (for tId=2), respectively. The example here is simplified, as in my example there are more conditions to match between the tournamentAccounts and games tables (e.g. time slots etc.), but I guess I can extend it myself once I understood the basic approach :-)
My approaches until now failed as I cannot get a nice JOIN or subqueries to work (I would like to avoid stored procedures).
Attempt 1: straight forward join
SELECT t.*, (t.handicap +COALESCE(SUM(w.score),0) -COALESCE(SUM(l.score),0)) AS score
FROM tournamentAccounts t
LEFT JOIN games w ON w.winnerTId = t.tId
LEFT JOIN games l ON l.loserTId = t.tId
WHERE playerId = 1
GROUP BY t.tId
Although this returns the correct number of rows, the double LEFT JOIN causes a cartesian product as it seems: the two won games are joined with the lost game into two datasets, hence 10 + 3 - 6 + 2 - 6. This effect obviously becomes worse the more matching rows I have in the games table.
Attempt 2: UNION with JOIN (similar to sql avoid cartesian product)
SELECT SUM(COALESCE(x.aa,0))
FROM
((SELECT -l.score AS aa FROM games l LEFT JOIN tournamentAccounts t ON l.loserTId = t.tId WHERE t.playerId = 1)
UNION
(SELECT w.score AS aa FROM games w LEFT JOIN tournamentAccounts t ON w.winnerTId = t.tId WHERE t.playerId = 1)) x
With this I get the proper score value summed up, however it is not yet combined with the corresponding handicap value, and also I don't know how to extend from here to cover all tournament accounts of that player (here, I just took a small snapshot of data) in an SQL manner.
I would just make the games portion of your query into a union, not the whole thing:
SELECT t.*, (t.handicap +COALESCE(SUM(win_score),0) -COALESCE(SUM(loss_score),0)) AS score
FROM tournamentAccounts t
LEFT JOIN (
SELECT w.winnerTId AS tId, w.score AS win_score, 0 AS loss_score FROM games w
UNION ALL
SELECT l.loserTId, 0, l.score FROM games l
) games_won_or_lost ON games_won_or_lost.tId=t.tId
WHERE playerId = 1
GROUP BY t.tId
The other alternative is to undo the effects of the cartesian product. You know the win score is too high by a factor of the number of lost games, so replace SUM(w.score) with ROUND(SUM(w.score)/GREATEST(COUNT(DISTINCT l.gameId),1)). And similarly, SUM(l.score) becomes ROUND(SUM(l.score)/GREATEST(COUNT(DISTINCT w.gameId),1)).
fiddle
How about following:-
SELECT t.*, (t.handicap + coalesce(wscore,0) - coalesce(lscore,0)) AS score
FROM tournamentAccounts t
LEFT JOIN (
select sum(score) wscore, winnerTId wid
from games
group by winnerTid
) as w ON w.wid = t.tid
left join (
select sum(score) lscore, loserTid lid
from games
group by loserTid
) as l ON l.lid = t.tid
where playerId = 1
I got the result as
tId playerId handicap score
1 1 10 9
2 1 2 2

get the id of the row with the least value, group by an other column

I ran into a problem trying to pull one action per user with the least priority, the priority is based on other columns content and is an integer,
This is the initial query :
SELECT
CASE
...
END AS dummy_priority,
id,
user_id
FROM
actions
Result :
id user_id priority
1 2345 1
2 2345 3
3 2999 5
4 2999 2
5 3000 10
Desired result :
id user_id priority
1 2345 1
4 2999 2
5 3000 10
Following what i want i tried
SELECT x.id, x.user_id, MIN(x.priority)
FROM (
SELECT
CASE
...
END AS priority,
id,
user_id
FROM
actions
) x
GROUP BY x.user_id
Which didn't work
Error Code: 1055. Expression #1 of SELECT list is not in GROUP BY
clause and contains nonaggregated column 'x.id' which is not
functionally dependent on columns in GROUP BY clause;
Most examples of this I found were extracting just the user_id and priority and then doing an inner join with both of them to get the row, but I can't do that since (priority, user_id) isn't unique
A simple verifiable example would be
CREATE TABLE `actions` (
`id` int(11) NOT NULL,
`user_id` int(11) DEFAULT NULL,
`priority` int(11) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
INSERT INTO `actions` (`id`, `user_id`, `priority`) VALUES
(1, 2345, 1),
(2, 2345, 3),
(3, 2999, 5),
(4, 2999, 2),
(5, 3000, 10);
how to extract the desired result (please hold in mind that this table is a subquery)?
The proper way to do this would involve a subquery of some sort . . . and that would require repeating the case definition.
Here is another method, using the substring_index()/group_concat() trick:
SELECT SUBSTRING_INDEX(GROUP_CONCAT(x.id ORDER BY x.priority), ',', 1) as id,
x.user_id, MIN(x.priority)
FROM (SELECT (CASE ...
END) AS priority,
id, user_id
FROM actions a
) x
GROUP BY x.user_id;
And that proper way in full...
SELECT x...
, CASE...x... priority
FROM my_table x
JOIN
( SELECT user_id
, MIN(CASE...) priority
FROM my_table
GROUP
BY user_id
) y
ON y.user_id = x.user_id
AND y.priority = CASE...x...;
This should work ...
SELECT id , user_id, priority FROM actions act
INNER JOIN
(SELECT
user_id, MIN(priority) AS priority
FROM
actions
GROUP BY user_id) pri
ON act.user_id = pri.user_id AND act.priority = pri.prority

MySQL: troubles with HAVING COUNT 'exact', not 'at least'

I have a table that holds relations of users participating in conversations like follows:
CREATE TABLE `so` (
`id` int(11) NOT NULL AUTO_INCREMENT PRIMARY KEY,
`user_id` int(11) NOT NULL,
`conversation_id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
ALTER TABLE `so`
ADD UNIQUE KEY `uc` (`user_id`,`conversation_id`) USING BTREE;
INSERT INTO `so` (`id`, `user_id`, `conversation_id`) VALUES
(1, 1, 1),
(3, 1, 2),
(2, 2, 1),
(4, 2, 2),
(5, 3, 2);
According to sample data, users 1 and 2 have conversation with ID of 1 and users 1, 2, 3 - conversation with ID of 2.
I need to get unique conversation_id for the list of user ids.
My current query is:
SELECT conversation_id, COUNT(user_id) as usersCount
FROM so
WHERE user_id IN (1,2)
GROUP BY conversation_id
HAVING usersCount = 2
ORDER BY NULL
But it returns 2 rows for both conversations and I expect the row with conversation_id of 1.
How can I select the row that belongs exactly to users 1 and 2, and not to 1, 2, 3? Thanks.
UPDATE:
I can't use subqueries on joins for performance reasons because users list in the query may be up to 30 ids and I'm afraid 30 subqueries is not the case.
You can use group_concat
select conversation_id
from so
group by conversation_id
having group_concat(user_id order by user_id) = '1,2';
To avoid full index scan, you can put your original query in a subquery:
SELECT a.conversation_id
FROM (
SELECT conversation_id
FROM so
WHERE user_id IN (1,2)
GROUP BY conversation_id
HAVING COUNT(conversation_id) = 2) a
JOIN so b ON a.conversation_id = b.conversation_id
GROUP BY a.conversation_id
HAVING COUNT(a.conversation_id) = 2;
Instead of checking the user_id in the WHERE clause, compare the number of rows that satisfy that condition with the total rows for each conversation.
SELECT conversation_id, COUNT(*) AS allCount, SUM(user_id IN (1, 2)) AS userCount
FROM so
GROUP BY conversation_id
HAVING allCount = 2 AND allCount = userCount
This answer is an alternative to the already given, and will provide better efficiency through not using sub-selects.
HAVING COUNT(user_id IN ('1','2') OR NULL) > 0 specifies that you want conversations with userid 1 and 2.
COUNT(user_id) = 2 says that there can only be 2 users in the conversation.
You could even remove COUNT(user_id) as usersCount from the result set if you don't actually use it as part of your exercise.
SELECT conversation_id, COUNT(user_id) as usersCount
FROM so
GROUP BY conversation_id
HAVING COUNT(user_id IN ('1','2') OR NULL) > 0 AND
COUNT(user_id) = 2;
To avoid a full index scan you would have to use a where clause as #Fabricator has shown in his answer. When you apply conditions to groups of rows, it has to group them first, and then do the aggregations and conditions, and a where clause only applies conditions to single rows. How big is your table out of interest?

Short-circuit logic evaluation operators

Are there any short-circuit logic operators (specifically short-circuit AND and short-circuit OR) that I can use in a WHERE clause in MySQL 5.5? If there isn't, what are the alternatives?
An abstract view at my problem along with an explanation as to why I need this can be found at this fiddle:
http://sqlfiddle.com/#!2/97fd1/3
In reality we are looking at millions of books in millions of bookstores in thousands of cities in hundreds of countries, which is why we cannot accept the overhead of receiving the unneeded information with every query we dispatch and seriously need to find a way to make the evaluation stop as soon as we have all rows that satisfy the current condition, before moving on to the next OR.
Let me know if you need more information. Thanks in advance.
As requested, here is the schema used in the fiddle:
CREATE TABLE quantitycache (
id INT AUTO_INCREMENT,
quantity INT,
book_id INT NOT NULL,
bookstore_id INT NULL,
city_id INT NULL,
country_id INT NULL,
PRIMARY KEY (id)
);
As well as some example data:
INSERT INTO quantitycache
(quantity, book_id, bookstore_id, city_id, country_id)
VALUES
(5, 1, 1, NULL, NULL),
(100, 2, 1, NULL, NULL),
(7, 1, 2, NULL, NULL),
(12, 1, NULL, 1, NULL),
(12, 1, NULL, NULL, 1),
(100, 2, NULL, 1, NULL),
(100, 2, NULL, NULL, 1),
(200, 3, NULL, 1, NULL),
(250, 3, NULL, NULL, 1);
Keep in mind that a query does not execute imperatively. The query you wrote may run on multiple threads, and therefore a short-circuit operator in the where clause would not result in only one result.
Instead, use the LIMIT clause to only return the first row.
SELECT * FROM quantitycache
WHERE bookstore_id = 1 OR city_id = 1 OR country_id = 1
ORDER BY bookstore_id IS NULL ASC,
city_id IS NULL ASC,
country_id IS NULL ASC
LIMIT 1;
To get the best match for all books in a result set, save the results to a temp table, find the best result, then return interesting fields.
CREATE TEMPORARY TABLE results (id int, book_id int, match_rank int);
INSERT INTO results (id, book_id, match_rank)
SELECT id, book_id,
-- this assumes that lower numbers are better
CASE WHEN Bookstore_ID is not null then 1
WHEN City_ID is not null then 2
ELSE 3 END as match_rank
FROM quantitycache
WHERE bookstore_id = 1 OR city_id = 1 OR country_id = 1;
Select *
from (
select book_id, MIN(match_rank) as best_rank
from results
group by book_id
) as r
inner join results as rid
on r.book_id = rid.book_id
and rid.match_rank = r.best_rank
inner join quantitycache as q on q.id = rid.id;
DROP TABLE results;