MySQL query optimization: Multiple SELECT IN to LEFT JOIN - mysql

I usually go with the join approach but in this case I am a bit confused. I am not even sure that it is possible at all. I wonder if the following query can be converted to a left join query instead of the multiple select in used:
select
users.id, users.first_name, users.last_name, users.description, users.email
from users
where id in (
select assigned.id_user from assigned where id_project in (
select assigned.id_project from assigned where id_user = 1
)
)
or id in (
select projects.id_user from projects where projects.id in (
select assigned.id_project from assigned where id_user = 1
)
)
This query returns the correct result set. However, I guess the repetition of the query that selects assigned.id_project is a waste.

You could start with the project assignments of user 1 a1. Then find all assignments of other people to those projects a2, and the user in the project table p. The users you are looking for are then in either a2 or p. I added distinct to remove users who can be reached in both ways.
select distinct u.*
from assigned a1
left join
assigned a2
on a1.id_project = a2.id_project
left join
project p
on a1.id_project = p.id
join user u
on u.id = a2.id_user
or u.id = p.id_user
where a1.id_user = 1

Since both subqueries have a condition where assigned.id_user = 1, I start with that query. Let's call that assignment(s) the 'leading assignment'.
Then join the rest, using left joins for the 'optional' tables.
Use an inner join on user that matches either users of assignments linked to the leading assignment or users of projects linked to the leading project.
I use distinct, because I assumen you'd want each user once, event if they have an assignment and a project (or multiple projects).
select distinct
u.id, u.first_name, u.last_name, u.description, u.email
from
assigned a
left join assigned ap on ap.id_project = a.id_project
left join projects p on p.id = a.id_project
inner join users u on u.id = ap.id_user or u.id = p.id_user
where
a.id_user = 1

Here's an alternative way to get rid of the repetition:
SELECT
users.id,
users.first_name,
users.last_name,
users.description,
users.email
FROM users
WHERE id IN (
SELECT up.id_user
FROM (
SELECT id_user, id_project FROM assigned
UNION ALL
SELECT id_user, id FROM projects
) up
INNER JOIN assigned a
ON a.id_project = up.id_project
WHERE a.id_user = 1
)
;
That is, the assigned table's pairs of id_user, id_project are UNIONed with those of projects. The resulting set is then joined with the user_id = 1 projects to obtain the list of all users who share the projects with the ID 1 user. And now it only remains to retrieve the details for those users, which in this case is done in the same way as in your query, i.e. using an IN clause.
I'm sorry to say that I don't have MySQL to thoroughly test the performance of this query and so cannot be quite sure if it is in any way better or worse than your original query or than the one suggested both by #GolezTrol and by #Andomar. Generally I tend to agree with #GolezTrol's comment that a query with simple (semi- or whatever-) joins and repetitive parts might turn out more efficient than an equivalent sophisticated query that doesn't have repetitions. In the end, however, it is testing that must reveal the final answer for you.

Related

select a column corresponding to max value in two joined tables

I have two tables, say Users and Interviews. One user can have multiple interview records.
Users
-----
UserID
FirstName
LastName
Interviews
----------
InterviewID
UserID
DateOfInterview
I want to get only the latest interview records. Here's my query
select u.UserID, firstname, lastname, max(DateOfInterview) as latestDOI
from users u
left join interviews i
on u.UserID = i.UserID
GROUP BY u.UserID, firstname, lastname
ORDER BY max(DateOfInterview) DESC
How do I update the query to return the InterviewID as well (i.e. the one which corresponds to max(DateOfInterview))?
Instead of using an aggregate function in your select list, you can use an aggregate subquery in your WHERE clause:
select u.UserID, firstname, lastname, i.InterviewId, DateOfInterview as latestDOI
from users u
left join interviews i
on u.UserID = i.UserID
where i.UserId is null or i.DateOfInterview = (
select max(DateOfInterview)
from interviews i2
where i2.UserId = u.UserId
)
That does suppose that max(DateOfInterview) will be unique per user, but the question has no well-defined answer otherwise. Note that the main query is no longer an aggregate query, so the constraints of such queries do not apply.
There are other ways to approach the problem, and it is worthwhile to look into them because a correlated subquery such as I present can be a performance concern. For example, you could use an inline view to generate a table of the per-user latest interview dates, and use joins to that view to connect users with the ID of their latest interview:
select u.*, im.latestDOI, i2.InterviewId
from
users u
left join (
select UserID, max(DateOfInterview) as latestDOI
from interviews i
group by UserID
) im
on u.UserId = im.UserId
left join interviews i2
on im.UserId = i2.UserId and im.latestDOI = i2.DateOfInterview
There are other alternatives, too, some standard and others DB-specific.
Rewrite to use an OUTER APPLY when grabbing your interview, that way you can use order by rather than MAX
select u.UserID, firstname, lastname, LatestInterviewDetails.ID, LatestInterviewDetails.DateOfInterview as latestDOI
from users u
OUTER APPLY (SELECT TOP 1 Id, DateOfInterview
FROM interviews
WHERE interviews.UserID = u.UserId
ORDER BY interviews.DateOfInterview DESC
) as LatestInterviewDetails
Note: This is providing you are using Microsoft SQL Server

Get users with followers, and without followers

I have a really simple table - follow - in which I store followers.
user | following
-----------------
1 | 2
The above means user 1 is following user 2.
I want to display all users on the home page and order them buy who has the most followers, and then return the rest of the users who have no followers. The below query is working as far as displaying the users, but I can't figure out how to retrieve the users who do not have any followers. I've tried RIGHT JOIN users u ON f.following=u.id but that gives me weird results.
This query returns user 2 who has a follower, but doesn't return users 1 and 3, who do not have followers.
Edit: this query is also checking to see if the user is following back, which is why I'm joining using the ID of 1 as a test.
SELECT
u.id
,u.username
,u.avatar
,COUNT(1) AS followers
,ul.*
,fo.*
FROM follow f
LEFT JOIN users u ON f.following=u.id
LEFT JOIN follow fo ON fo.following=u.id AND fo.user=1
LEFT JOIN users_likes ul ON ul.likes=u.id AND ul.user=1
GROUP BY f.following
ORDER BY COUNT(1) DESC
SQL Fiddle: http://sqlfiddle.com/#!2/98f65/1
The problem with your query in the question is that you are left-joining to the follow table. That means that all rows in the follow table are included regardless of their connection to another table. What you want is to show all users, so that is the table that should be on the outer end of the join.
I also think you're trying to do too many things at once here, which is why you're having trouble figuring it out. You want to know who has followers and who doesn't, who's following back, order them, consider the users_likes and so on. I recommend taking a step back and breaking them down into individual queries, and then building those into one result set as needed.
To get the users and number of followers, you can outer join the users table with the follow table like this:
SELECT u.id, u.username, u.avatar, (IFNULL(COUNT(f.following), 0)) AS numFollowers
FROM users u
LEFT JOIN follow f ON f.following = u.id
GROUP BY u.id
ORDER BY numfollowers DESC;
IFNULL is used to check the cases when there are no followers, and no link is made in the outer join so a null value appears.
If you want to work in the users_likes table, you should add it in as another left join. The problem this causes, is that it will return null values for all columns if there are no likes. (Example, if I left join the users_likes table here, I will see null for users 1 and 3 because nobody 'likes' them.) To make the result set a little more understandable, I recommend you don't collect all rows of the users_likes table. Perhaps this query would make more sense:
SELECT u.id, u.username, u.avatar, (IFNULL(COUNT(f.following), 0)) AS numFollowers, ul.user AS likedByUser, ul.created_at
FROM users u
LEFT JOIN follow f ON f.following = u.id
LEFT JOIN users_likes ul ON ul.likes = u.id
GROUP BY u.id
ORDER BY numfollowers DESC;
As far as whether or not a user is following back, I think this would change a bit, as the above only shows the number of followers, and doesn't produce a row for each follower.
Let me know if you have any more questions, here is an SQL Fiddle for the above. I will leave it up to you for handling the null values that occur right now.
You can use an outer join (left or right) from Users to your current query in any number of ways. An easy example that should get you started. This isn't a clean-up up solution, just a dmeo of a way that will work.
SELECT a.*
,b.*
FROM users a
LEFT JOIN (
SELECT
u.id
,u.username
,u.avatar
,COUNT(1) AS followers
FROM follow f
LEFT JOIN users u ON f.following=u.id
LEFT JOIN follow fo ON fo.following=u.id AND fo.user=1
LEFT JOIN users_likes ul ON ul.likes=u.id AND ul.user=1
GROUP BY f.following
) b
ON a.id = b.id
ORDER BY followers DESC
You can do this:
SELECT * FROM (
SELECT u.id, u.username, u.avatar, COUNT(f.user) as followers
FROM users AS u
LEFT JOIN follow AS f ON u.id = f.following
GROUP BY u.id
) AS subselect ORDER BY subselect.followers DESC

mysql joining for relational lookup

I've never been all that great with much more then regular select queries. I have a new project that has users, roles and assigned_roles (lookup table for users with roles).
I want to group_concat the roles.name so that my result shows me what roles each user has assigned.
I've tried several things:
select users.id, users.displayname,users.email, rolenames from `users`
left join `assigned_roles` on `assigned_roles`.`user_id` = `users`.`id`
left join (SELECT `id`, group_concat(`roles`.`name`) as `rolenames` FROM `roles`) as uroles ON `assigned_roles`.`role_id` = `uroles`.`id`
This gives me the grouped role names but shows me duplicate entries if a user has two roles, so the second row in the result shows the same user but no role names.
select users.id, users.displayname,users.email, rolenames from `users`
join `assigned_roles` on `assigned_roles`.`user_id` = `users`.`id`
join (SELECT `id`, group_concat(`roles`.`name`) as `rolenames` FROM `roles`) as uroles ON `assigned_roles`.`role_id` = `uroles`.`id`
Just regular joins shows me what I want but wont lists users who do not have any assigned.roles, so its not complete.
I'll keep plugging away but I thought stack could help, hopefully I'll learn a bit more about joins today.
Thank you.
For GROUP CONCAT to work in this scenario, you'll need a GROUP BY to get the group info per user, something like;
SELECT u.id, u.displayname, u.email, GROUP_CONCAT(r.name) rolenames
FROM users u
LEFT JOIN assigned_roles ar ON ar.user_id = u.id
LEFT JOIN roles r ON r.id = ar.role_id
GROUP BY u.id, u.displayname, u.email

MySQL JOIN ON the one of two columns that doesn't match variable

Let's jump right into it: I've got two simple tables set up in my MySQL database, a users table and a matches table. The users table holds, well, users. The matches table is meant to establish many-to-many connections between users and contains just two userID's.
What is want to query is a list of names of all matched users for the user with userID 1 but I can't wrap my head around it. The problem is that the userID (in this case 1) could be in either one field and I don't have a clue in which one.
Just to clarify; I mean something like this (please don't mind the weird pseudo-code):
SELECT users.name
FROM matches
INNER JOIN users
ON userId = (userId1 OR userId2 DEPENDS ON WHERE)
WHERE userId1 = '1'
OR userId2 = '1';
Could you please tell me if this is possible with MySQL and if so, what I should look for/if you would be so kind, give a simple example.
Thanks a lot.
The user of or in a join condition often prevents MySQL from using an index. The use of union or union all makes the query rather cumbersome. You can do what you want with left outer join:
SELECT coalesce(u1.name, u2.name) as name
FROM matches m LEFT JOIN
users u1
ON u.userId = m.userId1 AND m.userId2 = '1' LEFT JOIN
users u2
ON u.userId = m.userId2 AND m.userId1 = '1'
WHERE '1' in (m.userId1, m.userId2);
This should take advantage of indexes on users for looking up the values. If you want distinct names, then add the distinct keyword.
Try:
SELECT DISTINCT u.name
FROM matches m
INNER JOIN users u
ON (u.userId = m.userId1 AND m.userId2 = '1')
OR (u.userId = m.userId2 AND m.userId1 = '1')
Added DISTINCT to avoid duplicate rows.
See this fiddle.
Here's one way to do it that avoids excessive JOIN logic (to make sure SQL can use indexes on users.userId, matches.userId1, matches.userId2)
SELECT u.`name`
FROM
matches AS m
JOIN users AS u
ON m.userId1=u.userId
AND m.userId2='1'
UNION
SELECT u.`name`
FROM
matches AS m
JOIN users AS u
ON m.userId2=u.userId
AND m.userId1='1'
Something like this:
Select UserId1, UserId2
From Matches
Where UserId1 = 1
Union
Select UserId2, UserId1
From Matches
Where UserId2 = 1
Notice the order of the UserIds have been changed in the Select clause. This will give you a single list of matches with you searched user '1' in a single column and all their matches in the other.
This approach will require you then link in your users table as follows:
Select searchmatches.UserId1, searchmatches.UserId2, leftuser.Name, rightuser.name
From (
Select UserId1, UserId2
From Matches
Where UserId1 = 1
Union
Select UserId2, UserId1
From Matches
Where UserId2 = 1
) searchmatches
inner join users leftuser userMatches.UserId1 = leftuser.UserId
inner join users rightuser userMatches.UserId2 = rightuser.UserId
Hope that Helps! If you want you can remove one of the inner joins to the users table as you know who the left user is as you searched on them!

SQL query to exclude one to many that have a specific value?

Using MySQL, I'd like to list all users that don't have the document "liaison". It could means Users that does not have any document at all, or users that have documents, but not "liaison" in these ones.
How can I do using MySQL Query ? I can't make it work!
Here's the (simple) model
Users (id, name)
Documents (id, user_id, name, path)
The NOT EXISTS is a workable solution. As an alternative, sometimes, with large sets, an "anti JOIN" operation can give better performance:
SELECT u.*
FROM Users u
LEFT
JOIN (SELECT d.user_id
FROM Documents d
WHERE d.name = 'liaison'
) l
ON l.user_id = u.id
WHERE l.user_id IS NULL
The inline view aliased as l returns us a list of user_id that have document named 'liaison'; that result set gets outer joined to the Users table, and then we exclude any rows where we found a match (the test of l.user_id IS NULL).
This returns a resultset equivalent to your query with the NOT EXISTS predicate.
Another alternative is to use a query with a NOT IN predicate. Note that we need to guarantee that the subquery does not return a NULL, so the general approach is to include an IS NOT NULL predicate on the column being returned by the subquery.
SELECT u.*
FROM Users u
WHERE u.id NOT IN
( SELECT d.user_id
FROM Documents d
WHERE d.user_id IS NOT NULL
AND d.name = 'liaison'
)
I'd write the NOT EXISTS query like this:
SELECT u.*
FROM Users u
WHERE NOT EXISTS
( SELECT 1
FROM Documents d
WHERE d.name = 'liaison'
AND d.user_id = u.id
)
My personal preference is to use a literal 1 in the SELECT list of that correlated subquery; it reminds me that the query is just looking for the existence of 1 row.)
Again, I usually find that the "anti-join" pattern gives the best performance with large sets. (You'd need to look at the EXPLAIN output for each statement, and measure the performance of each to determine which will work best in your situation.)
The correct query you are looking for is:
SELECT
*
FROM
Users
WHERE
id NOT IN (
SELECT
user_id
FROM
Documents
WHERE
name = "liaison"
)
This will achieve the exact result you are looking for. If a specific user has no documents, it will be listed. If it has many documents, and one of those is 'liaison', it won't be listed.
If you want to search for 'liaison' in your document's name, replace name = "liaison" for name LIKE "%liaison%".
It basically says: Select all users such as there are no documents with name "liaison" pointing to it.
So, I finally came up with this solution that seems to work good :
SELECT * FROM users u WHERE id NOT IN (SELECT DISTINCT user_id FROM user_documents WHERE name = 'LIAISON') ORDER BY c.lastname, c.firstname
SELECT users.*
FROM users left join Documents
on users.id = Documents.user_id
and documents.name='LIAISON'
WHERE documents.user_id is null
select * from Users where not exists (select id from Documents where Users.id = Documents.id and Documents.name = 'liaison')
Try :
SELECT DISTINCT u.*
FROM users u LEFT JOIN documents d ON d.user_id = u.id
WHERE d.id IS NULL OR d.name NOT LIKE '%liaison%'
Remove percent signs if "liaison" is the exact name of the document.