SQL query to find users with apps with no releases - mysql

In my database, I have users, apps, and releases. A user can have 0..n apps through a permissions table and an app can have 0..n releases.
I'm trying to get a list of users who have at least 1 app, but none of that user's apps have any releases.
The schema is roughly
users permissions apps releases
----- ----------- ---- --------
id user_id id id
email app_id app_id
I think I've got something working with this, but it appears inefficient to me because I mention the permissions table twice and I'm using nested exists clauses. Is there a more efficient way to write this query?
select u.email from users u
join permissions p on p.user_id = u.id
where not exists (
select a.id from apps a
join permissions p on p.app_id = a.id
where p.user_id = u.id and exists (
select r.id from releases r
where r.app_id = a.id
)
);

You just need to use a LEFT JOIN on releases, and then look for the case where the number of released apps (r.app_id is non-NULL) is 0. If all you want is a list of users, I don't think you need to JOIN the apps table at all, as JOINing on permissions will ensure that only users that have permission for 1 or more apps are included.
SELECT u.email
FROM users u
JOIN permissions p ON p.user_id = u.id
LEFT JOIN releases r ON r.app_id = p.app_id
GROUP BY u.email
HAVING COUNT(r.app_id) = 0

The first Join seems to be correct between users and permissions table. You just need to check whether the app_id from joined result-set exists in releases table or not. You can try this query -
select u.email from users u
join permissions p on p.user_id = u.id
where not exists ( Select 1 from releases r where r.App_id = p.app_id)

I will do something like this, hope this helps:
SELECT
u.id, u.email
FROM
users AS u
INNER JOIN
permissions AS p ON p.user_id = u.id
LEFT JOIN
releases AS r ON r.app_id = p.app_id
GROUP BY
u.id, u.email
HAVING
SUM(CASE WHEN r.id IS NOT NULL THEN 1 ELSE 0 END) = 0

Another thing you could try is a combination of left and inner joins like this:
Select
email
From users u
Inner Join (
Select
p.user_id
, p.app_id
From permissions p
Left Join releases r
on p.app_id = r.app_id
Where r.app_id is null) a
on u.user_id = a.user_id
Group by email
It's hard to tell which is faster between this and the previous posted solution without knowing the size of the different tables (and hence how many rows SQL will be trying to join).
One thing that is clear - without the 'Group by email' line at the end, you might see users' email repeated multiple times in your list. Generally, literature on SQL states that using a "group by" statement at the end of your query is a faster way to get a distinct set than a "select distinct" statement at the beginning of your query.

Related

Get users with followers, and without followers

I have a really simple table - follow - in which I store followers.
user | following
-----------------
1 | 2
The above means user 1 is following user 2.
I want to display all users on the home page and order them buy who has the most followers, and then return the rest of the users who have no followers. The below query is working as far as displaying the users, but I can't figure out how to retrieve the users who do not have any followers. I've tried RIGHT JOIN users u ON f.following=u.id but that gives me weird results.
This query returns user 2 who has a follower, but doesn't return users 1 and 3, who do not have followers.
Edit: this query is also checking to see if the user is following back, which is why I'm joining using the ID of 1 as a test.
SELECT
u.id
,u.username
,u.avatar
,COUNT(1) AS followers
,ul.*
,fo.*
FROM follow f
LEFT JOIN users u ON f.following=u.id
LEFT JOIN follow fo ON fo.following=u.id AND fo.user=1
LEFT JOIN users_likes ul ON ul.likes=u.id AND ul.user=1
GROUP BY f.following
ORDER BY COUNT(1) DESC
SQL Fiddle: http://sqlfiddle.com/#!2/98f65/1
The problem with your query in the question is that you are left-joining to the follow table. That means that all rows in the follow table are included regardless of their connection to another table. What you want is to show all users, so that is the table that should be on the outer end of the join.
I also think you're trying to do too many things at once here, which is why you're having trouble figuring it out. You want to know who has followers and who doesn't, who's following back, order them, consider the users_likes and so on. I recommend taking a step back and breaking them down into individual queries, and then building those into one result set as needed.
To get the users and number of followers, you can outer join the users table with the follow table like this:
SELECT u.id, u.username, u.avatar, (IFNULL(COUNT(f.following), 0)) AS numFollowers
FROM users u
LEFT JOIN follow f ON f.following = u.id
GROUP BY u.id
ORDER BY numfollowers DESC;
IFNULL is used to check the cases when there are no followers, and no link is made in the outer join so a null value appears.
If you want to work in the users_likes table, you should add it in as another left join. The problem this causes, is that it will return null values for all columns if there are no likes. (Example, if I left join the users_likes table here, I will see null for users 1 and 3 because nobody 'likes' them.) To make the result set a little more understandable, I recommend you don't collect all rows of the users_likes table. Perhaps this query would make more sense:
SELECT u.id, u.username, u.avatar, (IFNULL(COUNT(f.following), 0)) AS numFollowers, ul.user AS likedByUser, ul.created_at
FROM users u
LEFT JOIN follow f ON f.following = u.id
LEFT JOIN users_likes ul ON ul.likes = u.id
GROUP BY u.id
ORDER BY numfollowers DESC;
As far as whether or not a user is following back, I think this would change a bit, as the above only shows the number of followers, and doesn't produce a row for each follower.
Let me know if you have any more questions, here is an SQL Fiddle for the above. I will leave it up to you for handling the null values that occur right now.
You can use an outer join (left or right) from Users to your current query in any number of ways. An easy example that should get you started. This isn't a clean-up up solution, just a dmeo of a way that will work.
SELECT a.*
,b.*
FROM users a
LEFT JOIN (
SELECT
u.id
,u.username
,u.avatar
,COUNT(1) AS followers
FROM follow f
LEFT JOIN users u ON f.following=u.id
LEFT JOIN follow fo ON fo.following=u.id AND fo.user=1
LEFT JOIN users_likes ul ON ul.likes=u.id AND ul.user=1
GROUP BY f.following
) b
ON a.id = b.id
ORDER BY followers DESC
You can do this:
SELECT * FROM (
SELECT u.id, u.username, u.avatar, COUNT(f.user) as followers
FROM users AS u
LEFT JOIN follow AS f ON u.id = f.following
GROUP BY u.id
) AS subselect ORDER BY subselect.followers DESC

Paginating a large user database with joins

I have a wordpress user database table 100,000+ users. As part of a plugin I need to list the subscribers. Obviously getting 100,000 users needs to be paginated. To get the total number of users to work out the pagination, I am running the main query without a limit and doing a PHP count() on the results:
SELECT role.umeta_id, role.user_id, role.meta_key, role.meta_value role, u.ID, u.user_login, u.user_email, u.user_registered
FROM wp_users AS u
LEFT JOIN wp_usermeta role ON role.user_id = u.ID
AND role.meta_key = 'wp_capabilities'
WHERE role.meta_value LIKE '%subscriber%'
GROUP BY u.ID
ORDER BY u.ID ASC
I am (unsurprisingly) running out of memory doing this. I have tried just doing a count similar to
SELECT COUNT( u.ID )
FROM wp_users AS u
LEFT JOIN wp_usermeta role ON role.user_id = u.ID
AND role.meta_key = 'wp_capabilities'
WHERE role.meta_value LIKE '%subscriber%'
GROUP BY u.ID
ORDER BY u.ID ASC
but rather than returning a single value, this returns rows and rows of count = 1.
I know that there are get_user functions in Wordpress to do this - I am just using this as a simplified example (the query is actually more complex)
So the question is "How can I efficiently get the total number of rows in such a situation as this?"
The problem with your query is that you're grouping by u.ID and count is an aggregate function
Edited:
I suggest getting rid of the group and the order by to where you're left with this
SELECT COUNT( u.ID )
FROM wp_users AS u
LEFT JOIN wp_usermeta role ON role.user_id = u.ID
AND role.meta_key = 'wp_capabilities'
WHERE role.meta_value LIKE '%subscriber%'

MySQL query optimization: Multiple SELECT IN to LEFT JOIN

I usually go with the join approach but in this case I am a bit confused. I am not even sure that it is possible at all. I wonder if the following query can be converted to a left join query instead of the multiple select in used:
select
users.id, users.first_name, users.last_name, users.description, users.email
from users
where id in (
select assigned.id_user from assigned where id_project in (
select assigned.id_project from assigned where id_user = 1
)
)
or id in (
select projects.id_user from projects where projects.id in (
select assigned.id_project from assigned where id_user = 1
)
)
This query returns the correct result set. However, I guess the repetition of the query that selects assigned.id_project is a waste.
You could start with the project assignments of user 1 a1. Then find all assignments of other people to those projects a2, and the user in the project table p. The users you are looking for are then in either a2 or p. I added distinct to remove users who can be reached in both ways.
select distinct u.*
from assigned a1
left join
assigned a2
on a1.id_project = a2.id_project
left join
project p
on a1.id_project = p.id
join user u
on u.id = a2.id_user
or u.id = p.id_user
where a1.id_user = 1
Since both subqueries have a condition where assigned.id_user = 1, I start with that query. Let's call that assignment(s) the 'leading assignment'.
Then join the rest, using left joins for the 'optional' tables.
Use an inner join on user that matches either users of assignments linked to the leading assignment or users of projects linked to the leading project.
I use distinct, because I assumen you'd want each user once, event if they have an assignment and a project (or multiple projects).
select distinct
u.id, u.first_name, u.last_name, u.description, u.email
from
assigned a
left join assigned ap on ap.id_project = a.id_project
left join projects p on p.id = a.id_project
inner join users u on u.id = ap.id_user or u.id = p.id_user
where
a.id_user = 1
Here's an alternative way to get rid of the repetition:
SELECT
users.id,
users.first_name,
users.last_name,
users.description,
users.email
FROM users
WHERE id IN (
SELECT up.id_user
FROM (
SELECT id_user, id_project FROM assigned
UNION ALL
SELECT id_user, id FROM projects
) up
INNER JOIN assigned a
ON a.id_project = up.id_project
WHERE a.id_user = 1
)
;
That is, the assigned table's pairs of id_user, id_project are UNIONed with those of projects. The resulting set is then joined with the user_id = 1 projects to obtain the list of all users who share the projects with the ID 1 user. And now it only remains to retrieve the details for those users, which in this case is done in the same way as in your query, i.e. using an IN clause.
I'm sorry to say that I don't have MySQL to thoroughly test the performance of this query and so cannot be quite sure if it is in any way better or worse than your original query or than the one suggested both by #GolezTrol and by #Andomar. Generally I tend to agree with #GolezTrol's comment that a query with simple (semi- or whatever-) joins and repetitive parts might turn out more efficient than an equivalent sophisticated query that doesn't have repetitions. In the end, however, it is testing that must reveal the final answer for you.

SQL query not returning unique results. Which type of join do I need to use?

I'm trying to run the following MySQL query:
SELECT *
FROM user u
JOIN user_categories uc ON u.user_id = uc.user_id
WHERE (uc.category_id = 3 OR uc.category_id = 1)
It currently returns:
Joe,Smith,60657,male
Joe,Smith,60657,male
Mickey,Mouse,60613,female
Petter,Pan,60625,male
Petter,Pan,60625,male
Donald,Duck,60615,male
If the user belongs to both categories it currently returns them twice. How can I return the user only once without using SELECT DISTINCT, regardless of how many categories they belong to?
You need a semi join. This can be achieved with a sub query.
SELECT *
FROM user u
WHERE EXISTS(SELECT *
FROM user_categories uc
WHERE u.user_id = uc.user_id AND
uc.category_id IN(1,3))
In MySQL the performance of sub queries is quite problematic however so a JOIN and duplicate elimination via DISTINCT or GROUP BY may perform better.
I don't know about MySQL, but in Postgres you may get better performance in the semi-join version from
SELECT * FROM user u
WHERE u.user_id
IN (SELECT user_id FROM user_categories uc WHERE uc.category_id IN (1,3));
I would expect SELECT DISTINCT to run fastest but I have learned my expectations and DB performance are often much different!
Try using a GROUP BY
SELECT * FROM user u
JOIN user_categories uc ON u.user_id = uc.user_id
WHERE uc.category_id = 3 OR uc.category_id = 1
GROUP BY u.user_id

multi-table mysql query

I am trying to make a multi-table query that I am not quite sure how to do properly. I have User, Message, Thread, and Project.
A User is associated with Message/Thread/Project as either the Creator or as it being 'shared' with them.
A Message is contained within a Thread (associated by message.thread_id and thread.id), and a Thread is contained within a Project (associated by thread.project_id and project_id).
I would like to create a query where given a User.id value, it will return all messages that the user has access to, as well as the Thread and Project name that that message is under, both as Creator or 'Shared'. I use a table to handle the 'shares'. The rough diagram is: http://min.us/mvpqbAU
There are more columns in each, but I left them out for simplicity.
I've made some assumptions on column names for message, project_name and thread_name as they are not included in the diagram.
/* Get messages where user is creator */
select u.name, m.message, p.project_name, t.thread_name
from user u
inner join message m
on u.id = m.owner_user_id
inner join thread t
on m.group_id = t.id
inner join project p
on t.project_id = p.id
where u.id = #YourUserID
union
/* Get messages where user has shared access */
select u.name, m.message, p.project_name, t.thread_name
from user u
inner join message_share ms
on u.id = ms.user_id
inner join message m
on ms.message_id = m.id
and m.owner_user_id <> #YourUserID
inner join thread t
on m.group_id = t.id
inner join project p
on t.project_id = p.id
where u.id = #YourUserID
I would urge you not to proceed with this design.
You have too many permissions levels spread over individual areas. I would suggest that you alter it so members have groups, which then in turn can be parts of projects, and threads. At a message level, you are looking at an administration nightmare having that level of permissions structure on individual messages.