How to exclude rows when using a LEFT JOIN (MySQL) - mysql

I have users with many posts. I want to build an SQL query that would do the following in 1 query (no subquery), and hopefully no unions if possible. I know I can do this with union but I want to learn if this can be done using only joins.
I want to get a list of distinct active users who:
have no posts
have no approved posts
Here's what I have so far:
SELECT DISTINCT u.*
FROM users u
LEFT JOIN posts p
ON p.user_id = u.id
LEFT JOIN posts p2
ON p2.user_id = u.id
WHERE u.status = 'active'
AND (p.status IS NULL
OR p2.status != 'approved');
The problem is when a user has multiple posts and one is active. This will still return the user which I do not want. If a user has an active post, he should be removed from the result set. Any ideas?
Here's what the data looks like:
mysql> select * from users;
+----+---------+
| id | status |
+----+---------+
| 1 | active |
| 2 | pending |
| 3 | pending |
| 4 | active |
| 5 | active |
+----+---------+
5 rows in set (0.00 sec)
mysql> select * from posts;
+----+---------+----------+
| id | user_id | status |
+----+---------+----------+
| 1 | 1 | approved |
| 2 | 1 | pending |
| 3 | 4 | pending |
+----+---------+----------+
3 rows in set (0.00 sec)
The answer here should be only users 4 and 5. 4 doesn't have an approved post and 5 doesn't have a post. It should not include 1, which has an approved post.

Not exists:
SELECT u.*
FROM users u
WHERE NOT EXISTS (
SELECT 1
FROM posts p
WHERE p.user_id = u.id AND p.status = 'approved');
Or equivalent LEFT JOIN
SELECT u.*
FROM users u
LEFT JOIN posts p
ON p.user_id = u.id AND p.status = 'approved'
WHERE p.user_id IS NULL;

Taking your requirements and translating them literally to SQL, I get this:
SELECT users.id,
COUNT(posts.id) as posts_count,
COUNT(approved_posts.id) as approved_posts_count
FROM users
LEFT JOIN posts ON posts.user_id = users.id
LEFT JOIN posts approved_posts
ON approved_posts.status = 'approved'
AND approved_posts.user_id = users.id
WHERE users.status = "active"
GROUP BY users.id
HAVING (posts_count = 0 OR approved_posts_count = 0);
For your test data above, this returns:
4|1|0
5|0|0
i.e. users with ids 4 and 5, the first of which has 1 post but no approved posts and the second of which has no posts.
However, it seems to me that this can be simplified since any user that has no approved posts will also have no posts, so the union of conditions is unnecessary.
In that case, the SQL is simply:
SELECT users.id,
COUNT(approved_posts.id) as approved_posts_count
FROM users
LEFT JOIN posts approved_posts
ON approved_posts.status = 'approved'
AND approved_posts.user_id = users.id
WHERE users.status = "active"
GROUP BY users.id
HAVING approved_posts_count = 0;
This also returns the same two users. Am I missing something?

Please explain why you don't want JOINs or UNIONs. If it is because of performance, then consider the following:
CREATE TABLE t ( PRIMARY KEY(user_id) )
SELECT user_id, MIN(status) AS z
FROM Posts
GROUP BY user_id;
SELECT u.id AS user,
IFNULL(z, 'no_posts') AS status
FROM users u
WHERE u.status = 'active'
LEFT JOIN t ON t.user_id = u.id
HAVING status != 'approved';
It will make only one pass over each table, thereby being reasonably efficient (considering the complexity of the query).

This one may help:
SELECT DISTINCT u.*
FROM users u
LEFT JOIN posts p ON 1=1
-- matches only if user has any post
AND p.user_id = u.id
-- matches only if user has any active post
AND p.status = 'approved'
WHERE 1=1
-- matches only active users
AND u.status = 'active'
-- matches only users with no matches on the LEFT JOIN
AND p.status IS NULL
;

I think this should be easy.
SELECT u.`id`, u.`status` FROM `users` u
LEFT OUTER JOIN `post` p ON p.`user_id` = u.`id` AND p.`status` = 'approved'
WHERE u.`status` = 'active' AND p.`id` IS NULL
Gives a result of 4 & 5.
[Edit] Just wanted to add why this works:
u.status = 'active'
This results into exclusion of all users that are not active.
p.status = 'approved'
This excludes all posts that are approved.
Hence, by using these two lines, we have excluded all users that qualify as approved for your criteria.
[Edit 2]
If you also need to know how many pending and how many approved, here is an updated version:
SELECT u.`id`, u.`status`, SUM(IF(p.`status` = 'approved', 1, 0)) AS `Approved_Posts`, SUM(IF(p.`status` = 'pending', 1, 0)) AS `Pending_Posts`
FROM `test_users` u
LEFT OUTER JOIN `test_post` p ON p.`user_id` = u.`id`
WHERE u.`status` = 'active'
GROUP BY u.`id`
HAVING SUM(IF(p.`id` IS NOT NULL, 1, 0))

Try this
SELECT DISTINCT u.*
FROM users u LEFT JOIN posts p
ON p.user_id = u.id
WHERE p.status IS NULL
OR p.status != 'approved';

Can you try with the below query:
SELECT DISTINCT u.*
FROM users u
LEFT JOIN posts p
ON p.user_id = u.id
WHERE
u.status = 'active' AND (
p.user_id IS NULL
OR p.status != 'approved');
EDIT
As per the updated question, the above query will include User 1. If we want to prevent that, and don't want to use inner query, we can use group_concat function of MySQL to get all the (distinct) statuses and see if it contains 'active' status, below query should give the desired output:
SELECT u.id, group_concat(distinct p.status) as statuses
FROM users u
LEFT JOIN posts p
ON u.id = p.user_id
WHERE
u.status = 'active'
group by u.id
having (statuses is null or statuses not like '%approved%');

Related

Why my query shows no errors, but also returns zero rows?

I have written this query to get as many rows as there are users + count of potentials that each user have created + all potentials that have been converted. This is how it looks like:
SELECT u.*, p.allPotentials, pc.cPotentials
FROM os_user u
JOIN (SELECT FID_author, count(*) allPotentials FROM os_potential) p
ON p.FID_author = u.ID
JOIN (SELECT converted, FID_author, count(*) cPotentials FROM os_potential) pc
ON p.FID_author = u.ID AND pc.converted = 1
I am trying to do it with uncorrelated subquery as this answer explained me, that I can combine my queries into 1. But im getting 0 rows.
My tables looks like this:
Users:
+----+------+-------+
| ID | Name | Email |
+----+------+-------+
Potentials:
+----+------+-------+------------+-----------+
| ID | Name | Email | FID_author | converted |
+----+------+-------+------------+-----------+
FID_author is foreign key, the user id.
My query is returning 0 rows and shows no errors. What am I doing wrong?
EDIT
So far my query:
SELECT u.*, p.allPotentials, pc.cPotentials
FROM os_user u
LEFT JOIN (SELECT FID_author, count(*) allPotentials
FROM os_potential GROUP BY FID_author) p
ON p.FID_author = u.ID
LEFT JOIN (SELECT converted, FID_author, count(*) cPotentials
FROM os_potential GROUP BY FID_author) pc
ON p.FID_author = u.ID
AND pc.converted = 1
GROUP BY u.ID
I am getting results almost as expected, but the problem is, cPotentials contains 1 in every row, which is false. There are much many then only 1. Where could be the problem?
Missing group by on subquery and eventully use left join
SELECT u.*, p.allPotentials, pc.cPotentials
FROM os_user u
LEFT JOIN (SELECT FID_author, count(*) allPotentials FROM os_potential
GROUP BY FID_author) p
ON p.FID_author = u.ID
LEFT JOIN (SELECT converted, FID_author, count(*) cPotentials FROM os_potential
GROUP BY converted,FID_author) pc
ON pc.FID_author = u.ID AND pc.converted = 1

Get unique records based on multi-join with conditionals

I have 4 tables: posts, users, mentions, following
posts
----------------------------
id | user_id | post_text
1 1 foo
2 1 bar
3 2 hello
4 3 jason
users
------------
id | name
1 jason
2 nicole
3 frank
mentions
--------------------------
id | post_id | user_id
1 4 1
following
-------------------------------------------------
id | user_id | user_id_of_user_being_followed
1 1 2
posts includes the user_id of the user who posted some text
users has the user id and name of the user
mentions has the post id and user id of any post which has mentioned 1 or more other users
following has a the user id and the user they are following (user can follow 0 to many users)
What I'm trying to do is return all posts from users a that a given user follows, PLUS any posts that have mentioned that user (whether or not the given user is following), without returning any duplicates.
SELECT p.id, p.post, u.name,
FROM following f
JOIN posts p ON f.following = p.user_id
JOIN users u ON u.id = p.user_id
WHERE f.user_id = :user;
The above returns all posts from users that a given user is following, but I'm struggling figuring out how to include mentions as well (remember, a user does not have to follow someone to be able to see the post they've been mention in).
UPDATE:
Thanks to John R I was able to figure this out:
SELECT DISTINCT(p.id), p.post, u.name
FROM posts p
LEFT JOIN following f ON f.following = p.user_id
LEFT JOIN mentions m ON m.posts_id = p.id
JOIN users u ON u.id = p.user_id
WHERE (f.user_id = :user_id OR m.user_id = :user_id)
if i understand your querstion correctly you would want a left join to include any mentions.. but not filter out any followers/posts
if you can add some sample data to play with I can make sure its working how you want it to...
SELECT
if(p.id is not null, p.id, p1.id) as post_id,
if(p.post is not null, p.post, p1.post) as post_text,
u.username, m.id, m.user_id
FROM posts p
JOIN users u on u.id = p.user_id
JOIN following f on f.user_id_of_user_being_followed = u.id
LEFT JOIN mentions m on m.user_id = f.user_id
LEFT JOIN posts p1 on p1.id = m.post_id
WHERE f.user_id = :user or m.user_id = :user;
I left join mentions to the post made and also when the user_id in the mention table is equal to the specified user to filter out other users. the left join shouldn't change the number of rows returned.. but only include any mentions
EDIT: WORKING FIDDLE
after playing around with it I realised it was trying to put all of the data into one row.. try this:
(
SELECT p.id, p.post_text, u.name
FROM posts p
JOIN users u on u.id = p.user_id
JOIN following f on f.user_id_of_user_being_followed = u.id
WHERE f.user_id = 1
)
UNION
(
SELECT p.id, p.post_text, u.name
FROM following f
JOIN mentions m on m.user_id = f.user_id
JOIN posts p on p.id = m.post_id
join users u on u.id = p.user_id
WHERE f.user_id = 1
);
Maybe you inherited this db; but the last table is not really in line with good data normalization. The table should be the id and following_id; as set up you'll eventually run out of columns (or have to keep adding them when a user gets an error) - new users won't be able to follow anyone.

Check if joined result set contains some value in SQL

I have a database structure similar to this:
asset
+----+---------+
| id | user_id |
+----+---------+
user_favorite
+----------+---------+
| asset_id | user_id |
+----------+---------+
I am looking to create a query where I can return all assets belonging to a given user AND a boolean indicating whether or not it is a "favorite" for them.
I can do this, where a count() equalling zero would mean it's not a favorite (but it seemed hacky and inefficient):
select distinct(a.asset_id),
(select count(*)
from user_favorite f
where f.user_id = MY USER ID
and f.asset_id = a.asset_id)
from asset a
left join user_favorite u on a.asset_id=u.asset_id
where a.user_id = MY USER ID;
I tried this (but it yielded multiple entries from assets when multiple users had favorited them:
select distinct (a.asset_id),
(u.user_id in (MY USER ID))
from asset a
left join user_favorite u on a.asset_id=u.asset_id
where a.user_id = MY USER ID;
I also tried this (but the IN condition wasn't respected):
select distinct(a.asset_id),
(u.user_id in (MY USER ID))
from asset a
left join user_favorite u on a.asset_id=u.asset_id
where a.user_id = MY USER ID group by u.user_id;
Is there some good way to do this query?
This is how I'd do it, but I'm sure there are many acceptable ways:
SELECT DISTINCT a.asset_id
,CASE WHEN u.asset_id IS NULL
THEN 0
ELSE 1
END AS IsFavourite
FROM asset a
LEFT JOIN user_favourite u ON a.asset_id = u.asset_id
AND a.user_id = u.user_id
WHERE a.userid = MY_USER_ID
LEFT JOIN favourites to assets, and if there is no favourite record present (u.asset_id IS NULL) then it is not a favourite, otherwise it is.
Maybe this helps a little
select
assetid,
case when isnull(fav.user_id,0) =1 Then 0 else 1 end
form
asset a
left outer join
user_favourite fav
on a.user_id = fav.user_id
where
fav.id = 'foobar'
GROUP BY assetid, fav.user_id

MySQL join get latest valid row

I have 2 tables:
users
id | email
1 | email1#test.com
2 | email2#test.com
And questions
id | userId | isValid | status
1 | 1 | 0 | pending
2 | 1 | 1 | processed
I want to do a MySQL query that returns all users with the latest valid question (i.e questions.isValid = 1 and questions.id is the highest for that user). I am stumbling on the "latest" part - here is the query so far (which returns all valid questions).
SELECT u.email, q.status
FROM users AS u
LEFT JOIN questions AS q ON u.id = q.userId
WHERE q.isValid = 1
ORDER BY u.id ASC
Any suggestions? There are plenty of similar questions on stackoverflow but I couldn't find one that precisely matches that problem. Thanks!
EDIT: thanks for all the answers! I forgot to mention one important thing: if there is no valid question for that user, I still want the user to show in the results, with status = ''.
Mmmkay, what about this?
http://www.sqlfiddle.com/#!2/b6d65/1
SELECT u.email, q.status
FROM users AS u
LEFT JOIN (
( SELECT MAX(mq.id) AS id
FROM questions AS mq
WHERE mq.isValid = 1
GROUP BY mq.userId
) AS maxq
INNER JOIN questions AS q ON q.id = maxq.id
) ON u.id = q.userId
ORDER BY u.id ASC
If you would just like the latest status, this would work:
SELECT u.email,
(SELECT status FROM questions WHERE userId = u.id ORDER BY id DESC LIMIT 1) status
FROM users AS u
here your query :
SELECT u.email,q.status
FROM users AS u
LEFT JOIN question AS q ON u.id = q.userId
WHERE q.id = (SELECT max(id) from question where isvalid = 1 and userid = u.id )
GROUP BY q.isValid,u.id
check demo here
SELECT u.email, CASE q.status WHEN q.isValid=1 THEN q.status ELSE q.STATUS='' end as status
FROM users AS u
LEFT JOIN questions AS q ON u.id = q.userId
WHERE q.id IN(SELECT MAX(id) maxid FROM questions GROUP BY userid)
ORDER BY u.id ASC
http://www.sqlfiddle.com/#!2/003dd/12

Left Join 2 tables on 1 table

It must be pretty easy, but i can't think of any solution nor can I find an answer somewhere...
I got the table 'users'
and one table 'blogs' (user_id, blogpost)
and one table 'messages' (user_id, message)
I'd like to have the following result:
User | count(blogs) | count(messages)
Jim | 0 | 3
Tom | 2 | 3
Tim | 0 | 1
Foo | 2 | 0
So what I did is:
SELECT u.id, count(b.id), count(m.id) FROM `users` u
LEFT JOIN blogs b ON b.user_id = u.id
LEFT JOIN messages m ON m.user_id = u.id
GROUP BY u.id
It obviously doesn't work, because the second left join relates to blogs not users. Any suggestions?
First, if you only want the count value, you could do subselects:
select u.id, u.name,
(select count(b.id) from blogs where userid = u.id) as 'blogs',
(select count(m.id) from messages where userid = u.id) as 'messages'
from 'users'
Note that this is just a plain sql example, I have no mysql db here to test it right now.
On the other hand, you could do a join, but you should use an outer join to include users without blogs but with messages. That would imply that you get several users multiple times, so a group by would be helpful.
If you use an aggregate function in a select, SQL will collapse all your rows into a single row.
In order to get more than 1 row out you must use a group by clause.
Then SQL will generate totals per user.
Fastest option
SELECT
u.id
, (SELECT(COUNT(*) FROM blogs b WHERE b.user_id = u.id) as blogcount
, (SELECT(COUNT(*) FROM messages m WHERE m.user_id = u.id) as messagecount
FROM users u
Why you code does not work
SELECT u.id, count(b.id), count(m.id)
FROM users u
LEFT JOIN blogs b ON b.user_id = u.id <<-- 3 matches multiplies # of rows *3
LEFT JOIN messages m ON m.user_id = u.id <<-- 5 matches multiplies # of rows *5
GROUP BY u.id
The count will be off, because you are counting duplicate items.
Simple fix, but will be slower than option 1
If you only count distinct id's, you will get the correct counts:
SELECT u.id, count(DISTNICT b.id), count(DISTINCT m.id)
FROM users u
LEFT JOIN blogs b ON b.user_id = u.id
LEFT JOIN messages m ON m.user_id = u.id
GROUP BY u.id