I wanted to know the difference between the 2 queries.I have 2 tables: Users and Emails.
User schema - id, name, email, is_subscribed, created, modified.
Email schema - id, user_id, sent_at, subject.
So I need to find the count those users, who have received a total of more than 20 emails throughout.
User table has roughly around 100K records. And Emails table have nearly 4 million records
1st Query
SELECT u.id, u.email, count(u.id)
FROM emails as e
LEFT JOIN users as u
ON e.user_id = u.id
WHERE u.is_subscribed = 1
GROUP BY e.user_id HAVING count(u.id) > 20
2nd Query
SELECT u.id, u.email, count(u.id)
FROM users as u
INNER JOIN emails as e
ON e.user_id = u.id
WHERE u.is_subscribed = 1
GROUP BY e.user_id HAVING count(u.id) > 20
What I have tried:
1)On production, these query takes like forever to execute, so on local, I have created sample table with dummy records. i.e
User table - around 5 records and Emails table around 100 records.
When I execute the above two queries I get the same result set for both the queries and when checked for Profiling, I get the same execution time for both queries(which may be different on production) so it is hard to know which is the better one. (This may not be the optimal way to find the solution.)
2)Used Explain with the query, and it shows it scans all 100 rows of emails table in both the cases(queries)
Please let me know if I have missed any specifics. I will update the question.
Read about MySQL LEFT JOIN optimization. The DBMS can tell that your LEFT JOINs WHERE is filtering out all the NULL-extended rows that come from LEFT JOIN that don't come from INNER JOIN so it just does an INNER JOIN.
MySQL 5.7 Reference Manual
9.2.1.9 LEFT JOIN and RIGHT JOIN Optimization
For a LEFT JOIN, if the WHERE condition is always false for the generated NULL row, the LEFT JOIN is changed to a normal join.
(Since you don't want NULL-extended rows, why would you use LEFT JOIN?)
Please try below query:-
SELECT u.id, u.email, count(u.id)
FROM users as u
INNER JOIN emails as e ON e.user_id = u.id
WHERE u.is_subscribed = 1
GROUP BY u.id
HAVING count(u.id) > 20
Related
I have a simple MySQL InnoDB database with two tables: users and dialogues. I am trying to make a LEFT JOIN query, however, I've ran into a performance problem.
When I execute the following statement,
EXPLAIN SELECT u.id FROM users u
LEFT JOIN dialogues d ON u.id = d.creator_id
I get a response that DB uses SELECT types index and ref, which is totally fine.
However, when I add an additional clause:
EXPLAIN SELECT u.id FROM users u
LEFT JOIN dialogues d ON (u.id = d.creator_id OR u.id = d.target_id)
suddenly the DB indicates that it uses all SELECT type when JOINing, which in turn makes the actual query multiple times slower.
Is there something that could be done to make DB use more effective SELECT type in the second example?
d.creator_id and d.target_id columns have foreign keys connected to u.id.
It is usually faster to do two left joins and coalesce() in the select:
SELECT d.*,
COALESCE(uc.name, ut.name) as name
FROM dialogues d LEFT JOIN
users uc
ON uc.id = d.creator_id LEFT JOIN
users ut
ON ut.id = d.target_id
I have a really simple table - follow - in which I store followers.
user | following
-----------------
1 | 2
The above means user 1 is following user 2.
I want to display all users on the home page and order them buy who has the most followers, and then return the rest of the users who have no followers. The below query is working as far as displaying the users, but I can't figure out how to retrieve the users who do not have any followers. I've tried RIGHT JOIN users u ON f.following=u.id but that gives me weird results.
This query returns user 2 who has a follower, but doesn't return users 1 and 3, who do not have followers.
Edit: this query is also checking to see if the user is following back, which is why I'm joining using the ID of 1 as a test.
SELECT
u.id
,u.username
,u.avatar
,COUNT(1) AS followers
,ul.*
,fo.*
FROM follow f
LEFT JOIN users u ON f.following=u.id
LEFT JOIN follow fo ON fo.following=u.id AND fo.user=1
LEFT JOIN users_likes ul ON ul.likes=u.id AND ul.user=1
GROUP BY f.following
ORDER BY COUNT(1) DESC
SQL Fiddle: http://sqlfiddle.com/#!2/98f65/1
The problem with your query in the question is that you are left-joining to the follow table. That means that all rows in the follow table are included regardless of their connection to another table. What you want is to show all users, so that is the table that should be on the outer end of the join.
I also think you're trying to do too many things at once here, which is why you're having trouble figuring it out. You want to know who has followers and who doesn't, who's following back, order them, consider the users_likes and so on. I recommend taking a step back and breaking them down into individual queries, and then building those into one result set as needed.
To get the users and number of followers, you can outer join the users table with the follow table like this:
SELECT u.id, u.username, u.avatar, (IFNULL(COUNT(f.following), 0)) AS numFollowers
FROM users u
LEFT JOIN follow f ON f.following = u.id
GROUP BY u.id
ORDER BY numfollowers DESC;
IFNULL is used to check the cases when there are no followers, and no link is made in the outer join so a null value appears.
If you want to work in the users_likes table, you should add it in as another left join. The problem this causes, is that it will return null values for all columns if there are no likes. (Example, if I left join the users_likes table here, I will see null for users 1 and 3 because nobody 'likes' them.) To make the result set a little more understandable, I recommend you don't collect all rows of the users_likes table. Perhaps this query would make more sense:
SELECT u.id, u.username, u.avatar, (IFNULL(COUNT(f.following), 0)) AS numFollowers, ul.user AS likedByUser, ul.created_at
FROM users u
LEFT JOIN follow f ON f.following = u.id
LEFT JOIN users_likes ul ON ul.likes = u.id
GROUP BY u.id
ORDER BY numfollowers DESC;
As far as whether or not a user is following back, I think this would change a bit, as the above only shows the number of followers, and doesn't produce a row for each follower.
Let me know if you have any more questions, here is an SQL Fiddle for the above. I will leave it up to you for handling the null values that occur right now.
You can use an outer join (left or right) from Users to your current query in any number of ways. An easy example that should get you started. This isn't a clean-up up solution, just a dmeo of a way that will work.
SELECT a.*
,b.*
FROM users a
LEFT JOIN (
SELECT
u.id
,u.username
,u.avatar
,COUNT(1) AS followers
FROM follow f
LEFT JOIN users u ON f.following=u.id
LEFT JOIN follow fo ON fo.following=u.id AND fo.user=1
LEFT JOIN users_likes ul ON ul.likes=u.id AND ul.user=1
GROUP BY f.following
) b
ON a.id = b.id
ORDER BY followers DESC
You can do this:
SELECT * FROM (
SELECT u.id, u.username, u.avatar, COUNT(f.user) as followers
FROM users AS u
LEFT JOIN follow AS f ON u.id = f.following
GROUP BY u.id
) AS subselect ORDER BY subselect.followers DESC
I'm a bit of a db noob and have a nasty query that is taking over 30 seconds to run. I'm trying to learn a bit more about EXPLAIN and optimize the query but am at a loss. Here is the query:
SELECT
feed.*, users.username, smf_attachments.id_attach AS avatar,
games.name AS item_name, games.image, feed.item_id, u2.username AS follow_name
FROM feed
INNER JOIN following ON following.follow_id = feed.user_id AND following.user_id = 1
LEFT JOIN users ON users.id = feed.user_id
LEFT JOIN smf_members ON smf_members.member_name = users.username
LEFT JOIN smf_attachments ON smf_attachments.id_member = smf_members.id_member
LEFT JOIN games ON games.id = feed.item_id
LEFT JOIN users u2 ON u2.id = feed.item_id
ORDER BY feed.timestamp DESC
LIMIT 25
Explain results:
The result you will want to avoid in your execution plan (the output of an explain statement) is "full scan" (extra field of the explain output). In order to avoid it, you need to create the correct indexes on your tables.
If you have a table scan, it means the query engine read sequentially each row of the the table. Instead, with index access, the query engines accesses more directly the relevant data.
More explanation here: http://dev.mysql.com/doc/refman/5.0/en/using-explain.html
In this sql:
SELECT s.*,
u.id,
u.name
FROM shops s
LEFT JOIN users u ON u.id = s.user_id
OR u.id = s.owner_user_id
WHERE s.status = 1
For some reason this query takes an amazing time. although id is the primary key. it seems especially after I added this part OR u.id=s.owner_user_id the query became slow. owner_user_id often is 0 only handful of times. But why would it take so long apparently scanning the whole table? The database table users is very long and big. I didn't design it. this is for a client who subsequent programmers added too many fields. the table is 22k rows and dozens of fields.
*the names of the fields for demonstration only. actual names are different, so don't ask me why I'm looking for owner_user_id (; I did solve the slowness by remove the "OR ..." part and instead searching for the id in the loop if it is not 0. but I would like to know why this is happening and how to speedup that query as is.
You may be able to speed it up by using IN instead of the OR but that is minor.
SELECT u.id,
u.name
FROM shops s
LEFT JOIN users u ON u.id IN ( s.user_id, s.owner_user_id )
WHERE s.status = 1
Firstly, are there any indexes on this table? Mainly one on the user.id field or the s.user_id or s.owner_user_id?
However, I must ask why you need to use a LEFT JOIN instead of a regular join. The LEFT JOIN causes the matching of every row with every other one. And since I'm assuming the value / id should either be in the user_id or the owner_user_id field, and that there will always be a match, if that is the case then the use of a JOIN should speed the query up a bit.
And as Mitch said, 22k rows is tiny.
How are you going to know which user record is which? Here's how I'd do it
SELECT s.*,
u.name AS user_name,
o.name AS owner_name
FROM shops s
LEFT JOIN users u ON s.user_id = u.id
LEFT JOIN users o ON s.owner_user_id = o.id
WHERE s.status = 1
I've omitted the IDs from the user table in the SELECT as these will be part of s.* anyway.
I'm curious about the left joins too. If shops.user_id and shops.owner_user_id are required foreign keys, use inner joins instead.
I have a system where, essentially, users are able to put in 3 different pieces of information: a tip, a comment, and a vote. These pieces of information are saved to 3 different tables. The linking column of each table is the user ID. I want to do a query to determine if the user has any pieces of information at all, of any of the three types. I'm trying to do it in a single query, but it's coming out totally wrong. Here's what I'm working with now:
SELECT DISTINCT
*
FROM tips T
LEFT JOIN comments C ON T.user_id = C.user_id
LEFT JOIN votes V ON T.user_id = V.user_id
WHERE T.user_id = 1
This seems to only be getting the tips, duplicated for as many votes or comments there are, even if the votes or comments weren't made by the specified user_id.
I only need a single number in return, not individual counts of each type. I basically want a sum of the number of tips, comments, and votes saved under that user_id, but I don't want to do three queries.
Anyone have any ideas?
Edit: Actually, I don't even technically need an actual count, I just need to know if there are any rows in any of those three tables with that user_id.
Edit 2: I almost have it with this:
SELECT
COUNT(DISTINCT T.tip_id),
COUNT(DISTINCT C.tip_id),
COUNT(DISTINCT V.tip_id)
FROM tips T
LEFT JOIN comments C ON T.user_id = C.user_id
LEFT JOIN votes V ON T.user_id = V.user_id
WHERE T.user_id = 1
I'm testing with user_id 1 (me). I've made 11 tips, voted 4 times, and made no comments. My return is a row with 3 columns: 11, 0, 4. That's the proper count. However, I tested it with a user that hasn't made any tips or comments, but has voted 3 times, that returned 0 for all counts, it should have returned: 0, 0, 3.
The problem that I'm having seems to be that if the table that I'm using for the WHERE clause doesn't have any rows from that user_id, then I get 0 across the board, even if the other tables DO have rows with that user_id. I could use this query:
SELECT
(SELECT COUNT(*) FROM tips WHERE user_id = 2) +
(SELECT COUNT(*) FROM comments WHERE user_id = 2) +
(SELECT COUNT(*) FROM votes WHERE user_id = 2) AS total
But I really wanted to avoid running multiple queries, even if they're subqueries like this.
UPDATE
Thanks to ace, I figured this out:
SELECT
(COUNT(DISTINCT T.tip_id) + COUNT(DISTINCT C.tip_id) + COUNT(DISTINCT V.tip_id)) AS total
FROM users U
LEFT JOIN tips T ON U.user_id = T.user_id
LEFT JOIN votes V ON U.user_id = V.user_id
LEFT JOIN comments C ON U.user_id = C.user_id
WHERE U.user_id = 4
the users table contains the actual information bout the user including, obviously, the user id. I used the user table as the parent, since I could be 100% sure that the user would be present in that table, even if they weren't in the other tables. I got the proper count that I wanted with this query!
As I understand your question. You want to count the total comments + tips + votes for each user. Though is not really clear to me take a look at below query. I added columns for details this is a cross tabs query as someone teach me.
EDITED QUERY:
SELECT
COALESCE(COALESCE(t2.tips,0) + COALESCE(c2.comments,0) + COALESCE(v2.votes,0)) AS `Totals`
FROM parent p
LEFT JOIN (SELECT t.user_id, COUNT(t.tip_id) AS tips FROM tips t GROUP BY t.user_id) t2
ON p.user_id = t2.user_id
LEFT JOIN (SELECT c.user_id, COUNT(c.tip_id) AS comments FROM comments c GROUP BY c.user_id) c2
ON p.user_id = c2.user_id
LEFT JOIN (SELECT v.user_id, COUNT(v.tip_id) AS votes FROM votes v GROUP BY v.user_id) v2
ON p.user_id = v2.user_id
WHERE p.user_id = 1;
Note: This used a parent table in order to get the result of a table which doesn't in other table.
The reason why I use a sub-query in my JOIN is to create a virtual table that will get the sum of tip_id for each table. Also I'm having problem with the DISTINCT using the same query of yours, so I end up with this query.
I know you prefer not using sub-queries, but I failed without a sub-query. For now this is all I can.