I have a simple MySQL InnoDB database with two tables: users and dialogues. I am trying to make a LEFT JOIN query, however, I've ran into a performance problem.
When I execute the following statement,
EXPLAIN SELECT u.id FROM users u
LEFT JOIN dialogues d ON u.id = d.creator_id
I get a response that DB uses SELECT types index and ref, which is totally fine.
However, when I add an additional clause:
EXPLAIN SELECT u.id FROM users u
LEFT JOIN dialogues d ON (u.id = d.creator_id OR u.id = d.target_id)
suddenly the DB indicates that it uses all SELECT type when JOINing, which in turn makes the actual query multiple times slower.
Is there something that could be done to make DB use more effective SELECT type in the second example?
d.creator_id and d.target_id columns have foreign keys connected to u.id.
It is usually faster to do two left joins and coalesce() in the select:
SELECT d.*,
COALESCE(uc.name, ut.name) as name
FROM dialogues d LEFT JOIN
users uc
ON uc.id = d.creator_id LEFT JOIN
users ut
ON ut.id = d.target_id
Related
I have a products table where I include 3 columns, created_user_id, updated_user_id and in_charge_user_id, all of which are related to my user table, where I store the id and name of the users.
I want to build an efficient query to obtain the names of the corresponding user_id's.
The query that I build so far is the following:
SELECT products.*,
(SELECT name FROM user WHERE user_id = products.created_user_id) as created_user,
(SELECT name FROM user WHERE user_id = products.updated_user_id) as updated_user,
(SELECT name FROM user WHERE user_id = products.in_charge_user_id) as in_charge_user
FROM products
The problem with this query is that if I have 30,000 records, I am executing 3 more queries per row.
What would be a more efficient way of achieving this? I am using mysql.
For each type of user id (created, updated, in_charge) you would JOIN the users table once:
SELECT
products.*,
u1.username AS created_username,
u2.username AS updated_username,,
u3.username AS in_charge_username,
FROM products
JOIN user u1 ON products.created_user_id = u1.user_id
JOIN user u2 ON products.updated_user_id = u2.user_id
LEFT JOIN user u3 ON products.in_charge_user_id = u3.user_id
This is the best practice method to obtain the data.
It is similiar to your query with sub-selects but a more modern approach which I think the database can optimize and utilize better.
Important:
You need foreign key index on all the user_id fields in both tables!
Then the query will be very fast no matter how many rows are in the table. This requires an engine which supports foreign keys, like InnoDB.
LEFT JOIN or INNER JOIN ?
As the other answers suggest a LEFT JOIN, I would not do a left join.
If you have an user id in the products table, there MUST be a linked user_id in the user table, except for the in_charge_user which is only present some times. If not, the data would be semantically corrupt. The foreign keys assure that you always have a linked user_id and a user_id can only be deleted when there is no linked product left.
JOIN is equivalent to INNER JOIN.
You can use LEFT JOIN instead of subselects.
Your query should be like:
SELECT
P.*,
[CU].[name],
[UU].[name],
[CU].[name]
FROM products AS [P]
LEFT JOIN user AS [CU] ON [CU].[user_id] = [P].[created_user_id]
LEFT JOIN user AS [UU] ON [UU].[user_id] = [P].[updated_user_id]
LEFT JOIN user AS [CU] ON [CU].[user_id] = [P].[in_charge_user_id]
First, your query should be fine. You only need an index on user(user_id) or better yet user(user_id, name) for performance. I imagine that the first exists.
Second, you can write this using LEFT JOIN:
SELECT p.*, uc.name as created_user,
uu.name as updated_user, uin.name as in_charge_user
FROM products p LEFT JOIN
user uc
ON uc.user_id = p.created_user_id LEFT JOIN
user uu
ON uu.user_id = p.updated_user_id LEFT JOIN
user uin
ON uin.user_id = p.in_charge_user_id;
With one of the above indexes, the two methods should have very similar performance.
Also note the use of LEFT JOIN. This handles the case where one or more of the user ids is missing.
Try this below query
SELECT products.*, c.name as created_user,u.name as updated_user,i.name as in_charge_user
FROM products left join user c on(products.created_user_id=c.user_id ) left join user u on(products.updated_user_id=u.user_id ) left join user u on(products.in_charge_user_id=i.user_id )
Also as Gordon Linoff mentioned create index on user table will fetch your data faster.
I wanted to know the difference between the 2 queries.I have 2 tables: Users and Emails.
User schema - id, name, email, is_subscribed, created, modified.
Email schema - id, user_id, sent_at, subject.
So I need to find the count those users, who have received a total of more than 20 emails throughout.
User table has roughly around 100K records. And Emails table have nearly 4 million records
1st Query
SELECT u.id, u.email, count(u.id)
FROM emails as e
LEFT JOIN users as u
ON e.user_id = u.id
WHERE u.is_subscribed = 1
GROUP BY e.user_id HAVING count(u.id) > 20
2nd Query
SELECT u.id, u.email, count(u.id)
FROM users as u
INNER JOIN emails as e
ON e.user_id = u.id
WHERE u.is_subscribed = 1
GROUP BY e.user_id HAVING count(u.id) > 20
What I have tried:
1)On production, these query takes like forever to execute, so on local, I have created sample table with dummy records. i.e
User table - around 5 records and Emails table around 100 records.
When I execute the above two queries I get the same result set for both the queries and when checked for Profiling, I get the same execution time for both queries(which may be different on production) so it is hard to know which is the better one. (This may not be the optimal way to find the solution.)
2)Used Explain with the query, and it shows it scans all 100 rows of emails table in both the cases(queries)
Please let me know if I have missed any specifics. I will update the question.
Read about MySQL LEFT JOIN optimization. The DBMS can tell that your LEFT JOINs WHERE is filtering out all the NULL-extended rows that come from LEFT JOIN that don't come from INNER JOIN so it just does an INNER JOIN.
MySQL 5.7 Reference Manual
9.2.1.9 LEFT JOIN and RIGHT JOIN Optimization
For a LEFT JOIN, if the WHERE condition is always false for the generated NULL row, the LEFT JOIN is changed to a normal join.
(Since you don't want NULL-extended rows, why would you use LEFT JOIN?)
Please try below query:-
SELECT u.id, u.email, count(u.id)
FROM users as u
INNER JOIN emails as e ON e.user_id = u.id
WHERE u.is_subscribed = 1
GROUP BY u.id
HAVING count(u.id) > 20
I'm a bit of a db noob and have a nasty query that is taking over 30 seconds to run. I'm trying to learn a bit more about EXPLAIN and optimize the query but am at a loss. Here is the query:
SELECT
feed.*, users.username, smf_attachments.id_attach AS avatar,
games.name AS item_name, games.image, feed.item_id, u2.username AS follow_name
FROM feed
INNER JOIN following ON following.follow_id = feed.user_id AND following.user_id = 1
LEFT JOIN users ON users.id = feed.user_id
LEFT JOIN smf_members ON smf_members.member_name = users.username
LEFT JOIN smf_attachments ON smf_attachments.id_member = smf_members.id_member
LEFT JOIN games ON games.id = feed.item_id
LEFT JOIN users u2 ON u2.id = feed.item_id
ORDER BY feed.timestamp DESC
LIMIT 25
Explain results:
The result you will want to avoid in your execution plan (the output of an explain statement) is "full scan" (extra field of the explain output). In order to avoid it, you need to create the correct indexes on your tables.
If you have a table scan, it means the query engine read sequentially each row of the the table. Instead, with index access, the query engines accesses more directly the relevant data.
More explanation here: http://dev.mysql.com/doc/refman/5.0/en/using-explain.html
In this sql:
SELECT s.*,
u.id,
u.name
FROM shops s
LEFT JOIN users u ON u.id = s.user_id
OR u.id = s.owner_user_id
WHERE s.status = 1
For some reason this query takes an amazing time. although id is the primary key. it seems especially after I added this part OR u.id=s.owner_user_id the query became slow. owner_user_id often is 0 only handful of times. But why would it take so long apparently scanning the whole table? The database table users is very long and big. I didn't design it. this is for a client who subsequent programmers added too many fields. the table is 22k rows and dozens of fields.
*the names of the fields for demonstration only. actual names are different, so don't ask me why I'm looking for owner_user_id (; I did solve the slowness by remove the "OR ..." part and instead searching for the id in the loop if it is not 0. but I would like to know why this is happening and how to speedup that query as is.
You may be able to speed it up by using IN instead of the OR but that is minor.
SELECT u.id,
u.name
FROM shops s
LEFT JOIN users u ON u.id IN ( s.user_id, s.owner_user_id )
WHERE s.status = 1
Firstly, are there any indexes on this table? Mainly one on the user.id field or the s.user_id or s.owner_user_id?
However, I must ask why you need to use a LEFT JOIN instead of a regular join. The LEFT JOIN causes the matching of every row with every other one. And since I'm assuming the value / id should either be in the user_id or the owner_user_id field, and that there will always be a match, if that is the case then the use of a JOIN should speed the query up a bit.
And as Mitch said, 22k rows is tiny.
How are you going to know which user record is which? Here's how I'd do it
SELECT s.*,
u.name AS user_name,
o.name AS owner_name
FROM shops s
LEFT JOIN users u ON s.user_id = u.id
LEFT JOIN users o ON s.owner_user_id = o.id
WHERE s.status = 1
I've omitted the IDs from the user table in the SELECT as these will be part of s.* anyway.
I'm curious about the left joins too. If shops.user_id and shops.owner_user_id are required foreign keys, use inner joins instead.
Hey guys quick question, I always use left join, but when I left join twice I always get funny results, usually duplicates. I am currently working on a query that Left Joins twice to retrieve the necessary information needed but I was wondering if it were possible to build another select statement in so then I do not need two left joins or two queries or if there were a better way. For example, if I could select the topic.creator in table.topic first AS something, then I could select that variable in users and left join table.scrusersonline. Thanks in advance for any advice.
SELECT * FROM scrusersonline
LEFT JOIN users ON users.id = scrusersonline.id
LEFT JOIN topic ON users.username = topic.creator
WHERE scrusersonline.topic_id = '$topic_id'
The whole point of this query is to check if the topic.creator is online by retrieving his name from table.topic and matching his id in table.users, then checking if he is in table.scrusersonline. It produces duplicate entries unfortunately and is thus inaccurate in my mind.
You use a LEFT JOIN when you want data back regardless. In this case, if the creator is offline, getting no rows back would be a good indication - so remove the LEFT joins and just do regular joins.
SELECT *
FROM scrusersonline AS o
JOIN users AS u ON u.id = o.id
JOIN topic AS t ON u.username = t.creator
WHERE o.topic_id = '$topic_id'
One option is to group your joins thus:
SELECT *
FROM scrusersonline
LEFT JOIN (users ON users.id = scrusersonline.id
JOIN topic ON users.username = topic.creator)
WHERE scrusersonline.topic_id = '$topic_id'
Try:
select * from topic t
left outer join (
users u
inner join scrusersonline o on u.id = o.id
) on t.creator = u.username
If o.id is null, the user is offline.
Would not it be better to match against topic_id in the topics table by moving the condition to the join. I think it will solve your problem, since duplicates come from joining with the topics table:
SELECT * FROM scrusersonline
JOIN users
ON users.id = scrusersonline.id
LEFT JOIN topic
ON scrusersonline.topic_id = '$topic_id'
AND users.username = topic.creator
By the way, LEFT JOIN with users is not required since you seem to search for the intersection between scrusersonline and users