Slow mysql query with joins - mysql

I have to run this query and it is pretty slow (4.86 seconds):
SELECT DISTINCT (users.id), users . *
FROM users
LEFT JOIN user_stages ON users.id = user_stages.user_id
LEFT JOIN user_tags ON users.id = user_tags.user_id
LEFT JOIN log ON log.user_id = users.id
ORDER BY last_activity DESC
When I do profiling it looks like Copying to tmp table takes 91% of the time (3.710409 seconds).
The size of the tables: users - almost 100,000 records, log - 1,443,000 records, user_stages - 66,000 records, user_tags - 260,000 records.
There are indexes properly added, if you want I can write all the indexes. How can I rewrite the query or modify the mysql settings to make this query faster?

Assuming last_activity is in the users table, you can change the query to the following:
SELECT users.*
FROM users
ORDER BY last_activity DESC
Your query is selecting only columns from the users table. The left join ensures that all rows from the table appear at least once. The distinct is removing duplicates added by the other tables. Hence, the joins are unnecessary.
If last_activity is in another table, then you might need to join that information in.
Your joins are probably taking so much time because you are getting cross products of rows for each user from the various tables.

SELECT `users`.*
FROM `users`
LEFT JOIN `user_stages` ON `users`.`id` = `user_stages`.`user_id`
LEFT JOIN `user_tags` ON `users`.`id` = `user_tags`.`user_id`
LEFT JOIN `log` ON `log`.`user_id` = `users`.`id`
GROUP BY `users`.`id`
ORDER BY `last_activity` DESC;

The query is built on the fly based on user's input. Sometimes it looks like this:
SELECT DISTINCT (users.id), users . *
FROM users
LEFT JOIN user_stages ON users.id = user_stages.user_id
LEFT JOIN user_tags ON users.id = user_tags.user_id
LEFT JOIN log ON log.user_id = users.id
WHERE user_stages.stage_id = 5
AND user_tags.tag_id = 10
ORDER BY last_activity DESC
The query has been written using GROUP BY initially but it was slower (about 8 seconds). I replaced GROUP BY with DISTINCT and it was faster but not fast enough. If you have any suggestions I would appreciate.

Related

query result taking time to load

I have a query that fetches data from Six tables but it takes too much time to fetch data.The browser loads and shows sometimes nothing as a result.When I run this query in the MySQL database, it takes a long time to execute.
SELECT SQL_CALC_FOUND_ROWS movies.*,
curriculums.name AS curriculum,
teachers.name AS teacher,
movie_sub_categories.name AS sub_cat_name,
movie_categories.name AS cat_name
FROM movies
LEFT JOIN curriculums on movies.curriculum_id = curriculums.id
LEFT JOIN teachers on movies.teacher_id = teachers.id
LEFT JOIN movies_movie_sub_categories on movies.id = movies_movie_sub_categories.movie_id
LEFT JOIN movie_sub_categories on movies_movie_sub_categories.movie_sub_category_id = movie_sub_categories.id
LEFT JOIN movie_categories on movie_sub_categories.movie_category_id = movie_categories.id
ORDER BY id LIMIT 0, 50
Here all of my table structure
That's not a very exciting query -- it simply delivers the first 50 rows of whichever table id belongs to. When JOINing, please qualify columns so we know what is going on.
Do you really need LEFT?
Assuming you need LEFT and id belongs to movies, then this should run a lot faster:
Meanwhile, find how many rows there are in movies only once, so you don't have to compute it every time.
SELECT movies.*, curriculums.name AS curriculum,
teachers.name AS teacher, movie_sub_categories.name AS sub_cat_name,
movie_categories.name AS cat_name
FROM ( SELECT id FROM movies ORDER BY id LIMIT 0, 50 ) AS m
JOIN movies USING(id)
LEFT JOIN curriculums AS c ON movies.curriculum_id = c.id
LEFT JOIN teachers AS t ON movies.teacher_id = t.id
LEFT JOIN movies_movie_sub_categories AS mmsc ON movies.id = mmsc.movie_id
LEFT JOIN movie_sub_categories AS msc ON mmsc.movie_sub_category_id = msc.id
LEFT JOIN movie_categories AS mc ON msc.movie_category_id = mc.id
ORDER BY m.id
Please use SHOW CREATE TABLE; we need to see if you have sufficient indexes, such as
mmsc: INDEX(movie_id)
the table movies_movie_sub_categories needs to have an index on movie_id and a separate index on movie_sub_category_id. Without those two indexes the query builder will be forced to scan every record twice (since the query has two separate join clauses that reference that table)

Query performance with Left Join vs Inner Join - MySQL

I wanted to know the difference between the 2 queries.I have 2 tables: Users and Emails.
User schema - id, name, email, is_subscribed, created, modified.
Email schema - id, user_id, sent_at, subject.
So I need to find the count those users, who have received a total of more than 20 emails throughout.
User table has roughly around 100K records. And Emails table have nearly 4 million records
1st Query
SELECT u.id, u.email, count(u.id)
FROM emails as e
LEFT JOIN users as u
ON e.user_id = u.id
WHERE u.is_subscribed = 1
GROUP BY e.user_id HAVING count(u.id) > 20
2nd Query
SELECT u.id, u.email, count(u.id)
FROM users as u
INNER JOIN emails as e
ON e.user_id = u.id
WHERE u.is_subscribed = 1
GROUP BY e.user_id HAVING count(u.id) > 20
What I have tried:
1)On production, these query takes like forever to execute, so on local, I have created sample table with dummy records. i.e
User table - around 5 records and Emails table around 100 records.
When I execute the above two queries I get the same result set for both the queries and when checked for Profiling, I get the same execution time for both queries(which may be different on production) so it is hard to know which is the better one. (This may not be the optimal way to find the solution.)
2)Used Explain with the query, and it shows it scans all 100 rows of emails table in both the cases(queries)
Please let me know if I have missed any specifics. I will update the question.
Read about MySQL LEFT JOIN optimization. The DBMS can tell that your LEFT JOINs WHERE is filtering out all the NULL-extended rows that come from LEFT JOIN that don't come from INNER JOIN so it just does an INNER JOIN.
MySQL 5.7 Reference Manual
9.2.1.9 LEFT JOIN and RIGHT JOIN Optimization
For a LEFT JOIN, if the WHERE condition is always false for the generated NULL row, the LEFT JOIN is changed to a normal join.
(Since you don't want NULL-extended rows, why would you use LEFT JOIN?)
Please try below query:-
SELECT u.id, u.email, count(u.id)
FROM users as u
INNER JOIN emails as e ON e.user_id = u.id
WHERE u.is_subscribed = 1
GROUP BY u.id
HAVING count(u.id) > 20

Optimize Query with MySQL EXPLAIN

I'm a bit of a db noob and have a nasty query that is taking over 30 seconds to run. I'm trying to learn a bit more about EXPLAIN and optimize the query but am at a loss. Here is the query:
SELECT
feed.*, users.username, smf_attachments.id_attach AS avatar,
games.name AS item_name, games.image, feed.item_id, u2.username AS follow_name
FROM feed
INNER JOIN following ON following.follow_id = feed.user_id AND following.user_id = 1
LEFT JOIN users ON users.id = feed.user_id
LEFT JOIN smf_members ON smf_members.member_name = users.username
LEFT JOIN smf_attachments ON smf_attachments.id_member = smf_members.id_member
LEFT JOIN games ON games.id = feed.item_id
LEFT JOIN users u2 ON u2.id = feed.item_id
ORDER BY feed.timestamp DESC
LIMIT 25
Explain results:
The result you will want to avoid in your execution plan (the output of an explain statement) is "full scan" (extra field of the explain output). In order to avoid it, you need to create the correct indexes on your tables.
If you have a table scan, it means the query engine read sequentially each row of the the table. Instead, with index access, the query engines accesses more directly the relevant data.
More explanation here: http://dev.mysql.com/doc/refman/5.0/en/using-explain.html

Joining sql statements

I have this statement:
SELECT board.*, numlikes
FROM board
LEFT JOIN (SELECT
pins.board_id, COUNT(source_user_id) AS numlikes
FROM likes
INNER JOIN pins ON pins.id = likes.pin_id
GROUP BY pins.board_id) likes ON board.id = likes.board_id
WHERE who_can_tag = ''
ORDER BY numlikes DESC LIMIT 10
But I need to also join these other two statements to it:
SELECT COUNT(owner_user_id)
FROM repin
INNER JOIN pins ON pins.id = repin.from_pin_id
WHERE pins.board_id = '$id'
and
SELECT COUNT(is_following_board_id)
FROM follow
WHERE is_following_board_id = '$id'
I managed to get the first one joined but I'm having trouble with the others - thinking it might get too long.
Is there a quicker way to execute?
Ideally, start with the smallest result set, and then start joining to the next smallest table.
You don't want the database to do full table joins on a bunch of big tables, and then at the end have a where clause that removes 99% of the rows the database just created.
In Oracle, I do a:
SELECT *
FROM big_table bt
JOIN DUAL ON bt.best_filter_column='the_value'
--now there are only a few rows
JOIN other_table_1 ...
LEFT JOIN outer_join_tables ...
Include all OUTER JOINS last, since they don't drop any rows, so hopefully you've already filtered out a lot of rows.

MySQL query performance problem - INNER JOIN, ORDER BY, DESC

I have got this query:
SELECT
t.type_id, t.product_id, u.account_id, t.name, u.username
FROM
types AS t
INNER JOIN
( SELECT user_id, username, account_id
FROM users WHERE account_id=$account_id ) AS u
ON
t.user_id = u.user_id
ORDER BY
t.type_id DESC
1st question:
It takes around 30seconds to do this at the moment with only 18k records in types table.
The only indexes at the moment are only a primary indexes with just id.
Would the long time be caused by a lack of more indexes? Or would it be more to do with the structure of this query?
2nd question:
How can I add the LIMIT so I only get 100 records with the highest type_id?
Without changing the results, I think it is a 100 times faster if you don't make a sub-select of your users table. It is not needed at all in this case.
You can just add LIMIT 100 to get only the first 100 results (or less if there aren't a 100).
SELECT SQL_CALC_FOUND_ROWS /* Calculate the total number of rows, without the LIMIT */
t.type_id, t.product_id, u.account_id, t.name, u.username
FROM
types t
INNER JOIN users u ON u.user_id = t.user_id
WHERE
u.account_id = $account_id
ORDER BY
t.type_id DESC
LIMIT 1
Then, execute a second query to get the total number of rows that is calculated.
SELECT FOUND_ROWS()
That sub select on MySQL is going to slow down your query. I'm assuming that this
SELECT user_id, username, account_id
FROM users WHERE account_id=$account_id
doesn't return many rows at all. If that's the case then the sub select alone won't explain the delay you're seeing.
Try throwing an index on user_id in your types table. Without it, you're doing a full table scan of 18k records for each record returned by that sub select.
Inner join the users table and add that index and I bet you see a huge increase in speed.