Can't figure out why this MySQL query is slow - mysql

I have one particular MySQL query which is slow, and I can't figure out why.
SELECT
s.title,
p.minPrice,
s.booking, r.url
FROM shows s
INNER JOIN showResources r
ON r.showID = s.id
INNER JOIN performances p
ON p.showID = s.id
WHERE s.lastDate >= CURDATE()
AND r.type = 'rectangle-poster'
AND p.minPrice > 0
GROUP BY s.id
ORDER BY p.minPrice ASC
LIMIT 30
The EXPLAIN for this query is as follows:
select_type table type possible_keys key key_len ref rows extra
1 SIMPLE s range PRIMARY,lastDate lastDate 4 NULL 291 Using index condition; Using temporary; Using filesort
1 SIMPLE r ref showID,type showID 5 thistle.s.id 1 Using where
1 SIMPLE p ref showID,minPrice showID 5 thistle.s.id 1 Using where
Other, seemingly far more complex queries on the same server are blisteringly fast - but this one typically takes about 4 seconds to run, and I just can't figure out why. I've even gone as far as deleting the tables and recreating them just in case it was some weird corruption, but no luck. Can a MySQL expert tell me what I'm doing wrong here?

Try this:
SELECT
s.id AS id,
s.title,
p.minPrice AS min_price,
s.booking,
r.url
FROM shows s
INNER JOIN showResources r
ON r.showID = s.id AND s.lastDate >= CURDATE() AND r.type = 'rectangle-poster'
INNER JOIN performances p
ON p.showID = s.id AND p.minPrice > 0
GROUP BY id
ORDER BY min_price ASC
LIMIT 30

Related

Find employees latest activity is slow when adding ORDER BY

I am working on a legacy system in Laravel and I am trying to pull the latest action of some specific types of actions an employee has done.
Performance is good when I don't add ORDER BY. When adding it the query will go from something like 130 ms to 18 seconds. There are about 1.5 million rows in the actions table.
How do I fix the performance problem?
I have tried to isolate the problem by cutting out all the other parts of the query so it is more readable for you:
SELECT
employees.id,
(
SELECT DATE_FORMAT(actions.date, '%Y-%m-%d')
FROM pivot
JOIN actions
ON pivot.actions_id = actions.id
WHERE employees.id = pivot.employee_id
AND (actions.type = 'meeting'
OR (actions.type = 'phone_call'
AND JSON_VALID(actions.data) = 1
AND actions.data->>'$.update_status' = 1))
LIMIT 1
) AS latest_action
FROM employees
ORDER BY latest_action DESC
I tried using LEFT JOIN and MAX() instead but it didn't seem to solve my problem.
I just added a subquery because it was the original query is already very complex. But if you have an alternative suggestion I am all ears.
UPDATE
Result of EXPLAIN:
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 PRIMARY employees NULL ALL NULL NULL NULL NULL 15217 10 Using where
2 DEPENDENT SUBQUERY pivot NULL ref actions_type_index,pivot_type_index pivot_type_index 4 dev.employees.id 104 11.11 Using index condition
2 DEPENDENT SUBQUERY actions NULL eq_ref PRIMARY,Logs PRIMARY 4 dev.pivot.actions_id 1 6.68 Using where
UPDATE 2
Here is the indexes. The index employee_type I don't think is important for my specific query, but maybe it should be re-worked?
# pivot table
KEY `actions_type_index` (`actions_id`,`employee_type`),
KEY `pivot_type_index` (`employee_id`,`employee_type`)
# actions table
KEY `Logs` (`type`,`id`,`is_log`)
# I tried to add `date` index to `actions` table but the problem remains.
KEY `date_index` (`date`)
First of all your query is very non-optimal.
I would rewrite it this way:
SELECT
e.id,
DATE_FORMAT(vMAX(a.date), '%Y-%m-%d') AS latest_action
FROM employees e
LEFT JOIN pivot p ON p.employee_id = e.id
LEFT JOIN actions a ON p.actions_id = a.id AND (a.type = 'meeting'
OR (a.type = 'phone_call'
AND JSON_VALID(a.data) = 1
AND a.data->>'$.update_status' = 1))
GROUP BY e.id
ORDER BY latest_action DESC
Obviously there must be indexes on p.employee_id, p.actions_id, a.date. Also would be good on a.type.
Also it would be good to replace a.data->>'$.update_status' with some simple field with an index on it.

How to convert dependent subquery to join for better performance?

I have a database that stores "themes" and every theme is associated with a whole bunch of images (=screenshots of these themes). Now I want to display the latest 10 themes and for every theme I only want to get one single image from the database (the one with the lowest ID).
Currently my query looks like this (I am using a subquery):
SELECT DISTINCT
t.theme_id, t.theme_name, theme_date_last_modification, image_id, image_type
FROM
themes t, theme_images i
WHERE
i.theme_id = t.theme_id
AND t.theme_status = 3
AND t.theme_date_added < now( )
AND i.image_id = (
SELECT MIN( image_id )
FROM theme_images ii
WHERE ii.theme_id = t.theme_id
)
GROUP BY
t.theme_id
ORDER BY
t.theme_date_last_modification DESC
LIMIT 10
It works, but the query is very slow. When I use EXPLAIN I can see that there's a "dependent subquery". Is it possible to convert this dependent subquery into some kind of join that can be processed faster by mysql?
P.S.: My actual query is much more complex and makes use of more tables. I have already tried to simplify it as much as possible so that you can concentrate on the actual reason for the performance-problems.
EDIT:
This is the output of EXPLAIN:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t index PRIMARY,themes themes 212 NULL 5846 Using where; Using index; Using temporary; Using filesort
1 PRIMARY i eq_ref PRIMARY,theme_id,image_id PRIMARY 4 func 1 Using where
2 DEPENDENT SUBQUERY ii ref theme_id theme_id 4 themes.t.theme_id 6
Try this query firstly -
SELECT
t.*, ti1.*
FROM
themes t
JOIN theme_images ti1
ON ti1.theme_id = t.theme_id
JOIN (SELECT theme_id, MIN(image_id) image_id FROM theme_images GROUP BY theme_id) ti2
ON ti1.theme_id = ti2.theme_id AND ti1.image_id = ti2.image_id
ORDER BY
t.theme_date_last_modification DESC
LIMIT 10
One more solution -
SELECT
t.*, ti.*
FROM
themes t
JOIN (SELECT * FROM theme_images ORDER BY image_id) ti
ON ti.theme_id = t.theme_id
GROUP BY
theme_id
ORDER BY
t.theme_date_last_modification DESC
LIMIT
10
Then add your WHERE filter.
One approach is to first LIMIT on the themes table, then JOIN to images:
SELECT
t.theme_id, t.theme_name, t.theme_date_last_modification,
ti.image_id, ti.image_type
FROM
( SELECT theme_id, theme_name, theme_date_last_modification
FROM themes t
WHERE theme_status = 3
AND theme_date_added < now( )
ORDER BY
theme_date_last_modification DESC
LIMIT 10
) AS t
JOIN -- LEFT JOIN if you want themes without an image
theme_images AS ti -- to be shown
ON ti.theme_id = t.theme_id
AND ti.image_id =
( SELECT ii.image_id
FROM theme_images AS ii
WHERE ii.theme_id = t.theme_id
ORDER BY ii.image_id
LIMIT 1
)
ORDER BY
t.theme_date_last_modification DESC ;
With an index on themes (theme_status, theme_date_last_modification, theme_id, theme_date_added) the limit subquery should be efficient.
I suppose you also have a (unique) index on theme_images (theme_id, image_id).

MySQL Query Times out - Need to speed it up

I whipped up a query here that does something particular with retrieving results that do not match the join (as suggested by this SO question).
SELECT cf.f_id
FROM comments_following AS cf
INNER JOIN comments AS c ON cf.c_id = c.id
WHERE NOT EXISTS (
SELECT 1 FROM follows WHERE f_id = cf.f_id
)
Any ideas on how to speed this up? There are anywhere from 30k-200k rows it's looking through and appears to be using indexes, but the query times out.
EXPLAIN/DESCRIBE Info:
1 PRIMARY c ALL PRIMARY NULL NULL NULL 39119
1 PRIMARY cf ref c_id, c_id_2 c_id 8 ...c.id 11 Using where; Using index
2 DEPENDENT SUBQUERY following index NULL PRIMARY 8 NULL 35612 Using where; Using index
The comments table isn't used explicitly in the query. Is it being used for filtering? If not, try:
SELECT cf.f_id
FROM comments_following cf
WHERE NOT EXISTS (
SELECT 1 FROM follows WHERE follows.f_id = cf.f_id
)
By the way, if this generates a syntax error (because follows.f_id does not exist), then that is the problem. In that case, you would think you have a correlated subquery, but there is not really one.
Or the left outer join version:
SELECT cf.f_id
FROM comments_following cf left outer join
follows f
on f.f_id = cf.f_id
where f.f_id is null
Having an index on follows(f_id) should make both these versions run faster.
LEFT JOIN sometimes is faster then WHERE NOT EXISTS subquerys, try:
SELECT cf.f_id
FROM comments_following AS cf
INNER JOIN comments AS c ON cf.c_id = c.id
LEFT JOIN follows AS f ON f.f_id = cf.f_id
WHERE f.f_id IS NULL
The answer to this problem was to place a second index on follows.f_id.

Improving query performance/rewriting query to be faster on MySQL

I have a couple of queries that run very slowly (several minutes) with the data currently in my database, and I'd like to improve their performance. Unfortunately they're kind of complex so the info I'm getting via google isn't enough for me to figure out what indexes to add or if I need to rewrite my queries or what... I'm hoping someone can help. I don't think they should be this slow, if things were set up properly.
The first query is:
SELECT i.name, i.id, COUNT(c.id)
FROM cert_certificates c
JOIN cert_histories h ON h.cert_certificate_id = c.id
LEFT OUTER JOIN inspectors i ON h.inspector_id = i.id
LEFT OUTER JOIN cert_histories h2
ON (h2.cert_certificate_id = c.id AND h.date_changed < h2.date_changed)
WHERE (h.cert_status_ref_id = ? OR h.cert_status_ref_id = ?)
AND h2.id IS NULL
GROUP BY i.id, i.name
ORDER BY i.name
The second query is:
SELECT l.letter, c.number
FROM cert_certificates c
JOIN cert_type_letter_refs l ON c.cert_type_letter_ref_id = l.id
JOIN cert_histories h ON h.cert_certificate_id = c.id
LEFT OUTER JOIN cert_histories h2
ON (h2.cert_certificate_id = c.id AND h.date_changed < h2.date_changed)
WHERE h.cert_status_ref_id = ?
AND h2.id IS NULL
AND h.inspector_id = ?
ORDER BY l.letter, c.number
The cert_certificates table contains nearly 19k records as does the cert_histories table (although in the future this table is expected to grow to approximately 2-3x the size of the cert_certificates table). The other tables are all quite small; less than 10 records each.
The only indexes right now are on id for each table and on cert_certificates.number. I read in a couple of places (e.g. here) to add indices for foreign keys, but in the case of the cert_histories table that'd be nearly all the columns (cert_certificate_id, inspector_id, cert_status_ref_id) which is also not advisable (according to some of the answers on that question e.g. Markus Winand's), so I'm kinda lost.
Any help would be greatly appreciated.
ETA: The results from EXPLAIN on the first query are (sorry for the hideous formatting; I'm using SQLyog which presents it in a nice table but it seems StackOverflow doesn't support tables?):
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE h ALL NULL NULL NULL NULL 19740 Using where; Using temporary; Using filesort
1 SIMPLE i ref index_inspectors_on_id index_inspectors_on_id 768 marketing_development.h.inspector_id 1
1 SIMPLE c ref index_cert_certificates_on_id index_cert_certificates_on_id 768 marketing_development.h.cert_certificate_id 91 Using where; Using index
1 SIMPLE h2 ALL NULL NULL NULL NULL 19740 Using where
Second query:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE h ALL NULL NULL NULL NULL 19795 Using where; Using temporary; Using filesort
1 SIMPLE c ref index_cert_certificates_on_id index_cert_certificates_on_id 768 marketing_development.h.cert_certificate_id 91 Using where
1 SIMPLE l ALL index_cert_type_letter_refs_on_id NULL NULL NULL 5 Using where; Using join buffer
1 SIMPLE h2 ALL NULL NULL NULL NULL 19795 Using where
You should create indices on your join fields:
cert_certificates.cert_type_letter_ref_id
cert_histories.cert_certificate_id
cert_histories.date_changed
cert_histories.inspector_id

Need help with an SQL query involving multiple tables - Join not an option

SELECT i.*, i.id IN (
SELECT id
FROM w
WHERE w.status='active') AS wish
FROM i
INNER JOIN r ON i.id=r.id
WHERE r.member_id=1 && r.status='active'
ORDER BY wish DESC
LIMIT 0,50
That's a query that I'm trying to run. It doesn't scale well, and I'm wondering if someone here can tell me where I could improve things. I don't join w to r and i because I need to show rows from i that are unrepresented in w. I tried a left join, but it didn't perform too well. This is better, but not ideal yet. All three tables are very large. All three are indexed on the fields I'm joining and selecting on.
Any comments, pointers, or constructive criticisms would be greatly appreciated.
EDIT Addition:
I should have put this in my original question. It's the EXPLAIN as return from SQLYog.
id|select_type |table|type |possible_keys|key |key_len|ref |rows|Extra|
1 |PRIMARY |r |ref |member_id,id |member_id|3 |const|3120|Using where; Using temporary; Using filesort
1 |PRIMARY |i |eq_ref |id |id |8 |r.id |1 |
2 |DEPENDENT SUBQUERY|w |index_subquery|id,status |id |8 |func |8 |Using where
EDIT le dorfier - more comments ...
I should mention that the key for w is (member_id, id). So each id can exist multiple times in w, and I only want to know if it exists.
WHERE x IN () is identical to an INNER JOIN to a SELECT DISTINCT subquery, and in general, a join to a subquery will typically perform better if the optimizer doesn't turn the IN into a JOIN - which it should:
SELECT i.*
FROM i
INNER JOIN (
SELECT DISTINCT id
FROM w
WHERE w.status = 'active'
) AS wish
ON i.id = wish.id
INNER JOIN r
ON i.id = r.id
WHERE r.member_id = 1 && r.status = 'active'
ORDER BY wish.id DESC
LIMIT 0,50
Which, would probably be equivalent to this if you don't need the DISTINCT:
SELECT i.*
FROM i
INNER JOIN w
ON w.status = 'active'
AND i.id = wish.id
INNER JOIN r
ON i.id = r.id
AND r.member_id = 1 && r.status = 'active'
ORDER BY i.id DESC
LIMIT 0,50
Please post your schema.
If you are using wish as an existence flag, try:
SELECT i.*, CASE WHEN w.id IS NOT NULL THEN 1 ELSE 0 END AS wish
FROM i
INNER JOIN r
ON i.id = r.id
AND r.member_id = 1 && r.status = 'active'
LEFT JOIN w
ON w.status = 'active'
AND i.id = w.id
ORDER BY wish DESC
LIMIT 0,50
You can use the same technique with a LEFT JOIN to a SELECT DISTINCT subquery. I assume you aren't specifying the w.member_id because you want to know if any members have this? In this case, definitely use the SELECT DISTINCT. You should have an index with id as the first column on w as well in order for that to perform:
SELECT i.*, CASE WHEN w.id IS NOT NULL THEN 1 ELSE 0 END AS wish
FROM i
INNER JOIN r
ON i.id = r.id
AND r.member_id = 1 && r.status = 'active'
LEFT JOIN (
SELECT DISTINCT w.id
FROM w
WHERE w.status = 'active'
) AS w
ON i.id = w.id
ORDER BY wish DESC
LIMIT 0,50
I should have put this in my original question. It's the EXPLAIN as return from SQLYog.
id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra|
1|PRIMARY|r|ref|member_id,id|member_id|3|const|3120|Using where; Using temporary; Using filesort
1|PRIMARY|i|eq_ref|id|id|8|r.id|1|
2|DEPENDENT SUBQUERY|w|index_subquery|id,status|id|8|func|8|Using where
Please post the EXPLAIN listing. And explain what the tables and columns mean.
wish appears to be a boolean - and you're ORDERing by it?
EDIT: Well, it looks like it's doing what it's being instructed to do. Cade seems to be thinking expansively on what this all could possibly mean (he probably deserves a vote just for effort.) But I'd really rather you tell us.
Wild guessing just confuses everyone (including you, I'm sure.)
OK, based on new info, here's my (slightly less wild) guess.
SELECT i.*,
CASE WHEN EXISTS (SELECT 1 FROM w WHERE id = i.id AND w.status = 'active' THEN 1 ELSE 0 END) AS wish
FROM i
INNER JOIN r ON i.id = r.id AND r.status = 'active'
WHERE r.member_id = 1
Do you want a row for each match in w? Or just to know for i.id , whether there is an active w record? I assumed the second answer, so you don't need to ORDER BY - it's for only one ID anyway. And since you're only returning columns from i, if there are multiple rows in r, you'll just get duplicate rows.
How about posting what you expect to get for a proper answer?
...
ORDER BY wish DESC
LIMIT 0,50
This appears to be the big expense. You're sorting by a computed column "wish" which cannot benefit from an index. This forces it to use a filesort (as indicated by the EXPLAIN) output, which means it writes the whole result set to disk and sorts it using disk I/O which is very slow.
When you post questions like this, you should not expect people to guess how you have defined your tables and indexes. It's very simple to get the full definitions:
mysql> SHOW CREATE TABLE w;
mysql> SHOW CREATE TABLE i;
mysql> SHOW CREATE TABLE r;
Then paste the output into your question.
It's not clear what your purpose is for the "wish" column. The "IN" predicate is a boolean expression, so it always results in 0 or 1. But I'm guessing you're trying to use "IN" in hopes of accomplishing a join without doing a join. It would help if you describe what you're trying to accomplish.
Try this:
SELECT i.*
FROM i
INNER JOIN r ON i.id=r.id
LEFT OUTER JOIN w ON i.id=w.id AND w.status='active'
WHERE r.member_id=1 AND r.status='active'
AND w.id IS NULL
LIMIT 0,50;
It uses an additional outer join, but it doesn't incur a filesort according to my test with EXPLAIN.
Have you tried this?
SELECT i.*, w.id as wish FROM i
LEFT OUTER JOIN w ON i.id = w.id
AND w.status = 'active'
WHERE i.id in (SELECT id FROM r WHERE r.member_id = 1 AND r.status = 'active')
ORDER BY wish DESC
LIMIT 0,50