mysql query takes 13min to run - mysql

I have three tables, and i am joining them using the code below
orders has 30k rows
orders_details has over 100k rows
services has 4 rows
when I execute the script below in Fish Database.net, it takes over 13mins to run before displaying the result!!!
select
`o`.`created` AS `created`,
sum(`o`.`total`) AS `total`,
sum(`o`.`paid`) AS `paid`,
`od`.`service_id` AS `service_id`,
`s`.`name` AS `grp`
from ((`orders` `o`
left join `orders_details` `od` on
(
(`od`.`order_id` = `o`.`id`)
))
left join `services` `s` on
(
(`s`.`id` = `od`.`service_id`)
))
group by
`od`.`service_id`,
`o`.`created`,
`s`.`name`
order by
`o`.`created`
all column names with id are integer primary key columns
created is a datetime column
total, paid are decimal(10,2) columns
name is a varchar (60) columns
Here is the explain result
EDITS
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE o ALL NULL NULL NULL NULL 23558 Using temporary; Using filesort
1 SIMPLE od ALL NULL NULL NULL NULL 40304
1 SIMPLE s eq_ref PRIMARY PRIMARY 4 mydb.od.service_id 1
I can i improve/find the bottleneck ??

The use of (unuseful) nested () can degrading performance then
try remove unseful ()
Be sure you have correct index on od.order_id, o.id , s.id , od.service_id
select
o.created AS created,
sum(o.total) AS total,
sum(o.paid) AS paid,
od.service_id AS service_id,
s.name AS grp
from orders o
left join orders_details od on od.order_id = o.id
left join services s on s.id = od.service_id
group by
od.service_id,
o.created,
s.name
order by
o.created
for a better reading i have removed also backtics .. you don't have resersved words od multi words column name so should not be needed)

There are steps to find bottlenecks in sql script.
First of all,examine the execution plans and query statistics.
Execution plan will tell you where is the bottleneck.
Depending on the problem, create indexes and then again examine execution plan.
Apply pagination when needed. Sending data in batches also can make difference.

Related

Find employees latest activity is slow when adding ORDER BY

I am working on a legacy system in Laravel and I am trying to pull the latest action of some specific types of actions an employee has done.
Performance is good when I don't add ORDER BY. When adding it the query will go from something like 130 ms to 18 seconds. There are about 1.5 million rows in the actions table.
How do I fix the performance problem?
I have tried to isolate the problem by cutting out all the other parts of the query so it is more readable for you:
SELECT
employees.id,
(
SELECT DATE_FORMAT(actions.date, '%Y-%m-%d')
FROM pivot
JOIN actions
ON pivot.actions_id = actions.id
WHERE employees.id = pivot.employee_id
AND (actions.type = 'meeting'
OR (actions.type = 'phone_call'
AND JSON_VALID(actions.data) = 1
AND actions.data->>'$.update_status' = 1))
LIMIT 1
) AS latest_action
FROM employees
ORDER BY latest_action DESC
I tried using LEFT JOIN and MAX() instead but it didn't seem to solve my problem.
I just added a subquery because it was the original query is already very complex. But if you have an alternative suggestion I am all ears.
UPDATE
Result of EXPLAIN:
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 PRIMARY employees NULL ALL NULL NULL NULL NULL 15217 10 Using where
2 DEPENDENT SUBQUERY pivot NULL ref actions_type_index,pivot_type_index pivot_type_index 4 dev.employees.id 104 11.11 Using index condition
2 DEPENDENT SUBQUERY actions NULL eq_ref PRIMARY,Logs PRIMARY 4 dev.pivot.actions_id 1 6.68 Using where
UPDATE 2
Here is the indexes. The index employee_type I don't think is important for my specific query, but maybe it should be re-worked?
# pivot table
KEY `actions_type_index` (`actions_id`,`employee_type`),
KEY `pivot_type_index` (`employee_id`,`employee_type`)
# actions table
KEY `Logs` (`type`,`id`,`is_log`)
# I tried to add `date` index to `actions` table but the problem remains.
KEY `date_index` (`date`)
First of all your query is very non-optimal.
I would rewrite it this way:
SELECT
e.id,
DATE_FORMAT(vMAX(a.date), '%Y-%m-%d') AS latest_action
FROM employees e
LEFT JOIN pivot p ON p.employee_id = e.id
LEFT JOIN actions a ON p.actions_id = a.id AND (a.type = 'meeting'
OR (a.type = 'phone_call'
AND JSON_VALID(a.data) = 1
AND a.data->>'$.update_status' = 1))
GROUP BY e.id
ORDER BY latest_action DESC
Obviously there must be indexes on p.employee_id, p.actions_id, a.date. Also would be good on a.type.
Also it would be good to replace a.data->>'$.update_status' with some simple field with an index on it.

How to make JOINS faster?

I had this query to start out with:
SELECT DISTINCT spentits.*
FROM `spentits`
WHERE (spentits.user_id IN
(SELECT following_id
FROM `follows`
WHERE `follows`.`follower_id` = '44'
AND `follows`.`accepted` = 1)
OR spentits.user_id = '44')
ORDER BY id DESC
LIMIT 15 OFFSET 0
This query takes 10ms to execute.
But once I add a simple join in:
SELECT DISTINCT spentits.*
FROM `spentits`
LEFT JOIN wishlist_items ON wishlist_items.user_id = 44 AND wishlist_items.spentit_id = spentits.id
WHERE (spentits.user_id IN
(SELECT following_id
FROM `follows`
WHERE `follows`.`follower_id` = '44'
AND `follows`.`accepted` = 1)
OR spentits.user_id = '44')
ORDER BY id DESC
LIMIT 15 OFFSET 0
This execute time increased by 11x. Now it takes around 120ms to execute. What's interesting is that if I remove either the LEFT JOIN clause or the ORDER BY id DESC , the time goes back to 10ms.
I am new to databases so I don't understand this. Why is it that removing either one of these clauses speeds it up 11x ? And how can I keep it as is but make it faster?
I have indexes on spentits.user_id, follows.follower_id, follows.accepted, and on primary ids of each table.
EXPLAIN:
1 PRIMARY spentits index index_spentits_on_user_id PRIMARY 4 NULL 15 Using where; Using temporary
1 PRIMARY wishlist_items ref index_wishlist_items_on_user_id,index_wishlist_items_on_spentit_id index_wishlist_items_on_spentit_id 5 spentit.spentits.id 1 Using where; Distinct
2 SUBQUERY follows index_merge index_follows_on_follower_id,index_follows_on_following_id,index_follows_on_accepted
index_follows_on_follower_id,index_follows_on_accepted 5,2 NULL 566 Using intersect(index_follows_on_follower_id,index_follows_on_accepted); Using where
You should have index also on:
wishlist_items.spentit_id
Because you are joining over that column
The LEFT JOIN is easy to explain: A cross product of all entries against all other entries is made. The conditions of the join (in your case: Take all entries on the left and find fitting ones on the right) are applied afterwards. So if your spentits table is large it will take the server some time. Would suggest you get rid of your subquery and make three joins. Start with the smallest table to avoid big amounts of data.
In the 2nd example the subselect runs for every spentits.user_id.
If you write is like this it will be faster because the subselect runs once:
SELECT DISTINCT spentits.*
FROM `spentits`, (SELECT following_id
FROM `follows`
WHERE `follows`.`follower_id` = '44'
AND `follows`.`accepted` = 1)
OR spentits.user_id = '44') as `follow`
LEFT JOIN wishlist_items ON wishlist_items.user_id = 44 AND wishlist_items.spentit_id = spentits.id
WHERE (spentits.user_id IN
(follow)
ORDER BY id DESC
LIMIT 15 OFFSET 0
As you can see the subselect moved to the FROM-part of the query and creates a imaginary tabel (or view).
This imaginary tabel is a inline-view.
JOINs and inline-views are faster every time than a subselect in the WHERE-part.

MySQL Query Times out - Need to speed it up

I whipped up a query here that does something particular with retrieving results that do not match the join (as suggested by this SO question).
SELECT cf.f_id
FROM comments_following AS cf
INNER JOIN comments AS c ON cf.c_id = c.id
WHERE NOT EXISTS (
SELECT 1 FROM follows WHERE f_id = cf.f_id
)
Any ideas on how to speed this up? There are anywhere from 30k-200k rows it's looking through and appears to be using indexes, but the query times out.
EXPLAIN/DESCRIBE Info:
1 PRIMARY c ALL PRIMARY NULL NULL NULL 39119
1 PRIMARY cf ref c_id, c_id_2 c_id 8 ...c.id 11 Using where; Using index
2 DEPENDENT SUBQUERY following index NULL PRIMARY 8 NULL 35612 Using where; Using index
The comments table isn't used explicitly in the query. Is it being used for filtering? If not, try:
SELECT cf.f_id
FROM comments_following cf
WHERE NOT EXISTS (
SELECT 1 FROM follows WHERE follows.f_id = cf.f_id
)
By the way, if this generates a syntax error (because follows.f_id does not exist), then that is the problem. In that case, you would think you have a correlated subquery, but there is not really one.
Or the left outer join version:
SELECT cf.f_id
FROM comments_following cf left outer join
follows f
on f.f_id = cf.f_id
where f.f_id is null
Having an index on follows(f_id) should make both these versions run faster.
LEFT JOIN sometimes is faster then WHERE NOT EXISTS subquerys, try:
SELECT cf.f_id
FROM comments_following AS cf
INNER JOIN comments AS c ON cf.c_id = c.id
LEFT JOIN follows AS f ON f.f_id = cf.f_id
WHERE f.f_id IS NULL
The answer to this problem was to place a second index on follows.f_id.

SQL optimization - slow query

Given SQL takes 1.2s:
SELECT DISTINCT contracts.id, jt0.id, jt1.id, jt2.id, jt3.id FROM contracts
LEFT JOIN accounts jt0 ON jt0.id = contracts.account_id AND jt0.deleted=0
LEFT JOIN manufacturers jt1 ON jt1.id = contracts.manufacturer_id AND jt1.deleted=0
LEFT JOIN products jt2 ON jt2.id = contracts.product_id AND jt2.deleted=0
LEFT JOIN users jt3 ON jt3.id = contracts.assigned_user_id AND jt3.deleted=0
WHERE contracts.deleted=0
ORDER BY contracts.application_number ASC
LIMIT 0,21
here is what explain extended returns:
id select_type table type possible_keys key key_len ref rows
1 SIMPLE contracts ref idx_contracts_deleted idx_contracts_deleted 2 const 18968 100.00 Using where; Using temporary; Using filesort
1 SIMPLE jt0 eq_ref PRIMARY,idx_accnt_id_del,idx_accnt_assigned_del PRIMARY 108 xxx.contracts.account_id 1 100.00
1 SIMPLE jt1 eq_ref PRIMARY,idx_manufacturers_id_deleted,idx_manufacturers_deleted PRIMARY 108 xxx.contracts.manufacturer_id 1 100.00
1 SIMPLE jt2 eq_ref PRIMARY,idx_products_id_deleted,idx_products_deleted PRIMARY 108 xxx.contracts.product_id 1 100.00
1 SIMPLE jt3 eq_ref PRIMARY,idx_users_id_del,idx_users_id_deleted,idx_users_deleted PRIMARY 108 xxx.contracts.assigned_user_id 1 100.00
I need the distinct, I need all the joins to be left, I need order by and i need limit.
Can i optimize it somehow?
These are the only suggestions i've got
I hope the id's are defined as primary keys and the foreign keys with the relation between the tables.
Maybe the application_number can be indexed (then the sort will be faster)
Maybe, if you are using MyISAM, the sql could be faster if you lock the tables before selecting (don't forget to unlock afterwards)
Try changing the indexes on the subsidiary tables to include the deleted column:
accounts(id, deleted)
manufacturers(id, deleted)
products(id, deleted)
users(id, deleted)
By including all the columns in the index, MySQL has a better opportunity to take advantage of the index.
Another suggestion is to figure out what is causing the duplication in values and to use subqueries to eliminate the duplicates, rather than distinct.
For instance, with the above indexes:
from contracts c left join
(select id
from accounts
where deleted = 0
group by id
) a
on c.account_id = a.id
. . .
The subquery should only use the index, which might speed things up.
First create necessary index on the following columns.
contracts.application_number, manufacturers.deleted, products.deleted, users.deleted
SELECT DISTINCT contracts.id, jt0.id, jt1.id, jt2.id, jt3.id
FROM contracts
LEFT JOIN accounts jt0
ON contracts.deleted=0 AND jt0.id = contracts.account_id
LEFT JOIN manufacturers jt1
ON jt1.deleted=0 AND jt1.id = contracts.manufacturer_id
LEFT JOIN products jt2
ON jt2.deleted=0 AND jt2.id = contracts.product_id
LEFT JOIN users jt3
ON jt3.deleted=0 AND jt3.id = contracts.assigned_user_id
ORDER BY contracts.application_number ASC
LIMIT 0,21
As you have mentioned you are have already index on contracts.deleted
FROM
(SELECT * FROM contracts WHERE contracts.deleted = 0 USE INDEX(<deletedIndexName>))
LEFT JOIN
accounts jt0
ON
jt0.id = contracts.account_id
LEFT JOIN
...
Try a little referential integrity? I bet the query is much faster with inner joins. It should be, because the query optimizer has more to work with. You're paying the price at select time for not taking more care with create.
I would also remove deleted rows to their own tables, and strike the deleted columns. Your queries will be simpler and likely run faster.
try those indexes
CREATE INDEX PAW_IDX1921682121 ON products(deleted,id);
CREATE INDEX PAW_IDX1196677611 ON manufacturers(deleted,id);
CREATE INDEX PAW_IDX1360881332 ON users(deleted,id);
CREATE INDEX PAW_IDX1028958902 ON accounts(deleted,id);
CREATE INDEX PAW_IDX1564931998 ON contracts(deleted,application_number);

Left JOIN faster or Inner Join faster?

So... which one is faster (NULl value is not an issue), and are indexed.
SELECT * FROM A
JOIN B b ON b.id = a.id
JOIN C c ON c.id = b.id
WHERE A.id = '12345'
Using Left Joins:
SELECT * FROM A
LEFT JOIN B ON B.id=A.bid
LEFT JOIN C ON C.id=B.cid
WHERE A.id = '12345'
Here is the actual query
Here it is.. both return the same result
Query (0.2693sec) :
EXPLAIN EXTENDED SELECT *
FROM friend_events, zcms_users, user_events,
EVENTS WHERE friend_events.userid = '13006'
AND friend_events.state =0
AND UNIX_TIMESTAMP( friend_events.t ) >=1258923485
AND friend_events.xid = user_events.id
AND user_events.eid = events.eid
AND events.active =1
AND zcms_users.id = user_events.userid
EXPLAIN
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE zcms_users ALL PRIMARY NULL NULL NULL 43082
1 SIMPLE user_events ref PRIMARY,eid,userid userid 4 zcms_users.id 1
1 SIMPLE events eq_ref PRIMARY,active PRIMARY4 user_events.eid 1 Using where
1 SIMPLE friend_events eq_ref PRIMARY PRIMARY 8 user_events.id,const 1 Using where
LEFTJOIN QUERY: (0.0393 sec)
EXPLAIN EXTENDED SELECT *
FROM `friend_events`
LEFT JOIN `user_events` ON user_events.id = friend_events.xid
LEFT JOIN `events` ON user_events.eid = events.eid
LEFT JOIN `zcms_users` ON user_events.userid = zcms_users.id
WHERE (
events.active =1
)
AND (
friend_events.userid = '13006'
)
AND (
friend_events.state =0
)
AND (
UNIX_TIMESTAMP( friend_events.t ) >=1258923485
)
EXPLAIN
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE friend_events ALL PRIMARY NULL NULL NULL 53113 Using where
1 SIMPLE user_events eq_ref PRIMARY,eid PRIMARY 4 friend_events.xid 1 Using where
1 SIMPLE zcms_users eq_ref PRIMARY PRIMARY 4 user_events.userid 1
1 SIMPLE events eq_ref PRIMARY,active PRIMARY 4 user_events.eid 1 Using where
It depends; run them both to find out; then run an 'explain select' for an explanation.
The actual performance difference may range from "virtually non-existent" to "pretty significant" depending on how many rows in A with id='12345' have no matching records in B and C.
Update (based on posted query plans)
When you use INNER JOIN it doesn't matter (results-wise, not performance-wise) which table to start with, so optimizer tries to pick the one it thinks would perform best. It seems you have indexes on all appropriate PK / FK columns and you either don't have an index on friend_events.userid or there are too many records with userid = '13006' and it's not being used; either way optimizer picks the table with less rows as "base" - in this case it's zcms_users.
When you use LEFT JOIN it does matter (results-wise) which table to start with; thus friend_events is picked. Now why it takes less time that way I'm not quite sure; I'm guessing friend_events.userid condition helps. If you were to add an index (is it really varchar, btw? not numeric?) on that, your INNER JOIN might behave differently (and become faster) as well.
The INNER JOIN has to do an extra check to remove any records from A that don't have matching records in B and C. Depending on the number of records initially returned from A it COULD have an impact.
LEFT JOIN shows all data from A and only shows data from B/C only if the condition is true. As for INNER JOIN, it has to do some extra checking on both tables. So, I guess that explains why LEFT JOIN is faster.
Use EXPLAIN to see the query plan. It's probably the same plan for both cases, so I doubt it makes much difference, assuming there are no rows that don't match. But these are two different queries so it really doesn't make sense to compare them - you should just use the correct one.
Why not use the "INNER JOIN" keyword instead of "LEFT JOIN"?