This select query takes about 20 seconds to complete.
select Count(*)
from products as bad_rows
inner join (
select pid, MAX(last_updated_date) as maxdate
from products
group by pid
having count(*) > 1
) as good_rows on good_rows.pid= bad_rows.pid
and good_rows.maxdate <> bad_rows.last_updated_date
where bad_rows.available = 0
The delete on the other hand is still running after 30 minutes !
delete bad_rows
from products as bad_rows
inner join (
select pid, MAX(last_updated_date) as maxdate
from products
group by pid
having count(*) > 1
) as good_rows on good_rows.pid= bad_rows.pid
and good_rows.maxdate <> bad_rows.last_updated_date
where bad_rows.available = 0
Why ?
Table Schema is as follows:
Explain for the select is as follows:
+----+-------------+------------+------+---------------+------+---------+------+-------+--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+---------------+------+---------+------+-------+--------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 6253 | |
| 1 | PRIMARY | bad_rows | ALL | NULL | NULL | NULL | NULL | 34603 | Using where; Using join buffer |
| 2 | DERIVED | products | ALL | NULL | NULL | NULL | NULL | 34603 | Using temporary; Using filesort|
+----+-------------+------------+------+---------------+------+---------+------+-------+--------------------------------
ok so I just googled the results explain which hinted that my query could be slow because of not having indexes on pid. It didn't actually say that, but I just had a hunch from reading about the results of Explain.
SO I added a index on pid and voila. Delete over in 1 minute!!
Related
I have a MySQL query which has a JOIN of 12 tables. When I explain the query, It showing 394699 rows for one table and 185368 rows for another table. All other tables has 1-3 rows. The total result which I am getting from the query id 472 rows only. But for that, it is taking more than 1 minute.
Is there any way to check how many rows has been analyzed to produce such a result? So that, I can find which is the table costs the higher time.
I am giving the query structure below. As the table structure is too high, I am not able to provide it here.
SELECT h.nid,h.attached_nid,h.created, s.field_species_value as species, g.field_gender_value as gender, u.field_unique_id_value as unqid, n.title, dob.field_adult_healthy_weight_value as birth_date, dcolor.field_dog_primary_color_value as dogcolor, ccolor.field_primary_color_value as catcolor, sdcolor.field_dog_secondary_color_value as sdogcolor, sccolor.field_secondary_color_value as scatcolor, dpattern.field_dog_pattern_value as dogpattern, cpattern.field_cat_pattern_value as catpattern
FROM table1 h
JOIN table2 n ON n.nid = h.nid
JOIN table3 s ON n.nid = s.entity_id
JOIN table4 u ON n.nid = u.entity_id
LEFT JOIN table5 g ON n.nid = g.entity_id
LEFT JOIN table6 dob ON n.nid = dob.entity_id
LEFT JOIN table7 AS dcolor ON n.nid = dcolor.entity_id
LEFT JOIN table8 AS ccolor ON n.nid = ccolor.entity_id
LEFT JOIN table9 AS sdcolor ON n.nid = sdcolor.entity_id
LEFT JOIN table10 AS sccolor ON n.nid = sccolor.entity_id
LEFT JOIN table11 AS dpattern ON n.nid = dpattern.entity_id
LEFT JOIN table12 AS cpattern ON n.nid = cpattern.entity_id
WHERE h.title = '4208'
AND ((h.created BETWEEN 1483257600 AND 1485935999))
AND h.uid!=1
AND h.uid IN(
SELECT etid
FROM `table`
WHERE gid=464
AND entity_type='user')
AND h.attached_nid>0
ORDER BY CAST(h.created as UNSIGNED) DESC;
Below is the EXPLAIN result which I get
+------+--------------+---------------+--------+----------------------+---------------------+---------+----------------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------+---------------+--------+----------------------+---------------------+---------+----------------------+--------+----------------------------------------------+
| 1 | PRIMARY | s | index | entity_id | field_species_value | 772 | NULL | 394699 | Using index; Using temporary; Using filesort |
| 1 | PRIMARY | u | ref | entity_id | entity_id | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | n | eq_ref | PRIMARY | PRIMARY | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | g | ref | entity_id | entity_id | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | dob | ref | entity_id | entity_id | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | dcolor | ref | entity_id | entity_id | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | ccolor | ref | entity_id | entity_id | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | sdcolor | ref | entity_id | entity_id | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | sccolor | ref | entity_id | entity_id | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | dpattern | ref | entity_id | entity_id | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | cpattern | ref | entity_id | entity_id | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | h | ref | attached_nid,nid,uid | nid | 5 | pantheon.s.entity_id | 3 | Using index condition; Using where |
| 1 | PRIMARY | <subquery2> | eq_ref | distinct_key | distinct_key | 4 | func | 1 | Using where |
| 2 | MATERIALIZED | og_membership | ref | entity,gid | gid | 4 | const | 185368 | Using where |
+------+--------------+---------------+--------+----------------------+---------------------+---------+----------------------+--------+----------------------------------------------+
You can find the ROWS_EXAMINED by using the Performance Schema.
Here is a link to the performance schema quick start guide.
https://dev.mysql.com/doc/refman/5.5/en/performance-schema-quick-start.html
This is the query I run in PHP applications, to find out what queries I need to optimize. You should be able to adapt it pretty easily.
The query finds the stats on the query that was run before this one. So in my apps, I run query after every query I run, store the results, then at the end of the PHP script I output the stats for every query I ran during the request.
SELECT `EVENT_ID`, TRUNCATE(`TIMER_WAIT`/1000000000000,6) as Duration,
`SQL_TEXT`, `DIGEST_TEXT`, `NO_INDEX_USED`, `NO_GOOD_INDEX_USED`, `ROWS_AFFECTED`, `ROWS_SENT`, `ROWS_EXAMINED`
FROM `performance_schema`.`events_statements_history`
WHERE
`CURRENT_SCHEMA` = '{$database}' AND `EVENT_NAME` LIKE 'statement/sql/%'
AND `THREAD_ID` = (SELECT `THREAD_ID` FROM `performance_schema`.`threads` WHERE `performance_schema`.`threads`.`PROCESSLIST_ID` = CONNECTION_ID())
ORDER BY `EVENT_ID` DESC LIMIT 1;
To decrease the number of rows accessed from og_membership, try adding an index containing the gid, entity_type, and etid fields. Including gid and entity_type should make the lookup more performant and including etid will make the index a covering index.
After adding the index, run EXPLAIN again to look at the results. Based on the new explain plan, either keep the index, remove the index, and/or add an additional index. Keep doing this until you get results you are satisfied with.
For sure, you will want to try and eliminate any mentions of Using temporary or Using filesort. Using temporary implies a temporary table is being used to make this query probably for the sheer size of your intermittent. Using filesort implies ordering isn't being satisfied with an index and is being done by examining the matching rows.
An detail explanation about EXPLAIN can be found at https://dev.mysql.com/doc/refman/5.7/en/explain-output.html.
Key-Value (EAV) schema sucks.
Indexes:
table1: INDEX(title, created)
table1: INDEX(uid, title, created)
table: INDEX(gid, entity_type, etid)
table* -- Is `entity_id` already an index? Can it be the PRIMARY KEY?
Does nid need to be NULL instead of NOT NULL?
If those don't do enough, try:
And turn the IN ( SELECT ... ) into a JOIN ( SELECT ... ) USING(hid)
If you still need help, please provide SHOW CREATE TABLE and EXPLAIN SELECT ...
The purpose of this query is to list the distinct users someone has connections to (ie. users that are followed by or are following the user with id 256 but excludes users who are either blocking or are blocked by the current user making the request (user with id 2)
The relationships table is pretty simple. The status column can be one of two values: "following" or "blocked":
mysql> describe relationships;
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| follower_id | int(11) | YES | MUL | NULL | |
| followee_id | int(11) | YES | MUL | NULL | |
| created_at | datetime | YES | | NULL | |
| updated_at | datetime | YES | | NULL | |
| status | varchar(191) | YES | MUL | NULL | |
+-------------+--------------+------+-----+---------+----------------+
This query currently takes about 58 seconds to complete! User 256 only has 1500 connections. To put this is context, there are roughly 10,000 user rows, 5500 relationships rows.
SELECT DISTINCT `users`.*,
-- "followed" is just a flag indicating if user #2 is currently following a given user
(
SELECT COUNT(*) FROM `relationships`
WHERE `relationships`.`followee_id` = `users`.`id`
AND `relationships`.`follower_id` = 2
) AS 'followed'
FROM `users`
INNER JOIN `relationships`
ON (
(`users`.`id` = `relationships`.`follower_id`
AND `relationships`.`followee_id` = 256
)
OR (`users`.`id` = `relationships`.`followee_id`
AND `relationships`.`follower_id` = 256
)
)
WHERE `relationships`.`status` = 'following'
AND (
-- Ensure we don't return users who are blocked by user #2
`users`.`id` NOT IN (
SELECT `relationships`.`followee_id`
FROM `relationships`
WHERE `relationships`.`follower_id` = 2
AND `relationships`.`status` = 'blocked'
)
)
AND (
-- Ensure we don't return users who are blocking user #2
`users`.`id` NOT IN (
SELECT `relationships`.`follower_id`
FROM `relationships`
WHERE `relationships`.`followee_id` = 2
AND `relationships`.`status` = 'blocked'
)
)
ORDER BY `users`.`id` ASC
LIMIT 10
Here's are the current indexes on relationships:
mysql> show index from relationships;
+---------------+------------+---------------------------------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------------+------------+---------------------------------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| relationships | 0 | PRIMARY | 1 | id | A | 3002 | NULL | NULL | | BTREE | | |
| relationships | 0 | index_relationships_on_status_and_follower_id_and_followee_id | 1 | status | A | 2 | NULL | NULL | YES | BTREE | | |
| relationships | 0 | index_relationships_on_status_and_follower_id_and_followee_id | 2 | follower_id | A | 3002 | NULL | NULL | YES | BTREE | | |
| relationships | 0 | index_relationships_on_status_and_follower_id_and_followee_id | 3 | followee_id | A | 3002 | NULL | NULL | YES | BTREE | | |
| relationships | 1 | index_relationships_on_followee_id | 1 | followee_id | A | 3002 | NULL | NULL | YES | BTREE | | |
| relationships | 1 | index_relationships_on_follower_id | 1 | follower_id | A | 3002 | NULL | NULL | YES | BTREE | | |
| relationships | 1 | index_relationships_on_status_and_followee_id_and_follower_id | 1 | status | A | 2 | NULL | NULL | YES | BTREE | | |
| relationships | 1 | index_relationships_on_status_and_followee_id_and_follower_id | 2 | followee_id | A | 3002 | NULL | NULL | YES | BTREE | | |
| relationships | 1 | index_relationships_on_status_and_followee_id_and_follower_id | 3 | follower_id | A | 3002 | NULL | NULL | YES | BTREE | | |
+---------------+------------+---------------------------------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
explain results:
mysql> EXPLAIN SELECT DISTINCT `users`.*, (SELECT COUNT(*) FROM `relationships` WHERE `relationships`.`followee_id` = `users`.`id` AND `relationships`.`follower_id` = 2) AS 'followed' FROM `users` INNER JOIN `relationships` ON(`users`.`id` = `relationships`.`follower_id` AND `relationships`.`followee_id` = 256) OR (`users`.`id` = `relationships`.`followee_id` AND `relationships`.`follower_id` = 256) WHERE `relationships`.`status` = 'following' AND (`users`.`id` NOT IN (SELECT `relationships`.`followee_id` FROM `relationships` WHERE `relationships`.`follower_id` = 2 AND `relationships`.`status` = 'blocked')) AND (`users`.`id` NOT IN (SELECT `relationships`.`follower_id` FROM `relationships` WHERE `relationships`.`followee_id` = 2 AND `relationships`.`status` = 'blocked')) ORDER BY `users`.`id` ASC LIMIT 10;
+----+--------------------+---------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------+---------+-------------------------------+------+----------------------------------------------------------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+---------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------+---------+-------------------------------+------+----------------------------------------------------------------------------------------------------------------------------------+
| 1 | PRIMARY | relationships | index_merge | index_relationships_on_status_and_follower_id_and_followee_id,index_relationships_on_followee_id,index_relationships_on_follower_id,index_relationships_on_status_and_followee_id_and_follower_id | index_relationships_on_followee_id,index_relationships_on_follower_id | 5,5 | NULL | 2 | Using union(index_relationships_on_followee_id,index_relationships_on_follower_id); Using where; Using temporary; Using filesort |
| 1 | PRIMARY | users | ALL | PRIMARY | NULL | NULL | NULL | 1534 | Range checked for each record (index map: 0x1) |
| 4 | SUBQUERY | relationships | ref | index_relationships_on_status_and_follower_id_and_followee_id,index_relationships_on_followee_id,index_relationships_on_follower_id,index_relationships_on_status_and_followee_id_and_follower_id | index_relationships_on_status_and_follower_id_and_followee_id | 767 | const | 1 | Using where; Using index |
| 3 | SUBQUERY | relationships | ref | index_relationships_on_status_and_follower_id_and_followee_id,index_relationships_on_followee_id,index_relationships_on_follower_id,index_relationships_on_status_and_followee_id_and_follower_id | index_relationships_on_status_and_follower_id_and_followee_id | 772 | const,const | 1 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | relationships | ref | index_relationships_on_followee_id,index_relationships_on_follower_id | index_relationships_on_followee_id | 5 | development.users.id | 1 | Using where |
+----+--------------------+---------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------+---------+-------------------------------+------+----------------------------------------------------------------------------------------------------------------------------------+
5 rows in set (0.01 sec)
It's hard to give you a concrete answer without testing this but I think this part of the query is the problem
SELECT DISTINCT `users`.*, (
SELECT COUNT(*) FROM `relationships`
WHERE `relationships`.`followee_id` = `users`.`id`
AND `relationships`.`follower_id` = 2
) AS 'followed'
You're also using order by. Remove DISTINCT and order by and see if things speed up. I know it changes the query but I suspect that distinct is basically building a bunch of temporary tables and throwing them away for every row that it needs to check. Have a look here
http://dev.mysql.com/doc/refman/5.7/en/distinct-optimization.html
Counts can be slow. make sure that the count is working from the fastest column. See this...
https://www.percona.com/blog/2007/04/10/count-vs-countcol/
A good way to think about SQL is in SETS. Luckily MySQL supports sub queries.
https://dev.mysql.com/doc/refman/5.7/en/from-clause-subqueries.html
Some pseudo SQL follows...
select user_id
from relationships as follower, relationships as followee
where ...
In the above we have two sets that we can then manipulate. Using sub queries this gets really interesting
select user_id
from (select user_id as f1 from relationships where ...) as follower,
(select user_id as f2 from relationships where ...) as followee
where ...
I've always found something like the above an easy way to think about self referencing tables.
It is hard to tell exactly how you should optimize your query and structure, first general hints:
use integers/bits/enums instead of varchars
use not null columns as much as possible
usually it makes sense to have unsigned columns (at least to have bigger range)
try different approaches to build query (check below)
distinct is quite expensive operation
sub-queries sometimes much faster neither joins
Anyway, I've prepared sample fiddle with proposed optimizations, I've changed names of the columns to reduce confusion
final query could look like this:
select *
from users a
where
(
id in (select follower_id as id from relationships USE INDEX (user_id) where user_id = 256 and status = 'following')
or id in (select user_id from relationships USE INDEX (follower_id) where follower_id = 256 and status = 'following')
)
and id not in (select follower_id from relationships USE INDEX (user_id) where user_id = 2 and status = 'blocked')
and id not in (select user_id from relationships USE INDEX (follower_id) where follower_id = 2 and status = 'blocked')
though, it could be rewritten as follows:
select *
from users a
where
id in (select follower_id as id from relationships USE INDEX (user_id) where user_id = 256 and status = 'following'
union all
select user_id from relationships USE INDEX (follower_id) where follower_id = 256 and status = 'following')
and id not in (select follower_id from relationships USE INDEX (user_id) where user_id = 2 and status = 'blocked'
union all
select user_id from relationships USE INDEX (follower_id) where follower_id = 2 and status = 'blocked')
benchmark both, despite execution plan - actual performance may be different on real database
Don't use IN ( SELECT ... ), it optimizes poorly. Instead, either use a JOIN, or EXISTS ( SELECT ... ).
The OR to UNION trick is good, but not if it is still inside IN(...).
(To aid in readability, please omit the table name when there is only one table. And rename followee_id and/or follower_id; they are too close to each other in spelling.)
I currently try to optimize a MySQL statement. It takes about 10 sec and outputs an average difference of two integer. The event table contains 6 cols and is indexed by it's id and also by run_id + every other key.
The Table holds 3308000 rows for run_id 37, 4162050 in total.
Most time seems to be needed for the join, so maybe there is a way to speed it up.
send.element_id and recv.element_id are unique, is there a way to express it in sql which might lead in a better performance?
|-------------------
|Spalte Typ
|-------------------
|run_id int(11)
|element_id int(11)
|event_id int(11) PRIMARY
|event_time int(11)
|event_type varchar(20)
|event_data varchar(20)
The Query:
select avg(recv.event_time-send.event_time)
from
(
select element_id, event_time
from event
where run_id = 37 and event_type='SEND_FLIT'
) send,
(
select element_id, event_time
from event
where run_id = 37 and event_type='RECV_FLIT'
) recv
where recv.element_id = send.element_id
The Explain of the Query:
+----+-------------+------------+------+-----------------------------------------------------+-------------+---------+-------------+--------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | extra |
+----+-------------+------------+------+-----------------------------------------------------+-------------+---------+-------------+--------+-----------------------+
| 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 499458 | NULL |
| 1 | PRIMARY | <derived2> | ref | <auto_key0> | <auto_key0> | 4 | element_id | 10 | NULL |
| 3 | DERIVED | event | ref | run_id,run_id_2,run_id_3,run_id_4,run_id_5,run_id_6 | run_id_5 | 26 | const,const | 499458 | Using index condition |
| 2 | DERIVED | event | ref | run_id,run_id_2,run_id_3,run_id_4,run_id_5,run_id_6 | run_id_5 | 26 | const,const | 562556 | Using index condition |
+----+-------------+------------+------+-----------------------------------------------------+-------------+---------+-------------+--------+-----------------------+
One way is to group by element_id and to use sum to determine the difference, which you can then pass to avg.
select avg(diff) from (
select
sum(case when event_type = 'SEND_FLIT' then -1 * event_time else event_time end)
as diff
from event
where run_id = 37
and event_type in ('SEND_FLIT','RECV_FLIT')
group by element_id
) t
The following query operates on two tables: dev_Profile and dev_User.
SELECT
dev_Profile.ID AS pid,
Name AS username,
st1.online
FROM
dev_Profile
LEFT JOIN (
SELECT
dev_User.ID,
lastActivityTime /* DATETIME */
FROM
dev_User)
AS st1 ON st1.ID = dev_Profile.UserID;
There are about 11K rows in each table and this query takes close to 6 seconds to complete. I don't have a lot of experience with databases yet. I thought creating an index for dev_Profile.UserID would do the trick, since dev_Profile.ID already has an index (it's the PK) and dev_Profile.UserID didn't have an index, but this didn't help at all.
EDIT: The EXPLAIN output for this query:
+----+-------------+-------------+------+---------------+------+---------+------+-------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+------+---------------+------+---------+------+-------+-------+
| 1 | PRIMARY | dev_Profile | ALL | NULL | NULL | NULL | NULL | 11521 | |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 11191 | |
| 2 | DERIVED | dev_User | ALL | NULL | NULL | NULL | NULL | 11440 | |
+----+-------------+-------------+------+---------------+------+---------+------+-------+-------+
Any suggestions?
Why the nested select? That might be confusing the optimizer. Try eliminating it:
SELECT
dev_Profile.ID AS pid,
Name AS username,
st1.online
FROM
dev_Profile
LEFT JOIN dev_User st1 ON st1.ID = dev_Profile.UserID;
I am trying to improve a query which does the following:
For every job, add up all the costs, add up the invoiced amount, and calculate a profit/loss. The costs come from several different tables, e.g. purchaseorders, users_events (engineer allocated time/time he spent on site), stock used etc.
The query also needs to output some other columns like the name of the site for the work, so that that column can be sorted by (an ORDER BY is appended after all of this).
SELECT
jobs.job_id,
jobs.start_date,
jobs.end_date,
events.time,
sites.name site,
IFNULL(stock_cost,0) stock_cost,
labour,
materials,
labour+materials+plant+expenses revenue,
(labour+materials+plant)-(time*3557/360000+IFNULL(orders_cost,0)+IFNULL(stock_cost,0)) profit,
((labour+materials+plant)-(time*3557/360000+IFNULL(orders_cost,0)+IFNULL(stock_cost,0)))/(time*3557/360000+IFNULL(orders_cost,0)+IFNULL(stock_cost,0)) ratio
FROM
jobs
LEFT JOIN (
SELECT
job_id,
SUM(labour_charge) labour,
SUM(materials_charge) materials,
SUM(plant_hire_charge) plant,
SUM(expenses) expenses
FROM invoices
GROUP BY job_id
ORDER BY NULL
) invoices USING(job_id)
LEFT JOIN (
SELECT
job_id,
SUM(IF(start_onsite && end_onsite,end_onsite-start_onsite,end-start)) time,
SUM(travel+parking+materials) user_expenses
FROM users_events
WHERE type='job'
GROUP BY job_id
ORDER BY NULL
) events USING(job_id)
LEFT JOIN (
SELECT
job_id,
SUM(IFNULL(total,0))*0.01 orders_cost
FROM purchaseorders
GROUP BY job_id
ORDER BY NULL
) purchaseorders USING(job_id)
LEFT JOIN (
SELECT
location job_id,
SUM(amount*cost))*0.01 stock_cost
FROM stock_location
LEFT JOIN stock_items ON stock_items.id=stock_location.stock_id
WHERE location>=3000 AND amount>0 AND cost>0
GROUP BY location
ORDER BY NULL
) stock USING(job_id)
LEFT JOIN contacts_sites sites ON sites.id=jobs.site_id;
I read this: http://dev.mysql.com/doc/refman/5.0/en/group-by-optimization.html but don't see how/if I can apply anything therein.
For testing purposes, I have tried adding all sorts of indices on fields left, right and centre with no improvement to the EXPLAIN output:
+----+-------------+----------------+--------+------------------------+---------+---------+------------------------------------+-------+-------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+--------+------------------------+---------+---------+------------------------------------+-------+-------------------------------+
| 1 | PRIMARY | jobs | ALL | NULL | NULL | NULL | NULL | 7088 | |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 5038 | |
| 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 6476 | |
| 1 | PRIMARY | <derived4> | ALL | NULL | NULL | NULL | NULL | 904 | |
| 1 | PRIMARY | <derived5> | ALL | NULL | NULL | NULL | NULL | 531 | |
| 1 | PRIMARY | sites | eq_ref | PRIMARY | PRIMARY | 4 | bestbee_db.jobs.site_id | 1 | |
| 5 | DERIVED | stock_location | ALL | stock,location,amount,…| NULL | NULL | NULL | 5426 | Using where; Using temporary; |
| 5 | DERIVED | stock_items | eq_ref | PRIMARY | PRIMARY | 4 | bestbee_db.stock_location.stock_id | 1 | Using where |
| 4 | DERIVED | purchaseorders | ALL | NULL | NULL | NULL | NULL | 1445 | Using temporary; |
| 3 | DERIVED | users_events | ALL | type,type_job | NULL | NULL | NULL | 11295 | Using where; Using temporary; |
| 2 | DERIVED | invoices | ALL | NULL | NULL | NULL | NULL | 5320 | Using temporary; |
+----+-------------+----------------+--------+------------------------+---------+---------+------------------------------------+-------+-------------------------------+
The rows produced is 5 x 10^21 (down from 3 x 10^42 before I started optimising this query!)
It currently takes seven seconds to execute (down from 26) but I would like that to be under one second.
By the way: GROUP BY x ORDER BY NULL is a great way to eliminate unnecessary filesorts from subqueries! (from http://www.mysqlperformanceblog.com/2006/09/04/group_concat-useful-group-by-extension/)
Based on your comment to my question, I would do the following...
At the very top...
SELECT STRAIGHT_JOIN (just add the "STRAIGH_JOIN" keyword)
Then, for each of your subqueries for invoices, events, p/o's, etc, change the ORDER BY to the JOB_ID explicitly so it might help the optimization against the primary JOBS table join.
Finally, ensure each of your subquery tables HAS an index on the Job_ID (Invoices, User_events, PurchaseOrders, Stock_Location)
Additionally, for the Stock_Location table, you might want to help the WHERE clause for your subquery by having a compound index on
(job_id, location, amount) Three fields deep should be enough even though you have the key plus 3 where condition elements.