Optimize MySql query: Too slow when ordering - mysql

(edited) For more details about the app it self, please, also see:
Simple but heavy application consuming a lot of resources. How to Optimize?
(The adopted solution was use both joins and fulltext search)
I have the following query running up to roughly 500.000 rows in 25 seconds. If I remove the ORDER, it takes 0.5 seconds.
Fisrt test
Keeping the ORDER and removing all t. and tu. columns, the query takes 7 seconds.
Second test
If I add or remove an INDEX to the i.created_at field the response time remain the same.
QUERY:
**EDITED: I'VE NOTICED THAT BOTH GROUP BY AND ORDER BY SLOW DOWN THE QUERY (I've also achieve a little gain in the query changing the joins. The gain was to 10secs, but at all, the problem remains). With the modification, the EXPLAIN have stopped to return filesort, but stills returning "using temporary" **
SELECT SQL_NO_CACHE
DISTINCT `i`.`id`,
`i`.`entity`,
`i`.`created_at`,
`i`.`collected_at`,
`t`.`status_id` AS `twt_status_id`,
`t`.`user_id` AS `twt_user_id`,
`t`.`content` AS `twt_content`,
`tu`.`id` AS `twtu_id`,
`tu`.`screen_name` AS `twtu_screen_name`,
`tu`.`profile_image` AS `twtu_profile_image`
FROM `mtrt_items` AS `i`
LEFT JOIN `mtrt_users` AS `u` ON i.user_id =u.id
LEFT JOIN `twt_tweets_content` AS `t` ON t.id =i.id
LEFT JOIN `twt_users` AS `tu` ON u.id = tu.id
INNER JOIN `mtrt_items_searches` AS `r` ON i.id =r.item_id
INNER JOIN `mtrt_searches` AS `s` ON s.id =r.search_id
INNER JOIN `mtrt_searches_groups` AS `sg` ON sg.search_id =s.id
INNER JOIN `mtrt_search_groups` AS `g` ON sg.group_id =g.id
INNER JOIN `account_clients` AS `c` ON g.client_id =c.id
ORDER BY `i`.`created_at` DESC
LIMIT 100 OFFSET 0
Here is the EXPLAIN (EDITED):
+----+-------------+-------+--------+--------------------+-----------+---------+------------------------+------+------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+--------------------+-----------+---------+------------------------+------+------------------------------+
| 1 | SIMPLE | c | index | PRIMARY | PRIMARY | 4 | NULL | 1 | Using index; Using temporary |
| 1 | SIMPLE | g | ref | PRIMARY,client_id | client_id | 4 | clubr_new.c.id | 3 | Using index |
| 1 | SIMPLE | sg | ref | group_id,search_id | group_id | 4 | clubr_new.g.id | 1 | Using index |
| 1 | SIMPLE | s | eq_ref | PRIMARY | PRIMARY | 4 | clubr_new.sg.search_id | 1 | Using index |
| 1 | SIMPLE | r | ref | search_id,item_id | search_id | 4 | clubr_new.s.id | 4359 | Using where |
| 1 | SIMPLE | i | eq_ref | PRIMARY | PRIMARY | 8 | clubr_new.r.item_id | 1 | |
| 1 | SIMPLE | u | eq_ref | PRIMARY | PRIMARY | 8 | clubr_new.i.user_id | 1 | Using index |
| 1 | SIMPLE | t | eq_ref | PRIMARY | PRIMARY | 4 | clubr_new.i.id | 1 | |
| 1 | SIMPLE | tu | eq_ref | PRIMARY | PRIMARY | 8 | clubr_new.u.id | 1 | |
+----+-------------+-------+--------+--------------------+-----------+---------+------------------------+------+------------------------------+
Here is the mtrt_items table:
+--------------+-------------------------------------------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-------------------------------------------------------+------+-----+---------+----------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| entity | enum('twitter','facebook','youtube','flickr','orkut') | NO | MUL | NULL | |
| user_id | bigint(20) | NO | MUL | NULL | |
| created_at | datetime | NO | MUL | NULL | |
| collected_at | datetime | NO | | NULL | |
+--------------+-------------------------------------------------------+------+-----+---------+----------------+
CREATE TABLE `mtrt_items` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`entity` enum('twitter','facebook','youtube','flickr','orkut') COLLATE utf8_unicode_ci NOT NULL,
`user_id` bigint(20) NOT NULL,
`created_at` datetime NOT NULL,
`collected_at` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `mtrt_user_id` (`user_id`),
KEY `entity` (`entity`),
KEY `created_at` (`created_at`),
CONSTRAINT `mtrt_items_ibfk_1` FOREIGN KEY (`user_id`) REFERENCES `mtrt_users` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=309650 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
The twt_tweets_content is MyISAM and is also used for fulltext searches:
+-----------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+--------------+------+-----+---------+-------+
| id | int(11) | NO | PRI | NULL | |
| user_id | int(11) | NO | MUL | NULL | |
| status_id | varchar(100) | NO | MUL | NULL | |
| content | varchar(200) | NO | MUL | NULL | |
+-----------+--------------+------+-----+---------+-------+

Instead of placing the Order By into the main query, wrap it, like so:
SELECT * FROM (
... your query
) ORDER BY `created at`
Take a look at the query plan. You will find that in your case, the sort is performed on your table mtrt_items before the outer join is performed. In the rewrite I've partially provided, the sort is applied after the outer joins, and is applied on a much smaller set.
UPDATE
Assuming that the LIMIT is being applied to a large set (500,000?), it looks like you can perform the top before doing any of the joins.
SELECT * from (
SELECT
`id`, ... `created_at`, ...
ORDER BY `i`.`created_at` DESC
LIMIT 100 OFFSET 0) as i
LEFT JOIN `mtrt_users` AS `u` ON i.user_id =u.id
LEFT JOIN `twt_tweets_content` AS `t` ON t.id =i.id
LEFT JOIN `twt_users` AS `tu` ON t.user_id = tu.id
INNER JOIN `mtrt_items_searches` AS `r` ON i.id =r.item_id
INNER JOIN `mtrt_searches` AS `s` ON s.id =r.search_id
INNER JOIN `mtrt_searches_groups` AS `sg` ON sg.search_id =s.id
INNER JOIN `mtrt_search_groups` AS `g` ON sg.group_id =g.id
INNER JOIN `account_clients` AS `c` ON g.client_id =c.id
GROUP BY i.id

Don't include the VARCHAR/TEXT fields in your initial query. This will create the TEMPORARY table required for the sorting, using the MEMORY engine and this will increase the efficiency dramatically. You can collect the text fields later using another query, without any sorting, simply with a condition on the PRIMARY KEY field and merge the data in your script (assuming that you are using one).
Also get rid of any JOINs (INNER or OUTER) that you don't actually take any data from.

Related

Optimizing MySQL select query

MySQL version: 5.7.23
Engine: InnoDB
I created an application that monitors network devices from around the world with ICMP echo request packets. It pings devices on a regular interval and stores the results in a MySQL table.
I have a query that fetches the latest 100 up/down events for a given device, but it takes ~38 seconds to execute, which is way too long. I'm trying to optimize the query but I'm kind of lost.
The query:
select
c.id as clusterId,
c.name as cluster,
m.id as machineId,
m.label as machine,
h.id as pingResultId,
h.timePinged as `timestamp`,
h.status
from pinger_history h
join pinger_history_updown ud on ud.pingResultId = h.id
join pinger_machine_ip_addresses i on h.machineIpId = i.id
join pinger_machines m on i.machineId = m.id
join pinger_clusters c on m.clusterId = c.id
where h.deviceId = ?
order by h.id desc
limit 100
Explain query output:
+----+-------------+-------+------------+--------+------------------------------+---------+---------+---------------------------+--------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+------------------------------+---------+---------+---------------------------+--------+----------+----------------------------------------------+
| 1 | SIMPLE | ud | NULL | index | PRIMARY | PRIMARY | 4 | NULL | 111239 | 100.00 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | h | NULL | eq_ref | PRIMARY,deviceId,machineIpId | PRIMARY | 4 | dashboard.ud.pingResultId | 1 | 5.00 | Using where |
| 1 | SIMPLE | i | NULL | eq_ref | PRIMARY,machineId | PRIMARY | 4 | dashboard.h.machineIpId | 1 | 100.00 | NULL |
| 1 | SIMPLE | m | NULL | eq_ref | PRIMARY,clusterId | PRIMARY | 4 | dashboard.i.machineId | 1 | 100.00 | Using where |
| 1 | SIMPLE | c | NULL | eq_ref | PRIMARY | PRIMARY | 4 | dashboard.m.clusterId | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+------------------------------+---------+---------+---------------------------+--------+----------+----------------------------------------------+
The pinger_history table consists of around 483,750,000 rows and pinger_history_updown around 115,520 rows. The other tables are small in comparison (less than 300 rows).
If anyone has experience in optimizing queries or debugging bottlenecks then all help will be greatly appreciated.
Edit:
I added the missing order by h.id desc to the query and I made pinger_history the first table in the query.
Here are the create table queries for pinger_history and pinger_history_updown:
pinger_history:
mysql> show create table pinger_history;
+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| pinger_history | CREATE TABLE `pinger_history` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`deviceId` int(10) unsigned NOT NULL,
`machineIpId` int(10) unsigned NOT NULL,
`minRoundTripTime` decimal(6,1) unsigned DEFAULT NULL,
`maxRoundTripTime` decimal(6,1) unsigned DEFAULT NULL,
`averageRoundTripTime` decimal(6,1) unsigned DEFAULT NULL,
`packetLossRatio` decimal(3,2) unsigned DEFAULT NULL,
`timePinged` datetime NOT NULL,
`status` enum('Up','Unstable','Down') DEFAULT NULL,
`firstOppositeStatusPingResultId` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `deviceId` (`deviceId`),
KEY `machineIpId` (`machineIpId`),
KEY `timePinged` (`timePinged`),
KEY `firstOppositeStatusPingResultId` (`firstOppositeStatusPingResultId`),
CONSTRAINT `pinger_history_ibfk_2` FOREIGN KEY (`machineIpId`) REFERENCES `pinger_machine_ip_addresses` (`id`),
CONSTRAINT `pinger_history_ibfk_4` FOREIGN KEY (`deviceId`) REFERENCES `pinger_devices` (`id`) ON DELETE CASCADE,
CONSTRAINT `pinger_history_ibfk_5` FOREIGN KEY (`firstOppositeStatusPingResultId`) REFERENCES `pinger_history` (`id`) ON DELETE SET NULL
) ENGINE=InnoDB AUTO_INCREMENT=483833283 DEFAULT CHARSET=utf8mb4 |
+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
pinger_history_updown:
+-----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| pinger_history_updown | CREATE TABLE `pinger_history_updown` (
`pingResultId` int(10) unsigned NOT NULL,
`notified` tinyint(1) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`pingResultId`),
CONSTRAINT `pinger_history_updown_ibfk_1` FOREIGN KEY (`pingResultId`) REFERENCES `pinger_history` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 |
+-----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Edit 2:
Here is the output of show index for pinger_history:
mysql> show index from pinger_history;
+----------------+------------+---------------------------------+--------------+---------------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------------+------------+---------------------------------+--------------+---------------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| pinger_history | 0 | PRIMARY | 1 | id | A | 443760800 | NULL | NULL | | BTREE | | |
| pinger_history | 1 | deviceId | 1 | deviceId | A | 288388 | NULL | NULL | | BTREE | | |
| pinger_history | 1 | machineIpId | 1 | machineIpId | A | 71598 | NULL | NULL | | BTREE | | |
| pinger_history | 1 | timePinged | 1 | timePinged | A | 38041236 | NULL | NULL | | BTREE | | |
| pinger_history | 1 | firstOppositeStatusPingResultId | 1 | firstOppositeStatusPingResultId | A | 8973 | NULL | NULL | YES | BTREE | | |
+----------------+------------+---------------------------------+--------------+---------------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
Edit 3:
Here is the explain output when I add straight_join:
Note that the query takes almost 2 minutes with straight_join but around 36 seconds without.
+----+-------------+-------+------------+--------+------------------------------+----------+---------+-------------------------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+------------------------------+----------+---------+-------------------------+--------+----------+-------------+
| 1 | SIMPLE | h | NULL | ref | PRIMARY,deviceId,machineIpId | deviceId | 4 | const | 344062 | 100.00 | Using where |
| 1 | SIMPLE | ud | NULL | eq_ref | PRIMARY | PRIMARY | 4 | dashboard.h.id | 1 | 100.00 | Using index |
| 1 | SIMPLE | i | NULL | eq_ref | PRIMARY,machineId | PRIMARY | 4 | dashboard.h.machineIpId | 1 | 100.00 | NULL |
| 1 | SIMPLE | m | NULL | eq_ref | PRIMARY,clusterId | PRIMARY | 4 | dashboard.i.machineId | 1 | 100.00 | Using where |
| 1 | SIMPLE | c | NULL | eq_ref | PRIMARY | PRIMARY | 4 | dashboard.m.clusterId | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+------------------------------+----------+---------+-------------------------+--------+----------+-------------+
create index on following columns
pingResultId ,machineIpId ,clusterId .pinger_clusters .id,
pinger_machine_ip_addresses.id,pinger_history.id,pinger_machines.id,deviceId ,
pinger_history_updown.pingResultId
indexing will reduce the time taken to fetch the data
You wrote that your query fetches the latest 100 events for device, but there is no ORDER BY clause in your SQL.
Add ORDER BY h.id DESCto your query and create composite index on (devideId, id) fields.
I would add the "STRAIGHT_JOIN" keyword and also put the pinger_history into the first position. Then, include an index on pinger_history by the DeviceID to optimize the WHERE clause. Your other tables would probably already have an index on their respective ID keys implied and should be good. The STRAIGHT_JOIN clause tells MySQL to run the query in the table/join order I gave you, don't imply something else.
select STRAIGHT_JOIN
c.id as clusterId,
c.name as cluster,
m.id as machineId,
m.label as machine,
h.id as pingResultId,
h.timePinged as `timestamp`,
h.status
from
pinger_history h
join pinger_history_updown ud
on h.id = ud.pingResultId
join pinger_machine_ip_addresses i
on h.machineIpId = i.id
join pinger_machines m
on i.machineId = m.id
join pinger_clusters c
on m.clusterId = c.id
where
h.deviceId = ?
order by
h.id desc
limit 100
Since you DO want the most recent records, I would definitely have and index on your pinger_history table on (DeviceID, ID ) -- change your existing key of DeviceID only and change it to (DeviceID, ID)
This way, the WHERE clause is FIRST optimized to get the Device ID records. By having the ID as part of the index, but in the second position, the ORDER by can utilize that to get the most recent first for you.
Plan A: Get rid of pinger_history_updown and move notified into pinger_history. Perhaps augment status to indicate "CameUp" and "WentDown". Pro: That will make the query much faster since it will be able to use INDEX(deviceId). Con: It makes pinger_history a little bigger; adding columns to a huge table will take time.
Plan B: Add deviceId to pinger_history_updown and have INDEX(deviceID, pingResultId). Pro: Much faster query. Con: Redundant data (deviceid) is frowned on.
Plan C: Add an index hint to force the execution to start with pinger_history. Con: "What helps today may hurt tomorrow." (STRAIGHT_JOIN was tested and found to be slower.)
Plan D: See if ANALYZE TABLE for each table will help. Pro: Quick and cheap. Con: May not help.
Plan E: Change to ORDER BY deviceId DESC, id DESC. Pro: Cheap and easy to try. Con: May not help.
Plan F: In pinger_history, change
PRIMARY KEY (`id`),
KEY `deviceId` (`deviceId`),
to
PRIMARY KEY(deviceId, id),
KEY(id)
This will make the desired rows "clustered" much better. Pros: Much faster. Con: ALTER TABLE will take a long time for that huge table.
Plan G: Assume it is an explode-implode problem and move the LIMIT into a derived table:
select c.id as clusterId, c.name as cluster, m.id as machineId,
m.label as machine, h2.id as pingResultId, h2.timePinged as `timestamp`,
h2.status
FROM
( -- "derived table"
SELECT ud1.pingResultId
FROM pinger_history_updown AS ud1
JOIN pinger_history AS h1 ON ud1.pingResultId = h1.id
WHERE h1.deviceId = ?
ORDER BY ud1.pingResultId
LIMIT 100 -- only needed here
) AS ud2
JOIN pinger_history AS h2 ON ud2.pingResultId = h2.id
join pinger_machine_ip_addresses i ON h.machineIpId = i.id
join pinger_machines m ON i.machineId = m.id
join pinger_clusters c ON m.clusterId = c.id
order by h2.id desc -- Yes, this is repeated
Pro: May make better use of 'covering' INDEX(deviceId), especially if merged with Plan B.
Summary: Start with D and E.

Debugging Slow mySQL query with Explain

Have found an inefficient query in our system. content holds versions of slides, and this is supposed to select the highest version of a slide by id.
SELECT `content`.*
FROM (`content`)
JOIN (
SELECT max(version) as `version` from `content`
WHERE `slide_id` = '16901'
group by `slide_id`
) c ON `c`.`version` = `content`.`version`;
EXPLAIN
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------+------+----------+--------------------------+
| 1 | PRIMARY | <derived2> | NULL | system | NULL | NULL | NULL | NULL | 1 | 100.00 | NULL |
| 1 | PRIMARY | content | NULL | ref | PRIMARY,version | PRIMARY | 8 | const | 9703 | 100.00 | NULL |
| 2 | DERIVED | content | NULL | ref | PRIMARY,fk_content_slides_idx,thumbnail_asset_id,version,slide_id | fk_content_slides_idx | 8 | const | 1 | 100.00 | Using where; Using index |
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------+------+----------+--------------------------+
One big issue is that it returns almost all the slides in the system as the outer query does not filter by slide id. After adding that I get...
SELECT `content`.*
FROM (`content`)
JOIN (
SELECT max(version) as `version` from `content`
WHERE `slide_id` = '16901' group by `slide_id`
) c ON `c`.`version` = `content`.`version`
WHERE `slide_id` = '16901';
EXPLAIN
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------------+------+----------+--------------------------+
| 1 | PRIMARY | <derived2> | NULL | system | NULL | NULL | NULL | NULL | 1 | 100.00 | NULL |
| 1 | PRIMARY | content | NULL | const | PRIMARY,fk_content_slides_idx,version,slide_id | PRIMARY | 16 | const,const | 1 | 100.00 | NULL |
| 2 | DERIVED | content | NULL | ref | PRIMARY,fk_content_slides_idx,thumbnail_asset_id,version,slide_id | fk_content_slides_idx | 8 | const | 1 | 100.00 | Using where; Using index |
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------------+------+----------+--------------------------+
That reduces the amount of rows down to one correctly, but doesnt really speed things up.
There are indexes on version, slide_id and a unique key on version AND slide_id.
Is there anything else I can do to speed this up?
Use a TOP LIMIT 1 insetead of Max ?
m
MySQL seems to take an index (version, slide_id) to join the tables. You should get a better result with
SELECT `content`.*
FROM `content`
FORCE INDEX FOR JOIN (fk_content_slides_idx)
join (
SELECT `slide_id`, max(version) as `version` from `content`
WHERE `slide_id` = '16901' group by `slide_id`
) c ON `c`.`slide_id` = `content`.`slide_id` and `c`.`version` = `content`.`version`
You need an index that has slide_id as first column, I just guessed that's fk_content_slides_idx, if not, take another one.
The part FORCE INDEX FOR JOIN (fk_content_slides_idx) is just to enforce it, you should try if mysql takes it by itself without forcing (it should).
You might get even a slightly better result with an index (slide_id, version), it depends on the amount of data (e.g. the number of versions per id) if you see a difference (but you should not spam indexes, and you already have a lot on this table, but you can try it for fun.)
Just a suggestion i think you should avoid the group by slide_id because you are filter by one slide_id only (16901)
SELECT `content`.*
FROM (`content`)
JOIN (
SELECT max(version) as `version` from `content`
WHERE `slide_id` = '16901'
) c ON `c`.`version` = `content`.`version`
WHERE `slide_id` = '16901';

Mysql is not using index on query

here is my explain result.
+----+-------------+----------------+--------+-------------------------------------------------------+-------------+---------+-------------------------------------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+--------+-------------------------------------------------------+-------------+---------+-------------------------------------+------+----------------------------------------------------+
| 1 | SIMPLE | client_address | ALL | client_id | NULL | NULL | NULL | 1619 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | clients | eq_ref | PRIMARY,contract_id,contract_id_2 | PRIMARY | 3 | defenselaw.client_address.client_id | 1 | Using where |
| 1 | SIMPLE | user_infos | eq_ref | user_id | user_id | 3 | defenselaw.clients.client_rep_id | 1 | Using where |
| 1 | SIMPLE | users | eq_ref | PRIMARY | PRIMARY | 2 | defenselaw.user_infos.user_id | 1 | Using where; Using index |
| 1 | SIMPLE | contracts | eq_ref | PRIMARY | PRIMARY | 3 | defenselaw.clients.contract_id | 1 | NULL |
| 1 | SIMPLE | documents | eq_ref | contract_id,contract_id_2,contract_id_3,contract_id_4 | contract_id | 3 | defenselaw.clients.contract_id | 1 | Using where |
| 1 | SIMPLE | client_info | ref | client_id,client_id_2,client_id_3,index5 | client_id | 3 | defenselaw.client_address.client_id | 1 | Using where |
| 1 | SIMPLE | location | ALL | PRIMARY | NULL | NULL | NULL | 1 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+----------------+--------+-------------------------------------------------------+-------------+---------+-------------------------------------+------+----------------------------------------------------+
here is the query.
SELECT
contracts.contract_id,
documents.documentKey,
documents.document_name,
documents.notify_accountant,
clients.client_first_name,
clients.client_middle_name,
clients.client_last_name,
clients.client_rep_id,
client_address.client_street,
client_address.client_apartment,
client_address.client_city,
client_address.client_state,
client_address.client_zip,
client_info.client_email,
client_info.client_home_phone,
client_info.client_work_phone,
client_info.client_cell_phone,
client_info.client_fax_number,
contracts.contract_agreement_date,
user_infos.user_first_name,
user_infos.user_last_name,
location.location
FROM
`contracts`
LEFT JOIN
`clients`
ON clients.contract_id = contracts.contract_id
INNER JOIN
`client_address`
ON client_address.client_id = clients.client_id
INNER JOIN
`client_info`
ON client_info.client_id = client_address.client_id
LEFT JOIN
`documents`
ON contracts.contract_id = documents.contract_id
LEFT JOIN
`user_infos`
ON user_infos.user_id = clients.client_rep_id
LEFT JOIN
`users` ON
users.user_id = user_infos.user_id
LEFT JOIN
`location`
ON location.location_id = user_infos.location_id
WHERE client_info.client_role = 'primary'
AND documents.archive = 0
ORDER BY
documents.created_on DESC
Could someone explain to me why my index on client_address isn't being used? I am lost here as I just don't know what to do. It's not that complicated of a query its just rather large. Yes, stackoverflow, I understand the post is mostly code. What more do you want from me? How much more can I say about this other than my query isn't using an index and its causing a really slow query here.
+----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| client_address | CREATE TABLE `client_address` (
`client_address_id` mediumint(12) NOT NULL AUTO_INCREMENT,
`client_id` mediumint(12) NOT NULL,
`client_street` varchar(40) COLLATE utf8_bin NOT NULL,
`client_apartment` varchar(20) COLLATE utf8_bin NOT NULL,
`client_city` varchar(80) COLLATE utf8_bin NOT NULL,
`client_state` varchar(80) COLLATE utf8_bin NOT NULL,
`client_zip` varchar(80) COLLATE utf8_bin NOT NULL,
PRIMARY KEY (`client_address_id`),
KEY `client_id` (`client_id`)
) ENGINE=InnoDB AUTO_INCREMENT=2167 DEFAULT CHARSET=utf8 COLLATE=utf8_bin |
+----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

why inner join on condition does not use key

I am creating a table as
create table temp_test2 (
date_id int(11) NOT NULL DEFAULT '0',
`date` date NOT NULL,
`day` int(11) NOT NULL,
PRIMARY KEY (date_id)
);
create table temp_test1 (
date_id int(11) NOT NULL DEFAULT '0',
`date` date NOT NULL,
`day` int(11) NOT NULL,
PRIMARY KEY (date_id)
);
explain select * from temp_test as t inner join temp_test2 as t2 on (t2.date_id =t.date_id) limit 3;
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------------+
| 1 | SIMPLE | t | ALL | date_id | NULL | NULL | NULL | 4 | NULL |
| 1 | SIMPLE | t2 | ALL | date_id | NULL | NULL | NULL | 4 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------------+
why the code_id key is not used in both the table, but when I use code_id=something in on condition it's using the key,
explain select * from temp_test as t inner join temp_test2 as t2 on (t2.date_id =t.date_id and t.date_id =1) limit 3;
+----+-------------+-------+-------+-------------------------------------+---------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+-------------------------------------+---------+---------+-------+------+-------+
| 1 | SIMPLE | t | const | PRIMARY,date_id,date_id_2,date_id_3 | PRIMARY | 4 | const | 1 | NULL |
| 1 | SIMPLE | t2 | ref | date_id,date_id_2,date_id_3 | date_id | 4 | const | 1 | NULL |
+----+-------------+-------+-------+-------------------------------------+---------+---------+-------+------+-------+
I tried (unique,composite primary,composite) key also but it is not working.
Can anyone explain why this so?
Because your tables contain a very small number of records, the optimiser decides that it is not worth using the index. A table scan will do just as good.
Also, you selected all fields (SELECT *), if it used the index for executing the JOIN a row scan would still be required to get the full contents.
The query would be more likely to use the index if:
you selected only the date_id field
there were more than 4 rows in temp_test

MySQL Join Optimisation: Improving join type with derived tables and GROUP BY

I am trying to improve a query which does the following:
For every job, add up all the costs, add up the invoiced amount, and calculate a profit/loss. The costs come from several different tables, e.g. purchaseorders, users_events (engineer allocated time/time he spent on site), stock used etc.
The query also needs to output some other columns like the name of the site for the work, so that that column can be sorted by (an ORDER BY is appended after all of this).
SELECT
jobs.job_id,
jobs.start_date,
jobs.end_date,
events.time,
sites.name site,
IFNULL(stock_cost,0) stock_cost,
labour,
materials,
labour+materials+plant+expenses revenue,
(labour+materials+plant)-(time*3557/360000+IFNULL(orders_cost,0)+IFNULL(stock_cost,0)) profit,
((labour+materials+plant)-(time*3557/360000+IFNULL(orders_cost,0)+IFNULL(stock_cost,0)))/(time*3557/360000+IFNULL(orders_cost,0)+IFNULL(stock_cost,0)) ratio
FROM
jobs
LEFT JOIN (
SELECT
job_id,
SUM(labour_charge) labour,
SUM(materials_charge) materials,
SUM(plant_hire_charge) plant,
SUM(expenses) expenses
FROM invoices
GROUP BY job_id
ORDER BY NULL
) invoices USING(job_id)
LEFT JOIN (
SELECT
job_id,
SUM(IF(start_onsite && end_onsite,end_onsite-start_onsite,end-start)) time,
SUM(travel+parking+materials) user_expenses
FROM users_events
WHERE type='job'
GROUP BY job_id
ORDER BY NULL
) events USING(job_id)
LEFT JOIN (
SELECT
job_id,
SUM(IFNULL(total,0))*0.01 orders_cost
FROM purchaseorders
GROUP BY job_id
ORDER BY NULL
) purchaseorders USING(job_id)
LEFT JOIN (
SELECT
location job_id,
SUM(amount*cost))*0.01 stock_cost
FROM stock_location
LEFT JOIN stock_items ON stock_items.id=stock_location.stock_id
WHERE location>=3000 AND amount>0 AND cost>0
GROUP BY location
ORDER BY NULL
) stock USING(job_id)
LEFT JOIN contacts_sites sites ON sites.id=jobs.site_id;
I read this: http://dev.mysql.com/doc/refman/5.0/en/group-by-optimization.html but don't see how/if I can apply anything therein.
For testing purposes, I have tried adding all sorts of indices on fields left, right and centre with no improvement to the EXPLAIN output:
+----+-------------+----------------+--------+------------------------+---------+---------+------------------------------------+-------+-------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+--------+------------------------+---------+---------+------------------------------------+-------+-------------------------------+
| 1 | PRIMARY | jobs | ALL | NULL | NULL | NULL | NULL | 7088 | |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 5038 | |
| 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 6476 | |
| 1 | PRIMARY | <derived4> | ALL | NULL | NULL | NULL | NULL | 904 | |
| 1 | PRIMARY | <derived5> | ALL | NULL | NULL | NULL | NULL | 531 | |
| 1 | PRIMARY | sites | eq_ref | PRIMARY | PRIMARY | 4 | bestbee_db.jobs.site_id | 1 | |
| 5 | DERIVED | stock_location | ALL | stock,location,amount,…| NULL | NULL | NULL | 5426 | Using where; Using temporary; |
| 5 | DERIVED | stock_items | eq_ref | PRIMARY | PRIMARY | 4 | bestbee_db.stock_location.stock_id | 1 | Using where |
| 4 | DERIVED | purchaseorders | ALL | NULL | NULL | NULL | NULL | 1445 | Using temporary; |
| 3 | DERIVED | users_events | ALL | type,type_job | NULL | NULL | NULL | 11295 | Using where; Using temporary; |
| 2 | DERIVED | invoices | ALL | NULL | NULL | NULL | NULL | 5320 | Using temporary; |
+----+-------------+----------------+--------+------------------------+---------+---------+------------------------------------+-------+-------------------------------+
The rows produced is 5 x 10^21 (down from 3 x 10^42 before I started optimising this query!)
It currently takes seven seconds to execute (down from 26) but I would like that to be under one second.
By the way: GROUP BY x ORDER BY NULL is a great way to eliminate unnecessary filesorts from subqueries! (from http://www.mysqlperformanceblog.com/2006/09/04/group_concat-useful-group-by-extension/)
Based on your comment to my question, I would do the following...
At the very top...
SELECT STRAIGHT_JOIN (just add the "STRAIGH_JOIN" keyword)
Then, for each of your subqueries for invoices, events, p/o's, etc, change the ORDER BY to the JOB_ID explicitly so it might help the optimization against the primary JOBS table join.
Finally, ensure each of your subquery tables HAS an index on the Job_ID (Invoices, User_events, PurchaseOrders, Stock_Location)
Additionally, for the Stock_Location table, you might want to help the WHERE clause for your subquery by having a compound index on
(job_id, location, amount) Three fields deep should be enough even though you have the key plus 3 where condition elements.