How can I speed up my query. subquery is too slow - mysql

The query I have is for a table of inventory. What the subquery join does is gets the total number of work orders there are for each inventory asset. If I run the base query with the main joins for equipment type, vendor, location and room, it runs just fine. Less than a second to return a result. using it with the subquery join, it takes 15 to 20 seconds to return a result.
Here is the full query:
SELECT `inventory`.inventory_id AS 'inventory_id',
`inventory`.media_tag AS 'media_tag',
`inventory`.asset_tag AS 'asset_tag',
`inventory`.idea_tag AS 'idea_tag',
`equipTypes`.equipment_type AS 'equipment_type',
`inventory`.equip_make AS 'equip_make',
`inventory`.equip_model AS 'equip_model',
`inventory`.equip_serial AS 'equip_serial',
`inventory`.sales_order AS 'sales_order',
`vendors`.vendor_name AS 'vendor_name',
`inventory`.purchase_order AS 'purchase_order',
`status`.status AS 'status',
`locations`.location_name AS 'location_name',
`rooms`.room_number AS 'room_number',
`inventory`.notes AS 'notes',
`inventory`.send_to AS 'send_to',
`inventory`.one_to_one AS 'one_to_one',
`enteredBy`.user_name AS 'user_name',
from_unixtime(`inventory`.enter_date, '%m/%d/%Y') AS 'enter_date',
from_unixtime(`inventory`.modified_date, '%m/%d/%Y') AS 'modified_date',
COALESCE(at.assets,0) AS assets
FROM mod_inventory_data AS `inventory`
LEFT JOIN mod_inventory_equip_types AS `equipTypes`
ON `equipTypes`.equip_type_id = `inventory`.equip_type_id
LEFT JOIN mod_vendors_main AS `vendors`
ON `vendors`.vendor_id = `inventory`.vendor_id
LEFT JOIN mod_inventory_status AS `status`
ON `status`.status_id = `inventory`.status_id
LEFT JOIN mod_locations_data AS `locations`
ON `locations`.location_id = `inventory`.location_id
LEFT JOIN mod_locations_rooms AS `rooms`
ON `rooms`.room_id = `inventory`.room_id
LEFT JOIN mod_users_data AS `enteredBy`
ON `enteredBy`.user_id = `inventory`.entered_by
LEFT JOIN
( SELECT asset_tag, count(*) AS assets
FROM mod_workorder_data
WHERE asset_tag IS NOT NULL
GROUP BY asset_tag ) AS at
ON at.asset_tag = inventory.asset_tag
ORDER BY inventory_id ASC LIMIT 0,20
The MySQL EXPLAIN data for this is here
+----+-------------+--------------------+--------+---------------+-----------+---------+-------------------------------------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+--------+---------------+-----------+---------+-------------------------------------+-------+---------------------------------+
| 1 | PRIMARY | inventory | ALL | NULL | NULL | NULL | NULL | 12612 | Using temporary; Using filesort |
| 1 | PRIMARY | equipTypes | eq_ref | PRIMARY | PRIMARY | 4 | spsd_woidbs.inventory.equip_type_id | 1 | |
| 1 | PRIMARY | vendors | eq_ref | PRIMARY | PRIMARY | 4 | spsd_woidbs.inventory.vendor_id | 1 | |
| 1 | PRIMARY | status | eq_ref | PRIMARY | PRIMARY | 4 | spsd_woidbs.inventory.status_id | 1 | |
| 1 | PRIMARY | locations | eq_ref | PRIMARY | PRIMARY | 4 | spsd_woidbs.inventory.location_id | 1 | |
| 1 | PRIMARY | rooms | eq_ref | PRIMARY | PRIMARY | 4 | spsd_woidbs.inventory.room_id | 1 | |
| 1 | PRIMARY | enteredBy | eq_ref | PRIMARY | PRIMARY | 4 | spsd_woidbs.inventory.entered_by | 1 | |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 4480 | |
| 2 | DERIVED | mod_workorder_data | range | asset_tag | asset_tag | 13 | NULL | 15897 | Using where; Using index |
+----+-------------+--------------------+--------+---------------+-----------+---------+-------------------------------------+-------+---------------------------------+
Using MySql query profiling I get this:
+--------------------------------+------------+
| Status | Time |
+--------------------------------+------------+
| starting | 0.000020 |
| checking query cache for query | 0.000263 |
| Opening tables | 0.000034 |
| System lock | 0.000013 |
| Table lock | 0.000079 |
| optimizing | 0.000011 |
| statistics | 0.000138 |
| preparing | 0.000019 |
| executing | 0.000010 |
| Sorting result | 0.000004 |
| Sending data | 0.015103 |
| init | 0.000094 |
| optimizing | 0.000009 |
| statistics | 0.000049 |
| preparing | 0.000022 |
| Creating tmp table | 0.000104 |
| executing | 0.000009 |
| Copying to tmp table | 15.410168 |
| Sorting result | 0.009488 |
| Sending data | 0.000215 |
| end | 0.000006 |
| removing tmp table | 0.001997 |
| end | 0.000018 |
| query end | 0.000005 |
| freeing items | 0.000112 |
| storing result in query cache | 0.000011 |
| removing tmp table | 0.000022 |
| closing tables | 0.000036 |
| logging slow query | 0.000005 |
| logging slow query | 0.000005 |
| cleaning up | 0.000013 |
+--------------------------------+------------+
which shows me that the bottle neck is copying to temp table, but I am unsure of how to speed this up. Are there settings on the server end that I can configure to make this faster? Are there changes to the existing query that I can do that will yield the same results that would be faster?
It seems to me that the LEFT JOIN subquery would give the same resulting data matrix every time, so if it has to run that query for every row in the inventory list, I can see why it would be slow. Or does MySQL cache the subquery when it runs? I thought I read somwhere that MySQL does not cache subqueries, is this true?
Any help is appreciated.

Here is what I did which seems to be working good. I created a table called mod_workorder_counts. The table has two fields, Asset tag which is unique, and wo_count which is and INT(3) field. I am populating that table with this query:
INSERT INTO mod_workorder_counts ( asset_tag, wo_count )
select s.asset_tag, ct
FROM
( SELECT t.asset_tag, count(*) as ct
FROM mod_workorder_data t
WHERE t.asset_tag IS NOT NULL
GROUP BY t.asset_tag
) as s
ON DUPLICATE KEY UPDATE mod_workorder_counts.wo_count = ct
which executed in 0.1580 seconds which may be considered slightly slow, but not bad.
Now when I run this modification of my original query:
SELECT `inventory`.inventory_id AS 'inventory_id',
`inventory`.media_tag AS 'media_tag',
`inventory`.asset_tag AS 'asset_tag',
`inventory`.idea_tag AS 'idea_tag',
`equipTypes`.equipment_type AS 'equipment_type',
`inventory`.equip_make AS 'equip_make',
`inventory`.equip_model AS 'equip_model',
`inventory`.equip_serial AS 'equip_serial',
`inventory`.sales_order AS 'sales_order',
`vendors`.vendor_name AS 'vendor_name',
`inventory`.purchase_order AS 'purchase_order',
`status`.status AS 'status',
`locations`.location_name AS 'location_name',
`rooms`.room_number AS 'room_number',
`inventory`.notes AS 'notes',
`inventory`.send_to AS 'send_to',
`inventory`.one_to_one AS 'one_to_one',
`enteredBy`.user_name AS 'user_name',
from_unixtime(`inventory`.enter_date, '%m/%d/%Y') AS 'enter_date',
from_unixtime(`inventory`.modified_date, '%m/%d/%Y') AS 'modified_date',
COALESCE(at.wo_count, 0) AS workorders
FROM mod_inventory_data AS `inventory`
LEFT JOIN mod_inventory_equip_types AS `equipTypes`
ON `equipTypes`.equip_type_id = `inventory`.equip_type_id
LEFT JOIN mod_vendors_main AS `vendors`
ON `vendors`.vendor_id = `inventory`.vendor_id
LEFT JOIN mod_inventory_status AS `status`
ON `status`.status_id = `inventory`.status_id
LEFT JOIN mod_locations_data AS `locations`
ON `locations`.location_id = `inventory`.location_id
LEFT JOIN mod_locations_rooms AS `rooms`
ON `rooms`.room_id = `inventory`.room_id
LEFT JOIN mod_users_data AS `enteredBy`
ON `enteredBy`.user_id = `inventory`.entered_by
LEFT JOIN mod_workorder_counts AS at
ON at.asset_tag = inventory.asset_tag
ORDER BY inventory_id ASC LIMIT 0,20
It executes in 0.0051 seconds. That puts a total between the two queries at 0.1631 seconds which is near 1/10th of a second versus 15+ seconds with the original subquery.
If I just included the field "wo_count" without using the COALESCE, I got NULL values for any asset tags that were not listed in the "mod_workorder_counts" table. So the COALESCE would give me a 0 for any NULL value, which is what I want.
Now I will set it up so that when a work order is entered for an asset tag, i'll have the INSERT/UPDATE query for the counts table update at that time so it doesn't run unnecessarily.

Related

mysql Query make Db Storage full

I am having issue with mysql query.
SELECT Count(*) AS aggregate
FROM (SELECT Group_concat(gateways.public_name) AS client_gateways,
`clients`.`id`,
`clients`.`name`,
`clients`.`status`,
`clients`.`api_key`,
`clients`.`user_name`,
`clients`.`psp_id`,
`clients`.`suspend`,
`clients`.`secret_key`,
`clients`.`created_at`,
`companies`.`name` AS `company_name`,
`mid_groups_mid`.`mid_id`,
`mid_groups_mid`.`mid_group_id`,
`mid_groups`.`id` AS `group_id`,
`mid_groups`.`user_id`,
`mids`.`mid_group_id` AS `id_of_mid`
FROM `clients`
LEFT JOIN `client_site_gateways`
ON `clients`.`id` = `client_site_gateways`.`client_id`
LEFT JOIN `gateways`
ON `client_site_gateways`.`gateway_id` = `gateways`.`id`
LEFT JOIN `client_broker`
ON `client_broker`.`client_id` = `clients`.`id`
LEFT JOIN `mid_groups`
ON `mid_groups`.`user_id` = `clients`.`psp_id`
LEFT JOIN `mid_groups_mid`
ON `mid_groups_mid`.`mid_group_id` = `mid_groups`.`id`
LEFT JOIN `mids`
ON `mids`.`mid_group_id` = `mid_groups_mid`.`mid_group_id`
INNER JOIN `companies`
ON `companies`.`id` = `clients`.`company_id`
WHERE `is_corp` = 0
AND `clients`.`suspend` = '0'
AND ( `clients`.`company_id` = 1 )
AND `clients`.`deleted_at` IS NULL
GROUP BY `clients`.`id`,
`clients`.`name`,
`clients`.`status`,
`clients`.`api_key`,
`clients`.`suspend`,
`clients`.`secret_key`,
`clients`.`created_at`,
`companies`.`name`,
`clients`.`user_name`,
`clients`.`psp_id`,
`mid_groups_mid`.`mid_id`,
`mid_groups_mid`.`mid_group_id`,
`mid_groups`.`id`,
`mid_groups`.`user_id`,
`mids`.`mid_group_id`) count_row_table
all table have few hundreds records. here is explain query result
+------+-------------+----------------------+--------+-------------------------------------+-------------------------------------+---------+----------------------------------------------+------------+-------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+----------------------+--------+-------------------------------------+-------------------------------------+---------+----------------------------------------------+------------+-------------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 2849642280 | |
| 2 | DERIVED | companies | const | PRIMARY | PRIMARY | 4 | const | 1 | Using temporary; Using filesort |
| 2 | DERIVED | clients | ref | clients_company_id_foreign | clients_company_id_foreign | 4 | const | 543 | Using where |
| 2 | DERIVED | client_site_gateways | ref | client_id | client_id | 4 | knox_staging.clients.id | 5 | |
| 2 | DERIVED | gateways | eq_ref | PRIMARY | PRIMARY | 4 | knox_staging.client_site_gateways.gateway_id | 1 | Using where |
| 2 | DERIVED | client_broker | ALL | NULL | NULL | NULL | NULL | 6 | Using where; Using join buffer (flat, BNL join) |
| 2 | DERIVED | mid_groups | ref | mid_groups_user_id_foreign | mid_groups_user_id_foreign | 4 | knox_staging.clients.psp_id | 1 | Using where; Using index |
| 2 | DERIVED | mid_groups_mid | ref | mid_groups_mid_mid_group_id_foreign | mid_groups_mid_mid_group_id_foreign | 8 | knox_staging.mid_groups.id | 433 | Using where |
| 2 | DERIVED | mids | ref | mids_mid_group_id_foreign | mids_mid_group_id_foreign | 9 | knox_staging.mid_groups_mid.mid_group_id | 404 | Using where; Using index |
+------+-------------+----------------------+--------+-------------------------------------+-------------------------------------+---------+----------------------------------------------+------------+-------------------------------------------------+
in explain results what is causing to have 2849642280 row. while tables have only few hundreds records. all tables have proper indexing.
what i am thinking causing storage full is tmp table with above records. i tried to scale storage upto 60GB database size is few MBs. all storage filled up as soon as i run above query. i am not sure what causing left join to filter 2849642280 rows
The problem is probably the "aggregate." If the only thing you need is the count of records, you should write a new query which gets that count.

MySQL differences between aggregate order in query vs subquery

I have 2 query about ordering data:
Query 1:
SELECT * FROM (
SELECT idprovince, COUNT(*) total
FROM cities
JOIN persons USE INDEX (index_5) USING (idcity)
WHERE is_tutor = 'Y'
GROUP BY idprovince
) A
ORDER BY total DESC
Query 2:
SELECT idprovince, COUNT(*) total
FROM cities
JOIN persons USE INDEX (index_5) USING (idcity)
WHERE is_tutor = 'Y'
GROUP BY idprovince
ORDER BY total DESC
Query 1 return data much faster than query 2, my question is what is big difference between ordering using query and using it in subquery?
NOTE:my db version is mysql-5.0.96-x64. Data count is about 400k in persons, and 500 in cities.
UPDATE:
Output of mysql explain command:
Query 1:
mysql> EXPLAIN
-> SELECT *
-> FROM (
-> SELECT idprovince, COUNT(*) total
-> FROM cities
-> JOIN persons USE INDEX (index_5) USING (idcity)
-> WHERE is_tutor = 'Y'
-> GROUP BY idprovince
-> ) A
-> ORDER BY total DESC
-> ;
+----+-------------+------------+--------+---------------+---------+---------+------------------------------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+---------+---------+------------------------------------+--------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 34 | Using filesort |
| 2 | DERIVED | persons | ref | index_5 | index_5 | 2 | | 163316 | Using where; Using temporary; Using filesort |
| 2 | DERIVED | cities | eq_ref | PRIMARY | PRIMARY | 4 | _myproject_lesaja_2.persons.idcity | 1 | |
+----+-------------+------------+--------+---------------+---------+---------+------------------------------------+--------+----------------------------------------------+
3 rows in set (1.22 sec)
Query 2:
mysql> EXPLAIN
-> SELECT idprovince, COUNT(*) total
-> FROM cities
-> JOIN persons USE INDEX (index_5) USING (idcity)
-> WHERE is_tutor = 'Y'
-> GROUP BY idprovince
-> ORDER BY total DESC;
+----+-------------+---------+-------+---------------+-------------+---------+-------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+---------------+-------------+---------+-------+--------+----------------------------------------------+
| 1 | SIMPLE | cities | index | PRIMARY | FK_cities_1 | 4 | NULL | 4 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | persons | ref | index_5 | index_5 | 2 | const | 163316 | Using where |
+----+-------------+---------+-------+---------------+-------------+---------+-------+--------+----------------------------------------------+
2 rows in set (0.00 sec)
Result Query 1:
mysql> SELECT *
-> FROM (
-> SELECT idprovince, COUNT(*) total
-> FROM cities
-> JOIN persons USE INDEX (index_5) USING (idcity)
-> WHERE is_tutor = 'Y'
-> GROUP BY idprovince
-> ) A
-> ORDER BY total DESC
-> ;
+------------+-------+
| idprovince | total |
+------------+-------+
| 35 | 15797 |
......................
......................
......................
| 76 | 2091 |
| 65 | 2018 |
+------------+-------+
34 rows in set (0.78 sec)
Result Query 2:
mysql> SELECT idprovince, COUNT(*) total
-> FROM cities
-> JOIN persons USE INDEX (index_5) USING (idcity)
-> WHERE is_tutor = 'Y'
-> GROUP BY idprovince
-> ORDER BY total DESC;
+------------+-------+
| idprovince | total |
+------------+-------+
| 35 | 15797 |
| 33 | 14413 |
| 12 | 13683 |
......................
......................
......................
| 34 | 2135 |
| 76 | 2091 |
| 65 | 2018 |
+------------+-------+
34 rows in set (8 min 25.80 sec)
SHOW PROFILE OUTPUT:
QUERY 1:
+----------------------+----------+
| Status | Duration |
+----------------------+----------+
| starting | 0.000240 |
| Opening tables | 0.000043 |
| System lock | 0.000004 |
| Table lock | 0.000392 |
| optimizing | 0.000084 |
| statistics | 0.004455 |
| preparing | 0.000026 |
| Creating tmp table | 0.000221 |
| executing | 0.000002 |
| Copying to tmp table | 0.913722 |
| Sorting result | 0.000065 |
| Sending data | 0.000020 |
| removing tmp table | 0.000145 |
| Sending data | 0.000008 |
| init | 0.000017 |
| optimizing | 0.000002 |
| statistics | 0.000038 |
| preparing | 0.000007 |
| executing | 0.000001 |
| Sorting result | 0.000012 |
| Sending data | 0.000337 |
| end | 0.000002 |
| end | 0.000002 |
| query end | 0.000002 |
| freeing items | 0.000020 |
| closing tables | 0.000001 |
| removing tmp table | 0.000074 |
| closing tables | 0.000003 |
| logging slow query | 0.000001 |
| cleaning up | 0.000003 |
+----------------------+----------+
QUERY 2:
+----------------------+------------+
| Status | Duration |
+----------------------+------------+
| starting | 0.000195 |
| Opening tables | 0.000029 |
| System lock | 0.000004 |
| Table lock | 0.000011 |
| init | 0.000078 |
| optimizing | 0.000021 |
| statistics | 0.003399 |
| preparing | 0.000025 |
| Creating tmp table | 0.000259 |
| Sorting for group | 0.000007 |
| executing | 0.000001 |
| Copying to tmp table | 506.711308 |
| Sorting result | 0.000049 |
| Sending data | 0.000298 |
| end | 0.000004 |
| removing tmp table | 0.000150 |
| end | 0.000002 |
| end | 0.000002 |
| query end | 0.000002 |
| freeing items | 0.000013 |
| closing tables | 0.000003 |
| logging slow query | 0.000001 |
| logging slow query | 0.000042 |
| cleaning up | 0.000003 |
+----------------------+------------+
CREATE STATEMENT
CREATE TABLE persons (
idperson INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
is_tutor ENUM('Y','N') NULL DEFAULT 'N',
name VARCHAR(64) NOT NULL,
...
idcity INT(10) UNSIGNED NOT NULL,
...
PRIMARY KEY (idperson),
UNIQUE INDEX index_3 (name) USING BTREE,
UNIQUE INDEX index_4 (email) USING BTREE,
INDEX index_5 (is_tutor),
...
CONSTRAINT FK_persons_1 FOREIGN KEY (idcity) REFERENCES cities (idcity)
)
ENGINE=InnoDB
AUTO_INCREMENT=414738;
CREATE TABLE cities (
idcity INT(10) UNSIGNED NOT NULL,
idprovince INT(10) UNSIGNED NOT NULL,
city VARCHAR(64) NOT NULL,
PRIMARY KEY (idcity),
UNIQUE INDEX index_3 (city),
INDEX FK_cities_1 (idprovince),
CONSTRAINT FK_cities_1 FOREIGN KEY (idprovince) REFERENCES provinces (idprovince)
)
ENGINE=InnoDB;
I am admittedly not an expert on this one but looking at MySQL Documentation on ORDER BY Optimization, you have not only one but two un-optimized use of ORDER BY in your Query No. 2:
SELECT idprovince, COUNT(*) total
FROM cities
JOIN persons USE INDEX (index_5) USING (idcity)
WHERE is_tutor = 'Y'
GROUP BY idprovince
ORDER BY total DESC
First one :
The key used to fetch the rows
WHERE is_tutor = 'Y'
is not the same as the one used in the ORDER BY:
ORDER BY total DESC
Second one :
You have different ORDER BY and GROUP BY expressions.
GROUP BY idprovince
ORDER BY total DESC
On the two cases above MySQL will not use Indexes in order to resolve ORDER BY although it could use indexes in searching for the rows to match the WHERE clause.
On other hand your Query No. 1, follows the optimized form of ORDER BY although the ORDER BY is used outside the sub-query.
Thus could be the reason that Query No. 2 is far slower than Query No. 1.
Additionally, in both cases the Index (idCity) will be virtually useless in resolving also ORDER BY because index uses idCity while ORDER BY clause uses Total which is an aggregate result.
See discussion here also.

MySQL Join Optimisation: Improving join type with derived tables and GROUP BY

I am trying to improve a query which does the following:
For every job, add up all the costs, add up the invoiced amount, and calculate a profit/loss. The costs come from several different tables, e.g. purchaseorders, users_events (engineer allocated time/time he spent on site), stock used etc.
The query also needs to output some other columns like the name of the site for the work, so that that column can be sorted by (an ORDER BY is appended after all of this).
SELECT
jobs.job_id,
jobs.start_date,
jobs.end_date,
events.time,
sites.name site,
IFNULL(stock_cost,0) stock_cost,
labour,
materials,
labour+materials+plant+expenses revenue,
(labour+materials+plant)-(time*3557/360000+IFNULL(orders_cost,0)+IFNULL(stock_cost,0)) profit,
((labour+materials+plant)-(time*3557/360000+IFNULL(orders_cost,0)+IFNULL(stock_cost,0)))/(time*3557/360000+IFNULL(orders_cost,0)+IFNULL(stock_cost,0)) ratio
FROM
jobs
LEFT JOIN (
SELECT
job_id,
SUM(labour_charge) labour,
SUM(materials_charge) materials,
SUM(plant_hire_charge) plant,
SUM(expenses) expenses
FROM invoices
GROUP BY job_id
ORDER BY NULL
) invoices USING(job_id)
LEFT JOIN (
SELECT
job_id,
SUM(IF(start_onsite && end_onsite,end_onsite-start_onsite,end-start)) time,
SUM(travel+parking+materials) user_expenses
FROM users_events
WHERE type='job'
GROUP BY job_id
ORDER BY NULL
) events USING(job_id)
LEFT JOIN (
SELECT
job_id,
SUM(IFNULL(total,0))*0.01 orders_cost
FROM purchaseorders
GROUP BY job_id
ORDER BY NULL
) purchaseorders USING(job_id)
LEFT JOIN (
SELECT
location job_id,
SUM(amount*cost))*0.01 stock_cost
FROM stock_location
LEFT JOIN stock_items ON stock_items.id=stock_location.stock_id
WHERE location>=3000 AND amount>0 AND cost>0
GROUP BY location
ORDER BY NULL
) stock USING(job_id)
LEFT JOIN contacts_sites sites ON sites.id=jobs.site_id;
I read this: http://dev.mysql.com/doc/refman/5.0/en/group-by-optimization.html but don't see how/if I can apply anything therein.
For testing purposes, I have tried adding all sorts of indices on fields left, right and centre with no improvement to the EXPLAIN output:
+----+-------------+----------------+--------+------------------------+---------+---------+------------------------------------+-------+-------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+--------+------------------------+---------+---------+------------------------------------+-------+-------------------------------+
| 1 | PRIMARY | jobs | ALL | NULL | NULL | NULL | NULL | 7088 | |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 5038 | |
| 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 6476 | |
| 1 | PRIMARY | <derived4> | ALL | NULL | NULL | NULL | NULL | 904 | |
| 1 | PRIMARY | <derived5> | ALL | NULL | NULL | NULL | NULL | 531 | |
| 1 | PRIMARY | sites | eq_ref | PRIMARY | PRIMARY | 4 | bestbee_db.jobs.site_id | 1 | |
| 5 | DERIVED | stock_location | ALL | stock,location,amount,…| NULL | NULL | NULL | 5426 | Using where; Using temporary; |
| 5 | DERIVED | stock_items | eq_ref | PRIMARY | PRIMARY | 4 | bestbee_db.stock_location.stock_id | 1 | Using where |
| 4 | DERIVED | purchaseorders | ALL | NULL | NULL | NULL | NULL | 1445 | Using temporary; |
| 3 | DERIVED | users_events | ALL | type,type_job | NULL | NULL | NULL | 11295 | Using where; Using temporary; |
| 2 | DERIVED | invoices | ALL | NULL | NULL | NULL | NULL | 5320 | Using temporary; |
+----+-------------+----------------+--------+------------------------+---------+---------+------------------------------------+-------+-------------------------------+
The rows produced is 5 x 10^21 (down from 3 x 10^42 before I started optimising this query!)
It currently takes seven seconds to execute (down from 26) but I would like that to be under one second.
By the way: GROUP BY x ORDER BY NULL is a great way to eliminate unnecessary filesorts from subqueries! (from http://www.mysqlperformanceblog.com/2006/09/04/group_concat-useful-group-by-extension/)
Based on your comment to my question, I would do the following...
At the very top...
SELECT STRAIGHT_JOIN (just add the "STRAIGH_JOIN" keyword)
Then, for each of your subqueries for invoices, events, p/o's, etc, change the ORDER BY to the JOB_ID explicitly so it might help the optimization against the primary JOBS table join.
Finally, ensure each of your subquery tables HAS an index on the Job_ID (Invoices, User_events, PurchaseOrders, Stock_Location)
Additionally, for the Stock_Location table, you might want to help the WHERE clause for your subquery by having a compound index on
(job_id, location, amount) Three fields deep should be enough even though you have the key plus 3 where condition elements.

slow mysql count because of subselect

how to make this select statement more faster?
the first left join with the subselect is making it slower...
mysql> SELECT COUNT(DISTINCT w1.id) AS AMOUNT FROM tblWerbemittel w1
JOIN tblVorgang v1 ON w1.object_group = v1.werbemittel_id
INNER JOIN ( SELECT wmax.object_group, MAX( wmax.object_revision ) wmaxobjrev FROM tblWerbemittel wmax GROUP BY wmax.object_group ) AS wmaxselect ON w1.object_group = wmaxselect.object_group AND w1.object_revision = wmaxselect.wmaxobjrev
LEFT JOIN ( SELECT vmax.object_group, MAX( vmax.object_revision ) vmaxobjrev FROM tblVorgang vmax GROUP BY vmax.object_group ) AS vmaxselect ON v1.object_group = vmaxselect.object_group AND v1.object_revision = vmaxselect.vmaxobjrev
LEFT JOIN tblWerbemittel_has_tblAngebot wha ON wha.werbemittel_id = w1.object_group
LEFT JOIN tblAngebot ta ON ta.id = wha.angebot_id
LEFT JOIN tblLieferanten tl ON tl.id = ta.lieferant_id AND wha.zuschlag = (SELECT MAX(zuschlag) FROM tblWerbemittel_has_tblAngebot WHERE werbemittel_id = w1.object_group)
WHERE w1.flags =0 AND v1.flags=0;
+--------+
| AMOUNT |
+--------+
| 1982 |
+--------+
1 row in set (1.30 sec)
Some indexes has been already set and as EXPLAIN shows they were used.
+----+--------------------+-------------------------------+--------+----------------------------------------+----------------------+---------+-----------------------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------------------------------+--------+----------------------------------------+----------------------+---------+-----------------------------------------------+------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 2072 | |
| 1 | PRIMARY | v1 | ref | werbemittel_group,werbemittel_id_index | werbemittel_group | 4 | wmaxselect.object_group | 2 | Using where |
| 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 3376 | |
| 1 | PRIMARY | w1 | eq_ref | object_revision,or_og_index | object_revision | 8 | wmaxselect.wmaxobjrev,wmaxselect.object_group | 1 | Using where |
| 1 | PRIMARY | wha | ref | PRIMARY,werbemittel_id_index | werbemittel_id_index | 4 | dpd.w1.object_group | 1 | |
| 1 | PRIMARY | ta | eq_ref | PRIMARY | PRIMARY | 4 | dpd.wha.angebot_id | 1 | |
| 1 | PRIMARY | tl | eq_ref | PRIMARY | PRIMARY | 4 | dpd.ta.lieferant_id | 1 | Using index |
| 4 | DEPENDENT SUBQUERY | tblWerbemittel_has_tblAngebot | ref | PRIMARY,werbemittel_id_index | werbemittel_id_index | 4 | dpd.w1.object_group | 1 | |
| 3 | DERIVED | vmax | index | NULL | object_revision_uq | 8 | NULL | 4668 | Using index; Using temporary; Using filesort |
| 2 | DERIVED | wmax | range | NULL | or_og_index | 4 | NULL | 2168 | Using index for group-by |
+----+--------------------+-------------------------------+--------+----------------------------------------+----------------------+---------+-----------------------------------------------+------+----------------------------------------------+
10 rows in set (0.01 sec)
The main problem while the statement above takes about 2 seconds seems to be the subselect where no index can be used.
How to write the statement even more faster?
Thanks for help. MT
Do you have the following indexes?
for tblWerbemittel - object_group, object_revision
for tblVorgang - object_group, object_revision
for tblWerbemittel_has_tblAngebot - werbemittel_id, zuschlag
Let me know if that helps, there are a few more that I can see might help but try those first.
EDIT
Can you try these two queries and see if they run fast?
SELECT w1.id AS AMOUNT
FROM tblWerbemittel w1 INNER JOIN
(SELECT wmax.object_group,
MAX( wmax.object_revision ) AS wmaxobjrev
FROM tblWerbemittel AS wmax
GROUP BY wmax.object_group ) AS wmaxselect ON w1.object_group = wmaxselect.object_group AND
w1.object_revision = wmaxselect.wmaxobjrev
WHERE w1.flags = 0
SELECT v1.werbemittel_id
FROM tblVorgang v1 LEFT JOIN
(SELECT vmax.object_group,
MAX( vmax.object_revision ) AS vmaxobjrev
FROM tblVorgang AS vmax
GROUP BY vmax.object_group ) AS vmaxselect ON v1.object_group = vmaxselect.object_group AND
v1.object_revision = vmaxselect.vmaxobjrev LEFT JOIN
WHERE v1.flags = 0
While I consider I don't have sufficient data to provide a 100% correct answer, but I can throw in a handful of tips.
Forst of all, MYSQL is stupid. Bear that in mind and always rearrange your queries so that the most data is excluded at the beginning. For instance, if the last join reduced the number of results from 10k to 2k while the others don't, try swapping their positions so that each subsequent join operates on the smallest subset of data possible.
Same applies to the WHERE clause.
Also, joins tend to be slower than subqueries. I don't know if that's a rule or just something that I'm observing in my case, but you can always try to substitute a join or two with a subquery.
While I suppose this doesn't really answer your question, I hope it at least gives you an idea about where to start looking for optimisations.

indexes and speeding up 'derived' queries

I've recently noticed that a query I have is running quite slowly, at almost 1 second per query.
The query looks like this
SELECT eventdate.id,
eventdate.eid,
eventdate.date,
eventdate.time,
eventdate.title,
eventdate.address,
eventdate.rank,
eventdate.city,
eventdate.state,
eventdate.name,
source.link,
type,
eventdate.img
FROM source
RIGHT OUTER JOIN
(
SELECT event.id,
event.date,
users.name,
users.rank,
users.eid,
event.address,
event.city,
event.state,
event.lat,
event.`long`,
GROUP_CONCAT(types.type SEPARATOR ' | ') AS type
FROM event FORCE INDEX (latlong_idx)
JOIN users ON event.uid = users.id
JOIN types ON users.tid=types.id
WHERE `long` BETWEEN -74.36829174058 AND -73.64365405942
AND lat BETWEEN 40.35195025942 AND 41.07658794058
AND event.date >= '2009-10-15'
GROUP BY event.id, event.date
ORDER BY event.date, users.rank DESC
LIMIT 0, 20
)eventdate
ON eventdate.uid = source.uid
AND eventdate.date = source.date;
and the explain is
+----+-------------+------------+--------+---------------+-------------+---------+------------------------------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+-------------+---------+------------------------------+-------+---------------------------------+
| 1 | PRIMARY | | ALL | NULL | NULL | NULL | NULL | 20 | |
| 1 | PRIMARY | source | ref | iddate_idx | iddate_idx | 7 | eventdate.id,eventdate.date | 156 | |
| 2 | DERIVED | event | ALL | latlong_idx | NULL | NULL | NULL | 19500 | Using temporary; Using filesort |
| 2 | DERIVED | types | ref | eid_idx | eid_idx | 4 | active.event.id | 10674 | Using index |
| 2 | DERIVED | users | eq_ref | id_idx | id_idx | 4 | active.types.id | 1 | Using where |
+----+-------------+------------+--------+---------------+-------------+---------+------------------------------+-------+---------------------------------+
I've tried using 'force index' on latlong, but that doesn't seem to speed things up at all.
Is it the derived table that is causing the slow responses? If so, is there a way to improve the performance of this?
--------EDIT-------------
I've attempted to improve the formatting to make it more readable, as well
I run the same query changing only the 'WHERE statement as
WHERE users.id = (
SELECT users.id
FROM users
WHERE uidname = 'frankt1'
ORDER BY users.approved DESC , users.rank DESC
LIMIT 1 )
AND date & gt ; = '2009-10-15'
GROUP BY date
ORDER BY date)
That query runs in 0.006 seconds
the explain looks like
+----+-------------+------------+-------+---------------+---------------+---------+------------------------------+------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+---------------+---------+------------------------------+------+----------------+
| 1 | PRIMARY | | ALL | NULL | NULL | NULL | NULL | 42 | |
| 1 | PRIMARY | source | ref | iddate_idx | iddate_idx | 7 | eventdate.id,eventdate.date | 156 | |
| 2 | DERIVED | users | const | id_idx | id_idx | 4 | | 1 | |
| 2 | DERIVED | event | range | eiddate_idx | eiddate_idx | 7 | NULL | 24 | Using where |
| 2 | DERIVED | types | ref | eid_idx | eid_idx | 4 | active.event.bid | 3 | Using index |
| 3 | SUBQUERY | users | ALL | idname_idx | idname_idx | 767 | | 5 | Using filesort |
+----+-------------+------------+-------+---------------+---------------+---------+------------------------------+------+----------------+
The only way to clean up that mammoth SQL statement is to go back to the drawing board and carefully work though your database design and requirements. As soon as you start joining 6 tables and using an inner select you should expect incredible execution times.
As a start, ensure that all your id fields are indexed, but better to ensure that your design is valid. I don't know where to START looking at your SQL - even after I reformatted it for you.
Note that 'using indexes' means you need to issue the correct instructions when you CREATE or ALTER the tables you are using. See for instance MySql 5.0 create indexes