How to improve this complex query? - mysql

My query is running very long. I tried to improve the query performance by adding index on website_id in hh_hits table.
Query:
SELECT
w.state, xx.pretty_name, xx.member_id,
SUM(ce.total_charge) as total_charge,
SUM(ce.shipping_cost) as shipping_cost,
SUM(ce.product_price * product_count) as product_price,
SUM(ce.tax) as tax,
SUM(ce.service_charge) as service_charge,
COUNT(distinct ce.order_id) as order_count,
SUM(ce.product_count) as product_count,
SUM( (select sum(addon_price*addon_count)
from wow.cart_entry_addons_archive cean
where cean.cart_entry_id=ce.cart_entry_id
and cean.addon_type='xx'
group by ce.cart_entry_id)) as giftwrap_total,
sum( (select sum(addon_price*addon_count)
from wow.cart_entry_addons_archive cean2
where cean2.cart_entry_id=ce.cart_entry_id
and cean2.addon_type='xx'
group by ce.cart_entry_id)) as addon_total,
(select sum(number) as hits
from wow.hh_hits thts
where thts.website_id=xx.website_id
and thts.start_date >= 'xxx'
and thts.start_date <= 'xx') as visits
FROM
wow.carts_archive c,
wow.cart_entries_archive ce,
eoe.websites xx
WHERE
ce.order_date >= 'xx' and
ce.order_date <= 'xx' and
ce.website_id=xx.website_id and
lower(ce.status) != 'deleted' and
ce.order_status != 'cancelled' and
ce.cart_id = c.cart_id and
(c.cc_number <> '343334' or c.cc_number is null)
GROUP BY ce.website_id
ORDER BY ce.website_id;
Explain plan:
+----+--------------------+-------+--------+------------------+----------+---------+---------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+--------+------------------+----------+---------+---------------------------+-------+----------------------------------------------+
| 1 | PRIMARY | ce | range | id1_nn,idx_726 | idx_1049 | 9 | NULL | 33 | Using where; Using filesort |
| 1 | PRIMARY | w | ref | idx_1055 | idx_1055 | 5 | wow.ce.website_id | 1 | Using where |
| 1 | PRIMARY | c | eq_ref | PRIMARY | PRIMARY | 4 | eoe.ce.cart_id | 1 | Using where |
| 4 | DEPENDENT SUBQUERY | thts | ALL | hh_n1 | NULL | NULL | NULL | 24493 | Using where |
| 3 | DEPENDENT SUBQUERY | cean2 | ref | idx_1383 | idx_1383 | 4 | wow.ce.cart_entry_id | 1 | Using where; Using temporary; Using filesort |
| 2 | DEPENDENT SUBQUERY | cean | ref | idx_1383 | idx_1383 | 4 | wow.ce.cart_entry_id | 1 | Using where; Using temporary; Using filesort |
Query explain plan seems hh_hits table is not using indexes.

On hh_hits, add this compound index:
INDEX(website_id, start_date)
For further discussion, please provide SHOW CREATE TABLE for each of the tables.
If my suggestion does not help enough, then we should talk about building a Summary Table that contains the subtotals for each website-date pair. You would augment the table each night, then run the dependent subquery against it.

Related

MYSQL Explain 'Using:where; Using temporary; Using filesort'

I am trying to optimize a stored procedure. I have identified some issues with it, but I don't know enough to actually correct the problems. One of the subqueries looks like this
select d.districtCode,
b.year Year,
if(bli.releaseAdjustment = 3 and bli.id not in (select billLineItem_id from AppliedDiscount),
bli.amount ,0) *-1 Refund
from Bill b
join BillLineItem bli on bli.bill_id = b.id
left join Bill_District d on d.bill_id = b.id
and bli.type = d.type
left join AppliedDiscount a on a.billLineItem_id = bli.id
where bli.releaseAdjustment in (1,2,3)
and bli.type in (4,6)
group by d.districtCode, b.year
The EXPLAIN outputs this
+----+------------------------+---------------+--------------+--------------------------------+-----------------+---------+---------------------+---------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+------------------------+---------------+--------------+--------------------------------+-----------------+---------+---------------------+---------+----------------------------------------------+
| 1 | PRIMARY | bli | ALL | FKF4236A,type,releaseAdjustment| NULL | NULL | NULL | 2787322 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | b | eq_ref | PRIMARY | PRIMARY | 8 | tax.bli.bill_id | 1 | |
| 1 | PRIMARY | d | ref | bill_id,type | bill_id | 8 | tax.bli.bill_id | 1 | |
| 1 | PRIMARY | a | ref | billLineItem_idx | billLineItem_idx| 8 | tax.bli.id | 1 | Using index |
| 1 | DEPENDENT SUBQUERY |AppliedDiscount|index_subquery| billLineItem_idx | billLineItem_idx| 8 | func | 1 | Using index |
+----+------------------------+---------------+--------------+--------------------------------+-----------------+---------+---------------------+---------+----------------------------------------------+
How would you suggest I fix this? This problem, or one very similar, is found throughout this stored procedure numerous times. AppliedDiscount only consists of 3 columns, all of which are indexed already.
Edit: Removing the group by changes the first row of the explain to
| 1 | PRIMARY | bli | ALL | FKF4236A,type,releaseAdjustment,bill_id| NULL | NULL | NULL | 2613847 | Using where |
That's better and technically answers my question, but that just means that I was asking the wrong question.
The 'type' is still ALL. What can I do to improve that?

Query uses temporary table without force index

Query
SELECT SQL_NO_CACHE contacts.id,
contacts.date_modified contacts__date_modified
FROM contacts
INNER JOIN
(SELECT tst.team_set_id
FROM team_sets_teams tst
INNER JOIN team_memberships team_membershipscontacts ON (team_membershipscontacts.team_id = tst.team_id)
AND (team_membershipscontacts.user_id = '5daa2e92-c347-11e9-afc5-525400a80916')
AND (team_membershipscontacts.deleted = 0)
GROUP BY tst.team_set_id) contacts_tf ON contacts_tf.team_set_id = contacts.team_set_id
LEFT JOIN contacts_cstm contacts_cstm ON contacts_cstm.id_c = contacts.id
WHERE contacts.deleted = 0
ORDER BY contacts.date_modified DESC,
contacts.id DESC
LIMIT 21;
Takes extremely long (2 minutes on 2M records). I cant change this query, since it is system generated
This is it's explain:
+----+-------------+--------------------------+------------+--------+-------------------------------------------------------------------------------------------------------+----------------------------+---------+-------------------------------------------+---------+----------+---------------------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------------------+------------+--------+-------------------------------------------------------------------------------------------------------+----------------------------+---------+-------------------------------------------+---------+----------+---------------------------------------------------------------------+
| 1 | PRIMARY | contacts | NULL | ref | idx_contacts_tmst_id,idx_del_date_modified,idx_contacts_del_last,idx_cont_del_reports,idx_del_id_user | idx_del_date_modified | 2 | const | 1113718 | 100.00 | Using temporary; Using filesort |
| 1 | PRIMARY | <derived3> | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 50.00 | Using where; Using join buffer (Block Nested Loop) |
| 1 | PRIMARY | contacts_cstm | NULL | eq_ref | PRIMARY | PRIMARY | 144 | sugarcrm.contacts.id | 1 | 100.00 | Using index |
| 3 | DERIVED | team_membershipscontacts | NULL | ref | idx_team_membership,idx_teammemb_team_user,idx_del_team_user | idx_team_membership | 145 | const | 2 | 99.36 | Using index condition; Using where; Using temporary; Using filesort |
| 3 | DERIVED | tst | NULL | ref | idx_ud_set_id,idx_ud_team_id,idx_ud_team_set_id,idx_ud_team_id_team_set_id | idx_ud_team_id_team_set_id | 144 | sugarcrm.team_membershipscontacts.team_id | 1 | 100.00 | Using index |
+----+-------------+--------------------------+------------+--------+-------------------------------------------------------------------------------------------------------+----------------------------+---------+-------------------------------------------+---------+----------+---------------------------------------------------------------------+
But when I use force index(idx_del_date_modified) (which is the same index used in explain), the query takes just 0.01s and I get slightly different explain.
+----+-------------+--------------------------+------------+--------+----------------------------------------------------------------------------+----------------------------+---------+-------------------------------------------+---------+----------+---------------------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------------------+------------+--------+----------------------------------------------------------------------------+----------------------------+---------+-------------------------------------------+---------+----------+---------------------------------------------------------------------+
| 1 | PRIMARY | contacts | NULL | ref | idx_del_date_modified | idx_del_date_modified | 2 | const | 1113718 | 100.00 | Using where |
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 50.00 | Using where |
| 1 | PRIMARY | contacts_cstm | NULL | eq_ref | PRIMARY | PRIMARY | 144 | sugarcrm.contacts.id | 1 | 100.00 | Using index |
| 2 | DERIVED | team_membershipscontacts | NULL | ref | idx_team_membership,idx_teammemb_team_user,idx_del_team_user | idx_team_membership | 145 | const | 2 | 99.36 | Using index condition; Using where; Using temporary; Using filesort |
| 2 | DERIVED | tst | NULL | ref | idx_ud_set_id,idx_ud_team_id,idx_ud_team_set_id,idx_ud_team_id_team_set_id | idx_ud_team_id_team_set_id | 144 | sugarcrm.team_membershipscontacts.team_id | 1 | 100.00 | Using index |
+----+-------------+--------------------------+------------+--------+----------------------------------------------------------------------------+----------------------------+---------+-------------------------------------------+---------+----------+---------------------------------------------------------------------+
The first query uses temporary table and file sort, but the query with force index uses just where. Shouldn't the queries be the same? Why is the query with force index so much faster - used index is still the same.
According to MySQL manual:
Temporary tables can be created under conditions such as these:
If there is an ORDER BY clause and a different GROUP BY clause, or if
the ORDER BY or GROUP BY contains columns from tables other than the
first table in the join queue, a temporary table is created.
DISTINCT combined with ORDER BY may require a temporary table.
If you use the SQL_SMALL_RESULT option, MySQL uses an in-memory
temporary table, unless the query also contains elements (described
later) that require on-disk storage.
Likely, you have better performance because in MySQL there is the query optimizer component.
If you create index the query optimizer could not use the index column even though the index exists.
Using force index(..) you are forcing MySql to use index, instead.
Please consider a detailed example here.

Sql query optimization and debug

I am trying to optimize sql query in mysql db. Tried different ways of rewriting it , adding/removing indexes, but nothing seems to decrease the load. Maybe I am missing something.
Query:
select co.country_name as state, ci.city_name as city, ci.city_id, ci.country_id,
count(l.id) as num
FROM cities ci
INNER JOIN countries co ON (ci.country_id = co.country_id)
INNER JOIN dancers l ON (l.city_id = ci.city_id AND l.closed = 0 AND l.approved = 1 )
WHERE 1 AND ci.main=1
GROUP BY ci.city_id
ORDER BY city
Duration : 2.01sec - 2.20sec
Optimized query:
select co.country_name as state, ci.city_name as city, ci.city_id, ci.country_id, count(l.id) as num from
(select ci1.city_name, ci1.city_id, ci1.country_id from cities ci1
where ci1.main=1) as ci
INNER JOIN countries co ON (ci.country_id = co.country_id)
INNER JOIN dancers l ON (l.city_id = ci.city_id AND l.closed = 0 AND l.approved = 1 ) GROUP BY ci.city_id ORDER BY city
Duration : 0.82sec - 0.90sec
But i feel that this query can be optimized even more but not getting the ideea how to optimized it. There are 3 tables
Table 1 : countries ( country_id, country_name)
Table 2 : cities ( city_id, city_name, main, country_id)
Table 3 : dancers ( id, country_id, city_id, closed, approved)
I am trying to get all the cities which have main=1 and for each to count all the profiles that are into those cities joining with countries to get the country_name.
Any ideas are welcomed, thank you.
Later edit : - first query explain
+----+-------------+-------+-------------+---------------------------------------------------------------------+-----------------+---------+------------------+-------+--------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------------+---------------------------------------------------------------------+-----------------+---------+------------------+-------+--------------------------------------------------------------------------------+
| 1 | SIMPLE | l | index_merge | city_id,closed,approved,city_id_2 | closed,approved | 1,2 | NULL | 75340 | Using intersect(closed,approved); Using where; Using temporary; Using filesort |
| 1 | SIMPLE | ci | eq_ref | PRIMARY,state_id_2,state_id,city_name,lat,city_name_shorter,city_id | PRIMARY | 4 | db.l.city_id | 1 | Using where |
| 1 | SIMPLE | co | eq_ref | PRIMARY | PRIMARY | 4 | db.ci.country_id | 1 | Using where |
+----+-------------+-------+-------------+---------------------------------------------------------------------+-----------------+---------+------------------+-------+--------------------------------------------------------------------------------+
Second query explain :
+----+-------------+------------+------+-----------------------------------+-------------+---------+------------------+-------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+-----------------------------------+-------------+---------+------------------+-------+------------------------------------+
| 1 | PRIMARY | co | ALL | PRIMARY | NULL | NULL | NULL | 51 | Using temporary; Using filesort |
| 1 | PRIMARY | <derived2> | ref | <auto_key1> | <auto_key1> | 4 | db.co.country_id | 176 | Using where |
| 1 | PRIMARY | l | ref | city_id,closed,approved,city_id_2 | city_id_2 | 4 | ci.city_id | 44 | Using index condition; Using where |
| 2 | DERIVED | ci1 | ALL | NULL | NULL | NULL | NULL | 11765 | Using where |
+----+-------------+------------+------+-----------------------------------+-------------+---------+------------------+-------+------------------------------------+
#used_by_already query explain :
+----+-------------+------------+-------------+-----------------------------------+-----------------+---------+------------------+-------+--------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------------+-----------------------------------+-----------------+---------+------------------+-------+--------------------------------------------------------------------------------+
| 1 | PRIMARY | co | ALL | PRIMARY | NULL | NULL | NULL | 51 | NULL |
| 1 | PRIMARY | <derived2> | ref | <auto_key0> | <auto_key0> | 4 | db.co.country_id | 565 | Using where |
| 2 | DERIVED | l | index_merge | city_id,closed,approved,city_id_2 | closed,approved | 1,2 | NULL | 75341 | Using intersect(closed,approved); Using where; Using temporary; Using filesort |
| 2 | DERIVED | ci1 | eq_ref | PRIMARY,state_id_2,city_id | PRIMARY | 4 | db.l.city_id | 1 | Using where |
+----+-------------+------------+-------------+-----------------------------------+-----------------+---------+------------------+-------+--------------------------------------------------------------------------------+
I suggest you try this:
SELECT
co.country_name AS state
, ci.city_name AS city
, ci.city_id
, ci.country_id
, ci.num
FROM (
SELECT
ci1.city_id
, ci1.city_name
, ci1.country_id
, COUNT(l.id) AS num
FROM cities ci1
INNER JOIN dancers l ON l.city_id = ci1.city_id
AND l.closed = 0
AND l.approved = 1
WHERE ci1.main = 1
GROUP BY
ci1.city_id
, ci1.city_name
, ci1.country_id
) AS ci
INNER JOIN countries co ON ci.country_id = co.country_id
;
And that you provide the explain plan output for further analysis if needed. When optimizing knowing what indexes exist, and having he explain plan, are essentials.
Not also, that MySQL does permit non-standard GROUP BY syntax (where only one or some of the columns in the select list are included under the group by clause).
In recent versions of MySQL the default behaviour for GROUP BY has changed to SQL standard syntax (where all "non-aggregating" columns in the select list must be included under the group by clause). While your existing queries use the non-standard group by syntax, the query supplied here dis compliant.

MySQL - Optimizing the select query

I have got about 4.4K records in stockmain table and 4.4K records in stockdetail table and bout 1.04K records in item table. I have got the following query:
SELECT
item.model,
stockdetail.docs,
item.category,
stockdetail.item_id,
stockdetail.chasis,
stockdetail.price,
stockdetail.tax,
stockdetail.recycle,
stockdetail.auction,
stockdetail.shaken,
stockdetail.transport,
stockdetail.fee,
stockdetail.netamount,
IFNULL(SUM(QTY),0) as QTY,
item.DESCRIPTION
FROM
stockmain
INNER JOIN stockdetail
ON stockmain.STID=stockdetail.STID
INNER JOIN item
ON stockdetail.ITEM_ID = item.ITEM_ID
WHERE
stockmain.vrdate
BETWEEN '{$startDate}' AND '{$endDate}'
AND stockmain.company_id={$company_id}
GROUP BY
item.item_id, chasis HAVING IFNULL(SUM(QTY),0) > 0 ORDER BY item.description, item.model
and it takes average number of about 45-48 seconds to load the data. How may I optimize this query to perform faster?
P.S I have tried by adding the indexes to stockmain.vrdate and stockmain.company_id but that changed nothing.
Below is the EXPLAIN for the above query`
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
| 1 | SIMPLE | stockdetail | ALL | | | | | 4180 | Using temporary; Using filesort |
| 1 | SIMPLE | item | eq_ref | PRIMARY | PRIMARY | 4 | kashmir.stockdetail.item_id | 1 | |
| 1 | SIMPLE | stockmain | eq_ref | PRIMARY | PRIMARY | 4 | kashmir.stockdetail.stid | 1 | Using where |

MySQL left join performance issues

I have been having issues with MySQL (version 5.5) left join performance on a number of queries. In all cases I have been able to work around the issue by restructuring the queries with unions and subselects (I saw some examples of this in the book High Performance MySQL). The problem is this this leads to very messy queries.
Below is an example of two queries that produce the exact same results. The first query is roughly two orders of magnitude slower than the second. The second query is much less readable than the first.
As far as I can tell these sorts of queries are not performing poorly because of bad indexing. In all cases when I restructure the query it runs just fine. I have also tried carefully looking at the indexes and using hints to no avail.
Has anyone else run into similar issues with MySQL? Are there any server parameters I should try tweaking? Has anyone found a cleaner way to work around this sort of issue?
Query 1
select
i.id,
sum(vp.measurement * pol.quantity_ordered) measurement_on_order
from items i
left join (vendor_products vp, purchase_order_lines pol, purchase_orders po) on
vp.item_id = i.id and
pol.vendor_product_id = vp.id and
pol.purchase_order_id = po.id and
po.received_at is null and
po.closed_at is null
group by i.id
explain:
+----+-------------+-------+--------+-------------------------------+-------------------+---------+-------------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-------------------------------+-------------------+---------+-------------------------------------+------+-------------+
| 1 | SIMPLE | i | index | NULL | PRIMARY | 4 | NULL | 241 | Using index |
| 1 | SIMPLE | po | ref | PRIMARY,received_at,closed_at | received_at | 9 | const | 2 | |
| 1 | SIMPLE | pol | ref | purchase_order_id | purchase_order_id | 4 | nutkernel_dev.po.id | 7 | |
| 1 | SIMPLE | vp | eq_ref | PRIMARY,item_id | PRIMARY | 4 | nutkernel_dev.pol.vendor_product_id | 1 | |
+----+-------------+-------+--------+-------------------------------+-------------------+---------+-------------------------------------+------+-------------+
Query 2
select
i.id,
sum(on_order.measurement_on_order) measurement_on_order
from (
(
select
i.id item_id,
sum(vp.measurement * pol.quantity_ordered) measurement_on_order
from purchase_orders po
join purchase_order_lines pol on pol.purchase_order_id = po.id
join vendor_products vp on pol.vendor_product_id = vp.id
join items i on vp.item_id = i.id
where
po.received_at is null and po.closed_at is null
group by i.id
)
union all
(select id, 0 from items)
) on_order
join items i on on_order.item_id = i.id
group by i.id
explain:
+------+--------------+------------+--------+-------------------------------+--------------------------------+---------+-------------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------+------------+--------+-------------------------------+--------------------------------+---------+-------------------------------------+------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 3793 | Using temporary; Using filesort |
| 1 | PRIMARY | i | eq_ref | PRIMARY | PRIMARY | 4 | on_order.item_id | 1 | Using index |
| 2 | DERIVED | po | ALL | PRIMARY,received_at,closed_at | NULL | NULL | NULL | 20 | Using where; Using temporary; Using filesort |
| 2 | DERIVED | pol | ref | purchase_order_id | purchase_order_id | 4 | nutkernel_dev.po.id | 7 | |
| 2 | DERIVED | vp | eq_ref | PRIMARY,item_id | PRIMARY | 4 | nutkernel_dev.pol.vendor_product_id | 1 | |
| 2 | DERIVED | i | eq_ref | PRIMARY | PRIMARY | 4 | nutkernel_dev.vp.item_id | 1 | Using index |
| 3 | UNION | items | index | NULL | index_new_items_on_external_id | 257 | NULL | 3380 | Using index |
| NULL | UNION RESULT | <union2,3> | ALL | NULL | NULL | NULL | NULL | NULL | |
+------+--------------+------------+--------+-------------------------------+--------------------------------+---------+-------------------------------------+------+----------------------------------------------+