Why does OR in subquery make query so much slower? - mysql

I'm using MySQL and have the following query that I was trying to improve:
SELECT
*
FROM
overpayments AS op
JOIN payment_allocations AS overpayment_pa ON overpayment_pa.allocatable_id = op.id
AND overpayment_pa.allocatable_type = 'Overpayment'
JOIN (
SELECT
pa.payment_source_type,
pa.payment_source_id,
ft.conversion_rate
FROM
payment_allocations AS pa
LEFT JOIN line_items AS li ON pa.payment_source_id = li.id
LEFT JOIN credit_notes AS cn ON li.parent_document_id = cn.id
LEFT JOIN financial_transactions AS ft ON (
ft.commercial_document_id = pa.payment_source_id
AND ft.commercial_document_type = pa.payment_source_type
)
OR (
ft.commercial_document_id = cn.id
AND ft.commercial_document_type = 'CreditNote'
)
WHERE
pa.allocatable_type = 'Overpayment'
AND pa.company_id = 14792
AND ft.company_id = 14792
) AS op_bank_transaction_ft ON op_bank_transaction_ft.payment_source_id = overpayment_pa.payment_source_id
AND op_bank_transaction_ft.payment_source_type = overpayment_pa.payment_source_type;
It takes 10s to run. I was able to improve to 0.047s it by removing the OR statement in the subquery and using COALESCE to get the result:
SELECT
*
FROM
overpayments AS op
JOIN payment_allocations AS overpayment_pa ON overpayment_pa.allocatable_id = op.id
AND overpayment_pa.allocatable_type = 'Overpayment'
JOIN (
SELECT
pa.payment_source_type,
pa.payment_source_id,
coalesce(ft_one.conversion_rate, ft_two.conversion_rate)
FROM
payment_allocations AS pa
LEFT JOIN line_items AS li ON pa.payment_source_id = li.id
LEFT JOIN credit_notes AS cn ON li.parent_document_id = cn.id
LEFT JOIN financial_transactions AS ft_one ON (
ft_one.commercial_document_id = pa.payment_source_id
AND ft_one.commercial_document_type = pa.payment_source_type
AND ft_one.company_id = 14792
)
LEFT JOIN financial_transactions AS ft_two ON (
ft_two.commercial_document_id = cn.id
AND ft_two.commercial_document_type = 'CreditNote'
AND ft_two.company_id = 14792
)
WHERE
pa.allocatable_type = 'Overpayment'
AND pa.company_id = 14792
) AS op_bank_transaction_ft ON op_bank_transaction_ft.payment_source_id = overpayment_pa.payment_source_id
AND op_bank_transaction_ft.payment_source_type = overpayment_pa.payment_source_type;
However, I don't really understand why that worked? The original sub query ran very quickly and only returned 2 results, so why would it slow down the query by so much? Explain on the first query returns the following:
# id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
FIELD13
1
SIMPLE
pa
ref
index_payment_allocations_on_payment_source_id
index_payment_allocations_on_company_id
index_payment_allocations_on_company_id
5
const
191
10.00
Using where
1
SIMPLE
overpayment_pa
ref
index_payment_allocations_on_payment_source_id
index_payment_allocations_on_allocatable_id
index_payment_allocations_on_payment_source_id
5
rails.pa.payment_source_id
1
3.42
Using where
1
SIMPLE
op
eq_ref
PRIMARY
PRIMARY
4
rails.overpayment_pa.allocatable_id
1
100.00
1
SIMPLE
li
eq_ref
PRIMARY
PRIMARY
4
rails.pa.payment_source_id
1
100.00
1
SIMPLE
cn
eq_ref
PRIMARY
PRIMARY
8
rails.li.parent_document_id
1
100.00
Using where; Using index
1
SIMPLE
ft
ALL
transactions_unique_by_commercial_doc
12587878
0.00
Range checked for each record (index map: 0x2)
And for the second I get the following:
# id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
FIELD13
FIELD14
1
SIMPLE
pa
ref
index_payment_allocations_on_payment_source_id
index_payment_allocations_on_company_id
index_payment_allocations_on_company_id
5
const
191
10.00
Using where
1
SIMPLE
overpayment_pa
ref
index_payment_allocations_on_payment_source_id
index_payment_allocations_on_allocatable_id
index_payment_allocations_on_payment_source_id
5
rails.pa.payment_source_id
1
3.42
Using where
1
SIMPLE
op
eq_ref
PRIMARY
PRIMARY
4
rails.overpayment_pa.allocatable_id
1
100.00
1
SIMPLE
ft_one
ref
transactions_unique_by_commercial_doc
index_financial_transactions_on_company_id
transactions_unique_by_commercial_doc
773
rails.pa.payment_source_id
rails.pa.payment_source_type
1
100.00
Using where
1
SIMPLE
li
eq_ref
PRIMARY
PRIMARY
4
rails.pa.payment_source_id
1
100.00
1
SIMPLE
cn
eq_ref
PRIMARY
PRIMARY
8
rails.li.parent_document_id
1
100.00
Using where; Using index
1
SIMPLE
ft_two
ref
transactions_unique_by_commercial_doc
index_financial_transactions_on_company_id
transactions_unique_by_commercial_doc
773
rails.cn.id
const
1
100.00
Using where
but I don't really know how to interpret those results.

Look at the right side of the last row of your first EXPLAIN. It didn't use an index, and it had to scan through megarows. That's slow. Your second query used indexes for every step of the query, so it was much faster.
If your second query yields correct results, use it and don't look back. Congratulations! You've optimized a query.
OR operations, especially in ON clauses, are harder than usual for the query planner module to satisfy, because they often mean it has to take the union of two separate subqueries. It looks like the planner chose to brute-force it in your case. (brute force === scanning many rows.)
Without knowing your indexes, it's hard to help you further.
Read this to learn more. https://use-the-index-luke.com

These may further speed up the second formulation:
overpayment_pa:
INDEX(payment_source_id, payment_source_type, allocatable_type, allocatable_id)
pa: INDEX(allocatable_type, company_id, payment_source_id, payment_source_type)
financial_transactions:
INDEX(commercial_document_id, commercial_document_type, company_id, conversion_rate)

Related

MySQL query performance improvement

I have a problem with performance with MySQL. How can i improve it?
The situation is following:
Table “backlogsap„ have about 4 mio entries.
Indexes are created
This table have FK and other tables have FK to this table => can’t
create partitions.
This query need about 140 seconds to complete:
select
idmaterial,
materialgroup,
materialgroupcategory,
name,
dispatchgroup,
idsupplier,
group_concat(distinct sellingorganizationname) as sellingorganizationnames,
group_concat(distinct idordertype) as idordertypes,
group_concat(distinct idpositiontype) as idpositiontypes,
sum(newOrUpdated and isCritical) as classA,
sum(newOrUpdated and not isCritical) as classB,
sum(processingstate <3) as classC,
(select count(innerBacklogsAp.idmaterial)
from backlogsap as innerBacklogsAp
where innerBacklogsAp.idmaterial = src.idmaterial and IsDeleted = 0) as countReplacementVehiclerRequests
from
(select
backlogsap.idmaterial as idmaterial,
backlog.processingstate as processingstate,
material.idsupplier as idsupplier,
backlogsap.sellingorganizationname as sellingorganizationname,
backlogsap.idpositiontype as idpositiontype,
backlogsap.idordertype as idordertype,
materialindistributioncenter.dispatchgroup as dispatchgroup,
material.name as name,
material.idmaterialgroup as materialgroup,
materialgroup.idmaterialgroupcategory as materialgroupcategory,
(processingstate = 0 or processingstate = 1) as newOrUpdated,
((cancellation.state is not null and cancellation.state = 0 ) or
(reminderrequest.state is not null and (reminderrequest.state = 2 or reminderrequest.state = 0))
) as isCritical
from backlogsap
join backlog using (idbacklogsap)
left join cancellation using (idcancellation)
left join reminderrequest on backlog.IdReminderRequest = reminderrequest.idreminder
left join material using (idmaterial)
left join materialindistributioncenter using (idmaterial, iddistributioncenter)
left join materialgroup using (idmaterialgroup)
where (idcancellation is null or cancellation.State not in (1)) and
backlogsap.isdeleted = 0 and
backlogsap.idordertype not in ('ZAP', 'ZAK', 'ZAKO', 'ZAKZ', 'ZAPM') and
iddistributioncenter = 1469990
) as src
group by idmaterial
order by classA desc, classB desc, classC, idmaterial desc
Explain
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived3> ALL 26960 Using temporary; Using filesort
3 DERIVED backlogsap index_merge PRIMARY,fk_BacklogSap_OrderType1_idx,
fk_BacklogSap_MaterialInDistributionCenter1_idx,
perform_backlogsap_isdeleted,
fk_BacklogSap_DistributionCenter_idx perform_backlogsap_isdeleted,fk_BacklogSap_DistributionCenter_idx 1,4 35946 Using intersect(perform_backlogsap_isdeleted,fk_BacklogSap_DistributionCenter_idx); Using where
3 DERIVED backlog eq_ref idBacklogSAP_UNIQUE,
fk_Backlog_BacklogSap1_idx,
fk_Backlog_Cancellation1_idx idBacklogSAP_UNIQUE 4 ...backlogsap.IdBacklogSap 1
3 DERIVED cancellation eq_ref PRIMARY PRIMARY 4 ...backlog.IdCancellation 1 Using where
3 DERIVED reminderrequest eq_ref PRIMARY PRIMARY 4 ...backlog.IdReminderRequest 1
3 DERIVED material eq_ref PRIMARY PRIMARY 45 ...backlogsap.IdMaterial 1
3 DERIVED materialindistributioncenter eq_ref PRIMARY,
unqiue_IdDistributionCenter_IdMaterial,
fk_MaterialDistributionCenter_DistributionCenter1_idx,
fk_MaterialDistributionCenter_Material1_idx PRIMARY 49 const,...backlogsap.IdMaterial 1
3 DERIVED materialgroup eq_ref PRIMARY PRIMARY 137 ....material.IdMaterialGroup 1
2 DEPENDENT SUBQUERY innerBacklogsAp ref perform_backlogsap_isdeleted,
idx_backlogsap_IdMaterial idx_backlogsap_IdMaterial 45 func 8 Using where
Solved: created combined Index (idmaterial, IsDeleted)

mysql Query slow on view for search

Our search is very slow on view. we can't define index on view.. Please help how we can improve this .. Below Query took 33.3993 sec.
SELECT
`v_cat_pro`.`product_id`, `v_cat_pro`.`msrp`,
FROM
`v_prod_cat` AS `v_cat_pro`
WHERE
(product_status="1" and msrp >0 AND (search_text = 'de') )
ORDER BY
`msrp` ASC LIMIT 50
Explain query result
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE cat_product index catproducts_product_id,category_product_index category_product_index 8 NULL 941343 Using index; Using temporary; Using filesort
1 SIMPLE dept eq_ref PRIMARY PRIMARY 4 newdhf.cat_product.category_id 1 Using where
1 SIMPLE team eq_ref PRIMARY PRIMARY 4 newdhf.dept.parent_id 1 Using where
1 SIMPLE league eq_ref PRIMARY PRIMARY 4 newdhf.team.parent_id 1 Using where
1 SIMPLE product eq_ref PRIMARY PRIMARY 4 newdhf.cat_product.product_id 1 Using where
CREATE
ALGORITHM=UNDEFINED VIEW v_prod_cat AS
select
dept.id AS dept_id,team.short_name AS shortteamname,team.url AS team_url,team.id AS team_id,league.id AS league_id,product.product_id AS product_id,product.product_status AS product_status,product.upload_image_l AS upload_image_l,dept.name AS department,team.name AS team,league.name AS league,product.title AS title,cat_product.product_url AS product_url,product.discount AS discount,product.discount_start_date AS discount_start_date,product.discount_end_date AS discount_end_date,product.extra_discount AS extra_discount,product.extra_discount_start_date AS extra_discount_start_date,product.extra_discount_end_date AS extra_discount_end_date,product.global_alt_tag AS global_alt_tag,product.msrp AS msrp,product.cost AS cost,product.vendor_id AS vendor_id,if((cat_product.is_default > 0),1,0) AS is_default,
concat(league.name,_utf8' ',team.name,_utf8' ',dept.name,_utf8' ',replace(replace(replace(replace(product.title,'$leaguename',league.name),'$teamname',team.name),'$shortteamname',team.short_name),'$departmentname',dept.name),' ',product.sku_code,_utf8' ',replace(replace(replace(replace(product.site_search_keyword,'$leaguename',league.name),'$teamname',team.name),'$shortteamname',team.short_name),'$departmentname',dept.name)) AS search_text
from
((((categories dept join categories team on(((team.id = dept.parent_id) and (team.category_type = _utf8'team')))) join categories league on(((league.id = team.parent_id) and (league.category_type = _utf8'league')))) join category_products cat_product on((cat_product.category_id = dept.id))) join products product on((product.product_id = cat_product.product_id))) where (dept.category_type = _utf8'department')
order by
dept.id desc;
If you add an index to your 'v_prod_cat' table on the columns that you are searching on then this should help speed up your view.

SELECT query executes for a very long time occasionally

I have a really wierd problem in my MySQL InnoDB database. I have following query:
SELECT DISTINCT p.idProject AS idProject, p.name AS name, 0 AS isConfirm
FROM Projects p
JOIN team_project tp ON (p.idProject = tp.idProject)
JOIN projtimes pt ON (p.idProject = pt.idProject)
JOIN CalledTimesTbl ctt ON (p.idProject = ctt.idProject)
LEFT JOIN NextCalls nc ON (ctt.idCustomer = nc.idCustomer
AND ctt.idProject = nc.idProject)
WHERE tp.idTeam = 158
AND p.activated = 1
AND current_date >= p.projStart
AND current_date < p.confirmStart
AND pt.invitesCount < pt.maxPerPresentation
AND (nc.idCustomer IS NULL OR nc.nextCall < now())
ORDER BY p.name
Generally the query runs fine, but sometimes - for example when I set tp.idTeam = 147 it runs really slow (like 10 or 20 seconds). When I create alternative team and adjust proper tables values to have the same result with different idTeam value - the query executes in a fraction of second.
I profiled the query and noticed that when query executes slowly - there is one thing that consumes most of the time:
Copying to tmp table | 12.489197
I was a bit surprised that the query creates a tmp table but ok - it creates it every time the query executes - also when it executes fast.
I just add that db is designed well, there are all needed foreign keys, etc.
How to find the source of the slow executions and eliminate it?
EDIT: EXPLAIN results:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE tp ref unique_row,idTeam idTeam 4 const 56 Using temporary; Using filesort
1 SIMPLE p eq_ref PRIMARY,projStart,confirmStart PRIMARY 4 xxx.tp.idProject 1 Using where
1 SIMPLE pt ref uniq_projtimes uniq_projtimes 4 xxx.tp.idProject 1 Using where; Distinct
1 SIMPLE ctt ref idProject idProject 4 xxx.tp.idProject 3966 Using index; Distinct
1 SIMPLE nc eq_ref PRIMARY,idProject PRIMARY 8 xxx.ctt.idCustomer,xxx.tp.idProject 1 Using where; Distinct
EDIT2: Results of EXPLAIN EXTENDED first for fast query, second for the slow one.
id select_type table type possible_keys key key_len ref rows filtered Extra 1 SIMPLE tp ref unique_row,idTeam idTeam 4 const 1 100 Using temporary
1 SIMPLE p eq_ref PRIMARY,projStart,confirmStart PRIMARY 4 xxx.tp.idProject 1 100 Using where
1 SIMPLE pt ref uniq_projtimes uniq_projtimes 4 xxx.tp.idProject 1 100 Using where; Distinct
1 SIMPLE ctt ref idProject idProject 4 xxx.tp.idProject 46199 100 Using index; Distinct
1 SIMPLE nc eq_ref PRIMARY,idProject PRIMARY 8 xxx.ctt.idCustomer,xxx.tp.idProject 1 100 Using index; Distinct
id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE p eq_ref PRIMARY,projStart,confirmStart PRIMARY 4 xxx.ctt.idProject 1 100 Using where
1 SIMPLE pt ref uniq_projtimes uniq_projtimes 4 xxx.ctt.idProject 1 100 Using where; Distinct
1 SIMPLE tp ref unique_row,idTeam unique_row 8 xxx.pt.idProject,const 1 100 Using where; Using index; Distinct
1 SIMPLE nc eq_ref PRIMARY,idProject PRIMARY 8 xxx.ctt.idCustomer,xxx.tp.idProject 1 100 Using index; Distinct
Try this adjusted query. (It will join less rows)
SELECT DISTINCT p.idProject AS idProject, p.name AS name, 0 AS isConfirm
FROM Projects p
JOIN projtimes pt ON
p.idProject = pt.idProject
AND p.activated = 1
AND current_date >= p.projStart
AND current_date < p.confirmStart
AND pt.invitesCount < pt.maxPerPresentation
JOIN team_project tp ON
p.idProject = tp.idProject
AND tp.idTeam = 158
JOIN CalledTimesTbl ctt ON (p.idProject = ctt.idProject)
LEFT JOIN NextCalls nc ON (ctt.idCustomer = nc.idCustomer
AND ctt.idProject = nc.idProject)
WHERE (nc.idCustomer IS NULL OR nc.nextCall < now())
ORDER BY p.name

Optimizing SQL request

Im using PDO Mysql, and made a request to select cheapest offers for a product in my database. It works fine, only problem is it is slow (for 200 offers (and still just 25 to return)) it takes almost a second, which is a lot higher than what I aim.
I'm no expert in SQL, so I seek for your help on this matter. Here is the request and I'll be happy to provide more info if needed :
SELECT
mo.id AS id,
mo.stock AS stock,
mo.price AS price,
mo.promotional_price AS promotional_price,
mo.picture_1 AS picture_1,
mo.picture_2 AS picture_2,
mo.picture_3 AS picture_3,
mo.picture_4 AS picture_4,
mo.picture_5 AS picture_5,
mo.title AS title,
mo.description AS description,
mo.state AS state,
mo.is_new AS is_new,
mo.is_original AS is_original,
c.name AS name,
u.id AS user_id,
u.username AS username,
u.postal_code AS postal_code,
p.name AS country_name,
ra.cache_rating_avg AS cache_rating_avg,
ra.cache_rating_nb AS cache_rating_nb,
GROUP_CONCAT(md.delivery_mode_id SEPARATOR ', ') AS delivery_mode_ids,
GROUP_CONCAT(ri.title SEPARATOR ', ') AS delivery_mode_titles
FROM
mp_offer mo, catalog_product_i18n c,
ref_country_i18n p, mp_offer_delivery_mode md,
ref_delivery_mode r,
ref_delivery_mode_i18n ri, user u
LEFT JOIN mp_user_review_rating_i18n ra
ON u.id = ra.user_id
WHERE (mo.product_id = c.id
AND mo.culture = c.culture
AND mo.user_id = u.id
AND u.country_id = p.id
AND mo.id = md.offer_id
AND md.delivery_mode_id = ri.id
AND mo.culture = ri.culture)
AND (mo.culture = 1
AND p.culture = 1)
AND mo.is_deleted = 0
AND mo.product_id = 60
AND ((u.holiday_start IS NULL)
OR (u.holiday_start = '0000-00-00')
OR (u.holiday_end IS NULL)
OR (u.holiday_end = '0000-00-00')
OR (u.holiday_start > '2012-05-03')
OR (u.holiday_end < '2012-05-03'))
AND mo.stock > 0
GROUP BY mo.id
ORDER BY IF (mo.promotional_price IS NULL,
mo.price,
LEAST(mo.price, mo.promotional_price)) ASC
LIMIT 25 OFFSET 0;
I take the offers for a particular product that have their "culture" set to 1, are not deleted, that have some stock and whose seller is not in holidays. I order by price (promotional_price when there is one).
Is LEAST a slow function?
Here is the output of EXPLAIN :
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE c const PRIMARY,catalog_product_i18n_product,catalog_product_i18n_culture PRIMARY 8 const,const 1 "Using temporary; Using filesort"
1 SIMPLE mo ref PRIMARY,culture,is_deleted,product_id,user_id culture 4 const 3 "Using where with pushed condition"
1 SIMPLE u eq_ref PRIMARY,user_country PRIMARY 4 database.mo.user_id 1 "Using where with pushed condition"
1 SIMPLE p eq_ref PRIMARY,ref_country_i18n_culture PRIMARY 8 database.u.country_id,const 1
1 SIMPLE r ALL NULL NULL NULL NULL 3 "Using join buffer"
1 SIMPLE ra ALL NULL NULL NULL NULL 4
1 SIMPLE md ref PRIMARY,fk_offer_has_delivery_mode_delivery_mode1,fk_offer_has_delivery_mode_offer1 PRIMARY 4 database.mo.id 2
1 SIMPLE ri eq_ref PRIMARY PRIMARY 2 database.md.delivery_mode_id,const 1
Thanks in advance for your help on optimizing this request.
J
You are not making use of ref_delivery_mode table that you have included in from clause. It's getting cause of Cartesian product of tables result.

MySQL query very slow without an explicit "USE INDEX"

I've got a weird situation where my MySQL query is taking forever. I've fixed it by adding an explicit "USE INDEX" statement. My question really is why is this necessary - and what caused the MySQL optimiser to go so drastically wrong.
Here's the SQL statement:
SELECT i._id
FROM interim_table i
JOIN tablea a ON i.table_a_id = a._id
JOIN tableb b ON i.table_b_id = b._id
JOIN levels l ON a.level_id = l._id
JOIN projects p ON a.project_id = p._id
WHERE s.time_stamp > NOW() - INTERVAL 5 DAY AND a.project_id = 13
Note the time_stamp in the WHERE clause. If it's set to 5 days, then the query takes about two seconds. If I change it to 6 days however, it takes that long that MySQL times out.
This is the "explain" results for using the "5 day" interval (which takes 2 seconds):
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE p const PRIMARY PRIMARY 4 const 1 Using index
1 SIMPLE b range PRIMARY, time_stamp time_stamp 8 479 Using where; Using index
1 SIMPLE i ref ind_tableb_id,int_tablea_id ind_tableb_id 4 b._id 11
1 SIMPLE a eq_ref PRIMARY,level_id,project_id PRIMARY 4 i.table_a_id 1 Using where
1 SIMPLE l eq_ref PRIMARY PRIMARY 4 a.level_id 1 Using index
This is the "explain" results for using the "6 day" interval (which times out):
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE p const PRIMARY PRIMARY 4 const 1 Using index
1 SIMPLE a ref PRIMARY,level_id,project_id project_id 4 const 2722
1 SIMPLE l eq_ref PRIMARY PRIMARY 4 a.level_id 1 Using index
1 SIMPLE i ref ind_tableb_id,int_tablea_id int_tablea_id 4 a._id 2
1 SIMPLE b eq_ref PRIMARY,time_stamp PRIMARY 4 i.table_b_id 1 Using where
If I put an explicit "USE INDEX" statement in there, then I get the 6 day interval also down to 2 seconds...
SELECT i._id
FROM interim_table i
JOIN tablea a ON i.table_a_id = a._id
JOIN tableb b USE INDEX (time_stamp) ON i.table_b_id = b._id
JOIN levels l ON a.level_id = l._id
JOIN projects p ON a.project_id = p._id
WHERE s.time_stamp > NOW() - INTERVAL 6 DAY AND a.project_id = 13
Then the explain results becomes:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE p const PRIMARY PRIMARY 4 const 1 Using index
1 SIMPLE s range time_stamp time_stamp 8 504 Using where; Using index
1 SIMPLE i ref ind_tableb_id,ind_tableaid ind_tableb_id 4 s._id 11
1 SIMPLE v eq_ref PRIMARY,level_id,project_id PRIMARY 4 i.table_a_id 1 Using where
1 SIMPLE l eq_ref PRIMARY PRIMARY 4 v.level_id 1 Using index
Any thoughts on why MySQL required me to tell it which index to use?
Have you tried to update statistics?
Posting it again as an answer :)