Mysql- Indexing on inner join condition - mysql

I have 2 tables like below
inv_ps
--------
inv_fkid ps_fkid
1 2
1 4
1 5
other_table
----------
id ps_fkid amt other_data
1 2 20 xxx
2 NULL 10 xxx
3 NULL 5 xxx
4 5 6 xxx
5 4 7 xxxx
and here's the query
SELECT inv_ps.ps_fkid, ot.amt FROM invoice_ps inv_ps INNER JOIN other_table ot ON ot.ps_fkid = inv_ps.ps_fkid WHERE inv_ps.inv_fkid=1 GROUP BY inv_ps.ps_fkid
this does works fine, however when i view EXPLAIN Sql
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE inv_ps ref inv_fkid,ps_fkid inv_fkid 4 const 1 Using where; Using temporary; Using filesort
1 SIMPLE ot ref ps_fkid ps_fkid 5 inv_ps.ps_fkid 3227 Using where
This supposed to scan only 3 rows but why it is searching in 3227 rows even though i added indexes on both join columns ? is it because the column ot.ps_fkid was set to NULL?
Please explain

As per my knowledge index is used in GROUP BY clause only if it is a covering index
try explain with following covering indexs on tables:
ALTER TABLE other_table ADD INDEX ix1 (ps_fkid, amt);
ALTER TABLE invoice_ps ADD INDEX ix1 (inv_fkid, ps_fkid);
SELECT a.ps_fkid, b.amt
FROM (SELECT ps_fkid
FROM invoice_ps
WHERE inv_fkid = 1
GROUP BY ps_fkid
)a
INNER JOIN other_table b
ON a.ps_fkid = b.ps_fkid;

Related

MySQL query performance improvement

I have a problem with performance with MySQL. How can i improve it?
The situation is following:
Table “backlogsap„ have about 4 mio entries.
Indexes are created
This table have FK and other tables have FK to this table => can’t
create partitions.
This query need about 140 seconds to complete:
select
idmaterial,
materialgroup,
materialgroupcategory,
name,
dispatchgroup,
idsupplier,
group_concat(distinct sellingorganizationname) as sellingorganizationnames,
group_concat(distinct idordertype) as idordertypes,
group_concat(distinct idpositiontype) as idpositiontypes,
sum(newOrUpdated and isCritical) as classA,
sum(newOrUpdated and not isCritical) as classB,
sum(processingstate <3) as classC,
(select count(innerBacklogsAp.idmaterial)
from backlogsap as innerBacklogsAp
where innerBacklogsAp.idmaterial = src.idmaterial and IsDeleted = 0) as countReplacementVehiclerRequests
from
(select
backlogsap.idmaterial as idmaterial,
backlog.processingstate as processingstate,
material.idsupplier as idsupplier,
backlogsap.sellingorganizationname as sellingorganizationname,
backlogsap.idpositiontype as idpositiontype,
backlogsap.idordertype as idordertype,
materialindistributioncenter.dispatchgroup as dispatchgroup,
material.name as name,
material.idmaterialgroup as materialgroup,
materialgroup.idmaterialgroupcategory as materialgroupcategory,
(processingstate = 0 or processingstate = 1) as newOrUpdated,
((cancellation.state is not null and cancellation.state = 0 ) or
(reminderrequest.state is not null and (reminderrequest.state = 2 or reminderrequest.state = 0))
) as isCritical
from backlogsap
join backlog using (idbacklogsap)
left join cancellation using (idcancellation)
left join reminderrequest on backlog.IdReminderRequest = reminderrequest.idreminder
left join material using (idmaterial)
left join materialindistributioncenter using (idmaterial, iddistributioncenter)
left join materialgroup using (idmaterialgroup)
where (idcancellation is null or cancellation.State not in (1)) and
backlogsap.isdeleted = 0 and
backlogsap.idordertype not in ('ZAP', 'ZAK', 'ZAKO', 'ZAKZ', 'ZAPM') and
iddistributioncenter = 1469990
) as src
group by idmaterial
order by classA desc, classB desc, classC, idmaterial desc
Explain
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived3> ALL 26960 Using temporary; Using filesort
3 DERIVED backlogsap index_merge PRIMARY,fk_BacklogSap_OrderType1_idx,
fk_BacklogSap_MaterialInDistributionCenter1_idx,
perform_backlogsap_isdeleted,
fk_BacklogSap_DistributionCenter_idx perform_backlogsap_isdeleted,fk_BacklogSap_DistributionCenter_idx 1,4 35946 Using intersect(perform_backlogsap_isdeleted,fk_BacklogSap_DistributionCenter_idx); Using where
3 DERIVED backlog eq_ref idBacklogSAP_UNIQUE,
fk_Backlog_BacklogSap1_idx,
fk_Backlog_Cancellation1_idx idBacklogSAP_UNIQUE 4 ...backlogsap.IdBacklogSap 1
3 DERIVED cancellation eq_ref PRIMARY PRIMARY 4 ...backlog.IdCancellation 1 Using where
3 DERIVED reminderrequest eq_ref PRIMARY PRIMARY 4 ...backlog.IdReminderRequest 1
3 DERIVED material eq_ref PRIMARY PRIMARY 45 ...backlogsap.IdMaterial 1
3 DERIVED materialindistributioncenter eq_ref PRIMARY,
unqiue_IdDistributionCenter_IdMaterial,
fk_MaterialDistributionCenter_DistributionCenter1_idx,
fk_MaterialDistributionCenter_Material1_idx PRIMARY 49 const,...backlogsap.IdMaterial 1
3 DERIVED materialgroup eq_ref PRIMARY PRIMARY 137 ....material.IdMaterialGroup 1
2 DEPENDENT SUBQUERY innerBacklogsAp ref perform_backlogsap_isdeleted,
idx_backlogsap_IdMaterial idx_backlogsap_IdMaterial 45 func 8 Using where
Solved: created combined Index (idmaterial, IsDeleted)

mysql Query slow on view for search

Our search is very slow on view. we can't define index on view.. Please help how we can improve this .. Below Query took 33.3993 sec.
SELECT
`v_cat_pro`.`product_id`, `v_cat_pro`.`msrp`,
FROM
`v_prod_cat` AS `v_cat_pro`
WHERE
(product_status="1" and msrp >0 AND (search_text = 'de') )
ORDER BY
`msrp` ASC LIMIT 50
Explain query result
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE cat_product index catproducts_product_id,category_product_index category_product_index 8 NULL 941343 Using index; Using temporary; Using filesort
1 SIMPLE dept eq_ref PRIMARY PRIMARY 4 newdhf.cat_product.category_id 1 Using where
1 SIMPLE team eq_ref PRIMARY PRIMARY 4 newdhf.dept.parent_id 1 Using where
1 SIMPLE league eq_ref PRIMARY PRIMARY 4 newdhf.team.parent_id 1 Using where
1 SIMPLE product eq_ref PRIMARY PRIMARY 4 newdhf.cat_product.product_id 1 Using where
CREATE
ALGORITHM=UNDEFINED VIEW v_prod_cat AS
select
dept.id AS dept_id,team.short_name AS shortteamname,team.url AS team_url,team.id AS team_id,league.id AS league_id,product.product_id AS product_id,product.product_status AS product_status,product.upload_image_l AS upload_image_l,dept.name AS department,team.name AS team,league.name AS league,product.title AS title,cat_product.product_url AS product_url,product.discount AS discount,product.discount_start_date AS discount_start_date,product.discount_end_date AS discount_end_date,product.extra_discount AS extra_discount,product.extra_discount_start_date AS extra_discount_start_date,product.extra_discount_end_date AS extra_discount_end_date,product.global_alt_tag AS global_alt_tag,product.msrp AS msrp,product.cost AS cost,product.vendor_id AS vendor_id,if((cat_product.is_default > 0),1,0) AS is_default,
concat(league.name,_utf8' ',team.name,_utf8' ',dept.name,_utf8' ',replace(replace(replace(replace(product.title,'$leaguename',league.name),'$teamname',team.name),'$shortteamname',team.short_name),'$departmentname',dept.name),' ',product.sku_code,_utf8' ',replace(replace(replace(replace(product.site_search_keyword,'$leaguename',league.name),'$teamname',team.name),'$shortteamname',team.short_name),'$departmentname',dept.name)) AS search_text
from
((((categories dept join categories team on(((team.id = dept.parent_id) and (team.category_type = _utf8'team')))) join categories league on(((league.id = team.parent_id) and (league.category_type = _utf8'league')))) join category_products cat_product on((cat_product.category_id = dept.id))) join products product on((product.product_id = cat_product.product_id))) where (dept.category_type = _utf8'department')
order by
dept.id desc;
If you add an index to your 'v_prod_cat' table on the columns that you are searching on then this should help speed up your view.

need to make this query scalable/optimized for larger db in the future (remove full table reads)

Been working at this for awhile now and cannot seem to get it optimized. Although it does work, each left joined logs* table is reading every row in the database regardless if it is part of the set it is joined to (user_id's). While it returns correct results as is, this will be a problem as the user base and db as a whole grows.
Some quick background : given an account id there can be any number of computers to it. On each of those computers there can be any number of users linked to it. These user_id's are then linked in the logs tables. Each of these relationships is indexed (account_id, computer_id, user_id) for the necessary tables.
I have put the left joins in subqueries to prevent a cartesian product (a previous issue which subqueries solved).
Query :
SELECT
users.username as username,
computers.computer_name as computer_name,
l1.cnt as cnt1,
l2.cnt as cnt2,
l3.cnt as cnt3,
l4.cnt as cnt4,
l5.cnt as cnt5,
l6.cnt as cnt6
FROM computers
INNER JOIN users
on users.computer_id = computers.computer_id
LEFT JOIN
(SELECT
user_id,
count(*) as cnt
from logs1
group by user_id
) AS l1
on l1.user_id = users.user_id
LEFT JOIN
(SELECT
user_id,
count(*) as cnt
from logs2
group by user_id
) AS l2
on l2.user_id = users.user_id
LEFT JOIN
(SELECT
user_id,
count(*) as cnt
from logs3
group by user_id
) AS l3
on l3.user_id = users.user_id
LEFT JOIN
(SELECT
user_id,
count(*) as cnt
from logs4
group by user_id
) AS l4
on l4.user_id = users.user_id
LEFT JOIN
(SELECT
user_id,
count(*) as cnt
from logs5
group by user_id
) AS l5
on l5.user_id = users.user_id
LEFT JOIN
(SELECT
user_id,
count(*) as cnt
from logs6
group by user_id
) AS l6
on l6.user_id = users.user_id
WHERE computers.account_id = :cw_account_id AND computers.status = :cw_status
GROUP BY users.user_id
Plan :
computers 1 PRIMARY ref PRIMARY,unique_filter,status unique_filter 4 const 5 Using where; Using temporary; Using filesort
users 1 PRIMARY ref PRIMARY,unique_filter unique_filter 4 stephen_spcplus_inno.computers.computer_id 1 Using index
<derived2> 1 PRIMARY ref <auto_key0> <auto_key0> 4 stephen_spcplus_inno.users.user_id 3
logs1 2 DERIVED index user_id user_id 8 33 Using index
<derived3> 1 PRIMARY ref <auto_key0> <auto_key0> 4 stephen_spcplus_inno.users.user_id 10
logs2 3 DERIVED index user_id user_id 8 101 Using index
<derived4> 1 PRIMARY ref <auto_key0> <auto_key0> 4 stephen_spcplus_inno.users.user_id 4
logs3 4 DERIVED index user_id user_id 8 41 Using index
<derived5> 1 PRIMARY ref <auto_key0> <auto_key0> 4 stephen_spcplus_inno.users.user_id 2
logs4 5 DERIVED index user_id user_id 8 28 Using index
<derived6> 1 PRIMARY ref <auto_key0> <auto_key0> 4 stephen_spcplus_inno.users.user_id 2
logs5 6 DERIVED index user_id user_id 8 28 Using index
<derived7> 1 PRIMARY ref <auto_key0> <auto_key0> 4 stephen_spcplus_inno.users.user_id 275
logs6 7 DERIVED index user_id user_id 775 27516 Using index
example results :
username computer_name cnt1 cnt2 cnt3 cnt4 cnt5 cnt6
testuser COMPUTER_1 1 2 1 (null) (null) 3
testuser2 COMPUTER_1 (null) (null) (null) (null) (null) (null)
someuser COMPUTER_2 32 83 26 15 28 1157
As an example, for logs6 the plan is reading every row in the database (27516) yet there were only 1160 which 'should' have been joined.
I have tried lots of different things, but cannot get this to operate in an optimized manner. As it is currently the reason all the rows from each table are being read is due to the use of COUNT(*) within each joins subquery... removing this and only the needed rows are joined like I want, however, I do not know how to get the counts then in the same grouped result.
Help from any gurus would be great! Yes, I know I do not have a lot of rows in the db, but I can see the results are correct and see that the full table scans are going to be a problem as well.
EDIT (partial solution) :
I have found a partial solution to this problem, but it requires an additional query to get a list of user_ids. By adding WHERE user_id IN (17,22,23) where these are the user_ids which should be joined... to each log table I get the correct results and the entire table is not scanned.
If anyone knows of a way to make this work without this additional query and where additional please let me know.
I simplified your question to 2 log-tables and played around with it a bit on SQLFiddle.
=> http://sqlfiddle.com/#!2/a99e4a/2
It seems that using a sub-query makes things worse in my example data, but I wonder how it handles things when there are much more records in the tables that don't fit the criteria.
I'd suggest you give it a try and see what comes out. I don't have a MySql db to play around with here and I'd rather not bring SqlFiddle to its knees =)

SELECT query executes for a very long time occasionally

I have a really wierd problem in my MySQL InnoDB database. I have following query:
SELECT DISTINCT p.idProject AS idProject, p.name AS name, 0 AS isConfirm
FROM Projects p
JOIN team_project tp ON (p.idProject = tp.idProject)
JOIN projtimes pt ON (p.idProject = pt.idProject)
JOIN CalledTimesTbl ctt ON (p.idProject = ctt.idProject)
LEFT JOIN NextCalls nc ON (ctt.idCustomer = nc.idCustomer
AND ctt.idProject = nc.idProject)
WHERE tp.idTeam = 158
AND p.activated = 1
AND current_date >= p.projStart
AND current_date < p.confirmStart
AND pt.invitesCount < pt.maxPerPresentation
AND (nc.idCustomer IS NULL OR nc.nextCall < now())
ORDER BY p.name
Generally the query runs fine, but sometimes - for example when I set tp.idTeam = 147 it runs really slow (like 10 or 20 seconds). When I create alternative team and adjust proper tables values to have the same result with different idTeam value - the query executes in a fraction of second.
I profiled the query and noticed that when query executes slowly - there is one thing that consumes most of the time:
Copying to tmp table | 12.489197
I was a bit surprised that the query creates a tmp table but ok - it creates it every time the query executes - also when it executes fast.
I just add that db is designed well, there are all needed foreign keys, etc.
How to find the source of the slow executions and eliminate it?
EDIT: EXPLAIN results:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE tp ref unique_row,idTeam idTeam 4 const 56 Using temporary; Using filesort
1 SIMPLE p eq_ref PRIMARY,projStart,confirmStart PRIMARY 4 xxx.tp.idProject 1 Using where
1 SIMPLE pt ref uniq_projtimes uniq_projtimes 4 xxx.tp.idProject 1 Using where; Distinct
1 SIMPLE ctt ref idProject idProject 4 xxx.tp.idProject 3966 Using index; Distinct
1 SIMPLE nc eq_ref PRIMARY,idProject PRIMARY 8 xxx.ctt.idCustomer,xxx.tp.idProject 1 Using where; Distinct
EDIT2: Results of EXPLAIN EXTENDED first for fast query, second for the slow one.
id select_type table type possible_keys key key_len ref rows filtered Extra 1 SIMPLE tp ref unique_row,idTeam idTeam 4 const 1 100 Using temporary
1 SIMPLE p eq_ref PRIMARY,projStart,confirmStart PRIMARY 4 xxx.tp.idProject 1 100 Using where
1 SIMPLE pt ref uniq_projtimes uniq_projtimes 4 xxx.tp.idProject 1 100 Using where; Distinct
1 SIMPLE ctt ref idProject idProject 4 xxx.tp.idProject 46199 100 Using index; Distinct
1 SIMPLE nc eq_ref PRIMARY,idProject PRIMARY 8 xxx.ctt.idCustomer,xxx.tp.idProject 1 100 Using index; Distinct
id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE p eq_ref PRIMARY,projStart,confirmStart PRIMARY 4 xxx.ctt.idProject 1 100 Using where
1 SIMPLE pt ref uniq_projtimes uniq_projtimes 4 xxx.ctt.idProject 1 100 Using where; Distinct
1 SIMPLE tp ref unique_row,idTeam unique_row 8 xxx.pt.idProject,const 1 100 Using where; Using index; Distinct
1 SIMPLE nc eq_ref PRIMARY,idProject PRIMARY 8 xxx.ctt.idCustomer,xxx.tp.idProject 1 100 Using index; Distinct
Try this adjusted query. (It will join less rows)
SELECT DISTINCT p.idProject AS idProject, p.name AS name, 0 AS isConfirm
FROM Projects p
JOIN projtimes pt ON
p.idProject = pt.idProject
AND p.activated = 1
AND current_date >= p.projStart
AND current_date < p.confirmStart
AND pt.invitesCount < pt.maxPerPresentation
JOIN team_project tp ON
p.idProject = tp.idProject
AND tp.idTeam = 158
JOIN CalledTimesTbl ctt ON (p.idProject = ctt.idProject)
LEFT JOIN NextCalls nc ON (ctt.idCustomer = nc.idCustomer
AND ctt.idProject = nc.idProject)
WHERE (nc.idCustomer IS NULL OR nc.nextCall < now())
ORDER BY p.name

MySQL query very slow without an explicit "USE INDEX"

I've got a weird situation where my MySQL query is taking forever. I've fixed it by adding an explicit "USE INDEX" statement. My question really is why is this necessary - and what caused the MySQL optimiser to go so drastically wrong.
Here's the SQL statement:
SELECT i._id
FROM interim_table i
JOIN tablea a ON i.table_a_id = a._id
JOIN tableb b ON i.table_b_id = b._id
JOIN levels l ON a.level_id = l._id
JOIN projects p ON a.project_id = p._id
WHERE s.time_stamp > NOW() - INTERVAL 5 DAY AND a.project_id = 13
Note the time_stamp in the WHERE clause. If it's set to 5 days, then the query takes about two seconds. If I change it to 6 days however, it takes that long that MySQL times out.
This is the "explain" results for using the "5 day" interval (which takes 2 seconds):
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE p const PRIMARY PRIMARY 4 const 1 Using index
1 SIMPLE b range PRIMARY, time_stamp time_stamp 8 479 Using where; Using index
1 SIMPLE i ref ind_tableb_id,int_tablea_id ind_tableb_id 4 b._id 11
1 SIMPLE a eq_ref PRIMARY,level_id,project_id PRIMARY 4 i.table_a_id 1 Using where
1 SIMPLE l eq_ref PRIMARY PRIMARY 4 a.level_id 1 Using index
This is the "explain" results for using the "6 day" interval (which times out):
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE p const PRIMARY PRIMARY 4 const 1 Using index
1 SIMPLE a ref PRIMARY,level_id,project_id project_id 4 const 2722
1 SIMPLE l eq_ref PRIMARY PRIMARY 4 a.level_id 1 Using index
1 SIMPLE i ref ind_tableb_id,int_tablea_id int_tablea_id 4 a._id 2
1 SIMPLE b eq_ref PRIMARY,time_stamp PRIMARY 4 i.table_b_id 1 Using where
If I put an explicit "USE INDEX" statement in there, then I get the 6 day interval also down to 2 seconds...
SELECT i._id
FROM interim_table i
JOIN tablea a ON i.table_a_id = a._id
JOIN tableb b USE INDEX (time_stamp) ON i.table_b_id = b._id
JOIN levels l ON a.level_id = l._id
JOIN projects p ON a.project_id = p._id
WHERE s.time_stamp > NOW() - INTERVAL 6 DAY AND a.project_id = 13
Then the explain results becomes:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE p const PRIMARY PRIMARY 4 const 1 Using index
1 SIMPLE s range time_stamp time_stamp 8 504 Using where; Using index
1 SIMPLE i ref ind_tableb_id,ind_tableaid ind_tableb_id 4 s._id 11
1 SIMPLE v eq_ref PRIMARY,level_id,project_id PRIMARY 4 i.table_a_id 1 Using where
1 SIMPLE l eq_ref PRIMARY PRIMARY 4 v.level_id 1 Using index
Any thoughts on why MySQL required me to tell it which index to use?
Have you tried to update statistics?
Posting it again as an answer :)