How to optimize a query that's running too slow (LEFT JOIN)

How to optimize a query that's running too slow (LEFT JOIN) - mysql

My query is running too slow.....
any ways to speed it up?
SELECT SQL_CALC_FOUND_ROWS a.*, b.*, c.*, d.*, e.*, f1.*, st6.*
FROM tbl_a a
LEFT JOIN tbl_b b ON a.m_id = b.m_id
LEFT JOIN tbl_c c ON a.ms_id = c.ms_id
LEFT JOIN tbl_d d ON a.gd_id = d.gd_id
LEFT JOIN tbl_e e ON a.sd_id = e.sd_id
LEFT JOIN tbl_f f1 ON a.tp_id = f1.tp_id
LEFT JOIN
(
SELECT g.*, GROUP_CONCAT(f2.tp_name SEPARATOR ',') tp_name, f2.tp AS tp
FROM tbl_g g
LEFT JOIN tbl_f f2 ON g.tp_id = f2.tp_id
GROUP BY s_id
)st6 ON st6.s_id = a.s_id

Do not use LEFT unless you have a specific reason for it. I see no hint of such here.
A query like this is best optimized if it can perform the subquery first. However, 'LEFT' is a hint that it cannot be. Remove all the LEFTs and see if it is fast enough and gives the "right" answer.
Check to see that there is a PRIMARY KEY or INDEX on gd_id in both of the indicated tables in the following. (And do likewise for other JOINs.)
JOIN tbl_d d ON a.gd_id = d.gd_id
If you still need help, provide SHOW CREATE TABLE for each table. And EXPLAIN SELECT ...;.

Related

SQL query optimization for speed

So I was working on the problem of optimizing the following query I have already optimized this to the fullest from my side can this be further optimized?
select distinct name ad_type
from dim_ad_type x where exists ( select 1
from sum_adserver_dimensions sum
left join dim_ad_tag_map on dim_ad_tag_map.id=sum.ad_tag_map_id and dim_ad_tag_map.client_id=sum.client_id
left join dim_site on dim_site.id = dim_ad_tag_map.site_id
left join dim_geo on dim_geo.id = sum.geo_id
left join dim_region on dim_region.id=dim_geo.region_id
left join dim_device_category on dim_device_category.id=sum.device_category_id
left join dim_ad_unit on dim_ad_unit.id=dim_ad_tag_map.ad_unit_id
left join dim_monetization_channel on dim_monetization_channel.id=dim_ad_tag_map.monetization_channel_id
left join dim_os on dim_os.id = sum.os_id
left join dim_ad_type on dim_ad_type.id = dim_ad_tag_map.ad_type_id
left join dim_integration_type on dim_integration_type.id = dim_ad_tag_map.integration_type_id
where sum.client_id = 50
and dim_ad_type.id=x.id
)
order by 1

Your query although joined ok, is an overall bloat. You are using the dim_ad_type table on the outside, just to make sure it exists on the inside as well. You have all those left-joins that have NO bearing on the final outcome, why are they even there. I would simplify by reversing the logic. By tracing your INNER query for the same dim_ad_type table, I find the following is the direct line. sum -> dim_ad_tag_map -> dim_ad_type. Just run that.
select distinct
dat.name Ad_Type
from
sum_adserver_dimensions sum
join dim_ad_tag_map tm
on sum.ad_tag_map_id = tm.id
and sum.client_id = tm.client_id
join dim_ad_type dat
on tm.ad_type_id = dat.id
where
sum.client_id = 50
order by
1
Your query was running ALL dim_ad_types, then finding all the sums just to find those that matched. Run it direct starting with the one client, then direct with JOINs.

How can I merge these two left joins into a single one?

How can I merge these two left joins: http://sqlfiddle.com/#!9/1d2954/69/0
SELECT d.`id`, (adcount + bdcount)
FROM `docs` d
LEFT JOIN
(
SELECT da.`doc_id`, COUNT(da.`doc_id`) AS adcount FROM `docs_scod_a` da
INNER JOIN `scod_a` a ON a.`id` = da.`scod_a_id`
WHERE a.`ver_a` IN ('AA', 'AB')
GROUP BY da.`doc_id`
) ad ON ad.`doc_id` = d.`id`
LEFT JOIN
(
SELECT db.`doc_id`, COUNT(db.`doc_id`) AS bdcount FROM `docs_scod_b` db
INNER JOIN `scod_b` b ON b.`id` = db.`scod_b_id`
WHERE b.`ver_b` IN ('BA', 'BB')
GROUP BY db.`doc_id`
) bd ON bd.`doc_id` = d.`id`
to be a Single left join just to ease its use in my code, while making it no less slower?

Let me first emphasize that your method of doing the calculation is the better method. You have two separate dimensions and aggregating them separately is often the most efficient method for doing the calculation. It is also the most scalable method.
That said, your query should be equivalent to this version:
SELECT d.id,
count(distinct a.id),
count(distinct b.id)
FROM docs d left join
docs_scod_a da
ON da.doc_id = d.id LEFT JOIN
scod_a a
ON a.id = da.scod_a_id AND a.ver_a IN ('AA', 'AB') LEFT JOIN
docs_scod_b db
ON db.doc_id = d.id LEFT JOIN
scod_b b
ON b.id = db.scod_b_id AND b.ver_b IN ('BA', 'BB')
GROUP BY d.id
ORDER BY d.id;
This query is more expensive than it looks, because the COUNT(DISTINCT) incurs additional overhead compared to COUNT().
And here is the SQL Fiddle.
And, because LEFT JOIN can return NULL values, your query is more correctly written as:
SELECT d.`id`, COALESCE(adcount, 0) + COALESCE(bdcount, 0)
If you were having problems with the results, this small change might fix those problems.

Performance may be a big problem, depending on sizes of each table. It appears to be an "inflate-deflate" situation since it first "inflates" the number of rows via JOIN, then "deflates" via GROUP BY. The formulation below avoids inflation-deflation.
But first, if I understand this subquery correctly, this
SELECT da.`doc_id`, COUNT(da.`doc_id`) AS adcount
FROM `docs_scod_a` da
INNER JOIN `scod_a` a ON a.`id` = da.`scod_a_id`
WHERE a.`ver_a` IN ('AA', 'AB')
GROUP BY da.`doc_id`
can be rewritten as
SELECT `doc_id`,
( SELECT COUNT(*)
FROM `scod_a`
WHERE `id` = da.`scod_a_id`
AND `ver_a` IN ('AA', 'AB')
) AS adcount
FROM `docs_scod_a` AS da
If that is correct, then the entire query becomes
SELECT d.id,
( SELECT COUNT(*)
FROM docs_scod_a ds
JOIN scod_a s ON s.id = ds.scod_a_id
WHERE ds.doc_id = d.id
AND s.ver_a IN ('AA', 'AB')
) +
( SELECT COUNT(*)
FROM docs_scod_b ds
JOIN scod_b s ON s.id = ds.scod_b_id
WHERE ds.doc_id = d.id
AND s.ver_b IN ('BA', 'BB')
)
FROM docs AS d
Which needs these indexes:
docs_scod_a: (doc_id, scod_a_id), (scod_a_id, doc_id)
docs_scod_b: (doc_id, scod_b_id), (scod_b_id, doc_id)
scod_a: (ver_a, id)
scod_b: (ver_b, id)
docs: -- presumably has PRIMARY KEY(id)
Note the lack of GROUP BY.
docs_scod_a smells like a many-to-many mapping table. I recommend you follow the tips here.
(No COALESCE is needed since COUNT will simply return zero.)
(I don't know whether my version is better (faster or whatever) than Gordon's, nor whether my indexes will help his formulation.)

Can't do 2 joins on an MS Access query?

So I have checked all the names etc are correct, the query works with only one of the joins not both with a syntax error:
SELECT *, Invoice_Accounts.Title AS t1, Invoice_Accounts.Forename AS f1, Invoice_Accounts.Surname AS s1
FROM Invoice_Accounts
RIGHT JOIN Delivery_Points ON Invoice_Accounts.Account_No = Delivery_Points.Account_No
RIGHT JOIN Companies ON Invoice_Accounts.account_id = Companies.Company_No
This is the query I want to run but it comes up with the error:
syntax error (missing operator)
UPDATE:
using:
SELECT *,
Invoice_Accounts.Title AS t1,
Invoice_Accounts.Forename AS f1,
Invoice_Accounts.Surname AS s1
FROM (Invoice_Accounts
RIGHT JOIN Delivery_Points
ON Invoice_Accounts.Account_No = Delivery_Points.Account_No)
RIGHT JOIN Companies
ON Invoice_Accounts.account_id = Companies.Company_No
Gets me the error "JOIN expression not supported"

In Access you need to use parenthesis around joins if you have more than one:
SELECT *,
Invoice_Accounts.Title AS t1,
Invoice_Accounts.Forename AS f1,
Invoice_Accounts.Surname AS s1
FROM (Invoice_Accounts
RIGHT JOIN Delivery_Points
ON Invoice_Accounts.Account_No = Delivery_Points.Account_No)
RIGHT JOIN Companies
ON Invoice_Accounts.account_id = Companies.Company_No
Two extend this, if you had three joins, you would need another set of parentheses:
SELECT *
FROM (( A
INNER JOIN B
ON B.AID = A.AID)
INNER JOIN C
ON C.BID = B.BID)
INNER JOIN D
ON D.CID = C.CID;
EDIT
I did not know that you could not RIGHT JOIN to the same table twice, so the above is an error, having said that I have never used a RIGHT JOIN in production code in my life, I would be inclined to switch to the more widely used LEFT JOIN and change the order of the tables:
SELECT *,
ia.Title AS t1,
ia.Forename AS f1,
ia.Surname AS s1
FROM (Companies AS c
LEFT JOIN Invoice_Accounts AS ia
ON ia.account_id = c.Company_No)
LEFT JOIN Delivery_Points AS dp
ON ia.Account_No = dp.Account_No;
N.B. I have used aliases to reduce the amount of text in the query, this (in my opinion) makes them easier to read

Figured out the answer eventually:
SELECT Invoice_Accounts.Title AS t1, Invoice_Accounts.Forename AS f1, Invoice_Accounts.Surname AS s1, *
FROM (Companies
RIGHT JOIN Invoice_Accounts ON Companies.Company_No = Invoice_Accounts.Company_Name
)
RIGHT JOIN Delivery_Points ON Invoice_Accounts.Account_No = Delivery_Points.Account_No;
The issue arises from ancient ms-access technologies where you can't right join to the same table more than once!

Getting data differences between two queries

I have two queries that result two result sets i need to compare both the result sets and need to display the differences between them.Hope i will get good support.Thank you.These are my queries
Query:1
SELECT distinct c.sid_ident,c.fix_ident from corept.std_sid_leg as c INNER JOIN (SELECT sid_ident, transition_ident, max(sequence_num) seq, route_type FROM corept.std_sid_leg WHERE data_supplier='J' AND airport_ident='KBOS' GROUP BY sid_ident,transition_ident) b ON c.sequence_num=b.seq and c.sid_ident = b.sid_ident and c.transition_ident =b.transition_ident WHERE c.data_supplier='J' and c.airport_ident='KBOS';
Query:2
SELECT name,trans FROM skyplan_deploy.deploy_sids ON d.name=c.sid_ident WHERE apt = 'KBOS' AND name != trans;
Comparison is to be done on fields sid_ident in corept.std_sid_leg and name in skplan_deplay.deploy_sids. As Mysql does not support full outer join,I thought of using left join and right join and combine both the results.But i stuck up with this.Please help.I am getting syntax error while using left and right join.Thank you.

The following query should simulate a FULL OUTER JOIN in MySQL.
SELECT *
FROM A
LEFT OUTER JOIN B
ON A.NAME = B.NAME
WHERE B.ID IS NULL
UNION ALL
SELECT *
FROM B
LEFT OUTER JOIN A
ON B.NAME = A.NAME
WHERE A.ID IS NULL;
Compare the results of the with an actual FULL OUTER JOIN in SQL Server and you'll see it works.

Is it possible to convert this subquery into a join?

I want to replace the subquery with a join, if possible.
SELECT `fftenant_farmer`.`person_ptr_id`, `fftenant_surveyanswer`.`text_value`
FROM `fftenant_farmer`
INNER JOIN `fftenant_person`
ON (`fftenant_farmer`.`person_ptr_id` = `fftenant_person`.`id`)
LEFT OUTER JOIN `fftenant_surveyanswer`
ON fftenant_surveyanswer.surveyquestion_id = 1
AND fftenant_surveyanswer.`surveyresult_id` IN (SELECT y.`surveyresult_id` FROM `fftenant_farmer_surveyresults` y WHERE y.farmer_id = `fftenant_farmer`.`person_ptr_id`)
I tried:
SELECT `fftenant_farmer`.`person_ptr_id`, `fftenant_surveyanswer`.`text_value`#, T5.`text_value`
FROM `fftenant_farmer`
INNER JOIN `fftenant_person`
ON (`fftenant_farmer`.`person_ptr_id` = `fftenant_person`.`id`)
LEFT OUTER JOIN `fftenant_farmer_surveyresults`
ON (`fftenant_farmer`.`person_ptr_id` = `fftenant_farmer_surveyresults`.`farmer_id`)
LEFT OUTER JOIN `fftenant_surveyanswer`
ON (`fftenant_farmer_surveyresults`.`surveyresult_id` = `fftenant_surveyanswer`.`surveyresult_id`)
AND fftenant_surveyanswer.surveyquestion_id = 1
But that gave me one record per farmer per survey result for that farmer. I only want one record per farmer as returned by the first query.
A join may be faster on most RDBMs, but the real reason I asked this question is I just can't seem to formulate a join to replace the subquery and I want to know if it's even possible.

You could use DISTINCT or GROUP BY, as mvds and Brilliand suggest, but I think it's closer to the query's design intent if you change the last join to an inner-join, but elevating its precedence:
SELECT farmer.person_ptr_id, surveyanswer.text_value
FROM fftenant_farmer AS farmer
INNER
JOIN fftenant_person AS person
ON person.id = farmer.person_ptr_id
LEFT
OUTER
JOIN
( fftenant_farmer_surveyresults AS farmer_surveyresults
INNER
JOIN fftenant_surveyanswer AS surveyanswer
ON surveyanswer.surveyresult_id = farmer_surveyresults.surveyresult_id
AND surveyanswer.surveyquestion_id = 1
)
ON farmer_surveyresults.farmer_id = farmer.person_ptr_id
Broadly speaking, this will end up giving the same results as the DISTINCT or GROUP BY approach, but in a more principled, less ad hoc way, IMHO.

Use SELECT DISTINCT or GROUP BY to remove the duplicate entries.
Changing your attempt as little as possible:
SELECT DISTINCT `fftenant_farmer`.`person_ptr_id`, `fftenant_surveyanswer`.`text_value`#, T5.`text_value`
FROM `fftenant_farmer`
INNER JOIN `fftenant_person`
ON (`fftenant_farmer`.`person_ptr_id` = `fftenant_person`.`id`)
LEFT OUTER JOIN `fftenant_farmer_surveyresults`
ON (`fftenant_farmer`.`person_ptr_id` = `fftenant_farmer_surveyresults`.`farmer_id`)
LEFT OUTER JOIN `fftenant_surveyanswer`
ON (`fftenant_farmer_surveyresults`.`surveyresult_id` = `fftenant_surveyanswer`.`surveyresult_id`)
AND fftenant_surveyanswer.surveyquestion_id = 1

the real reason I asked this question is I just can't seem to formulate a join to replace the subquery and I want to know if it's even possible
Then consider a much simpler example to begin with e.g.
SELECT *
FROM T1
WHERE id IN (SELECT id FROM T2);
This is known as a semi join and if desired may be re-written using (among other possibilities) a JOIN with a SELECT clause to a) project only from the 'outer' table, and b) return only DISTINCT rows:
SELECT DISTINCT T1.*
FROM T1
JOIN T2 USING (id);

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

How to optimize a query that's running too slow (LEFT JOIN) - mysql

Related

SQL query optimization for speed

How can I merge these two left joins into a single one?

Can't do 2 joins on an MS Access query?

Getting data differences between two queries

Is it possible to convert this subquery into a join?

Categories

Resources