optimize sql select with many Joins

optimize sql select with many Joins - mysql

How can I optimize this mysql statement?
SELECT DISTINCT p.name
FROM Something_Meta s1
JOIN Something_Meta s2 ON s1.fk_somethingId = s2.fk_somethingId
JOIN Products p ON s2.fk_productId = p.id
JOIN
(
select fk_id from Restricted where fk_foo != 233 and fk_id NOT IN
(
Select fk_id from Restricted where fk_foo = 233
)
)
r ON r.fk_id = p.id
WHERE s1.fk_somethingId = 63 AND s2.fk_somethingId <> s1.fk_somethingId
order by p.name ASC
My tables are like that
Product (id,name )
Restricted (id,fk_id,fk_foo )
Something_Meta (id,fk_id,fk_somethingId )
fk_id is foreign key to product (id)
Probably that sql statement needs optimization..
select fk_id from Restricted where fk_foo != 233 and fk_id NOT IN
(
Select fk_id from Restricted where fk_foo = 233
)
The whole query statement needs more than 1.5 sec to run which are many secs for a website for a single query.

You could try an index on (fk_foo, fk_id) which would cover your entire query:
create index ix_restricted_fk_foo_fk_id(fk_foo, fk_id) on restricted

First off : the DISTINCT is a bit of a red flag. If you need to filter out the doubles then there's probably an error somewhere (either in the query or the design) causing the doubles.
Second; you could write your query like this:
SELECT DISTINCT p.name
FROM Something_Meta s1
JOIN Something_Meta s2 ON s1.fk_somethingId = s2.fk_somethingId
JOIN Products p ON s2.fk_productId = p.id
WHERE s1.fk_somethingId = 63 AND s2.fk_somethingId <> s1.fk_somethingId
AND NOT EXISTS ( SELECT *
FROM Restricted r
WHERE r.fk_foo != 233
AND r.fk_id = p.id )
AND EXISTS ( SELECT *
FROM Restricted r2
WHERE r2.fk_foo = 233
AND r2.fk_id = p.id )
order by p.name ASC
As I don't have a clue about the Something_Meta tables I'll simply focus on the Restricted table and suggest you put an index on fk_foo and fk_id. Said index is NOT part of the query, but rather of the table, so you have to define it upfront, once.
CREATE INDEX idx_Restricted ON Restricted (fk_id, fk_foo)
Once the index is there; any query that might benefit from it will automagically use it in the background; no need for you to adapt the query for it.
Side-note; since you obviously are looking for the product I find it curious you don't 'focus' your query on Products.
SELECT p.name
FROM Products p
JOIN etc...

Related

Order by count(*) of my second table takes long time

Let's assume I have 2 tables. One contains car manufacturer's names and their IDs, the second contains information about car models. I need to select few of them from the first table, but order them by quantity of linked from the second table data.
Currently, my query looks like this:
SELECT DISTINCT `manufacturers`.`name`,
`manufacturers`.`cars_link`,
`manufacturers`.`slug`
FROM `manufacturers`
JOIN `cars`
ON manufacturers.cars_link = cars.manufacturer
WHERE ( NOT ( `manufacturers`.`cars_link` IS NULL ) )
AND ( `cars`.`class` = 'sedan' )
ORDER BY (SELECT Count(*)
FROM `cars`
WHERE `manufacturers`.cars_link = `cars`.manufacturer) DESC
It was working ok for my table of scooters which size is few dozens of mb. But now i need to do the same thing for the cars table, which size is few hundreds megabytes. The problem is that the query takes very long time, sometimes it even causes nginx timeout. Also, i think, that i have all the necesary database indexes. Is there any alternative for the query above?

lets try to use subquery for your count instead.
select * from (
select distinct m.name, m.cars_link, m.slug
from manufacturers m
join cars c on m.cars_link=c.manufacturer
left join
(select count(1) ct, c1.manufacturer from manufacturers m1
inner join cars_link c2 on m1.cars_link=c2.manufacturer
where coalesce(m1.cars_link, '') != '' and c1.class='sedan'
group by c1.manufacturer) as t1
on t1.manufacturer = c.manufacturer
where coalesce(m.cars_link, '') != '' and c.class='sedan') t2
order by t1.ct

alternative syntax using join

I have this query which works fine.
I need the rows where the subcategory belongs to the company
OR
The company has access to default subcategories (c.plannerdefaults =1 )and the subcategory is a default subcategory (s.company =0)
SELECT distinct
s.category from planner_subcat s, company c
where
(
c.id = 66
and c.plannerdefaults = 1
and s.company = 0
)
or s.company = 66
The thing is, and maybe my thinking is wrong here, I got the impression that if a query starts with
select col from table1, table2
then there is something wrong with the methodology, but in this case I could not think of an alternative using a join.
Is there one?

I would write the query this way:
SELECT s.category
FROM company c
JOIN planner_subcat s
ON c.id = s.company OR (c.plannerdefaults = 1 AND s.company = 0)
WHERE c.id = 66;

I am not sure what is your goal here.
Why do you want to re-write the query, are you seeing performance issues?
If you are looking for alternative syntax.
Here is one syntax using sub-query. This query could be a little faster and it will reduce the row locks if the tables are huge (not sure you will have to test it) also id this relation 1 planner-to-many companies then you don't need the DISTINCT function unless it is a different relation then add it back
SELECT s.category
FROM planner_subcat AS s
WHERE s.company IN(66,0) AND (
s.company = 66 OR EXISTS (SELECT 1 FROM company AS c WHERE id = 66 AND plannerdefaults = 1 AND s.company = 0 AND company = s.company )
)
if you simply want the query to have a newer syntax only then try this
SELECT DISTINCT s.category
FROM planner_subcat AS s
INNER JOIN company AS c ON c.company = s.company
WHERE s.company IN(0,66) AND ( s.company = 66 OR ( c.id = 66 AND c.plannerdefaults = 1 AND s.company = 0 ) )
But I think my first query will be better in your case since you would not need to use DISTINCT any more. I would think MySQL will not execute the sub query every time unless company = 0 since the company = 66 condition will satisfy the condition then is no reason to do more checking.

MySQL Query Optimization for the derived & Sub Query Combination queries

The following sort of the queries are running on the server which uses the derived table and subquery. The constraint is that the subqueries are generated from the multiple modules based on the current situation so cannot really convert it into the join combination.
Please suggest the possible solution to optimize the query
SELECT COUNT(1)
AS total
FROM member tlb_m
where tlb_m.active = 1
and tlb_m.rank > 0
and tlb_m.member_id not in (5735,134,241,1055,348,272,476,43,7,804,7548,90,229,346,40895)
and tlb_m.type = 'M'
and (tlb_m.hometown_list_id in
(SELECT l2.list_id
FROM ((
SELECT t12.list_id
from list_tree_idx t12
INNER JOIN list_tree_idx t11
ON t12.list_parent_id=t11.list_id
where t11.list_parent_id='205546'
) UNION ALL (
SELECT list_id
from list_tree_idx
where list_parent_id='205546'
) ) as l2
) or tlb_m.hometown_list_id = 205546
)

I would suggest to use a closure table for optimal hierarchical queries.
For example, having a closure table with columns ANCESTOR_ID, CHILD_ID and DEPTH your query will look like this
SELECT COUNT(1) AS total
FROM member AS tlb_m
LEFT JOIN hometown_closure AS c ON c.child_id = tlb_m.hometown_list_id
where tlb_m.active = 1
and tlb_m.rank > 0
and tlb_m.member_id not in (5735,134,241,1055,348,272,476,43,7,804,7548,90,229,346,40895)
and tlb_m.type = 'M'
and c.ancestor_id = 205546

subquery: on clause is ambiguous

SELECT *
FROM (
`lecture` AS l
)
LEFT JOIN `professor` AS p ON `p`.`professor_id` = `l`.`professor_id`
WHERE `lecture_sem` = '20141'
AND (
lecture_name LIKE '%KEYWORD%'
OR lecture_code LIKE '%KEYWORD%'
OR p.professor_name LIKE '%KEYWORD%'
)
AND (
SELECT COUNT( DISTINCT s1.yoil, s1.start_time, s1.end_time )
FROM schedule AS s1
INNER JOIN schedule AS s2 ON ( s1.lecture_id
IN (
SELECT lecture_id
FROM timeitem
WHERE timetable_id =890
)
AND s2.yoil = s1.yoil
AND (
(
s1.start_time <= s2.start_time
AND s2.end_time <= s1.end_time
) )
AND s2.lecture_id = lecture_id # <-- HERE
)
) >0
LIMIT 0 , 30
I want to use where column like this:
s2.lecture_id = lecture_id
or,
s2.lecture_id = l.lecture_id
So I want to use parent column in subquery, but error occurs:
Column 'lecture_id' in on clause is ambiguous
I googled many answers about this problem ("on clause is ambiguous"), they said I should replace this query to joining two queries. But I don't have a clue how to transform this query.

I believe the following does the equivalent query, but I haven't tested it.
The technique is to move the correlated subquery into the FROM clause as a derived table so that it gets run only once, producing a result for each lecture_id (hence the GROUP BY).
I also factored out the subquery for timetable, which I believe can be rewritten as a JOIN.
And I suspect the join to professor may be properly an INNER JOIN -- how can you have a lecture without a professor?
SELECT l.*, p.*
FROM lecture AS l
INNER JOIN professor AS p ON p.professor_id = l.professor_id
INNER JOIN (
SELECT s2.lecture_id, COUNT( DISTINCT s1.yoil, s1.start_time, s1.end_time ) AS count
FROM schedule AS s1
INNER JOIN schedule AS s2 ON s2.yoil = s1.yoil
AND s1.start_time <= s2.start_time AND s2.end_time <= s1.end_time
INNER JOIN timeitem AS t ON s1.lecture_id = t.lecture_id
WHERE t.timetable_id = 890
GROUP BY s2.lecture_id
) AS c ON l.lecture_id = c.lecture_id
WHERE l.lecture_sem = '20141'
AND c.count > 0
AND (
l.lecture_name LIKE '%KEYWORD%'
OR l.lecture_code LIKE '%KEYWORD%'
OR p.professor_name LIKE '%KEYWORD%'
)
LIMIT 0 , 30
Anyway, even if the query isn't perfect, it demonstrates how one would refactor it to avoid a correlated subquery.

The line you identified in your code isn't an ON clause.
Instead, I think the error is referring to the following section.
AND (
SELECT COUNT( DISTINCT s1.yoil, s1.start_time, s1.end_time )
FROM schedule AS s1
INNER JOIN schedule AS s2 ON ( s1.lecture_id
IN (
# vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
SELECT lecture_id # <---- HERE
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FROM timeitem
WHERE timetable_id =890
)
You can fix this by creating an alias for the timeitem table, and prefixing the column with this prefix:
IN (
SELECT ti.lecture_id
FROM timeitem as ti
WHERE ti.timetable_id =890
)
But as Bill Karwin pointed out in his answer, you have other logical issues that need to be addressed.

MySQL Update query with left join and group by

I am trying to create an update query and making little progress in getting the right syntax.
The following query is working:
SELECT t.Index1, t.Index2, COUNT( m.EventType )
FROM Table t
LEFT JOIN MEvents m ON
(m.Index1 = t.Index1 AND
m.Index2 = t.Index2 AND
(m.EventType = 'A' OR m.EventType = 'B')
)
WHERE (t.SpecialEventCount IS NULL)
GROUP BY t.Index1, t.Index2
It creates a list of triplets Index1,Index2,EventCounts.
It only does this for case where t.SpecialEventCount is NULL. The update query I am trying to write should set this SpecialEventCount to that count, i.e. COUNT(m.EventType) in the query above. This number could be 0 or any positive number (hence the left join). Index1 and Index2 together are unique in Table t and they are used to identify events in MEvent.
How do I have to modify the select query to become an update query? I.e. something like
UPDATE Table SET SpecialEventCount=COUNT(m.EventType).....
but I am confused what to put where and have failed with numerous different guesses.

I take it that (Index1, Index2) is a unique key on Table, otherwise I would expect the reference to t.SpecialEventCount to result in an error.
Edited query to use subquery as it didn't work using GROUP BY
UPDATE
Table AS t
LEFT JOIN (
SELECT
Index1,
Index2,
COUNT(EventType) AS NumEvents
FROM
MEvents
WHERE
EventType = 'A' OR EventType = 'B'
GROUP BY
Index1,
Index2
) AS m ON
m.Index1 = t.Index1 AND
m.Index2 = t.Index2
SET
t.SpecialEventCount = m.NumEvents
WHERE
t.SpecialEventCount IS NULL

Doing a left join with a subquery will generate a giant
temporary table in-memory that will have no indexes.
For updates, try avoiding joins and using correlated
subqueries instead:
UPDATE
Table AS t
SET
t.SpecialEventCount = (
SELECT COUNT(m.EventType)
FROM MEvents m
WHERE m.EventType in ('A','B')
AND m.Index1 = t.Index1
AND m.Index2 = t.Index2
)
WHERE
t.SpecialEventCount IS NULL
Do some profiling, but this can be significantly faster in some cases.

my example
update card_crowd as cardCrowd
LEFT JOIN
(
select cc.id , count(1) as num
from card_crowd cc LEFT JOIN
card_crowd_r ccr on cc.id = ccr.crowd_id
group by cc.id
) as tt
on cardCrowd.id = tt.id
set cardCrowd.join_num = tt.num;

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

optimize sql select with many Joins - mysql

You could try an index on (fk_foo, fk_id) which would cover your entire query: create index ix_restricted_fk_foo_fk_id(fk_foo, fk_id) on restricted

Related

Order by count(*) of my second table takes long time

alternative syntax using join

MySQL Query Optimization for the derived & Sub Query Combination queries

subquery: on clause is ambiguous

MySQL Update query with left join and group by

Categories

Resources