SQL : Join after Where Clause - mysql

Ive got a big SQL statement.
But i want to test the where clause on the firsttable (T1) and after that, make all the joins on the rows selected using the where clause.
Because actually the query is very slow, cause MySQL join all the tables to t1 and then test the where clause !
SELECT * from FIRSTTABLE T1
LEFT JOIN T2 on (....)
LEFT JOIN T3 on (....)
WHERE T1.column = '1' OR T1.column= '5'
Any ideas ?

You can do something like:
SELECT * from
(
SELECT * from FIRSTTABLE T1
WHERE T1.column = '1' OR T1.column= '5'
) as T2
LEFT JOIN T3 on (....)
LEFT JOIN T4 on (....)

This is too long for a comment.
SQL queries are compiled and optimized before they are executed. The order of clauses in a query really has nothing to do with the final execution plan. More specifically, the filtering conditions in the WHERE clause could take place before, after, or even during the JOIN processing.
You can start to learn about SQL optimization by understanding indexes. The MySQL documentation is a very reasonable place to start.

Related

Difference between "INNER JOIN table" and "INNER JOIN (SELECT table)"?

I work on a query in mysql that spend 30 sec to execute. The format is like this :
SELECT id
FROM table1 t1
INNER JOIN table2 t2
ON t1.id = t2.idt2
The INNER JOIN take 25 of 30 sec. When I write this like this :
SELECT id
FROM table1 t1
INNER JOIN (
SELECT idt2,col1,col2,col3
FROM table2
) t2
ON t1.id = t2.idt2
It take only 8 sec! Why does it work? I'm afraid of losing data.
(obviously, my query is more complex than this one, it's just an exemple)
Well you haven't shown us the EXPLAIN output
EXPLAIN SELECT id
FROM table1 t1
INNER JOIN table2 t2
ON t1.id = t2.idt2
this would definitly give us some insights of your query and table sctructures.
Based on your scenario, 1st query seems like you have issues with indexing.
What happened in your 2nd query is the optimizer is creating a temporary set from your subquery furthering filtering your data. I dont recommend doing that in MOST cases.
Purpose of subquery is to solve complex logic, not an instant solution for everything.

Which of the two SELECT statements is faster?

It seems that the second statement applies the where condition first before joining and the first one does join before applying the where condition, so the second one would be faster because it would do less joining. But is that really the case? Is there a reference which says definitely that in the first statement the where condition is executed after all the other joining operations finish?
SELECT * FROM class t1
LEFT JOIN class_students t2 ON t1.id = t2.class_id
LEFT JOIN student t3 ON t2.student_id = t3.id
WHERE t1.id = 1;
or
SELECT * FROM (SELECT * FROM class WHERE id = 1) t1
LEFT JOIN class_students t2 ON t1.id = t2.class_id
LEFT JOIN student t3 ON t2.student_id = t3.id;
Your second option has a "derived table" (a subquery in FROM or JOIN). Subqueries usually take extra effort. So, usually it is better to avoid them.
In your particular example, the Optimizer will probably start with t1 because the WHERE clause mentions it. That is, the execution will filter based on t1.id = 1, just as you suggest the second version would do.
Note my italicized words... There are exceptions to my statements; if you find a case where the second version runs faster, present it, I may be able to explain why it runs faster. (A likely example is where the subquery has GROUP BY and/or LIMIT. This is different enough from WHERE to make a difference.)

Which Query is faster if we put the "Where" inside the Join Table or put it at the end?

Ok, I am using Mysql DB. I have 2 simple tables.
Table1
ID-Text
12-txt1
13-txt2
42-txt3
.....
Table2
ID-Type-Text
13- 1 - MuTxt1
42- 1 - MuTxt2
12- 2 - Xnnn
Now I want to join these 2 tables to get all data for Type=1 in table 2
SQL1:
Select * from
Table1 t1
Join
(select * from Table2 where Type=1) t2
on t1.ID=t2.ID
SQL2:
Select * from
Table1 t1
Join
Table2 t2
on t1.ID=t2.ID
where t2.Type=1
These 2 queries give the same result, but which one is faster?
I don't know how Mysql does the Join (or How the Join works in Mysql) & that why I wonder this!!
Exxtra info, Now if i don't want type=1 but want t2.text='MuTxt1', so Sql2 will become
Select * from
Table1 t1
Join
Table2 t2
on t1.ID=t2.ID
where t2.text='MuTxt1'
I feel like this query is slower??
Sometimes the MySQL query optimizer does a pretty decent job and sometimes it sucks. Having said that, there are exception to my answer where the optimizer optimizes something else better.
Sub-Queries are generally expensive as MySQL will need to execute and store results seperately. Normally if you could use a sub-query or a join, the join is faster. Especially when using sub-query as part of your where clause and don't put a limit to it.
Select *
from Table1 t1
Join Table2 t2 on t1.ID=t2.ID
where t2.Type=1
and
Select *
from Table1 t1
Join Table2 t2
where t1.ID =t2.ID AND t2.Type=1
should perform equally well, while
Select *
from Table1 t1
Join (select *
from Table2
where Type=1) t2
on t1.ID=t2.ID
most likely is a lot slower as MySQL stores the result of select * from Table2 where Type=1 into a temporary table.
Generally joins work by building a table comprised of all combinations of rows from both table and afterwards removing lines which do not match the conditions. MySQL of course will try to use indexes containing the columns compared in the on clause and specified in the where clause.
If you are interested in which indexes are used, write EXPLAIN in front of your query and execute.
As per my view 2nd query is more better than first query in terms of code readability and performance. You can include filter condition in Join clause also like
Select * from
Table1 t1
Join
Table2 t2 on t1.ID=t2.ID and t2.Type=1
You can compare execution time for all queries in SQL fiddle here :
Query 1
Query 2
My Query
I think this question is hard to answer since we don't exactly know the internals of the query parser in the database. Usually these kind of constructions are evaluated by the database in a similar way (it can see that the first and second query are identical so parses it correctly, or not).
I would write the second one since it is more clear what is happening.

Left join part of the table

I am trying to join two table using left join, that is table1 left join table2.
I would only like part of the rows from A to be joined with B. Is it recommended that i use a sub query to filter rows from table1 or avoid them in where clause to improve my query performance?
select t1.a
,t1.b
,t2.c
from (select *
from table1
where a='x'
) t1 LEFT JOIN table2 t2 on t1.d=t2.d
or
select t1.a
,t1.b
,t2.c
from table1 t1 LEFT JOIN table2 t2 on t1.d=t2.d
where t1.a='x'
Check the query plan but I doubt it would make any difference.
It very depends on the structure and content of your database. The best way is to look into the query plan and compare it for both versions of your query.
You can find this documentation useful: MySQL Query Execution Plan

thoughts on innerjoin mysql

We have tables with more then 3m records. When using innerjoin it is much slower then select * from db1,db2 where db1.field=db2.field
Any thoughts?
INNER JOIN should not be any different from a SELECT FROM t1,t2 WHERE t1.c=t2.c, it is just a different syntax for doing the same thing and is treated the same by the optimiser.
Any difference in performance is in some other aspect of the query. Please POST:
The schema of both tables including their indexes (SHOW CREATE TABLE gives you this)
Both the queries you're comparing
Some detail about your performance testing methodology (it may be flawed)
The EXPLAIN output of both queries.
If you want a reasonable answer.
SELECT * from t1, t2 where t1.id = t2.id
is equivalent to
SELECT * from t1 INNER JOIN t2 on t1.id = t2.id.
However, if there are other criteria for the SQL query, then the behaviour may differ. For instance.
SELECT * from t1, t2 where t1.id = t2.id and t1.col1 is not null;
can be written in two different ways with the INNER JOIN:
SELECT * from t1 INNER JOIN t2 on t1.id = t2.id and t1.col1 is not null
or
SELECT * from t1 INNER JOIN t2 on t1.id = t2.id
WHERE t1.col1 is not null
This may or may not end up being the same query (according to the optimiser), and the complexity of the other parts of the query. The EXPLAIN PLAN will tell you if you are executing the same query.
Why are the above queries different? Because the restriction on not null is done at different stages of the query, which may have an impact on the performance, or even on the number of rows returned.
In general, the ...where db1.field=db2.field... syntax is an inner join. It's just the implicit notation instead of the explicit. If you're joining on the same columns and returning the same columns, performance should be identical. More: http://en.wikipedia.org/wiki/Join_(SQL)#Inner_join
I generally use explicit INNER JOIN or LEFT JOIN syntax according to needs. When the optimizer does a bad job, a STRAIGHT_JOIN can often sort it out, with suitable rearrangement of the query.
With any join involving large tables, it's worth using EXPLAIN.