what is difference between 'where' and different 'joins' in mysql? [duplicate] - mysql

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
In MySQL queries, why use join instead of where?
Joins are always confusing for me..can anyone tell me what is difference between different joins and which one is quickest and when to use which join and what is difference between join and where clause ?? Plz give ur answer in detail as i already read about joins on some websites but didn't get the concept properly.

Rather than quote an entire Wikipedia article, I'm going to suggest your their article on SQL Joins.
An SQL JOIN clause combines records
from two or more tables in a
database. It creates a set that
can be saved as a table or used as is.
A JOIN is a means for combining
fields from two tables by using values
common to each. ANSI standard SQL
specifies four types of JOINs: INNER,
OUTER, LEFT, and RIGHT. In special
cases, a table (base table, view, or
joined table) can JOIN to itself in a
self-join.
A programmer writes a JOIN predicate
to identify the records for joining.
If the evaluated predicate is true,
the combined record is then produced
in the expected format, a record set
or a temporary table, for example.
There is an older syntax that uses WHERE clauses to imply INNER JOINs. While it works, and will produce queries that run exactly like queries specified with the INNER JOIN syntax, it is deprecated because most people find it more confusing.
Here's the documentation for the MySQL JOIN syntax.

This query:
SELECT c.customer_name, o.order_id, o.total
FROM customers c, orders o
WHERE c.id = o.customer_id
and this
SELECT c.customer_name, o.order_id, o.total
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
make no difference on what is executed on the mysql server but the join is (imho) way more readable. Expecially for more complex joins:
Here is an article about the topic: http://www.mysqlperformanceblog.com/2010/04/14/is-there-a-performance-difference-between-join-and-where/

Related

MySQL Left Join and Right Join Optimization

I've been looking up some documentation about this topic here: https://dev.mysql.com/doc/refman/5.7/en/left-join-optimization.html
But I don't understand the following example:
The join optimizer calculates the order in which to join tables. The table read order forced by LEFT JOIN or STRAIGHT_JOIN helps the join optimizer do its work much more quickly, because there are fewer table permutations to check. This means that if you execute a query of the following type, MySQL does a full scan on b because the LEFT JOIN forces it to be read before d:
SELECT *
FROM a JOIN b LEFT JOIN c ON (c.key=a.key)
LEFT JOIN d ON (d.key=a.key)
WHERE b.key=d.key;
The fix in this case is reverse the order in which a and b are listed in the FROM clause:
SELECT *
FROM b JOIN a LEFT JOIN c ON (c.key=a.key)
LEFT JOIN d ON (d.key=a.key)
WHERE b.key=d.key;
Why does the order make an optimization? Do JOIN and LEFT_JOIN execute in some order?
I suspect the first quote is not quite correct. I have seen LEFT JOIN turned into JOIN and then the tables touched in the 'wrong' order.
Anyway, don't worry about the work the optimizer needs to do. In thousands of slow JOINs, I have identified only one case where the cost of picking the order was important. And it was a case of multiple joins to a single table; yet another drawback of EAV schema. Anyway, there is a simple setting to avoid that problem.
LEFT/RIGHT/plain JOINs are semantically done left-to-right (regardless of the order the optimizer chooses to touch the tables).
If you are concerned about the ordering, you can add parentheses. For example:
FROM (a JOIN b ON ...) JOIN (c JOIN d ON ...) ON ...
If you are using "commajoin" (FROM a,b...), don't. However, its precedence changed long ago. The workaround was to add parens so that the same SQL would work in versions before and after the change.
Don't use LEFT unless you need it to get NULLs for missing 'right' rows. It just confuses readers into thinking that you expect NULLs.
This example is wrong in many ways, and it is not clear to me what it is trying to convey. Apologies for that. I will file a bug with the documentation team.
Some clarifications:
For the given query, the last LEFT JOIN will be converted to an inner join. This is because the WHERE clause, WHERE b.key=d.key, implies that d.key can not be NULL. Hence, any extra rows produced by LEFT JOIN compared to INNER JOIN would be filtered out by the WHERE clause. (The principles of this transformation is described in the paragraph following the given example.)
The ON clause of the first LEFT JOIN, ON (c.key=a.key), makes table c dependent on table a, but not table b. Hence, the only the requirement wrt join order is that table a is processed before table c. The order in which tables a and b are listed in the query, will not change that.
Tables b and d may be processed in any order, both wrt each other and wrt the other tables of the query.
This paragraph seems to recommend LEFT JOIN as a mechanism to reduce number of "table permutations to check". This is not meaningful since changing from INNER JOIN to LEFT JOIN may change the semantics of the query. For this purpose STRAIGHT_JOIN should be used instead.
For most join queries, execution time by far exceeds optimization time. Reducing the number of "table permutations to check" may cause potentially more efficient permutations to not be explored. Hence, LEFT JOIN should not be used unless it is required to get the wanted semantics.

ON statement in sql for joins

Good day! I'm searching the entire web for the purpose of 'on' statement in MySQL. But i can't find an exact answer about the purpose of it. For example:
SELECT Customers.CustomerName, Orders.OrderID
FROM Customers
INNER JOIN Orders
ON Customers.CustomerID=Orders.CustomerID
ORDER BY Customers.CustomerName;
What i'm trying to trace is to find the purpose of this:
ON Customers.CustomerID=Orders.CustomerID
You should not look for the ON syntax but for the inner join syntax and there you will find a lot explanations on line. E.g. http://www.w3schools.com/sql/sql_join_inner.asp
The 'ON' in this case is similar to 'Where' It defines on which fields the join is based.
I guess you are fooled by the fact that in your particular case, the joining field name is the same in both table: CustomerID.
That is not always the case, hence the necessity of specifying in SQL which fields are used for joining tables, in the ONclause.
You are joining two tables on customerId.
It means that you will join two tables side by side having same customerId from both the tables, so you could select customer and their order together at the same time.

Explain which table to choose "FROM" in a JOIN statement

I'm new to SQL and am having trouble understanding why there's a FROM keyword in a JOIN statement if I use dot notation to select the tables.columns that I want. Does it matter which table I choose out of the two? I didn't see any explanation for this in w3schools definition on which table is the FROM table. In the example below, how do I know which table to choose for the FROM? Since I essentially already selected which table.column to select, can it be either?
For example:
SELECT Customers.CustomerName, Orders.OrderID
FROM Customers
INNER JOIN Orders
ON Customers.CustomerID=Orders.CustomerID
ORDER BY Customers.CustomerName;
The order doesn't matter in an INNER JOIN.
However, it does matter in LEFT JOIN and RIGHT JOIN. In a LEFT JOIN, the table in the FROM clause is the primary table; the result will contain every row selected from this table, while rows named in the LEFT JOIN table can be missing (these columns will be NULL in the result). RIGHT JOIN is similar but the reverse: rows can be missing in the table named in FROM.
For instance, if you change your query to use LEFT JOIN, you'll see customers with no orders. But if you swapped the order of the tables and used a LEFT JOIN, you wouldn't see these customers. You would see orders with no customer (although such rows probably shouldn't exist).
The from statement refers to the join not the table. The join of table will create a set from which you will be selecting columns.
For an inner join it does not matter which table is in the from clause and which is in the join clause.
For outer joins it of course does matter, as the table in the outer join is allowed to have "missing" records.
It does not matter for inner joins: the optimizer will figure out the proper sequence of reading the tables, regardless of your choice for the ordering.
For directional outer joins, it does matter, because these are not symmetric. You choose the table in which you want to keep all rows for the first FROM table in a left outer join; for the right outer join it is the other way around.
For full outer joins it does not matter again, because the tables in full outer joins are used symmetrically to each other.
In situations when ordering does not matter you pick the order to be "natural" to the reader of your SQL statement, whatever that means for your model. SQL queries very quickly become rather hard to read, so the proper ordering of your tables is important for human readers of your queries.
Well in your current example the from operator can be applied on both tables.
SELECT Customers.CustomerName, Orders.OrderID
FROM Customers,Orders
WHERE Customers.CustomerID=Orders.CustomerID
ORDER BY Customers.CustomerName;
->Will work like your code
The comma will join the two tables.
From just means which table you are retrieving data from.
In your example, you joined the two tables using different syntax.
it could also have been :
SELECT Customers.CustomerName, Orders.OrderID
FROM Orders
INNER JOIN Customers
ON Customers.CustomerID=Orders.CustomerID
ORDER BY Customers.CustomerName;
all the code written will generate same results

Is this a MySQL JOIN? [duplicate]

This question already has answers here:
Explicit vs implicit SQL joins
(12 answers)
Closed 8 years ago.
SELECT
dim_date.date, dim_locations.city, fact_numbers.metric
FROM
dim_date, fact_numbers
WHERE
dim_date.id = fact_numbers.fk_dim_date_id
AND
dim_locations.city = "Toronto"
AND
dim_date.date = 2010-04-13;
Since I'm not using the JOIN command, I'm wondering if this is indeed a JOIN (and if not, what to call it)?
This is for a dimensional model by the way which is using surrogate and primary keys to match up details
Yes, it is joining the tables together so it will provide related information from all of the tables.
The one major downfall of linking tables via WHERE instead of JOIN is not being able to use LEFT, RIGHT, and Full to show records from one table even if missing in the other.
However, your SQL statement is invalid. You are not linking dim_locations to either of the other tables and it is missing in the FROM clause.
Your query using an INNER JOIN which is comparable to your WHERE clause may look something like the following:
SELECT DD.date, DL.city, FN.metric
FROM dim_date AS DD
JOIN fact_numbers AS FN ON DD.id = FN.fk_dim_date_id
JOIN dim_locations AS DL ON DL.id = FN.fk_dim_locations_id
WHERE DL.city = 'Toronto'
AND DD.date = 2010-04-13
yes, it is joining tables. It is called non-ANSI JOIN syntax when join clause is not used explicitly. And when join clause is used, it is ANSI JOIN
If you reference two different tables in a where clause and compare their referential IDs, then yes, it acts the same as a JOIN.
Note however that this can be very inefficient if the optimizer doesn't optimize it properly see: Is there something wrong with joins that don't use the JOIN keyword in SQL or MySQL? and INNER JOIN keywords | with and without using them

Whats the advantage of using INNER JOIN? [duplicate]

This question already has answers here:
INNER JOIN ON vs WHERE clause
(12 answers)
Closed 7 years ago.
I see that this example:
SELECT Orders.OrderID, Customers.CustomerName, Orders.OrderDate
FROM Orders
INNER JOIN Customers
ON Orders.CustomerID=Customers.CustomerID;
produces the same output as this example:
SELECT Orders.OrderID, Customers.CustomerName, Orders.OrderDate
FROM Orders, Customers
WHERE Orders.CustomerID=Customers.CustomerID;
Is there any advantage in the INNER JOIN command?
Your former example is standard ANSI/ISO SQL. Your latter example is deprecated SQL. With the former syntax (ANSI/ISO) , the join criteria are clearly delineated; with the latter (deprecated) syntax, the join criteria are intermixed with the where clause filters making things much more difficult to sort out.
It doesn't matter so much for simply queries like yours, but when you have to deal with much more complicated queries (try 12 or 15 tables with a mixture of left, right and inner joins!), you will understand why the old syntax is deprecated.
INNER JOIN is for clarity. Using WHERE as in your second example can get lost in more complex queries, making it difficult to determine how your tables are joined and make it more difficult to troubleshoot problems.