Difference between USING and ON when joining more than two tables - mysql

Say I have three tables with the following data in them:
CREATE TABLE movies (
movie_id INT,
movie_name VARCHAR(255),
PRIMARY KEY (movie_id)
);
CREATE TABLE movie_ratings (
movie_rating_id INT,
movie_id INT,
rating_value TINYINT,
PRIMARY KEY (movie_rating_id),
KEY movie_id (movie_id)
);
CREATE TABLE movie_actors (
movie_actor_id INT,
movie_id INT,
actor_id INT,
PRIMARY KEY (movie_actor_id),
KEY movie_id (movie_id)
);
INSERT INTO movies VALUES (1, 'Titanic'),(2,'Star Trek');
INSERT INTO movie_ratings VALUES (1,1,5),(2,1,4),(3,1,5);
INSERT INTO movie_actors VALUES (1,1,2),(2,2,2);
If I wanted to get the average rating and number of actors for each movie, I could do this using JOINs:
SELECT m.movie_name, AVG(rating_value) AS avgRating, COUNT(actor_id) AS numActors
FROM movies m
LEFT JOIN movie_ratings r ON m.movie_id = r.movie_id
LEFT JOIN movie_actors a ON m.movie_id = a.movie_id
GROUP BY m.movie_id;
Let's call that query A. Query A can be rewritten with USING thusly:
SELECT m.movie_name, AVG(rating_value) AS avgRating, COUNT(actor_id) AS numActors
FROM movies m
LEFT JOIN movie_ratings r USING (movie_id)
LEFT JOIN movie_actors a USING (movie_id)
GROUP BY m.movie_id;
Let's call that query B.
Both of those queries return 1 as numActors for the movie 'Star Trek'. So let's modify that query a bit:
SELECT m.movie_name, AVG(rating_value) AS avgRating, COUNT(actor_id) AS numActors
FROM movies m
LEFT JOIN movie_ratings r ON m.movie_id = r.movie_id
LEFT JOIN movie_actors a ON r.movie_id = a.movie_id
GROUP BY m.movie_id;
Let's call this query C. Instead of doing m.movie_id = a.movie_id I'm now doing r.movie_id = a.movie_id. For query C numActors is 0.
My questions are:
How can I write query C using USING? Can I?
Is USING essentially doing an ON with the current table and the table mentioned in FROM?
If the answer to #2 is yes then what does USING do when an implicit JOIN is used and multiple tables are in the FROM?

If the column name is the same in both tables then yes, you can use USING().
In other words, this:
SELECT movie_name, AVG(rating_value) AS averageRating, COUNT(actor_id) AS numActors
FROM movies m
LEFT JOIN movie_ratings r ON m.movie_id = r.movie_id
LEFT JOIN movie_actors a ON m.movie_id = a.movie_id
GROUP BY m.movie_id;
Is the same as:
SELECT movie_name, AVG(rating_value) AS averageRating, COUNT(actor_id) AS numActors
FROM movies m
LEFT JOIN movie_ratings USING (movie_id)
LEFT JOIN movie_actors USING (movie_id)
GROUP BY movie_id;
As far as the ambiguity there won't be any here. It will join the tables when the movie_id is equal. In your select statement, you are pulling the movie_name, which only exists in one column.
However, if you said this:
SELECT movie_id, AVG(rating_value) AS averageRating, COUNT(actor_id) AS numActors
MySQL will say there is an error because movie_id cannot be resolved because it as ambiguous. To fix this ambiguity, you'd just have to make sure you used a table alias or name when selecting movie_id.
This is a valid select statement:
SELECT m.movie_id, AVG(rating_value) AS averageRating, COUNT(actor_id) AS numActors
No error would be thrown for this.
I would like to comment that I foresee some danger here. If you left join movies with all of these tables, you could potentially receive null values. If movie_id 1 does not have any ratings, your AVG(rating_value) will return null. You won't have this problem for COUNT(actor_id) as this will just return 0. I don't know if this bothers you, but be aware that that column could return null.
I built the sample tables in MySQL workbench, and I'm unable to get SQL Fiddle to work to show you, but if you would like to see the data I've created let me know and I will edit the question.

1. Can C be rewritten using USING?
Yes, you can, using a nested join:
SELECT m.movie_name, AVG(rating_value) AS avgRating, COUNT(actor_id) AS numActors
FROM movies m
LEFT JOIN (
movie_ratings r
LEFT JOIN movie_actors a USING (movie_id)
) USING (movie_id)
GROUP BY m.movie_id
2. Is USING essentially doing an ON with the current table and the table mentioned in FROM?
No. MySQL Documentation says:
The evaluation of multi-way natural joins differs in a very important way that affects the result of NATURAL or USING joins and that can require query rewriting. Suppose that you have three tables t1(a,b), t2(c,b), and t3(a,c) that each have one row: t1(1,2), t2(10,2), and t3(7,10). Suppose also that you have this NATURAL JOIN on the three tables:
SELECT ... FROM t1 NATURAL JOIN t2 NATURAL JOIN t3;
Previously, the left operand of the second join was considered to be t2, whereas it should be the nested join (t1 NATURAL JOIN t2). As a result, the columns of t3 are checked for common columns only in t2, and, if t3 has common columns with t1, these columns are not used as equi-join columns. Thus, previously, the preceding query was transformed to the following equi-join:
SELECT ... FROM t1, t2, t3
WHERE t1.b = t2.b AND t2.c = t3.c;
So basically, in older versions of MySQL your query B was not the same as query A, but as query C!
3. What does USING do when an implicit JOIN is used and multiple tables are in the FROM?
Again, citing the MySQL Documentation:
Previously, the comma operator (,) and JOIN both had the same precedence, so the join expression t1, t2 JOIN t3 was interpreted as ((t1, t2) JOIN t3). Now JOIN has higher precedence, so the expression is interpreted as (t1, (t2 JOIN t3)). This change affects statements that use an ON clause, because that clause can refer only to columns in the operands of the join, and the change in precedence changes interpretation of what those operands are.
It's all about join-order and precedence. So basically t1, t2 JOIN t3 USING (x) would do t2 JOIN t3 USING(x) first and join that with t1.

There is no ambiguity as USING applies to the tables in the join so this query
SELECT movie_name, AVG(rating_value), COUNT(actor_id)
FROM movies m
LEFT JOIN movie_ratings r USING (movie_id)
LEFT JOIN movie_actors a USING (movie_id)
GROUP BY m.movie_id;
is pretty much equivalent to the one with inner joins except that the movie_idcolumn should only appear once in the results, instead of three times in theinner joincase.
See this example for the column elimination: http://ideone.com/qMj5XK (using SQLite I think, SQL Fiddle wouldn't work but MySQL should behave in the same way).

How can I write query C using USING? Can I?
Like jpw mentionned in is answer yes you can use USING with query C. It will join m with rusing movie_id and m with a also using movie_id. In fact USING with MySQL is aligned with the SQL 2003 standard.
Is USING essentially doing an ON with the current table and the table
mentioned in FROM?
Yes USING is doing an ON with the current table and the table mentioned in the FROM clause. The only difference is the number columns you are going to end with if you use an asterisk in the SELECT clause. The Oracle documentation for USING is much more helpful than the MySQL documentation about that.
If the answer to #2 is yes then what does USING do when an implicit
JOIN is used and multiple tables are in the FROM?
You can try it for yourself but I'm pretty sure it wouldn't work with an implicit join (FROM tableA, tableB). This might be just another reason why implicit joins should be avoided.
Also since USING can only be used with explicit join that would mean a very awkward query mixing both explicit and implicit join. Something you probably want to avoid.
Edit :
By the way, numActors is 0 in query C because your join are incorrect. In fact if there are no movie rating then there are no actors! If you fix that you should get the same result than query B.
SELECT m.movie_name, AVG(rating_value) AS avgRating, COUNT(actor_id) AS numActors
FROM movies m
LEFT JOIN movie_ratings r ON m.movie_id = r.movie_id
LEFT JOIN movie_actors a ON m.movie_id = a.movie_id -- Instead of r.movie_id = a.movie_id
GROUP BY m.movie_id;

Related

Display name of foreign key column instead of its id

I have a match table whose structure is displayed here
in this table i have column teama, teamb which are a foreign key columns referenced to team table's t_id. Basically, what i want to do is that when i select all data from this table i want it to display the values in teama, and teamb instead of their t_id. Structure of Team table is here
Query which i am writing is below:
select *
from teams,matches
where
matches.team_a=teams.t_id
and matches.team_b=teams.t_id;
You need to join 2 columns of matches to the teams table:
select
m.m_id,
t1.t_name as team_a,
t2.t_name as team_b,
m.m_time
from
matches m inner join teams as t1 on m.team_a=t1.t_id
inner join teams as t2 on m.team_b=t2.t_id
order by m.m_id;
First, never use commas in the FROM clause. Always use proper, explicit, standard JOIN syntax. You need two JOINs in fact:
select m.*, ta.t_name as name_a, tb.t_name as name_b
from matches m left join
teams ta
on m.team_a = ta.t_id left join
teams tb
on m.team_b = tb.t_id;
This uses left join just to ensure that you get all matches, even if one of the teams is missing. In this case, that is probably not an important consideration, so inner join would be equivalent.
You want two INNER JOINs from table matches to table teams, like :
SELECT
ta.t_name,
tb.t_name
FROM
matches m
INNER JOIN team as ta on ta.t_id = matches.team_a
INNER JOIN team as tb on tb.t_id = matches.team_b
You can create view after join it was make your work simple for further developement,i was improove mr.forbas code as follow
CREATE VIEW team AS select
m.m_id,
t1.t_name as team_a,
t2.t_name as team_b,
m.m_time
from
matches m inner join teams as t1 on m.team_a=t1.t_id
inner join teams as t2 on m.team_b=t2.t_id
order by m.m_id;

I am unsure: Is this an anti-join?

I am working on the first problem of the famous SQLzoos and am working on the using Null section: http://sqlzoo.net/wiki/Using_Null
The question is:
List the teachers who have NULL for their department.
The corresponding SQL query would be:
SELECT t.name
FROM teacher t
WHERE t.dept IS NULL
Is this a type of anti-join? Specifically, is this a left-anti-join?
This isn't a join at all.
The statement is filtering only records for teachers who don't have an assigned department.
Set Difference
The set difference of teachers and departments, teacher \ department would be a kind of "anti-join"
SELECT
t.name
FROM teacher t
LEFT JOIN department d ON d.id = t.dept_id
WHERE d.id IS NULL
At first glance, this statement does what your statement does, if the foreign key reference was enforced, it would guarantee to do exactly that. However, one use for this statement would be to retrieve teachers who are assigned to departments that have since been deleted (e.g. if the English Lit Dept. & English as 2nd Lang Dept. were reorganized as the English Dept.)
Symmetric Difference
Another "anti-join" would be the symmetric difference, which selects elements from both sets ONLY if they cannot be joined, i.e
(teacher \ department) U (department \ teacher)
I can't think of a motivating example using teachers and departments, but one way to write the symmetric difference on databases that support the FULL OUTER JOIN would be:
SELECT
t.name
FROM teacher t
FULL OUTER JOIN department d ON d.id = t.dept_id
WHERE d.id IS NULL OR t.id IS NULL
For MySQL, this statement would have to be written as the union of two statements.
SELECT
t.name teacher_name, d.name department_name
FROM teacher t
LEFT JOIN department d ON d.id = t.dept_id
WHERE d.id IS NULL
UNION ALL
SELECT
t.name teacher_name, d.name department_name
FROM teacher t
LEFT JOIN department d ON d.id = t.dept_id
WHERE t.id IS NULL
Looking through one of my projects, I found this one use of symmetric difference:
Context:
I have three tables: users, users_gameplay_summary, users_transactions_summary. I needed to email those users who created their accounts in the past 7 days AND one of the following
have transacted but have not played or played but have not transacted.
To get the list, I have this query (note, this was written for Postgresql, and won't work on MySQL, but it illustrates the symmetric difference use case):
SELECT
COALESCE(g.user_id, t.user_id) user_id
FROM users_gameplay_summary g
FULL OUTER JOIN users_transactions_summary t ON t.user_id = g.user_id
WHERE COALESCE(g.user_id, t.user_id) IN (
SELECT user_id
FROM users
WHERE created_at > CURRENT_DATE - '7 day'::interval)
AND (g.user_id IS NULL OR t.user_id IS NULL)
Not exactly, your not actually joining anything now,
in the case of a left anti join you would have access to the department name as well. (although it would be NULL)
Your sql code would be a correct answer for the question you gave though.
A left anti join would be:
SELECT t.name
FROM teacher t
LEFT JOIN dept d ON d.id = t.dept
WHERE d.id IS NULL
To solve this problem of listing teachers without assigned departments, you don't need a JOIN between teacher and dept tables.
dept table is basically a dictionary table that you join to, to translate ids to corresponding names.
teacher table has a dept column which normally could have a FOREIGN KEY constraint to id column in dept table.
Your query is not an ANTI-JOIN. This is a simple projection and selection query using one table.
SELECT t.name
FROM teacher t
WHERE t.dept IS NULL
For an ANTI-JOIN you would at least need a JOIN operation between more than one table at first.
Normally an ANTI-JOIN could look like:
Using LEFT JOIN
SELECT *
FROM table1 t1
LEFT JOIN table2 t2
ON t1.join_column = t2.join_column
WHERE t2.join_column IS NULL
Using NOT EXISTS
SELECT *
FROM table1 t1
WHERE NOT EXISTS (
SELECT 1
FROM table2 t2
WHERE t1.join_column = t2.join_column
)

Finding actor/movie rows where actor has multiple distinct roles in the same movie

I got 3 tables, actor (id,name), movie (id,name) and casts(aid,mid,role) (aid is the actor id and mid is the movie id). I was trying to get the output like this:
if an actor had more than 3 distinct roles in the same movie, print all the combinations, like:
-1.actor.name, movie.name, role1
-2.actor.name, movie.name, role2
-3.actor.name, movie.name, role3
My query is like this:
select a.name, m.name, x.role
from actor a,
movie m,
(select distinct role
from casts c
where c.aid =a.id and c.mid = m.id
group by c.aid and c.mid
having count(distinct role) >=3) as x;
But I got error message:
The multi-part identifier "m.id" could not be bound.
The multi-part identifier "a.id" could not be bound.
Please point out where my thought went wrong, I want to be able to do this next time. Thanks.
Your initial query is close, but the problem is that you can only return a single column from a subquery, whereas your casts table has a composite key* of two foreign key columns.
Instead, you can do the hard work in a derived table (as you've done in your initial subquery). The benefit of the derived table over the subquery is that you can then join the other tables back to on the two columns to return the friendly column names:
select a.name, m.name, c.`role`
from
(
select aid, mid
from casts
group by aid, mid
having count(distinct `role`) >= 3
) x
inner join actor a
on a.id = x.aid
inner join movie m
on m.id = x.mid
inner join casts c
on x.mid = c.mid and x.aid = c.aid;
* actually, it isn't really a key either, given that the same actor can have multiple roles in the same movie. But we are looking for unique combinations, so its unique after we do the GROUP BY on mid, aid
SqlFiddle here - Duplicate Roles are ignored, and the threshold of 3 roles, same movie is observed.
Do a join between the 3 tables, and have a sub-select to verify at least 3 different roles:
select a.name, m.name, c.role
from actor a
join movie m on a.id = m.aid
join casts c on m.id = c.mid and c.aid = a.id
where a.id in (select aid from casts
where aid = a.id and mid = m.id
group by aid, mid
having count(distinct role) >=3)

Mixing ANSI 1992 JOINs and COMMAs in a query

i'm trying the following MySQL query to fetch some data:
SELECT m.*, t.*
FROM memebers as m, telephone as t
INNER JOIN memeberFunctions as mf ON m.id = mf.memeber
INNER JOIN mitgliedTelephone as mt ON m.id = mt.memeber
WHERE mf.function = 32
But i always get the following error:
#1054 - Unknown column 'm.id' in 'on clause'
The column does exists and the query works fine with only one table (e.g. when i remove telephone)
Does anybody know what I do wrong?
According to this link, you shouldn't mix up both notations when building up joins. The comma you are using to join memebers as m, telephone as t, and the subsequent calls to inner join, are triggering the unknown column error.
To deal with it, use CROSS/INNER/LEFT JOIN instead of commas.
Previously, the comma operator (,) and JOIN both had the same
precedence, so the join expression t1, t2 JOIN t3 was interpreted as
((t1, t2) JOIN t3). Now JOIN has higher precedence, so the expression
is interpreted as (t1, (t2 JOIN t3)). This change affects statements
that use an ON clause, because that clause can refer only to columns
in the operands of the join, and the change in precedence changes
interpretation of what those operands are.
For pedagogic purpose, I'm adding the query as it, I think, should be:
SELECT m.*, t.*
FROM memebers as m
JOIN telephone as t
JOIN memeberFunctions as mf ON m.id = mf.memeber AND mf.function = 32
JOIN mitgliedTelephone as mt ON m.id = mt.memeber
Since you're not joining t and m, the final result will be a cartesian product; you might want it to be revised.
I Hope it helped.
It seems your requirement is to join members table but you are joining with telephone table. just change their order.
SELECT
`m`.*,
`t`.*
FROM
`memebers` AS `m`
JOIN `telephone` AS `t`
JOIN `memeberFunctions` AS `mf`
ON `m`.`id` = `mf`.`memeber`
AND `mf`.`function` = 32
JOIN `mitgliedTelephone` AS `mt`
ON `m`.`id` = `mt`.`memeber`;
Hope this helps you. Thank you!!

Left outer join to a generated table?

Am I on completely the wrong tack ?
I want to do a left outer join to a query generated from 2 tables , but i keep getting errors. Do I need a different approach?
t1:
ID, Surname,Firstname
t2:
ID,JobNo,Confirmed
I have the following query:
SELECT JobNo AS N, StaffID AS P, Confirmed as C,
FirstName AS F,Surname AS S
FROM gigs_players, Players
WHERE t1.StaffID=t2.StaffID AND JobNo="2"
AND (`Confirmed` IS NULL OR Confirmed ='Y' )
ORDER BY Instrument,Surname
I want to add:
LEFT OUTER JOIN contacted (ON t1.StaffID=contact.ID AND t2.JobNo=contact.JobNo)"
Can I do a left outer join to a query generated from 2 tables ?
In order to use the t1 and t2 in the left outer join that you want to add you need to join them with the first tables, you can't reference them directly in the left outer join you, Something like the following:
SELECT JobNo AS N, StaffID AS P, Confirmed as C,
FirstName AS F,Surname AS S
FROM gigs_players, Players
Inner join t1 on ...
Inner join t2 on ...
LEFT OUTER JOIN contacted c
on t1.StaffID=c.ID AND t2.JobNo = c.JobNo
WHERE t1.StaffID=t2.StaffID AND JobNo="2"
AND (`Confirmed` IS NULL OR Confirmed ='Y' )
ORDER BY Instrument,Surname
So, based in your tables' structure, define the conditions of the two joins with t1 and t2 with other tables.
Here is the an example of a left join to a sub query. This might be what you are looking for.
select
parts.id,
min(inv2.id) as nextFIFOitemid
from test.parts
left join
( select
inventory.id,
coalesce(parts.id, 1) as partid
from test.inventory
left join test.parts
on (parts.id = inventory.partid)
) inv2
on (parts.id = inv2.partid)
group by parts.id;