SQL - Find the object with the most appearances - mysql

I am a newbie to SQL working on an assignment to find the actor or actress with the most appearances. A diagram of the database I'm working with is here:
Here was the query I was trying to use:
SELECT DISTINCT n.name, count(n.name)
FROM cast_info c
INNER JOIN name n
ON (n.id = c.person_id)
INNER JOIN title t
ON (c.movie_id = t.id)
CROSS JOIN role_type r
WHERE (r.role = 'actor' OR r.role = 'actress')
GROUP BY n.name
This is intended to get a count of how many times different actors showed up, which I can then sort and select the top one. But it doesn't work. Something else I did was:
SELECT n.name, count(n.name) AS amount
FROM cast_info c
INNER JOIN name n
ON (n.id = c.person_id)
INNER JOIN title t
ON (c.movie_id = t.id)
LEFT JOIN role_type r
ON c.role_id = r.id
AND (r.role = 'actor' OR r.role = 'actress')
GROUP BY amount
ORDER BY amount DESC
LIMIT 1
But that gives the error
aggregate functions are not allowed in GROUP BY
LINE 1: SELECT COUNT(*) AS total FROM (SELECT n.name, count(n.name) ...
Tips?

I am going to take a stab at each of these questions for you, because this assignment is obviously causing you some trouble.
You can find everything you need in your cast_info table and your role_type table, unless you need to display the actors/actresses actual name.
I would start by selecting all rows that represent an actor or actress in a movie. This should be a unique combination, as a person can't be an actor in the same movie twice. Once you've done that, group by the persons id and get the count() of rows, which should effectively be the number of movies. I think the error you're getting is exactly for the reason it sounds, you can't use an aggregate column in your order by. A workaround for that would be to use this as a subquery, and use MAX() to get most appearances.
Try this:
SELECT c.personid, MAX(numMovies) AS mostApperances
FROM(SELECT c.personid, COUNT(*) AS numMovies
FROM cast_info c
JOIN role_type r ON r.id = c.role_id
WHERE r.role = 'actor' OR r.role = 'actress'
GROUP BY c.personid) t

Try this
SELECT DISTINCT n.name, count(n.name)
FROM cast_info c
INNER JOIN name n
ON n.id = c.person_id
INNER JOIN title t
ON c.movie_id = t.id
LEFT JOIN role_type r
ON c.role_id = r.id
AND (r.role = 'actor' OR r.role = 'actress')
GROUP BY n.name

Related

SQL - inner join on different criteria

Just getting confused on basic stuff -
could someone explain me this -
select s.name from students s
inner join friends f on f.id = s.id
inner join packages p on p.id = s.id
where p.salary < (select pp.salary from packages pp where pp.id = f.friend_id)
order by (select pp.salary from packages pp where pp.id = f.friend_id) ASC;
the salary comparison part - i.e select pp.salary from packages pp where pp.id = f.friend_id should not yield the same salary result? - so how can we compare.
for references, use the below sample tables
table 1- students
columns - id, name
table 2 - friends (here each id is linked with one friend_id (his best friend))
columns - id , friend_id
table3 - packages
columns - id , salary
Trying to find out the name of the friend whose best friend's salary is more than his salary.
I am confused at understanding this solution.
That where subquery part is wrong cause the subquery will return multiple record and which can't be used with < operator since it's accepts scalar value. Rather change that to a JOIN as well like
JOIN packages pp ON pp.id = f.friend_id
AND p.salary < pp.salary
Change your query to be
select s.name from students s
inner join friends f on f.id = s.id
inner join packages p on p.id = s.id
JOIN packages pp ON pp.id = f.friend_id
AND p.salary < pp.salary
order by pp.salary;

Is this the right way to join tables to fetch data?

I have a database with the tables:
Student(SID,Name,Surname,Age)
Registration(StudentID,CourseID)
Course(CID,Name,Cost)
I would like to extract only the name of the courses with students younger than 20. Will the query below do just that?
SELECT C.NAME
FROM Course C
INNER JOIN Registration
INNER JOIN Student S
WHERE CID = CourseID
AND SID = StudentID
AND Age < 20
GROUP BY C.NAME
I would also like to extract the number of students in each course having students younger than 20. Is it correct to do it as below?
SELECT count(S.NAME)
,C.NAME
FROM Student S
INNER JOIN Course C
INNER JOIN Registration
WHERE Age < 20
AND CID = CourseID
AND SID = StudentID
GROUP BY C.NAME
You are missing the ON part for the join otherwise it would just be a CROSS JOIN.
Your first query should look like this if you want just a distinct list of student names:
SELECT DISTINCT C.NAME
FROM Course C
INNER JOIN Registration R ON C.CID = R.CourseID
INNER JOIN Student S ON R.StudentID = S.SID
WHERE Age < 20
Your second query shouldn't really have the C.Name in the select if you want to get just a count unless you want a count of how many students have that name.
SELECT count(*)
FROM Student S
INNER JOIN Registration R ON s.SID = R.StudentID
INNER JOIN Course C ON c.CID = R.CourseID
WHERE Age < 20
GROUP BY C.NAME
First join these tables, then group by Course's PK(CID), Add the HAVING condition to filter the course which has students younger than 20.
Then use Course table to join the result to get the course name and count of students in the course.
SELECT
T1.Name,
T2.StudentCount
FROM
Course T1
INNER JOIN (
SELECT
c.CID,
COUNT(s.SID) AS StudentCount
FROM
Course c
LEFT JOIN Registration r ON c.CID = r.CourseID
LEFT JOIN Student s ON s.SID = r.StudentID
GROUP BY c.CID
HAVING COUNT(IF(s.Age < 20, 1, NULL)) > 0
) T2 ON T1.CID = T2.CID
More correctly, you should move the conditions of the join, to the join statements themselves by including them in the on clause instead of the where. While the results may not change in this instance, if you were to start including outer joins you would encounter difficulties.
SELECT count(S.NAME)
,C.NAME
FROM Student S
INNER JOIN Registration R
ON s.SID = R.StudentID
INNER JOIN Course C
ON c.CID = R.CourseID
WHERE Age < 20
GROUP BY C.NAME
There's a fiddle here showing it in action: http://sqlfiddle.com/#!9/c3b8f/1
Your first query will also produce the results you want, but again, you should move the join predicates to the join itself. Also, you don't need to perform the grouping just to get distinct values, mysql has an expression for that called distinct. So rewritten, the first query would look like:
SELECT DISTINCT C.NAME
FROM Student S
INNER JOIN Registration R
ON s.SID = R.StudentID
INNER JOIN Course C
ON c.CID = R.CourseID
WHERE Age < 20.
Again, the results are the same as what you have already but it is easier to 'read' and will put you in good stead when you move on to other queries. As it stands you have mixed implicit and explicit join syntax.
This fiddle demonstrates both queries: http://sqlfiddle.com/#!9/c3b8f/4
edit
I may have misinterpreted your original question - if you want the total number of students enrolled in a course with at least one student under 19, you can use a query like this:
select name, count(*)
from course c
inner join registration r
on c.cid = r.courseid
where exists (
select 1
from course cc
inner join registration r
on cc.cid = r.courseid
inner join student s
on s.sid = r.studentid
where cc.cid = c.cid
group by cc.cid
having min(s.age) < 20
)
group by name;
Again with the updated fiddle here: http://sqlfiddle.com/#!9/c3b8f/17

Count matched words from IN operator

i have this little mysql query :
select t.title FROM title t
inner join movie_keyword mk on mk.movie_id = t.id
inner join keyword k on k.id = mk.keyword_id
where k.keyword IN (
select k.keyword
FROM title t
inner join movie_keyword mk on mk.movie_id = t.id
inner join keyword k on k.id = mk.keyword_id
where t.id = 166282
)
LIMIT 15
as you can see it will return all titles from title that have at least one the same keyword that have movie with id 166282.
Now i have problem, because i want also count how many keywords was matched in IN operator(let's say i want to see only titles that have 3 or more the same keywords), i tried something with aggregate functions, but everything failed, so i came here with my problem. Maybe somebody can give me some advice, or code example.
I'm not also sure, if this "subquery way" is good, so if there are some better options how i should solve my problem, I am open to any suggestions or tips.
Thank you!
#Edit
So after some problems, i have one more. This is my current query :
SELECT s.title,s.vote,s.rating,count(dk.key) as keywordCnt, count(dg.name) as genreCnt
FROM series s
INNER JOIN series_has_genre shg ON shg.series_id = s.id
INNER JOIN dict_genre dg ON dg.id = shg.dict_genre_id
INNER JOIN series_has_keyword shk ON shk.series_id = s.id
INNER JOIN dict_keyword dk ON dk.id = shk.dict_keyword_id
WHERE dk.key IN (
SELECT dki.key FROM series si
INNER JOIN series_has_keyword shki ON shki.series_id = si.id
INNER JOIN dict_keyword dki ON dki.id = shki.dict_keyword_id
WHERE si.title LIKE 'The Wire'
)
and dg.name IN (
SELECT dgo.name FROM series so
INNER JOIN series_has_genre shgo ON shgo.series_id = so.id
INNER JOIN dict_genre dgo ON dgo.id = shgo.dict_genre_id
WHERE so.title LIKE 'The Wire'
)
and s.production_year > 2000
GROUP BY s.title
ORDER BY s.vote DESC, keywordCnt DESC ,s.rating DESC, genreCnt DESC
LIMIT 5
Problem is, it is very, very, very slow. Any tips what i should change, to run it faster ?
Will this work for you:
select t.title, count(k.keyword) as keywordCount FROM title t
inner join movie_keyword mk on mk.movie_id = t.id
inner join keyword k on k.id = mk.keyword_id
where k.keyword IN (
select ki.keyword
FROM title ti
inner join movie_keyword mki on mki.movie_id = ti.id
inner join keyword ki on ki.id = mki.keyword_id
where ti.id = 166282
) group by t.title
LIMIT 15
Note that I have changed the table names inside the nested query to avoid confusion.

MySQL inner join fails

I got this simple join statement and I'm pretty sure the syntax is correct. I looked some tutorials and I don't find any difference between my code and the exemples.
Here's the statement:
SELECT n.id nId, n.news_date, n.news_type,
p.id pId, p.title pTitle, p.file_path pPath,
s.id sId, s.title sTitle, s.content sContent,
v.id vId, v.title vTitle, v.url vUrl
FROM photo_news p, standard_news s, video_news v
INNER JOIN news n
ON p.news_id = n.id OR s.news_id = n.id OR v.news_id = n.id
ORDER BY n.news_date DESC
I get the following error:
Unknown column 's.news_id' in 'on clause'
I really don't know why this error is launched because the column 'news_id' exists in every table it has to exist.
And if I change the order in the ON clause (i.e. I start with p.news_id = n.news_id) I get the same error (unknwonw column p.news_id). So I think there's a problem with the aliases but I really don't have a clue.
Thanks for your help ;)
Probably you are looking for something like this to return data for the record in photo news with data in at least one of the other table.
In that case you need to use a LEFT JOINs and not OR in the JOIN conditions.
SELECT n.id nId, n.news_date, n.news_type,
p.id pId, p.title pTitle, p.file_path pPath,
s.id sId, s.title sTitle, s.content sContent,
v.id vId, v.title vTitle, v.url vUrl
FROM news n
LEFT OUTER JOIN photo_news p
ON n.id = p.news_id
LEFT OUTER JOIN standard_news s
ON n.id = s.news_id
LEFT OUTER JOIN video_news v
ON n.id = v.news_id
WHERE p.news_id IS NOT NULL
OR s.news_id IS NOT NULL
OR v.news_id IS NOT NULL
ORDER BY n.news_date DESC
Try this, You made mistake in JOINing tables.
for reference you can see how multiple tables are JOINed together.
SELECT n.id nId, n.news_date, n.news_type,
p.id pId, p.title pTitle, p.file_path pPath,
s.id sId, s.title sTitle, s.content sContent,
v.id vId, v.title vTitle, v.url vUrl
FROM photo_news p INNER JOIN standard_news s
p.news_id = s.news_id
INNER JOIN video_news v
on s.news_id = v.news_id
INNER JOIN news n
on v.news_id = n.id
ORDER BY n.news_date DESC
You are mixing only-style and new-style joins. Just use the explicit join syntax. Your from should probably be:
FROM news n join
photo_news p
on p.news_id = n.id join
standard_news s
on s.news_id = n.id join
video_news v
on v.news_id = n.id
Using or between join conditions is not typically used.
THe error is appearing because of the precedence rules that MySQL uses. As the documentation explains:
INNER JOIN and , (comma) are semantically equivalent in the absence of
a join condition: both produce a Cartesian product between the
specified tables (that is, each and every row in the first table is
joined to each and every row in the second table).
However, the precedence of the comma operator is less than of INNER
JOIN, CROSS JOIN, LEFT JOIN, and so on. If you mix comma joins with
the other join types when there is a join condition, an error of the
form Unknown column 'col_name' in 'on clause' may occur. Information
about dealing with this problem is given later in this section.
All that said, I'm not sure that this is really the query that you want. You are going to get a cartesian product of the different values from the different tables. You should probably ask another question with sample data and desired results, so someone can help you with the right query.
You are using deprecated join syntax mixed with supported syntax.
do yourself a favor and write those joins properly
http://dev.mysql.com/doc/refman/5.0/en/join.html
SELECT n.id nId, n.news_date, n.news_type,
p.id pId, p.title pTitle, p.file_path pPath,
s.id sId, s.title sTitle, s.content sContent,
v.id vId, v.title vTitle, v.url vUrl
FROM photo_news p
left/inner/right/"" join standard_news s on CONDITION
left/inner/right/"" join video_news v on CONDITION
INNER JOIN news n
ON p.news_id = n.id OR s.news_id = n.id OR v.news_id = n.id
ORDER BY n.news_date DESC
However, i am pretty sure you want to use union or something
SELECT ...
FROM
select * from (photo_news p
union all standard_news s
union all video_news v ) all_news
INNER JOIN news n on CONDITION

How to use aliases with MySQL LEFT JOIN

My original query is doing joins using the WHERE clause rather than JOIN. I realized that this was not returning movies that did not have any stars or genres did not show up so I think I have to do a LEFT JOIN in order to show every movie. Here is my original SQL:
SELECT *
FROM movies m, stars s, stars_in_movies sm, genres g, genres_in_movies gm
WHERE m.id = sm.movie_id
AND sm.star_id = s.id
AND gm.genre_id = g.id
AND gm.movie_id = m.id
AND m.title LIKE '%the%'
AND s.first_name LIKE '%Ben%'
ORDER BY m.title ASC
LIMIT 5;
I tried to do a LEFT JOIN on movies I'm definitely doing something wrong.
SELECT *
FROM movies m, stars s, stars_in_movies sm, genres g, genres_in_movies gm
LEFT JOIN movies m1 ON m1.id = sm.movie_id
LEFT JOIN movies m2 ON m2.id = gm.movie_id
AND sm.star_id = s.id
AND gm.genre_id = g.id
ORDER BY m.title ASC
LIMIT 5;
I get ERROR 1054 (42S22): Unknown column 'sm.movie_id' in 'on clause' so clearly I'm doing the join wrong, I just don't see what it is.
Don't mix the comma operator with JOIN - they have different precedence! There is even a warning about this in the manual:
However, the precedence of the comma operator is less than of INNER JOIN, CROSS JOIN, LEFT JOIN, and so on. If you mix comma joins with the other join types when there is a join condition, an error of the form Unknown column 'col_name' in 'on clause' may occur. Information about dealing with this problem is given later in this section.
Try this instead:
SELECT *
FROM movies m
LEFT JOIN (
stars s
JOIN stars_in_movies sm
ON sm.star_id = s.id
) ON m.id = sm.movie_id AND s.first_name LIKE '%Ben%'
LEFT JOIN (
genres g
JOIN genres_in_movies gm
ON gm.genre_id = g.id
) ON gm.movie_id = m.id
WHERE m.title LIKE '%the%'
ORDER BY m.title ASC
LIMIT 5;
You should put your conditions related to your JOINs in the same ON clause. However, for your above problem, you should use the following query:
SELECT *
FROM movies m
LEFT JOIN stars_in_movies sm ON sm.movie_id = m.id
JOIN stars s ON sm.star_id = s.id
LEFT JOIN genres_in_movies gm ON gm.movie_id = m.id
JOIN genres g ON gm.genre_id = g.id
ORDER BY m.title ASC
LIMIT 5;
Maybe ugly, But the way it will work is here. Beware this is ugly and lot of people is giving warning about this kind of hacks
SELECT *
FROM movies m, stars_in_movies sm LEFT JOIN movies m1 ON m1.id = sm.movie_id, stars s
ORDER BY m.title ASC
LIMIT 5;
when using joins, you must do the join with the right table which have the columns you are comparing.
SQL Join (inner join in MySQL)
select emp1.id,emp1.name,emp1.job from (select id, type as name, description as job from component_type as emp1)emp1
inner join
emp
on emp1.id=emp.id;
Left Join
select emp1.id,emp1.name,emp1.job from (select id, type as name, description as job from component_type as emp1 where id between '1' AND '5')emp1
left join
emp
on emp1.id=emp.id;
Right Join
select emp1.id,emp1.name,emp1.job from (select id, type as name, description as job from component_type as emp1)emp1
Right join
(select * from emp where id between '1' and '5')exe
on emp1.id=exe.id;
Using alias connect many table without using join..
select sum(s.salary_amount) as total_expenses_paid_to_all_department
from salary_mas_tbl s,dept_mas_tbl d
where s.salary_dept=d.dept_id;