I am having the following two table.
1.Movie Detail (Movie-ID,Movie_Name,Rating,Votes,Year)
2.Movie Genre (Movie-ID,Genre)
I am using the following query to perform join and get the movie with highest rating in each
genre.
select Movie_Name,
max(Rating) as Rating,
Genre from movie_test
inner join movie_genre
where movie_test.Movie_ID = movie_genre.Movie_ID
group by Genre
In the output Rating and Genre are correct but the Movie_Name is incorrect.
can anyone suggest what changes I should make to get the correct movie name along with rating and genre.
SELECT g.*, d.*
FROM MovieGenre g
INNER JOIN MovieDetail d
ON g.MovieID = d.MovieID
INNER JOIN
(
SELECT a.Genre, MAX(b.Rating) maxRating
FROM MovieGenre a
INNER JOIN MovieDetail b
ON a.MovieID = b.MovieID
GROUP BY a.Genre
) sub ON g.Genre = sub.Genre AND
d.rating = sub.maxRating
There is something wrong with your schema design. If a Movie can have many Genre as well as Genre can be contain on many Movie, it should be a three table design.
MovieDetails Table
MovieID (PK)
MovieName
MovieRating
Genre Table
GenreID (PK)
GenreName
Movie_Genre Table
MovieID (FK) -- compound primary key with GenreID
GenreID (FK)
This is a common MySQL problem - specifying non-aggregate/non-aggregated-by columns in an aggregate query. Other flavours of SQL do not let you do this and will warn you.
When you do a query like yours, you are selecting non-aggregate columns in an aggregated group. Since many rows share the same genre, when you select Movie_Name it picks one row at random from each group and displays that one, because there is no general algorithm to guess the row you want and return the values of that.
You might ask 'why does it pick randomly? It could pick the one that max(Rating) belongs to?' but what about other aggregate columns, like avg(Rating)? What row does it pick there? What if two rows have the same max, anyway? Therefore it cannot have an algorithm to pick a row.
To solve a problem like this, you have to restructure your query, something like:
select Movie_Name,
Rating,
Genre from movie_test mt
inner join movie_genre
where movie_test.Movie_ID = movie_genre.Movie_ID
and Rating = (select max(Rating) from movie_test mt2 where mt.Genre = mt2.Genre
group by Genre
limit 1
This will select the row with the rating being the same as the maximum rating for that genre, using a subquery.
Query:
SELECT t.Movie_Name,
t.Rating,
g.Genre
FROM movie_test t
INNER JOIN movie_genre g ON t.Movie_ID = g.Movie_ID
WHERE t.Movie_ID = (SELECT t1.Movie_ID
FROM movie_test t1
INNER JOIN movie_genre g1 ON t1.Movie_ID = g1.Movie_ID
WHERE g1.Genre = g.Genre
ORDER BY t1.Rating DESC
LIMIT 1)
Related
I'm trying to find the output of all books that have more than one genre using a group by statement and subquery. However, it keeps returning Subquery returns more than 1 row. This is what I have so far:
SELECT title
FROM book
WHERE 1 < (SELECT COUNT(genre) FROM genres GROUP BY book_id);
Here's an example:
SELECT b.title
FROM ( SELECT g.book_id
FROM genres g
GROUP
BY g.book_id
HAVING COUNT(1) > 1
) m
JOIN book b
ON b.id = m.book_id
The inline view m is meant to return us values of book_id that appear more than one time in the genres table. Depending on uniqueness constraints, we might want to count distinct values of genre
HAVING COUNT(DISTINCT g.genre) > 1
if we want to find books with exactly three related genre:
HAVING COUNT(DISTINCT g.genre) = 3
Once we have a list of book_id values, we can join to the book table. (The query assumes that book_id in genres is a foreign key reference to the id column in book table.)
You seem to what a correlated subquery:
SELECT b.title
FROM book b
WHERE 1 < (SELECT COUNT(*) FROM genres g WHERE g.book_id = b.book_id);
SELECT distinct a.title
FROM book a, (select bookid,count(distinct genre)genres from genres group by bookid)b
WHERE a.book_id=b.bookid and b.genres>1
hope it helps!
SELECT DISTINCT actor_id
FROM
(SELECT DISTINCT actor_id
FROM cast
WHERE NOT movie_id in
(SELECT movie_id
FROM cast
INNER JOIN actors
ON actors.ID = cast.actor_id
WHERE full_name = 'Kevin Bacon')) as A
WHERE movie_id in
(SELECT movie_id
FROM cast
WHERE actor_id in
(SELECT DISTINCT actor_id
FROM cast
WHERE movie_id in
(SELECT movie_id
FROM cast
INNER JOIN actors
ON actors.ID = cast.actor_id
WHERE full_name = 'Kevin Bacon')))
AND actor_id <> (SELECT id from actors
where full_name = "Kevin Bacon")
;
I keep getting this error of Unknown column 'movie_id' in 'IN/ALL/ANY subquery'; which i do not understand, as the blocks of this code taken separately work just fine.
What am I missing here?
Thks!
I see and error on this simplified example of your query:
SELECT DISTINCT actor_id
FROM ( SELECT DISTINCT actor_id FROM ...) as A
WHERE movie_id in (...);
On the WHERE clause you are referencing "movie_id" from "A" table, but in the inner query "( SELECT DISTINCT actor_id FROM ...)" this column is not selected.
Also, there are so much anidated queries, i'm sure this can be simplified if you give an example "with words" of what you want to get.
Response to the commented goal
I did not found an easy answer for your goal, but i will go this way:
First, i will create a view with actors relationship based on the movies they acted...
CREATE VIEW vw_relations AS (
SELECT
c1.actor_id AS actor1_id, a1.full_name AS actor1_fullname,
c1.movie_id, m.title AS movie_title,
c2.actor_id AS actor2_id, a2.full_name AS actor2_fullname
FROM
cast AS c1
INNER JOIN
cast AS c2 ON c2.movie_id = c1.movie_id AND c2.actor_id != c1.actor_id
INNER JOIN
movies AS m ON m.id = c1.movie_id
INNER JOIN
actors AS a1 ON a1.id = c1.actor_id
INNER JOIN
actors AS a2 ON a2.id = c2.actor_id
);
Now, if actor NAME1 participated on the same movie that actor NAME2 there will be rows with next values on the previous view:
(id_name1, name1, movie_id, movie_title, id_name2, name2)
(id_name2, name2, movie_id, movie_title, id_name1, name1)
In other words, the relation will appear twice, but this simplify next queries...
Now, based on your definition, the 1º degrees of closeness for actor "Kevin Bacon" (actors that worked with him) can be obtained like this:
CREATE VIEW vw_1_degree_to_kb AS (
SELECT
actor1_id, actor1_fullname
FROM
vw_relations
WHERE
actor2_fullname = "Kevin Bacon"
);
Now, for the 2º degrees of closeness for actor "Kevin Bacon" (actors that worked with actors that worked with him) i will do this (and save in a view too):
CREATE VIEW vw_2_degree_to_kb AS (
SELECT
actor1_id, actor1_fullname
FROM
vw_relations
WHERE
actor2_id IN (SELECT actor1_id FROM vw_1_degree_to_kb)
AND
actor1_id NOT IN (SELECT actor1_id FROM vw_1_degree_to_kb)
AND
actor1_fullname != "Kevin Bacon"
);
In others words, this view contains actors that worked with the 1º degree of closeness actors of "Kevin Bacon", but that not already belong to that set.
Even more, the 3º degrees of closeness will be like this:
CREATE VIEW vw_3_degree_to_kb AS (
SELECT
actor1_id, actor1_fullname
FROM
vw_relations
WHERE
actor2_id IN (SELECT actor1_id FROM vw_2_degree_to_kb)
AND
actor1_id NOT IN (SELECT actor1_id FROM vw_1_degree_to_kb)
AND
actor1_id NOT IN (SELECT actor1_id FROM vw_2_degree_to_kb)
AND
actor1_fullname != "Kevin Bacon"
);
I want to find for each
genre of movie, find the N actors who have played in most movies
of the genre
I have done this:
select genre.genre_name,actor.actor_id,count(genre.genre_name) from genre
inner join movie_has_genre on movie_has_genre.genre_id=genre.genre_id
inner join movie on movie_has_genre.movie_id=movie.movie_id
inner join role on movie.movie_id=role.movie_id
inner join actor on actor.actor_id=role.actor_id
group by genre.genre_name,actor.actor_id;
which gives as a result for each genre how many movies of that genre every actor has played and now i want to find for each genre the actor that has played the most moviesof that genre.
Tables and their columns:
actor(actor_id,name)
role(actor_id,movie_id)
movie(movie_id,title)
movie_has_genre(movie_id,genre_id)
genre(genre_id,genre_name)
Also the result should be something like this:
Action 22591 7
Horror 25863 3
Horror 24867 3
Comedy 23476 2
Drama 14536 1
Drama 19634 1
Drama 17563 1
Man, what I'd do is the next (supposing your code is working well):
-- Notice this is your code with some aliases, nothing else.
-- Just for making mi job easier.
create view frequency as
select genre.genre_name as genre_ name,
actor.actor_id as actor_id,
count(genre.genre_name) as freq
from genre
inner join movie_has_genre on movie_has_genre.genre_id=genre.genre_id
inner join movie on movie_has_genre.movie_id=movie.movie_id
inner join role on movie.movie_id=role.movie_id
inner join actor on actor.actor_id=role.actor_id
group by genre.genre_name,actor.actor_id;
-- And this is my proposal
-- Take the max frequency per each category
-- and find the guy who possesses it (maybe 2 or more...)
select genre.genre_name,actor.actor_id
from frequency as tbl1 inner join
(
-- The max frequency in a genre.
select f.genre_name,
max(f.freq) as max_freq
from frequency f
group by(genre_name)
) as tbl2 on (tbl1.genre_name = tbl2.genre_name)
where tbl1.freq = tbl2.max_freq;
And well, there's one problem: It may return more than one actor per category, if there's a tie. But how can I know who is the winner? I let it for you. Maybe it's wrong, I don't think so, but we're both learning! Hope I'd help you.
You need to use the MAX() function. Some SQL implementations (such as Oracle) allow you to do this: SELECT MAX(COUNT(whatever)) but MySQL isn't one of them.
One way to do what you want is this:
select genre_name, actor_id, max(genrecount)
from (
select genre.genre_name, actor.actor_id, count(genre.genre_name) as genrecount
from genre
inner join movie_has_genre on movie_has_genre.genre_id=genre.genre_id
inner join movie on movie_has_genre.movie_id=movie.movie_id
inner join role on movie.movie_id=role.movie_id
group by genre.genre_name,actor.actor_id
) as topactor
This does the outer SELECT on the table derived from the inner SELECT.
I have the following tables:
Movie ( mID, title, year, director )
Reviewer ( rID, name )
Rating ( rID, mID, stars, ratingDate )
What i want to do is get the directors name along with the movies name which he has directed and got the highest rating.
For example, if Steven Spielberg has directed two movies (namely A and B) which have got 3 stars and 5 stars rating respectively, then the query must show Steven Spielberg and B (movie with the highest rating).
PS: I only need help with the approach. Hope I made myself clear. Please ask if any more info or explanation needed.
Why dont you try this,
SELECT TITLE,DIRECTOR FROM MOVIE,
(SELECT MAX(STARS),mID FROM RATING GROUP BY mID) R
WHERE MOVIE.mID=R.mID
Set up a subselect to get the director and the highest rating:-
SELECT director, MAX(stars)
FROM Movie
INNER JOIN Rating
ON Movie.mID = Rating.mID
INNER JOIN
(
SELECT director, MAX(stars) AS MaxRating
FROM Movie
INNER JOIN Rating
ON Movie.mID = Rating.mID
GROUP BY director
) Sub1
ON Movie.directort = Sub1.director
AND Rating.stars = Sub1.MaxRating
However I presume you will need more details. You do not appear to use the reviewer table at the moment, and I presume that one movie could have had several different reviewers who could have given different ratings. If so you would want to use the above as a subselect to join back against the rating table (macthign on the title and stars), and from that to the reviewer table.
Here you go
SELECT q.* FROM (SELECT m.*,MAX(r.`stars`) AS maxrating FROM `movie` m
INNER JOIN `rating` r ON (m.`mID` = r.`mID` )
GROUP BY r.`mID` ORDER BY maxrating DESC ) q GROUP BY q.director
ORDER BY q.maxrating DESC
And i am sure this question is taken from the quiz of DB class provided by stanford university
Here is your fiddle
Another way to do that is:
select m.title, max(r.stars) as stars
from rating r
inner join movie m on r.mid = m.mid
group by r.mid
order by m.title
this code should suffice :
select distinct m1.director, m1.title, r1.stars from movie m1
join rating r1 on m1.mID = r1.mID
left join (
select m2.director, r2.stars from movie m2
join rating r2 on m2.mID = r2.mID
) s on m1.director = s.director and r1.stars < s.stars
where s.stars is null and m1.director is not null;
Im trying to get all the movies_name that belong in two genres(categroies),
i.e. SCHEMA
movie_id genre
4 Action
4 Comedy
SQL:
SELECT movies.movie_name
FROM movies
INNER JOIN tags
ON movies.movie_id = tags.movie_id
WHERE tags.genre = 'Comedy'
AND tags.genre = 'Action'
this should bring me back the movie_name of movie_id 4.
this brings me back zero results, when i know thier should be three results using my test data, am i doing the query wrong.
SELECT movies.movie_name
FROM movies
INNER JOIN tags
ON movies.movie_id = tags.movie_id
WHERE tags.genre IN ('Comedy','Action')
GROUP BY movies.movie_name
HAVING COUNT(*) = 2
if unique constraint was not specified on genre for each movie then you need to add DISTINCT
SELECT movies.movie_name
FROM movies
INNER JOIN tags
ON movies.movie_id = tags.movie_id
WHERE tags.genre IN ('Comedy','Action')
GROUP BY movies.movie_name
HAVING COUNT(DISTINCT tags.genre) = 2
SQLFiddle Demo (example data is different but still have same thought)
Tags.genre cannot be BOTH 'Comedy' and 'Action' at the same time. You need an IN clause like such:
`SELECT fields FROM tables WHERE tags.genre IN ('Comedy', 'Action')