IMDB database has the following tables
actors(id, first_name, last_name, gender)
directors(id, first_name,last_name)
directors_genres(director_id, genre, prob)
movies(id, name,year, rank)
movies_directors(director_id, movie_id)
roles(actor_id,movie_id, role)
movies_genres(movie_id, genre)
a) Write a query that lists the female actors who appeared in a movie during the 90s (1990-1999) that was rated higher than 8.5.
b) Write a query that lists all actors who was in a movie rated lower than 3.0 two or more times. List the name of the actor, the movie and each rating, ordered ascending by the actors’ last name then first name.
c) Write a query that lists all actors who have been in two or more movies of different genres. List their name, movie and their respective genres.
My answers:
a)
SELECT actors.firstname
from ((roles inner join movies on roles.mid=movies.id)
inner join actors on actors.id=roles.aid)
where (movies.year between 1990 and 1999)
and
(movies.rank >= 8.5)
. is it correct ?
and can anyone help how to approach other queries. Thanks in advance
You forget gender, and higher than 8.5 (not higher than or equals).
SELECT actors.firstname
from roles inner join movies on roles.movie_id = movies.id
inner join actors on actors.id=roles.aid
where movies.year between 1990 and 1999
and movies.rank > 8.5
and actors.gender = 'F';
P.S. Is this your school work?
SELECT
first_name, last_name
FROM actors
JOIN
movies ON actors.id=movies.id
WHERE movies.gender='female'and movies.rank>8.5 AND movies.year
BETWEEN
1990-1999
Add role to actors table
actors(id, first_name, last_name, gender,role)
movies(id, name,year, rank)
One more variant:
SELECT actors.firstname FROM actors WHERE id IN(
SELECT actors_id FROM roles WHERE movies_id IN(
SELECT id FROM movies WHERE (movies.year BETWEEN 1990 and 1999)
and (movies.rank >=8)));
Related
Stumbled into a problem using an IMBd dataset that I can't seem to figure out the answer to. The question is:
Create a table that contains the average count of genres per movie for
each genre
We have two tables: Movies: id, name; Genres: id (movieId), genre
Movies:
id,name
1,Toy Story
2,Jumanji
3,Grumpier Old Men
4,Waiting to Exhale
5,Father of the Bride Part II
6,Heat
Genres:
id,genre
1,Animation
1,Children's
1,Comedy
2,Adventure
2,Children's
2,Fantasy
3,Comedy
3,Romance
4,Comedy
4,Drama
5,Comedy
6,Action
6,Crime
6,Thriller
I maybe interpreting the question incorrectly, but shouldn't the output be 3 columns: genre, movie, and count?
My answer would start along the lines of:
SELECT genre, name, AVG(COUNT(*)) FROM movies
JOIN genres ON genres.id=movies.id
GROUP BY name;
Any ideas on how you would interpret the question and answer?
Well, I would start with the number of genres per movie:
select id, count(*) as num_genres
from genres g
group by id
Then, I would "attach" this information to the genres information. And aggregate and average:
select g.genre, avg(m.num_genres)
from genres g join
(select id, count(*) as num_genres
from genres g
group by id
) m
on g.id = m.id
group by g.genre;
I agree with Gordon first number of genres per movie
select id, count(*) as num_genres
from genres g
group by id
But the average of genres per movies should be
SELECT AVG(num_generes)
FROM (
SELECT id, count(*) as num_genres
FROM genres g
GROUP BY id
) t
I'm meant to use an implicit join to get all the movies with angelina jolie as director or were she stars here's what I have so far
SELECT DISTINCT title, relYear
FROM actor,movie
WHERE director ='Angelina Jolie' OR aID in (SELECT aID
FROM actor
WHERE fName='Angelina' and surname='Jolie'
Here are the relevant tables
movie(id, title, relYear, category, runTime, director,
studioName, description, rating)
actor(aID, fName, surname, gender)
stars(movieID, actorID)
movGenre(movieID, genre)
This returns all of the movies , I think that's because of aID in (SELECT aID
I don't know how to do this without using explicit join on three tables is the subquery even the most efficient approach ? Thanks
This is what I would do on MSSQL. Think it should work on MySql.
Select title, relYear FROM movie WHERE director = 'Angelina Jolie' OR id IN
(SELECT movieId FROM stars inner join actor ON stars.actorId = actor.aID WHERE
actor.fName = 'Angelina' AND surname = 'Jolie')
I want to find for each
genre of movie, find the N actors who have played in most movies
of the genre
I have done this:
select genre.genre_name,actor.actor_id,count(genre.genre_name) from genre
inner join movie_has_genre on movie_has_genre.genre_id=genre.genre_id
inner join movie on movie_has_genre.movie_id=movie.movie_id
inner join role on movie.movie_id=role.movie_id
inner join actor on actor.actor_id=role.actor_id
group by genre.genre_name,actor.actor_id;
which gives as a result for each genre how many movies of that genre every actor has played and now i want to find for each genre the actor that has played the most moviesof that genre.
Tables and their columns:
actor(actor_id,name)
role(actor_id,movie_id)
movie(movie_id,title)
movie_has_genre(movie_id,genre_id)
genre(genre_id,genre_name)
Also the result should be something like this:
Action 22591 7
Horror 25863 3
Horror 24867 3
Comedy 23476 2
Drama 14536 1
Drama 19634 1
Drama 17563 1
Man, what I'd do is the next (supposing your code is working well):
-- Notice this is your code with some aliases, nothing else.
-- Just for making mi job easier.
create view frequency as
select genre.genre_name as genre_ name,
actor.actor_id as actor_id,
count(genre.genre_name) as freq
from genre
inner join movie_has_genre on movie_has_genre.genre_id=genre.genre_id
inner join movie on movie_has_genre.movie_id=movie.movie_id
inner join role on movie.movie_id=role.movie_id
inner join actor on actor.actor_id=role.actor_id
group by genre.genre_name,actor.actor_id;
-- And this is my proposal
-- Take the max frequency per each category
-- and find the guy who possesses it (maybe 2 or more...)
select genre.genre_name,actor.actor_id
from frequency as tbl1 inner join
(
-- The max frequency in a genre.
select f.genre_name,
max(f.freq) as max_freq
from frequency f
group by(genre_name)
) as tbl2 on (tbl1.genre_name = tbl2.genre_name)
where tbl1.freq = tbl2.max_freq;
And well, there's one problem: It may return more than one actor per category, if there's a tie. But how can I know who is the winner? I let it for you. Maybe it's wrong, I don't think so, but we're both learning! Hope I'd help you.
You need to use the MAX() function. Some SQL implementations (such as Oracle) allow you to do this: SELECT MAX(COUNT(whatever)) but MySQL isn't one of them.
One way to do what you want is this:
select genre_name, actor_id, max(genrecount)
from (
select genre.genre_name, actor.actor_id, count(genre.genre_name) as genrecount
from genre
inner join movie_has_genre on movie_has_genre.genre_id=genre.genre_id
inner join movie on movie_has_genre.movie_id=movie.movie_id
inner join role on movie.movie_id=role.movie_id
group by genre.genre_name,actor.actor_id
) as topactor
This does the outer SELECT on the table derived from the inner SELECT.
This question already has answers here:
How to return rows that have the same column values in MySql
(3 answers)
Closed 8 years ago.
I have a DB that tracks movies and actors, with the following tables:
Person(ID, Name)
Movie(ID, Name)
Actors(ID, PersonID, MovieID)
ID fields are all primary keys and PersonID & MovieID are foreign keys.
I want to select all the movies that Tom Cruise and Brad Pitt played in.
I tried to do the following:
SELECT * FROM Person, Actors
WHERE Person.ID = Actors.ActorID AND
(Person.Name= 'Tom Cruise' or Person.Name= 'Brad Pitt')
GROUP BY Actors.MovieID
HAVING COUNT(Actors.MovieID) > 1
This doesn't work, because the person name is not unique (and I can't change it to unique). If I have another actor named Brad Pitt and the two Brad Pitts played in the same movie it would return as a result too.
How can I do it?
NOTE: The number of actors I am querying about can change. I might need a movie with 10 actors that all played in it.
Do a inner join between the tables as below
SELECT m.Name as MovieName
FROM Movie m
inner join Actors a
on m.ID = a.MovieID
inner join Person p
on p.ID = a.ActorID
and p.Name in ('Tom Cruise','Brad Pitt')
I am having the following two table.
1.Movie Detail (Movie-ID,Movie_Name,Rating,Votes,Year)
2.Movie Genre (Movie-ID,Genre)
I am using the following query to perform join and get the movie with highest rating in each
genre.
select Movie_Name,
max(Rating) as Rating,
Genre from movie_test
inner join movie_genre
where movie_test.Movie_ID = movie_genre.Movie_ID
group by Genre
In the output Rating and Genre are correct but the Movie_Name is incorrect.
can anyone suggest what changes I should make to get the correct movie name along with rating and genre.
SELECT g.*, d.*
FROM MovieGenre g
INNER JOIN MovieDetail d
ON g.MovieID = d.MovieID
INNER JOIN
(
SELECT a.Genre, MAX(b.Rating) maxRating
FROM MovieGenre a
INNER JOIN MovieDetail b
ON a.MovieID = b.MovieID
GROUP BY a.Genre
) sub ON g.Genre = sub.Genre AND
d.rating = sub.maxRating
There is something wrong with your schema design. If a Movie can have many Genre as well as Genre can be contain on many Movie, it should be a three table design.
MovieDetails Table
MovieID (PK)
MovieName
MovieRating
Genre Table
GenreID (PK)
GenreName
Movie_Genre Table
MovieID (FK) -- compound primary key with GenreID
GenreID (FK)
This is a common MySQL problem - specifying non-aggregate/non-aggregated-by columns in an aggregate query. Other flavours of SQL do not let you do this and will warn you.
When you do a query like yours, you are selecting non-aggregate columns in an aggregated group. Since many rows share the same genre, when you select Movie_Name it picks one row at random from each group and displays that one, because there is no general algorithm to guess the row you want and return the values of that.
You might ask 'why does it pick randomly? It could pick the one that max(Rating) belongs to?' but what about other aggregate columns, like avg(Rating)? What row does it pick there? What if two rows have the same max, anyway? Therefore it cannot have an algorithm to pick a row.
To solve a problem like this, you have to restructure your query, something like:
select Movie_Name,
Rating,
Genre from movie_test mt
inner join movie_genre
where movie_test.Movie_ID = movie_genre.Movie_ID
and Rating = (select max(Rating) from movie_test mt2 where mt.Genre = mt2.Genre
group by Genre
limit 1
This will select the row with the rating being the same as the maximum rating for that genre, using a subquery.
Query:
SELECT t.Movie_Name,
t.Rating,
g.Genre
FROM movie_test t
INNER JOIN movie_genre g ON t.Movie_ID = g.Movie_ID
WHERE t.Movie_ID = (SELECT t1.Movie_ID
FROM movie_test t1
INNER JOIN movie_genre g1 ON t1.Movie_ID = g1.Movie_ID
WHERE g1.Genre = g.Genre
ORDER BY t1.Rating DESC
LIMIT 1)