MySQL Querying a movie database - mysql

I'm very new to SQL, so please bear with me.
I've built a movie database and I'm trying to query it so that all my tables display properly.
I have a movies table with the columns movieID, title, releaseYear, directorID, genreID, and actorID.
Inside the table director, I have directorID and Director.
Using the query SELECT * FROM movies INNER JOIN director ON director.directorID = movies.directorID;, I'm able to get everything in tables movies and director to display (which isn't exactly what I want, but it's in the right track).
My remaining tables are actor, (with actorID and actor's names) starring (with starringID, movieID, and actorID), genre (with genreID and 22 different genres), and moviegenres (with moviegenresID, moviesID, and genreID).
I'm a bit lost and I apologize if this is confusing and messy, but I'm thinking I need to query the database so that all the tables show the data and are associated with the correct column. For example, most movies have multiple genres and actors, which is why I separated them into tables of their own.
I can't figure out how to query everything to display properly in the result grid.
Thanks in advance

Related

SQL query with distinct values

I have the two following schemes:
Movies[title, year, director, country, rating, genre, gross, producer]
and
Actors[title, year, characterName, actor]
Now I have the following exercise
Find character names that appeared in two movies produced in different countries.
My idea was the following which doesn't really work:
SELECT characterName
FROM Actors a
JOIN Movies m
ON a.title=m.title
AND a.year=m.year
WHERE COUNT(m.title)=2
AND COUNT(DISTINCT(m.country)=2
GROUP BY m.title;
My idea was to obviously select the characterName and join both tables on title and year because they are unique values in combination. Then my plan was to get the movies that are unique (by grouping them) and find the ones with a count of 2 since we want two movies. I hope that I am right till now.
Now I have my problems, because I don't really know how to evaluate if the movies played in two different locations.
I want to somehow make sure that they play in different countries.
You are on the right track. Here is a fixed version of your original query, that should get you the results that you expect:
select a.characterName
from actors a
inner join movies m
on m.title = a.title
and m.year = a.year
group by a.characterName
having count(distinct m.coutry) >= 2
Notes on your design:
it seems like you are using (title, year) as the primary key for the movies table. This does not look like a good design (what if two movies with the same title are produced the same year?). You would be better off with an identity column (in MySQL, an autoincremented primary key), that you would refer as a foreign key in the actors table
better yet, you would probably need to create a separate table to store the masterdata of the actors, and set up a junction table, say appearances, that represents which actors interpreted which character in which movie

MySQL - Storing IDs as an array in a table

Let's say I have two tables on MySQL 5.7
Film
---
ID
name
location
year
user_id
actors
Actor
---
ID
name
born
location
Then I want to link each actor to a film, so each film entry would have actors as an array like [5, 2, 12] and on.
Now, that's one way, I have been told. Is this the appropriate way? Is this right? Wrong?
If I understand you correctly then:
You have to create a Foreign Key in your actors table which contains the film id. But there you can only take ONE film per actor.
If you create a table BETWEEN these tables you can access both tables and combine them with join. So every actor can take place in more than only one film.
Never save an Array in your Database, because you can't access this array with select commands.
In relational databases instead of storing an array of ids a field, you should store a record for each id in a separate related table.
In this specific case, you can have a Film with many actors, and also each actor of this specific film could also work in other different films, so the relation is many to many.
To model this relation you actually need a third table that would hold the ids of the related actors and films the work in.
Like this:
Film
ID
name
location
year
user_id
Actor
ID
name
born
location
ActorInFilm
ActorID
FilmID
Don't use comma delimited values in a table. Rather than have Actors and Films in the Films table, make another table called film_actors or whatever and if you need a table for actor info make an Actors table as well. Then in film actors make a new entry for each actor in the films. It's much less taxing to search these fewer columns and a simple int than a whole row of other information plus parse commas. A sample of some data from film_actors should look like the following:
film : 1, actor : 2 ,
film : 1, actor : 4,
film : 2, actor : 2
Searching through csv columns is a lot more taxing than doing a search of all films where film = x and actor = y.
You can use MySql JSON field to store arrays or lists that can still be indexed, queried by the DB engine.

Proper DB setup - too many joins?

I am setting up a database where I have movies, directors, camera operator, composer of score etc.
Now for each movie there is a row with title, description etc. Also, there should be a column for the director, composer and so forth.
I do not want to repeat the name of a director for every row in the film table, so I put the directors in a different table, the composers in a different table and so on.
In the film table, I then reference with a foreign key to to data of the director, composer etc.
To get ALL data of a movie, I would create a view where I join all the tables together so that a single MySQL query would give me the human readable information of a movie.
Is this a good way to do this? Or are several left joins not a good idea?
The view would look like this (kindof)
SELECT `movies`.`title` as `title`, `movies`.`description` as [...],
`directors`.`name` as `name` [...], `composers`.`name` as `cname` [...]
from
`movies`
left join
`directors` on `movies`.`directors_id`=`directors`.`id`
left join
`movies`.`musicians_id` = `musicians`.`id` [...]
Would that be an efficient way to do it?
As commented by Gordon Linoff, this is called normalization and in general is a good way of storing the data.
You certainly need LEFT joins in your way of defining the tables. If a LEFT join is replaced with INNER JOIN, then there will be no row at all in the result for a movie that has no musician or no director.
However, your proposed normalization will end up with many tables for the different creative roles in the film industry. Even to cover the major creative roles for which there are Oscars and other awards, you will need ten or more tables.
A common way to avoid this is to have a single table for movies and a single table for creatives. A third table is used to connect the two in different roles. For example:
TABLE CREATIVE_ROLES_LINK ( LIKN_ID INT, MOVIE_ID INT, ROLE_NAME VARCHAR(30), PERSON_ID INT);
The above shows the basic idea. In practice, even more normalization can be done with ROLE_NAME stored in another table.

Confused on SQL assignment

We are doing a database query in class. And it's using relational keys. I don't know how to get the query to run. Here is what is says.
For each movie, list its number and title, along with the number and name of the actors who appeared in it.
This is what I have, but it doesn't work
SELECT `Movie`,`Movie_ID`,`ActorNum` FROM `Movies`
Union
Select Actor.Fname, Actor.Lname FROM Actor
;
Im not sure what all the column names are but if ActroName would be the actors name would this be what you are looking for?
SELECT Movies.Movie,
Movies.Movie_ID,
Movies.ActroNum,
Actor.ActroName
FROM Movies
JOIN Actor ON
Actor.ActroNum = Movies.ActroNum

Mysql finding duplicates

have been given an assignment for school, things have been mostly going well but one query i must do has me stumped. Here is a description of the two tables:
Movie: MovieId,[pk] Title, Year, DirectorCode[fk]
Director: DirectorCode,[pk] Name
What i have to do is find any directors that have remade their own movie, and display the Name of the movie, director's name and year of the first and second release??
even if you dont want to give me the answer I would be very greatful for some hints
Thanks
Assume that the remake has a different movieId, but it will have the same title. Therefor you can find movies that has the same title and the same directorCode.
using GROUP BY and find all the movies that would have COUNT(title) > 1 would give you the directorCode and titles to search for, and then in a second query take out full info from both movies (first and second remake) using that, because the info would be lost in the GROUP BY. Another option would be to just select MAX(year), MIN(year) to find out the first and second year.
If you are allowed using "HAVING" keyword that will be useful in order to filter on a group by aggregate, however I don't remember if that is mysql proprietary or part of ANSI SQL.
You don't need to use having or a subquery, or even GROUP. You can do this in one query as long as it is assumed that the movie and remake titles are identical. Since they are remakes, I assume the titles will be identical with different years (otherwise, how can you identify a remake? You would need another field).
SELECT
name
, m1.title
, m1.year
, m2.year as remake
FROM
Movie m1
JOIN Director d USING (directorcode)
JOIN Movie m2 ON (
d.directorcode = m2.directorcode
AND m1.title = m2.title
AND m1.year < m2.year
)
The inner joins from Movie to Director and Director to Movie again ensure that you will only get results if the same director is on two movies. Then, the titles are compared (this could also be done in the WHERE clause). For organizational purposes, m1 is chosen to be less than m2 (also possible in the WHERE clause). Otherwise, 'remake' could be the earlier one.
One thing to note is that if a director remakes a movie twice, you will get three rows. E.g. if they remake a 2009 movie in 2010 and 2011, you will get a row where year = 2009, remake = 2010, year = 2009, remake = 2011, and year = 2010, remake = 2011. From the context of the question, it seems like a director will only remake a movie once, though.
I tested this out and it will not show results for movies that have been remade by a different director or not at all. If two directors remake the same movie twice (that's three remakes, two from a different director) you will get both of those directors. I think this is desirable.