I'm trying to find the film that has been rented the most without using limit. I'm trying to use the following query:
SELECT f.title, f.film_id
FROM film f
JOIN inventory i ON f.film_id = i.film_id
JOIN rental r ON r.inventory_id = i.inventory_id
GROUP BY f.film_id
HAVING COUNT(r.rental_id) = MAX(
SELECT COUNT(r2.rental_id)
FROM rental r2, inventory i2
WHERE i2.inventory_id = r2.inventory_id
GROUP BY i2.film_id);
but mySQL tells me that I have a syntax error somewhere in here SELECT COUNT(r2.rental_id)
FROM rental r2, inventory however, when I run the subquery independently it returns the expected table. Am I doing something massively wrong?
relevant database schema:
film(film id, title, description, release year, language id, original language id, rental duration, rental rate, length, replacement cost, rating, special features, last update)
inventory(inventory id, film id, store id, last update)
rental(rental id, rental date, inventory id, customer id, return date, staff id, last update)
You can't use MAX() over a result set, but you can use
someValue >= ALL (subquery)
to achieve what you're attempting, because ALL requires that the preceding operator be true for all values in the set.
Try this:
SELECT f.title, f.film_id
FROM film f
JOIN inventory i ON f.film_id = i.film_id
JOIN rental r ON r.inventory_id = i.inventory_id
GROUP BY f.film_id
HAVING COUNT(r.rental_id) >= ALL (
SELECT COUNT(r2.rental_id)
FROM rental r2, inventory i2
WHERE i2.inventory_id = r2.inventory_id
GROUP BY i2.film_id);
I don't have a database to test in, but this should work:
Edited to LIMIT 1 instead of SELECT TOP 1 for MySQL)
SELECT f.title, f.film_id
FROM film f
JOIN inventory i ON f.film_id = i.film_id
JOIN rental r ON r.inventory_id = i.inventory_id
GROUP BY f.film_id
HAVING COUNT(r.rental_id) = (SELECT COUNT(r2.rental_id)
FROM rental r2, inventory i2
WHERE i2.inventory_id = r2.inventory_id
GROUP BY i2.film_id
ORDER BY COUNT(r2.rental_id) desc
LIMIT 1) s
Related
In Sakila DB, how to get a list of customers that have never rented out even a single movie from the top 5 actors (the list of top actors is calculated by rental volume).
This is what I used to find the top 5 actors
SELECT a.actor_id, a.first_name, a.last_name,
COUNT(r.rental_id) AS rentalVolume
FROM actor a
JOIN film_actor fa ON a.actor_id = fa.actor_id
JOIN film f ON fa.film_id = f.film_id
JOIN inventory i ON f.film_id = i.film_id
JOIN rental r ON i.inventory_id = r.inventory_id
GROUP BY a.actor_id, a.first_name, a.last_name
ORDER BY rentalVolume DESC
LIMIT 5;
I want to SELECT the customer_id, first_name, last_name, that have never rented out a movie from these actors.
The desired result would be something like this
Customer Number First Name Last Name
2 PETER OLIVIER
8 JOHN DOE
64 GWEN LORENZO
You can use the not exists operator to search for the costumers that haven't any movie with the top 5 actors (to identify which of their rentals are movies with those actors you can use the in operator) :
select c.costumer_id, c.first_name, c.last_name
from costumers c
where not exists (
select *
from rental r
inner join inventory i on i.inventory_id = r.inventory_id
inner join film f on f.film_id = i.film_id
inner join film_actor fa on fa.film_id = f.film_id and
fa.actor_id in (<< Here_goes_your_top_5_actors_query >>)
where r.costumer_id = c.costumer_id
)
This is the most direct and easiest to understand translation of your logic to SQL, but a correlated subquery can result in very bad performance (specially when large numbers of records are involved). If this query is too slow for you, then you can do a select of rentals grouped by costumer, and summing their films that have one of those actors, returning only the costumers with a sum of zero.
The inner joins now have to be changed to left joins, because we are interested in rows that don't have not even a single film_actors matching the top 5 actors, so an inner join wouldn't return those costumers.
select c.costumer_id, c.first_name, c.last_name
from costumer c
left join rental r on r.costumer_id = c.costumer_id
left join inventory i on i.inventory_id = r.inventory_id
left join film_actor fa on fa.film_id = i.film_id and
fa.actor_id in (<< Here_goes_your_top_5_actors_query >>)
group by c.costumer_id, c.first_name, c.last_name
having sum(fa.film_id) = 0
PS: in this faster query I have removed the join with films because it was never needed, we don't use any data from the film table, so we can directly join inventory to film_actor.
From the database below with schema
movieActor (actorID, movieID)
rental (rentalID, inventoryID, customerID)
inventory (inventoryID, movieID)
I am trying to list pair of customers who rented movies from same actor. The resulting set should be composed of three columns
customerID1,customerID2,nOfCommonActors
for example
23 44 5
11 44 3
where the first row means customers with ids of 23 and 44 each rented various movies but 5 of those actors played in both set of movies customer 23 and 44 rented
I came up with this query however it takes so much time to run and times out without returning any result. Was wondering how I can make it more efficient( I am using MYSQL):
SELECT r1.customerID AS customerID1,
r2.customerID AS customerID2,
COUNT(DISTINCT fa.actorID) as nOfCommonActors
FROM movieActor AS fa
JOIN (SELECT r.customerID, i.movieID, fa.actorID
FROM rental AS r
JOIN inventory i
ON i.inventoryID=r.inventoryID
JOIN movieActor AS fa
ON fa.actorID=i.movieID
) AS r1
JOIN (SELECT r.customerID, i.movieID, fa.actorID
FROM rental AS r
JOIN inventory i
ON i.inventoryID=r.inventoryID
JOIN movieActor AS fa
ON fa.actorID=i.movieID
) AS r2
ON r2.actorID=r1.actorID
AND r1.customerID < r2.customerID
GROUP BY r1.customerID, r2.customerID
ORDER BY nOfCommonActors DESC;
The one thing I can think of is select distinct in the subqueries:
SELECT ca.customerID AS customerID1,
ca2.customerID AS customerID2,
COUNT(*) as nOfCommonActors
FROM (SELECT DISTINCT r.customerID, fa.actorID
FROM rental r JOIN
inventory i
ON i.inventoryID = r.inventoryID JOIN
movieActor fa
ON fa.actorID = i.movieID
) ca JOIN
(SELECT DISTINCT r.customerID, fa.actorID
FROM rental r JOIN
inventory i
ON i.inventoryID = r.inventoryID JOIN
movieActor fa
ON fa.actorID = i.movieID
) ca2
ON ca.actorID = ca2.actorID AND
ca.customerID < ca2.customerID
GROUP BY ca.customerID, ca2.customerID
ORDER BY nOfCommonActors DESC;
Your version is multiplying out the number of rows in the subqueries considerably. That makes the JOIN more expensive -- and all that extra work is for nought because you want COUNT(DISTINCT) anyway.
Splitting the query into, allows allows statistic to plot best path
SELECT DISTINCT r.customerID, fa.actorID
into #t1
FROM rental r JOIN
inventory i
ON i.inventoryID = r.inventoryID JOIN
movieActor fa
ON fa.actorID = i.movieID
SELECT DISTINCT r.customerID, fa.actorID
into #t1
FROM rental r JOIN
inventory i
ON i.inventoryID = r.inventoryID JOIN
movieActor fa
ON fa.actorID = i.movieID
select #t1.customerID AS customerID1,
#t2.customerID AS customerID2,
COUNT(*) as nOfCommonActors
from
(
select #t1.customerID, #t2.customerID
from #t1
join #t2 ON #t1.actorID = #t2.actorID AND #t1.customerID < #t2.customerID )
GROUP BY #t1.customerID, #t2.customerID
ORDER BY nOfCommonActors DESC;
Using the sakila database, write a query that finds, for each customer X, another customer Y who has rented at least one movie in common with X. Find all such pairs of Customers (X, Y) and against each pair, the number of overlapping movies. Order the results by the number of overlapping movies
I've tried using aliases, inner joins, and sub-queries. However, I believe there is a syntax error with my code.
SELECT o1.customer_id AS CustomerID1,
o2.customer_id AS CustomerID2,
COUNT(*) NoOfOverlappingMovies
FROM( ( (SELECT c.customer_id, f.film_id
FROM customer AS c,
JOIN rental AS r
ON r.customer_id = c.customer_id)
JOIN inventory AS i ON i.inventory_id = r.inventory_id)
JOIN film AS f ON i.film_id = f.film_id
) AS o1
JOIN( ( (SELECT c.customer_id, f.film_id
FROM customer AS c,
JOIN rental AS r
ON r.customer_id = c.customer_id)
JOIN inventory AS i ON i.inventory_id = r.inventory_id)
JOIN film AS f ON i.film_id = f.film_id
) AS o2
ON o2.film_id = o1.film_id AND o2.customer_id < o1.customer_id
GROUP BY o1.customer_id, o2.customer_id
ORDER BY COUNT(*) DESC;
The query should have 3 columns. CustomerID1, CustomerID2, and NoOfOverlappingMovies.
1) Do not use "," between "FROM" and "JOIN" parts.
2) Your parentheses are somewhat off. I tried to correct them as best i could without having the tables present:
SELECT o1.customer_id AS CustomerID1,
o2.customer_id AS CustomerID2,
COUNT(*) NoOfOverlappingMovies
FROM( (SELECT c.customer_id, f.film_id
FROM customer AS c
JOIN rental AS r ON r.customer_id = c.customer_id
JOIN inventory AS i ON i.inventory_id = r.inventory_id
JOIN film AS f ON i.film_id = f.film_id
) AS o1
JOIN (SELECT c.customer_id, f.film_id
FROM customer AS c
JOIN rental AS r ON r.customer_id = c.customer_id
JOIN inventory AS i ON i.inventory_id = r.inventory_id
JOIN film AS f ON i.film_id = f.film_id
) AS o2 ON o2.film_id = o1.film_id AND o2.customer_id < o1.customer_id )
GROUP BY o1.customer_id, o2.customer_id
ORDER BY COUNT(*) DESC;
The query would seem to be:
select r1.customer_id, r2.customer_id,
count(distinct r1.film_id) as num_films
from rental r1 join
rental r2
on r1.film_id = r2.film_id and
r1.customer_id < r2.customer_id
group by r1.customer_id, r2.customer_id
order by num_films desc;
The other tables do not seem to be needed for this query.
I'm having a bit of trouble trying to reduce the redundancy of a query in MySQL. I currently have it working, but it feels like I have too much overhead because it uses a redundant subquery. What I am trying to do is use a dvd rental database to find which store location has rented out more dvd's for each month in 2005.
Here is the working query
SELECT b.month, c.store_id, b.maxRentals
FROM
(SELECT a.month, MAX(a.rentalCount) as maxRentals
FROM
(SELECT MONTH(rental.rental_date) as month, inventory.store_id, count(1) as rentalCount
FROM rental
INNER JOIN inventory
ON rental.inventory_id = inventory.inventory_id
WHERE YEAR(rental.rental_date) = 2005
GROUP BY MONTH(rental.rental_date), inventory.store_id
) a
GROUP BY a.month
) b
INNER JOIN
(SELECT MONTH(rental.rental_date) as month, inventory.store_id, count(1) as rentalCount
FROM rental
INNER JOIN inventory
ON rental.inventory_id = inventory.inventory_id
WHERE YEAR(rental.rental_date) = 2005
GROUP BY MONTH(rental.rental_date), inventory.store_id
) c
ON b.maxRentals = c.rentalCount
GROUP BY b.month;
Notice how the subquery with the alias of "c" is the exact same subquery of alias "a". I'm not sure if there's a way to get rid of this, as I can't inner join on an alias. Am I just stuck with a giant query, or is there something else I can do?
I am 90% certain this query will achieve your intentions:
SELECT MONTH(r.rental_date), i.store_id, COUNT(*)
FROM rental r
LEFT JOIN inventory i ON r.inventory_id = i.inventory_id
WHERE YEAR(r.rental_date) = 2005
GROUP BY MONTH(r.rental_date), i.store_id
Let me know how it goes!
Edit: to answer the question which store location has rented out more dvd's for each month in 2005:
SELECT x.rental_month, x.store_id, MAX(x.rental_count) FROM (
SELECT MONTH(r.rental_date) AS rental_month, i.store_id AS store_id, COUNT(*) AS rental_count
FROM rental r LEFT JOIN inventory i ON r.inventory_id = i.inventory_id
WHERE YEAR(r.rental_date) = 2005
GROUP BY MONTH(r.rental_date), i.store_id) x
GROUP BY x.rental_month, x.store_id
I was explicit by using aliases everywhere, you could probably omit some. Hopefully this helps...
Edit: Dirty hack:
SELECT x.rental_month, x.store_id, MAX(x.rental_count) FROM (
SELECT MONTH(r.rental_date) AS rental_month, i.store_id AS store_id, COUNT(*) AS rental_count
FROM rental r LEFT JOIN inventory i ON r.inventory_id = i.inventory_id
WHERE YEAR(r.rental_date) = 2005
GROUP BY MONTH(r.rental_date), i.store_id
ORDER BY MONTH(r.rental_date) ASC, COUNT(*) DESC) x
GROUP BY x.rental_month
Ref:
http://kristiannielsen.livejournal.com/6745.html
But then does this satisfy you, seeing as you do already have a working query...
I'm working with the Sakila sample database, and trying to get the most viewed film per country. So far I've managed to get the most viewed film of a certain country given its id with the following query:
SELECT
F.title, CO.country, count(F.film_id) as times
FROM
customer C
INNER JOIN
address A ON C.address_id = A.address_id
INNER JOIN
city CI ON A.city_id = CI.city_id
INNER JOIN
country CO ON CI.country_id = CO.country_id
INNER JOIN
rental R ON C.customer_id = R.customer_id
INNER JOIN
inventory I ON R.inventory_id = I.inventory_id
INNER JOIN
film F ON I.film_id = F.film_id
WHERE
CO.country_id = 1
GROUP BY
F.film_id
ORDER BY
times DESC
LIMIT 1;
I supose that I'll have to use this query or something similar in the FORM of another query, but I've tried it all I could think and am completely unable to figure out how to do so.
Thanks in advance!
I admit, this is a hell of a query. But well, as long as it works.
Explanation:
Subquery: almost the same as you already has. Without the WHERE and LIMIT. Resulting in a list of movie-count per country
Result of that, grouped per country
GROUP_CONCAT(title ORDER BY times DESC SEPARATOR '|||'), will give ALL titles in that 'row', with the most-viewed title first. The separator doesn't matter, as long as you are sure it will never occurs in a title.
SUBSTRING_INDEX('...', '|||', 1) results in the first part of the string until it finds |||, in this case the first (and thus most-viewed) title
Full query:
SELECT
country_name,
SUBSTRING_INDEX(
GROUP_CONCAT(title ORDER BY times DESC SEPARATOR '|||'),
'|||', 1
) as title,
MAX(times)
FROM (
SELECT
F.title AS title,
CO.country_id AS country_id,
CO.country AS country_name,
count(F.film_id) as times
FROM customer C INNER JOIN address A ON C.address_id = A.address_id
INNER JOIN city CI ON A.city_id = CI.city_id
INNER JOIN country CO ON CI.country_id = CO.country_id
INNER JOIN rental R ON C.customer_id = R.customer_id
INNER JOIN inventory I ON R.inventory_id = I.inventory_id
INNER JOIN film F ON I.film_id = F.film_id
GROUP BY F.film_id, CO.country_id
) AS count_per_movie_per_country
GROUP BY country_id
Proof of concept (as long as the subquery is correct): SQLFiddle