I'm working with the Sakila sample database, and trying to get the most viewed film per country. So far I've managed to get the most viewed film of a certain country given its id with the following query:
SELECT
F.title, CO.country, count(F.film_id) as times
FROM
customer C
INNER JOIN
address A ON C.address_id = A.address_id
INNER JOIN
city CI ON A.city_id = CI.city_id
INNER JOIN
country CO ON CI.country_id = CO.country_id
INNER JOIN
rental R ON C.customer_id = R.customer_id
INNER JOIN
inventory I ON R.inventory_id = I.inventory_id
INNER JOIN
film F ON I.film_id = F.film_id
WHERE
CO.country_id = 1
GROUP BY
F.film_id
ORDER BY
times DESC
LIMIT 1;
I supose that I'll have to use this query or something similar in the FORM of another query, but I've tried it all I could think and am completely unable to figure out how to do so.
Thanks in advance!
I admit, this is a hell of a query. But well, as long as it works.
Explanation:
Subquery: almost the same as you already has. Without the WHERE and LIMIT. Resulting in a list of movie-count per country
Result of that, grouped per country
GROUP_CONCAT(title ORDER BY times DESC SEPARATOR '|||'), will give ALL titles in that 'row', with the most-viewed title first. The separator doesn't matter, as long as you are sure it will never occurs in a title.
SUBSTRING_INDEX('...', '|||', 1) results in the first part of the string until it finds |||, in this case the first (and thus most-viewed) title
Full query:
SELECT
country_name,
SUBSTRING_INDEX(
GROUP_CONCAT(title ORDER BY times DESC SEPARATOR '|||'),
'|||', 1
) as title,
MAX(times)
FROM (
SELECT
F.title AS title,
CO.country_id AS country_id,
CO.country AS country_name,
count(F.film_id) as times
FROM customer C INNER JOIN address A ON C.address_id = A.address_id
INNER JOIN city CI ON A.city_id = CI.city_id
INNER JOIN country CO ON CI.country_id = CO.country_id
INNER JOIN rental R ON C.customer_id = R.customer_id
INNER JOIN inventory I ON R.inventory_id = I.inventory_id
INNER JOIN film F ON I.film_id = F.film_id
GROUP BY F.film_id, CO.country_id
) AS count_per_movie_per_country
GROUP BY country_id
Proof of concept (as long as the subquery is correct): SQLFiddle
Related
In Sakila DB, how to get a list of customers that have never rented out even a single movie from the top 5 actors (the list of top actors is calculated by rental volume).
This is what I used to find the top 5 actors
SELECT a.actor_id, a.first_name, a.last_name,
COUNT(r.rental_id) AS rentalVolume
FROM actor a
JOIN film_actor fa ON a.actor_id = fa.actor_id
JOIN film f ON fa.film_id = f.film_id
JOIN inventory i ON f.film_id = i.film_id
JOIN rental r ON i.inventory_id = r.inventory_id
GROUP BY a.actor_id, a.first_name, a.last_name
ORDER BY rentalVolume DESC
LIMIT 5;
I want to SELECT the customer_id, first_name, last_name, that have never rented out a movie from these actors.
The desired result would be something like this
Customer Number First Name Last Name
2 PETER OLIVIER
8 JOHN DOE
64 GWEN LORENZO
You can use the not exists operator to search for the costumers that haven't any movie with the top 5 actors (to identify which of their rentals are movies with those actors you can use the in operator) :
select c.costumer_id, c.first_name, c.last_name
from costumers c
where not exists (
select *
from rental r
inner join inventory i on i.inventory_id = r.inventory_id
inner join film f on f.film_id = i.film_id
inner join film_actor fa on fa.film_id = f.film_id and
fa.actor_id in (<< Here_goes_your_top_5_actors_query >>)
where r.costumer_id = c.costumer_id
)
This is the most direct and easiest to understand translation of your logic to SQL, but a correlated subquery can result in very bad performance (specially when large numbers of records are involved). If this query is too slow for you, then you can do a select of rentals grouped by costumer, and summing their films that have one of those actors, returning only the costumers with a sum of zero.
The inner joins now have to be changed to left joins, because we are interested in rows that don't have not even a single film_actors matching the top 5 actors, so an inner join wouldn't return those costumers.
select c.costumer_id, c.first_name, c.last_name
from costumer c
left join rental r on r.costumer_id = c.costumer_id
left join inventory i on i.inventory_id = r.inventory_id
left join film_actor fa on fa.film_id = i.film_id and
fa.actor_id in (<< Here_goes_your_top_5_actors_query >>)
group by c.costumer_id, c.first_name, c.last_name
having sum(fa.film_id) = 0
PS: in this faster query I have removed the join with films because it was never needed, we don't use any data from the film table, so we can directly join inventory to film_actor.
If I have a database(sakila) with multiple tables, and I want to query multiple columns that relate to each other do I need to use keywords like
SELECT city.city, actor.first_name, actor.last_name
FROM city, actor, staff, address, inventory, film_actor, store
WHERE city.city_id = address.city_id AND
address.address_id = staff.address_id AND
staff.staff_id = store.store_id AND
store.store_id = inventory.store_id AND
inventory.film_id = film_actor.film_id AND
film_actor.actor_id = actor.actor_id
or can I just select them without linking the keys together like this:
SELECT city.city, actor.first_name, actor.last_name
FROM city, actor
EDIT:
So, since I want to see which cities the actors are from, I should use an inner join because a cross join will just match every city to every actor regardless if they actually relate?
What do you mean by multiple columns that relate to each other? can you explain further, the normal way of making a select query is like this
$sql= "Select column name FROM tablename ";
or be specific like
$sql="Select column name FROM tablename Where column name LIKE '%%' ";
you can make query with related fields by making another query for example $sql, $sql2 and so on.
Yes. You will have to use a JOIN command.
Ex.
SELECT c.city, a.first_name, a.last_name
FROM city c
INNER JOIN address ad ON c.city_id = ad.city_id
INNER JOIN staff s ON ad.address_id = s.address_id
INNER JOIN store st ON s.store_id = st.store_id
INNER JOIN inventory i ON st.store_id = i.store_id
INNER JOIN film_actor fa ON i.flim_id = fa.film_id
INNER JOIN actor a ON fa.actor_id = a.actor_id
My expected output:
And I write like below:
SELECT c.cID, s.svcID, s.svcNote
FROM company c
LEFT JOIN service s ON s.cID = c.cID
LEFT JOIN (SELECT MAX(s.svcID) AS svcID
FROM service s
GROUP BY s.cID) AS s1 ON s1.svcID = s.svcID
ORDER BY c.cJoinDate DESC
However, I can't get my expected output and taking very long time to run my query. Can someone help me?
Since you want only those entries with service, you need to use INNER JOIN instead of LEFT JOIN. LEFT JOIN will list all records in Company, including cID = 4.
Try this instead:
SELECT c.cID, s.svcID, s.svcNote
FROM company c
INNER JOIN service s ON s.cID = c.cID
WHERE s.ID = (SELECT MAX(s2.svcID)
FROM service s2
WHERE s1.svcID = s2.svcID
GROUP BY s2.cID)
ORDER BY c.cJoinDate DESC
SELECT c.cID, MAX(s.svcID), s.svcNote
FROM company c
INNER JOIN service s ON s.cID = c.cID
GROUP BY s.cID
ORDER BY c.cJoinDate DESC
I have a database with the tables:
Student(SID,Name,Surname,Age)
Registration(StudentID,CourseID)
Course(CID,Name,Cost)
I would like to extract only the name of the courses with students younger than 20. Will the query below do just that?
SELECT C.NAME
FROM Course C
INNER JOIN Registration
INNER JOIN Student S
WHERE CID = CourseID
AND SID = StudentID
AND Age < 20
GROUP BY C.NAME
I would also like to extract the number of students in each course having students younger than 20. Is it correct to do it as below?
SELECT count(S.NAME)
,C.NAME
FROM Student S
INNER JOIN Course C
INNER JOIN Registration
WHERE Age < 20
AND CID = CourseID
AND SID = StudentID
GROUP BY C.NAME
You are missing the ON part for the join otherwise it would just be a CROSS JOIN.
Your first query should look like this if you want just a distinct list of student names:
SELECT DISTINCT C.NAME
FROM Course C
INNER JOIN Registration R ON C.CID = R.CourseID
INNER JOIN Student S ON R.StudentID = S.SID
WHERE Age < 20
Your second query shouldn't really have the C.Name in the select if you want to get just a count unless you want a count of how many students have that name.
SELECT count(*)
FROM Student S
INNER JOIN Registration R ON s.SID = R.StudentID
INNER JOIN Course C ON c.CID = R.CourseID
WHERE Age < 20
GROUP BY C.NAME
First join these tables, then group by Course's PK(CID), Add the HAVING condition to filter the course which has students younger than 20.
Then use Course table to join the result to get the course name and count of students in the course.
SELECT
T1.Name,
T2.StudentCount
FROM
Course T1
INNER JOIN (
SELECT
c.CID,
COUNT(s.SID) AS StudentCount
FROM
Course c
LEFT JOIN Registration r ON c.CID = r.CourseID
LEFT JOIN Student s ON s.SID = r.StudentID
GROUP BY c.CID
HAVING COUNT(IF(s.Age < 20, 1, NULL)) > 0
) T2 ON T1.CID = T2.CID
More correctly, you should move the conditions of the join, to the join statements themselves by including them in the on clause instead of the where. While the results may not change in this instance, if you were to start including outer joins you would encounter difficulties.
SELECT count(S.NAME)
,C.NAME
FROM Student S
INNER JOIN Registration R
ON s.SID = R.StudentID
INNER JOIN Course C
ON c.CID = R.CourseID
WHERE Age < 20
GROUP BY C.NAME
There's a fiddle here showing it in action: http://sqlfiddle.com/#!9/c3b8f/1
Your first query will also produce the results you want, but again, you should move the join predicates to the join itself. Also, you don't need to perform the grouping just to get distinct values, mysql has an expression for that called distinct. So rewritten, the first query would look like:
SELECT DISTINCT C.NAME
FROM Student S
INNER JOIN Registration R
ON s.SID = R.StudentID
INNER JOIN Course C
ON c.CID = R.CourseID
WHERE Age < 20.
Again, the results are the same as what you have already but it is easier to 'read' and will put you in good stead when you move on to other queries. As it stands you have mixed implicit and explicit join syntax.
This fiddle demonstrates both queries: http://sqlfiddle.com/#!9/c3b8f/4
edit
I may have misinterpreted your original question - if you want the total number of students enrolled in a course with at least one student under 19, you can use a query like this:
select name, count(*)
from course c
inner join registration r
on c.cid = r.courseid
where exists (
select 1
from course cc
inner join registration r
on cc.cid = r.courseid
inner join student s
on s.sid = r.studentid
where cc.cid = c.cid
group by cc.cid
having min(s.age) < 20
)
group by name;
Again with the updated fiddle here: http://sqlfiddle.com/#!9/c3b8f/17
When I run this query, I get duplicate lines. Specifically, the order_ID is repeated for every possible ship_state related to that Customer_ID. If I remove the cust_address table from the query, I get the correct number of lines. How can I get just the Ship_states related to that particular order. Thanks.
SELECT
co.ID AS order_ID,
col.PART_ID,
col.ORDER_QTY,
co.STATUS,
co.SHIPTO_ID,
co.CUSTOMER_PO_REF,
co.CUSTOMER_ID,
c.STATE AS Bill_State,
ca.STATE AS Ship_State
FROM
dbo.CUSTOMER_ORDER AS co
INNER JOIN
dbo.CUST_ORDER_LINE AS col ON co.ID = col.CUST_ORDER_ID
INNER JOIN
dbo.CUSTOMER AS c ON co.CUSTOMER_ID = c.ID
INNER JOIN
dbo.CUST_ADDRESS AS ca ON c.ID = ca.CUSTOMER_ID
WHERE
(co.ORDER_DATE > '2014-01-01') AND (co.ID NOT LIKE 'rma%')
ORDER BY order_ID
This gave me unique order lines and order numbers for each shipping address. The next challenge is to figure out a way to populate the rows that have the same shipping and billing address. For these orders the shipping fields are null and information from the customer table is used instead.
SELECT ca.STATE AS ship_state,
co.ID,
co.CUSTOMER_ID,
ca.ADDR_NO,
co.SHIP_TO_ADDR_NO,
c.STATE AS Bill_state,
c.NAME AS Bill_name,
ca.NAME AS Ship_name
FROM
dbo.CUST_ORDER_LINE AS col
FULL OUTER JOIN
dbo.CUSTOMER_ORDER AS co ON col.CUST_ORDER_ID = co.ID
FULL OUTER JOIN
dbo.CUST_ADDRESS AS ca
FULL OUTER JOIN
dbo.CUSTOMER AS c ON ca.CUSTOMER_ID = c.ID
ON
co.CUSTOMER_ID = c.ID
AND
co.CUSTOMER_ID = ca.CUSTOMER_ID
AND
co.SHIP_TO_ADDR_NO = ca.ADDR_NO
WHERE
(co.ORDER_DATE > '2014-1-1') AND (co.ID NOT LIKE 'rma%')
ORDER BY co.ID