Grouping by Count(*) following multiple join operations - mysql

Trying to figure out what I thought was a simple query-
I have the following tables with schemas as follows.
models(id: int, first_name: string, last_name: string)
painters(id: int, first_name: string, last_name:string)
portraits_painters(painter_id: int, portrait_id: int)
roles(model_id: int, portrait_id: int)
I am trying to satisfy this query: The first name, last name, and count
for painters who have been the model for their own painting- i.e. where the count column for a first and last name is the number of self portraits the painter has done (both the painter and model). (Safe to assume equal first names and last names mean the same individual).
If a painter has no self portraits they should not even be included in the final output.
SELECT
models.first_name,
models.last_name,
COUNT(portraits_painters.portrait_id)
FROM models
RIGHT JOIN painters
ON models.first_name = painters.first_name
AND models.last_name = painters.last_name
RIGHT JOIN portraits_painters
ON portraits_painters.painter_id = painters.id
RIGHT JOIN roles
ON roles.portrait_id = portraits_painters.portrait_id
GROUP BY portraits_painters.portrait_id;
I essentially tried to narrow the table down to rows where names matched, followed by instances where the painter ids matched and the portrait id's matched, but evidently there's something incorrect in how I am joining tables.
I'm also getting errors in regards to the GROUP BY portion, so not sure if there's a better way to approach it? Perhaps aliasing?

You can get the names of the models and painters using:
select m.first_name, m.last_name, p.first_name, p.last_name
from roles r join
portraits_painters pp
on pp.portrait_id = r.portrait_id join
painters p
on p.id = pp.painter_id join
models m
on m.id = r.model_id;
The rest is just a where and group by:
select p.first_name, p.last_name, count(*)
from roles r join
portraits_painters pp
on pp.portrait_id = r.portrait_id join
painters p
on p.id = pp.painter_id join
models m
on m.id = r.model_id
where m.first_name = p.first_name and m.last_name = p.last_name
group by p.first_name, p.last_name;
I am not sure what thinking leads to outer joins or exists clauses.

Get painter infos for all portraits:
SELECT p.first_name, p.last_name, pp.portrait_id
FROM portraits_painters AS pp
JOIN painters AS p
ON pp.painter_id = p.id
Get model infos for all portraits:
SELECT m.first_name, m.last_name, pp.portrait_id
FROM roles AS r
JOIN models AS m ON m.id = r.model_id
JOIN portraits_painters AS pp ON pp.portrait_id = r.portrait_id
Combine these two tables on the same first_name and last_name. Now you get both painter infos and model infos for every portrait. Just group the records by first_name and last_name, and COUNT for each group, you get what you want.
So, the final query would be:
SELECT a.first_name, a.last_name, COUNT(a.portrait_id)
FROM
(
SELECT p.first_name, p.last_name, pp.portrait_id
FROM portraits_painters AS pp
JOIN painters AS p
ON pp.painter_id = p.id
) AS a
JOIN
(
SELECT m.first_name, m.last_name, pp.portrait_id
FROM roles AS r
JOIN models AS m ON m.id = r.model_id
JOIN portraits_painters AS pp ON pp.portrait_id = r.portrait_id
) AS b
ON a.first_name = b.first_name AND a.last_name = b.last_name
GROUP BY a.first_name, a.last_name;

Related

SQL - inner join on different criteria

Just getting confused on basic stuff -
could someone explain me this -
select s.name from students s
inner join friends f on f.id = s.id
inner join packages p on p.id = s.id
where p.salary < (select pp.salary from packages pp where pp.id = f.friend_id)
order by (select pp.salary from packages pp where pp.id = f.friend_id) ASC;
the salary comparison part - i.e select pp.salary from packages pp where pp.id = f.friend_id should not yield the same salary result? - so how can we compare.
for references, use the below sample tables
table 1- students
columns - id, name
table 2 - friends (here each id is linked with one friend_id (his best friend))
columns - id , friend_id
table3 - packages
columns - id , salary
Trying to find out the name of the friend whose best friend's salary is more than his salary.
I am confused at understanding this solution.
That where subquery part is wrong cause the subquery will return multiple record and which can't be used with < operator since it's accepts scalar value. Rather change that to a JOIN as well like
JOIN packages pp ON pp.id = f.friend_id
AND p.salary < pp.salary
Change your query to be
select s.name from students s
inner join friends f on f.id = s.id
inner join packages p on p.id = s.id
JOIN packages pp ON pp.id = f.friend_id
AND p.salary < pp.salary
order by pp.salary;

Is this the right way to join tables to fetch data?

I have a database with the tables:
Student(SID,Name,Surname,Age)
Registration(StudentID,CourseID)
Course(CID,Name,Cost)
I would like to extract only the name of the courses with students younger than 20. Will the query below do just that?
SELECT C.NAME
FROM Course C
INNER JOIN Registration
INNER JOIN Student S
WHERE CID = CourseID
AND SID = StudentID
AND Age < 20
GROUP BY C.NAME
I would also like to extract the number of students in each course having students younger than 20. Is it correct to do it as below?
SELECT count(S.NAME)
,C.NAME
FROM Student S
INNER JOIN Course C
INNER JOIN Registration
WHERE Age < 20
AND CID = CourseID
AND SID = StudentID
GROUP BY C.NAME
You are missing the ON part for the join otherwise it would just be a CROSS JOIN.
Your first query should look like this if you want just a distinct list of student names:
SELECT DISTINCT C.NAME
FROM Course C
INNER JOIN Registration R ON C.CID = R.CourseID
INNER JOIN Student S ON R.StudentID = S.SID
WHERE Age < 20
Your second query shouldn't really have the C.Name in the select if you want to get just a count unless you want a count of how many students have that name.
SELECT count(*)
FROM Student S
INNER JOIN Registration R ON s.SID = R.StudentID
INNER JOIN Course C ON c.CID = R.CourseID
WHERE Age < 20
GROUP BY C.NAME
First join these tables, then group by Course's PK(CID), Add the HAVING condition to filter the course which has students younger than 20.
Then use Course table to join the result to get the course name and count of students in the course.
SELECT
T1.Name,
T2.StudentCount
FROM
Course T1
INNER JOIN (
SELECT
c.CID,
COUNT(s.SID) AS StudentCount
FROM
Course c
LEFT JOIN Registration r ON c.CID = r.CourseID
LEFT JOIN Student s ON s.SID = r.StudentID
GROUP BY c.CID
HAVING COUNT(IF(s.Age < 20, 1, NULL)) > 0
) T2 ON T1.CID = T2.CID
More correctly, you should move the conditions of the join, to the join statements themselves by including them in the on clause instead of the where. While the results may not change in this instance, if you were to start including outer joins you would encounter difficulties.
SELECT count(S.NAME)
,C.NAME
FROM Student S
INNER JOIN Registration R
ON s.SID = R.StudentID
INNER JOIN Course C
ON c.CID = R.CourseID
WHERE Age < 20
GROUP BY C.NAME
There's a fiddle here showing it in action: http://sqlfiddle.com/#!9/c3b8f/1
Your first query will also produce the results you want, but again, you should move the join predicates to the join itself. Also, you don't need to perform the grouping just to get distinct values, mysql has an expression for that called distinct. So rewritten, the first query would look like:
SELECT DISTINCT C.NAME
FROM Student S
INNER JOIN Registration R
ON s.SID = R.StudentID
INNER JOIN Course C
ON c.CID = R.CourseID
WHERE Age < 20.
Again, the results are the same as what you have already but it is easier to 'read' and will put you in good stead when you move on to other queries. As it stands you have mixed implicit and explicit join syntax.
This fiddle demonstrates both queries: http://sqlfiddle.com/#!9/c3b8f/4
edit
I may have misinterpreted your original question - if you want the total number of students enrolled in a course with at least one student under 19, you can use a query like this:
select name, count(*)
from course c
inner join registration r
on c.cid = r.courseid
where exists (
select 1
from course cc
inner join registration r
on cc.cid = r.courseid
inner join student s
on s.sid = r.studentid
where cc.cid = c.cid
group by cc.cid
having min(s.age) < 20
)
group by name;
Again with the updated fiddle here: http://sqlfiddle.com/#!9/c3b8f/17

SQL - Find the object with the most appearances

I am a newbie to SQL working on an assignment to find the actor or actress with the most appearances. A diagram of the database I'm working with is here:
Here was the query I was trying to use:
SELECT DISTINCT n.name, count(n.name)
FROM cast_info c
INNER JOIN name n
ON (n.id = c.person_id)
INNER JOIN title t
ON (c.movie_id = t.id)
CROSS JOIN role_type r
WHERE (r.role = 'actor' OR r.role = 'actress')
GROUP BY n.name
This is intended to get a count of how many times different actors showed up, which I can then sort and select the top one. But it doesn't work. Something else I did was:
SELECT n.name, count(n.name) AS amount
FROM cast_info c
INNER JOIN name n
ON (n.id = c.person_id)
INNER JOIN title t
ON (c.movie_id = t.id)
LEFT JOIN role_type r
ON c.role_id = r.id
AND (r.role = 'actor' OR r.role = 'actress')
GROUP BY amount
ORDER BY amount DESC
LIMIT 1
But that gives the error
aggregate functions are not allowed in GROUP BY
LINE 1: SELECT COUNT(*) AS total FROM (SELECT n.name, count(n.name) ...
Tips?
I am going to take a stab at each of these questions for you, because this assignment is obviously causing you some trouble.
You can find everything you need in your cast_info table and your role_type table, unless you need to display the actors/actresses actual name.
I would start by selecting all rows that represent an actor or actress in a movie. This should be a unique combination, as a person can't be an actor in the same movie twice. Once you've done that, group by the persons id and get the count() of rows, which should effectively be the number of movies. I think the error you're getting is exactly for the reason it sounds, you can't use an aggregate column in your order by. A workaround for that would be to use this as a subquery, and use MAX() to get most appearances.
Try this:
SELECT c.personid, MAX(numMovies) AS mostApperances
FROM(SELECT c.personid, COUNT(*) AS numMovies
FROM cast_info c
JOIN role_type r ON r.id = c.role_id
WHERE r.role = 'actor' OR r.role = 'actress'
GROUP BY c.personid) t
Try this
SELECT DISTINCT n.name, count(n.name)
FROM cast_info c
INNER JOIN name n
ON n.id = c.person_id
INNER JOIN title t
ON c.movie_id = t.id
LEFT JOIN role_type r
ON c.role_id = r.id
AND (r.role = 'actor' OR r.role = 'actress')
GROUP BY n.name

Disambiguating identical columns in a join

[Yes, I've searched for an answer for this here and in google but this is a little difficult to query for.]
(MySQL database.)
messages table:
messageid
senderid
recipientid
people table:
personid
name
I wish to issue a query that returns the following:
messageid sender_name recipient_name
1 larry jane
2 mark alice
etc.
The following doesn't do it, and I expected that it would not, but it's a place to start:
select m.messageid, p.name as "sender_name", p.name as "recipient_name"
from messages m, people p
where m.senderid = p.personid and m.recipientid = p.personid
The issue is that I don't know how in sql to specifically reference the sender and the recipient since they are part of the same join clause, if that makes sense.
thanks
try:
select m.messageid, pSender.name as "sender_name", pRecipient.name as "recipient_name"
from messages m
inner join people pSender on m.senderId = pSender.personId
inner join people pRecipient on m.recipientid = pRecipient.personId
For your join method (i think this should work... i'm not very familiar with comma joins)
select m.messageid, p.name as "sender_name", p.name as "recipient_name"
from messages m, people pSender, people pRecipient
where m.senderid = pSender.personid and m.recipientid = pRecipient.personid
You can join the same table into the query twice, just alias it differently, something aking to:
select m.messageid, s.name as "sender_name", r.name as "recipient_name"
from messages m
inner join people s on m.senderid = s.personid
inner join people r on m.recipientid = r.personid
Your query return only messages that is sent from a person to itself. Something like:
select m.messageid, p1.name as sender_name, p2.name as recipient_name
from messages m,
join people p1
on m.senderid = p1.personid
join people p2
on m.recipientid = p2.personid
That is you need one join for sender and one join for receiver

MySQL Join Query (possible two inner joins)

I currently have the following:
Table Town:
id
name
region
Table Supplier:
id
name
town_id
The below query returns the number of suppliers for each town:
SELECT t.id, t.name, count(s.id) as NumSupplier
FROM Town t
INNER JOIN Suppliers s ON s.town_id = t.id
GROUP BY t.id, t.name
I now wish to introduce another table in to the query, Supplier_vehicles. A supplier can have many vehicles:
Table Supplier_vehicles:
id
supplier_id
vehicle_id
Now, the NumSupplier field needs to return the number of suppliers for each town that have any of the given vehicle_id (IN condition):
The following query will simply bring back the suppliers that have any of the given vehicle_id:
SELECT * FROM Supplier s, Supplier_vehicles v WHERE s.id = v.supplier_id AND v.vehicle_id IN (1, 4, 6)
I need to integrate this in to the first query so that it returns the number of suppliers that have any of the given vehicle_id.
SELECT t.id, t.name, count(s.id) as NumSupplier
FROM Town t
INNER JOIN Suppliers s ON s.town_id = t.id
WHERE s.id IN (SELECT sv.supplier_id
FROM supplier_vehicles sv
WHERE sv.vehicle_id IN (1,4,6))
GROUP BY t.id, t.name
Or you could do an INNER JOIN (as your supplier join is INNER, but this will remove towns with no suppliers with those vehicles) and change the COUNT(s.id) TO COUNT(DISTINCT s.id)
If I remember correctly, you can put your second query inside the LEFT OUTER JOIN condition.
So for example, you can do something like
...
LEFT OUTER JOIN (SELECT * FROM Suppler s, Supplier_vehicles ......) s ON s.town_id=t.id
In that way you are "integrating" or combining the two queries into one. Let me know if this works.
SELECT t.name, count(s.id) as NumSupplier
FROM Town t
LEFT OUTER JOIN Suppliers s ON t.id = s.town_id
LEFT OUTER JOIN Supplier_vehicles v ON s.id = v.supplier_id
WHERE v.vehicle_id IN (1,4,6)
GROUP BY t.name