SQL - Find object with the highest count in a column - mysql

I am answering questions about an IMDB database as shown below.
I need to find which TV show (which is a kind_type that shows up as 'tv series') has the most episodes, actors and actresses, and seasons (these are separate parts of the question).
To start off, I wrote a query to find the name of the TV show that has the most actresses:
SELECT *
FROM (
SELECT DISTINCT t.title, count(t.title) total
FROM title t
INNER JOIN kind_type k
ON (t.kind_id = k.id)
INNER JOIN cast_info c
ON (c.movie_id = t.id)
CROSS JOIN role_type r
GROUP BY t.title
HAVING r.role = 'actress' AND k.kind = 'tv series'
ORDER BY total DESC
) as newTable
LIMIT 1
However, I get the error:
column "r.role" must appear in the GROUP
BY clause or be used in an aggregate function
LINE 11: HAVING r.role = 'actress' AND k.kind = 'tv series'
So you can think of it as having a lot of cast_info objects, each attached to role_type objects. Each cast_info also has a variable for the movie_id, and I aimed to select a list of all cast_info objects that had role_types with the role 'actress', and then pick out the most frequently occurring 'movie_id' that shows up in that list.
Example:
In this example, the query should ideally return "3" because that is the movie ID that has the most actresses.
Any tips would be greatly appreciated.

This is a simple fix and likely just a mistake on your part.
You're receiving the error because you're putting a regular condition inside your HAVING clause. HAVING is used for conditions regarding aggregate functions.
For example, if you were trying to select only rows with a total greater than 2, you use having:
HAVING total > 2
However, what you want needs to go in a WHERE clause. Try this:
SELECT *
FROM (
SELECT DISTINCT t.title, count(t.title) total
FROM title t
INNER JOIN kind_type k
ON (t.kind_id = k.id)
INNER JOIN cast_info c
ON (c.movie_id = t.id)
JOIN role_type r
ON (r.id = c.role_id)
WHERE r.role = 'actress' AND k.kind = 'tv series'
GROUP BY t.title
ORDER BY total DESC
) as newTable
LIMIT 1
Here is more info on the HAVING clause.

Related

Join 2 different counts from 2 different tables into one subtable in sql

I'm having a problem in where i want to count how many medals in total a country has won from both the individual and team competitions does not give me the disered outcome. i have managed so far tocome up with this.
select distinct C.Cname as Country, count(i.medal) as Medals_Won
from individual_results as i, Country as C, participant as p
where (i.Olympian = p.OlympicID and C.Cname = p.country)
union
select distinct C.Cname, count(r.medal) as medals_Won
from team_results as r, Country as C, participant as p, team as t
where (r.team = t.TeamID and t.Member1 = p.OlympicID and C.Cname = p.Country)
group by C.Cname
order by medals_won desc
enter image description here
but i get this result.
even tho if i run the two separate pieces of code i ge the wanted restuls that is enter image description here
You say you can run your query and it gives you a result. This is bad. It indicates that you are MySQL's notorious cheat mode that lets you run invalid queries.
You have something like this:
select ...
union
select ...
group by ...
order by ...
There are two queries the results of which you glue together, namely
select ...
and
select ...
group by ...
So, your first query becomes:
select distinct C.Cname as Country, count(i.medal) as Medals_Won
from individual_results as i, Country as C, participant as p
where (i.Olympian = p.OlympicID and C.Cname = p.country)
You COUNT medals, i.e. you aggregate your data. And there is no GROUP BY clause. So you get one result row from all your data. You say you want to count all rows for which i.medal is not null. But you also want to select the country. The country? Which??? Is there just one country in the tables? And even then your query would be invalid, because still you'd have to tell the DBMS from which row to pick the country. You can pick the maximum country (MAX(C.Cname)) for instance or the minimum country (MIN(C.Cname)), but not the country.
The DBMS should raise an error on this invalid query, but you switched that off.
Make sure in MySQL to always
SET sql_mode = 'ONLY_FULL_GROUP_BY';
It is the default in more recent versions, so either you are working with old software or you switched from good mode to bad mode voluntarily.
And talking of old software: Even at the first moment MySQL was published, comma joins had long been deprecated. They were made redudant in 1992. Please don't ever use commas in your FROM clause. Use explicit joins ([INNER] JOIN, LEFT [OUTER] JOIN, etc.) instead.
As to the task, here is a straight-forward solution with joins:
select
c.cname as country,
coalesce(i.medals, 0) as medals_individual,
coalesce(t.medals, 0) as medals_team,
coalesce(i.medals, 0) + coalesce(t.medals, 0) as medals_total
from country c
left outer join
(
select p.country, count(ir.medal) as medals
from participant p
join individual_results ir on ir.olympian = p.olympicid
group by p.country
) i on on i.country = c.name
left outer join
(
select p.country, count(ir.medal) as medals
from participant p
join team t on t.member1 = p.olympicid
join team_results tr on tr.team = t.teamid
group by p.country
) t on on t.country = c.name
order by medals_total desc;
You should sum the union result for each of the subquery grouped by cname
select t.Cname , sum( t.Medals_Won)
from (
select C.Cname as Country, count(i.medal) Medals_Won
from individual_results i
inner join participant p ON i.Olympian = p.OlympicID
inner join Country C ON C.Cname = p.country
group by C.Cname
union
select distinct C.Cname, count(r.medal)
from team_results as r
inner join team as t ON r.team = t.TeamID
inner join participant as p ON t.Member1 = p.OlympicID
inner join Country as C ON C.Cname = p.Country
group by C.Cname
) t
group by t.Cname
order by sum( t.Medals_Won) desc

Return the company with most film in a genre

I am working on this project at my university, where I need to create a query to the database. I want the query to return the company with most movies in the given genre. At the moment I have this query, but this only return one company, but there can probably be more than one.
SELECT CompanyID, CategoryID, COUNT(*) as NumberOfMovies
FROM Movie
NATURAL JOIN CategoryFilm
NATURAL JOIN Category
NATUAL JOIN Comapny
GROUP BY CategoryID, CompanyID
Order by NumberOfMovies DESC LIMIT 1
I beleave I will need a "having" in here.
pls try this, it may because you added limit 1, which only show 1st retrieved record
SELECT CompanyID, CategoryID, COUNT(*) as NumberOfMovies
FROM Movie
NATURAL JOIN CategoryFilm
NATURAL JOIN Category
NATURAL JOIN Comapny
GROUP BY CategoryID, CompanyID
Order by NumberOfMovies DESC
I assume by "category" you mean "genre" -- or that they are the same thing.
Do not use NATURAL JOIN. It does not even use properly declared foreign key relationships, instead relying merely on name similarity between tables. It is dangerous because the columns used are not specified and can introduce hard-to-debug errors. I often refer to it as an "abomination" because it does not take table declarations into account.
If you have a given category, then I would expect a WHERE clause:
SELECT CompanyID, COUNT(*) as NumberOfMovies
FROM Movie m JOIN
CategoryFilm cf
ON cf.movie_id = m.movie_id JOIN
Company c
ON c.company_id = m.company_id
WHERE cf.category_id = ?
GROUP BY CategoryID
ORDER BY NumberOfMovies DESC
LIMIT 1;
If you want to allow ties, you can use window function rank():
select *
from (
select
co.companyID,
ca.categoryID,
count(*) NumberOfMovies,
rank() over(partition by c.categoryID order by count(*) desc) rn
from movie m
inner join categoryFilm cf on cf.movieID = m.movieID
inner join category ca on ca.categoryID = cf.categoryID
inner join company co on co.companyID = m.companyID
group by co.companyID, ca.categoryID
) t
where rn = 1
order by ca.categoryID
This gives you the top company for each and every category, ties included. If you want to filter on a given category, you can just add a where clause to the inner query.
Side note: do not use natural joins: they are error-prone. I rewrote the query to use inner joins instead (I made a few assumptions on the relations).

SQL Count in LEFT Joint Aggregate?

I have four table like the picture bellow
I want to count how many student that have status 'v' where in table submission have submission type '1' and group by student_id so in the last i can get table like this
I have try sql query like this
select p.id, (SELECT count(*) FROM (select b.id from student as a , submission as b WHERE a.id = b.student_id and b.id_submission_type =1 and a.status_n='v' and a.id_academic_programe = p.id GROUP BY b.student_id) ) from academic_programe as p
But give me error
1054 - Unknown column 'p.id' in 'where clause'
Any suggestion? sory for my english
Correlations cannot be in nested subqueries. Fortunately, this is easy to fix:
select p.id,
(select count(*)
from student st join
submission su
on st.id = su.student_id and
su.id_submission_type = 1 and
st.status_n = 'v' and
where st.id_academic_programe = p.id
)
from academic_programe p;
Try this out:
select c.academic_program_name,count(a.distinct student_name) as count
from
(select * from student where status = 'v') a
inner join
(select * from submission id_submission_type=1) b
on a.id =b.student_id
inner join
academic_program_name c
on a.id_academic_programe = c.id
group by c.academic_program_name;
Let me know in case of any queries.
Please try the following...
SELECT student.id,
student_name,
academic_program_name AS Programe,
COUNT( status_n ) AS status_n_count
FROM student
JOIN Submission ON student.id = Submission.student_id
RIGHT JOIN academic_program ON student.id_academic_programe = academic_program.id
WHERE id_submission_type = 1
AND status_n = 'v'
GROUP BY student.id,
student_name,
academic_program_name;
This statement begins by joining the student and Submission so as to get a table containing the student's id, student_name, status_n and id_submission_type fields. This is then RIGHT JOINed to form a table where each academic program is listed along with each student's details, and that programs with no students are still listed.
The resulting dataset is refined as per your criteria with the WHERE clause, GROUPed and SELECTed
If you have any questions or comments then please feel free to post a Comment accordingly.

Find most common value of a table

How could I find the most common value in table player_frags of either lasthit or mostdamage and order by asc?
SELECT DISTINCT(name) FROM players p
INNER JOIN player_frags pf ON pf.lasthit = p.name
OR pf.mostdamage = p.name
SELECT name FROM players p
INNER JOIN player_frags pf ON pf.lasthit = p.name
OR pf.mostdamage = p.name GROUP BY name Order By COUNT(*) DESC
You could add LIMIT 1 at the end for the most common name.
SQL Fiddle
I don't think you tried, this looks exactly like the other SQL you posted in the other question you sent
Anyway this will return name vs frequency of appearing:
SELECT COUNT(*) AS Freq, name
FROM players
GROUP BY players.name
ORDER BY COUNT(*)

Multiple Grouping in mysql queries. Group_concat? Group_by? Inner joins? Where am I going wrong?

I'm finding trouble finding a similar example to what I'm trying to achieve. I have 3 tables. From one table I want to get the linking ID number. From another table I want to find the same ID's and add up another column of numbers in that table where the ID number from the 1st table matches. Then on the 3rd table, which is text, I want to group all the text together where the ID matches the main ID number... and return all this in 1 go. My diagram should show what I mean:
So have 2 queries that will on their own return part the results, but Im struggling to build it into 1 single query.
SELECT ticket_charges.ticket_id
, sum(ticket_charges.charge_time) AS Seconds
FROM
ticket_charges
LEFT OUTER JOIN tickets
ON ticket_charges.ticket_id = tickets.id
GROUP BY
ticket_charges.ticket_id
, tickets.id
The 77 and 937 for ticket ID 3 have been added up correctly!!
SELECT tickets.id AS `Ticket Number`
, left(tickets_messages.message, 500) AS `Ticket Message`
FROM
tickets
INNER JOIN tickets_messages
ON tickets.id = tickets_messages.id
GROUP BY
tickets_messages.ticket_id
, tickets.id
The messages are joined together correctly.
I've tried some concatenation on messages, selects within selects, different methods to group by, a couple of sums etc.. but just can't seem to get a result where by the I'm getting the results back correctly with both queries as 1 single query. Either the joined numbers from "charge_time" are very wrong and don't match any resemblance to anything or I end up with hundreds of "message" and strange numbers on the "charge_time"
FYI.. If I try this, I get "Sub query returned more than 1 row" but it's what I thought I should be doing.
SELECT ticket_charges.ticket_id
, sum(ticket_charges.charge_time) AS Seconds
FROM
ticket_charges
LEFT OUTER JOIN tickets
ON ticket_charges.ticket_id = tickets.id
Where (SELECT left(tickets_messages.message, 500)
FROM
tickets
INNER JOIN tickets_messages
ON tickets.id = tickets_messages.id
GROUP BY
tickets.id)
GROUP BY
ticket_charges.ticket_id
, tickets.id
If you really need to do that with a single query, the solution is to do a subquery in one of the jointures.
SELECT t.id, t.person_id, SUM(tc.charge_time), mc.concat
FROM tickets t
INNER JOIN tickets_charges tc ON tc.ticket_id = t.id
INNER JOIN (
SELECT ticket_id, GROUP_CONCAT(message SEPARATOR ' ') as concat
FROM tickets_messages
GROUP BY ticket_id) AS mc
ON mc.ticket_id = t.id
GROUP BY t.id
Try this query -
SELECT
t.id,
t.person_id,
SUM(tc.charge_time) Seconds,
GROUP_CONCAT(LEFT(tm.message, 20)) Message
FROM
tickets t
LEFT JOIN ticket_charges ts
ON ts.ticket_id = t.id
LEFT JOIN tickets_messages tm
ON tm.ticket_id = t.id
GROUP BY
t.id;
Note, that I used 'LEFT(tm.message, 20)', because GROUP_CONCAT function has length limitation - group_concat_max_len.