Getting an average for each row based on a separate query - mysql

Above is my table schema. My task is to Write a SQL command to display for each publisher the publisher’s name, the publisher’s location and the
average cost of the books that the publisher sells. I have a mostly working query:
SELECT Publisher.name, Publisher.location,
(SELECT AVG(Book.cost)
FROM (Book
INNER JOIN Publisher
ON Book.publisherName = Publisher.name)
WHERE Book.publisherName = Publisher.name
) bookAverage FROM Book
INNER JOIN Publisher ON Book.publisherName = Publisher.name;
The problem is that this returns the average of all books in the Books table. How can I change this to only return the average cost of the books associated with each publisher?
Here's a fiddle with the schema implemented already:
http://sqlfiddle.com/#!9/7a9909/11/0

SELECT p.name, p.location, AVG(b.Cost) as AverageBookCost
FROM
Publisher p
INNER JOIN book b
ON b.publisherName = p.name
GROUP BY
p.name, p.location
http://sqlfiddle.com/#!9/7a9909/18
Only 1 join, no sub queries, inner selects nothing needed as you are looking for the straight forward aggregate of the join between the tables. Simply specify you GROUP BY clause correctly.

You are close. You just have too many JOINs. For instance, the subquery only needs the correlation clause:
SELECT p.name, p.location,
(SELECT AVG(b.cost)
FROM Book b
WHERE b.publisherName = p.name
) as bookAverage
FROM Publisher p;
If you were to write this as a JOIN, you would properly write it using a LEFT JOIN and GROUP BY:
SELECT p.name, p.location, AVG(b.cost) as bookAverage
FROM Publisher p JOIN
Book b
ON b.publisherName = p.name
GROUP BY p.name, p.location;

SELECT Publisher.name, Publisher.location, costs.avg_cost
FROM Publisher
INNER JOIN (
SELECT AVG(Book.cost) as avg_cost, Book.publisherName
FROM Book
GROUP BY Book.publisherName
) AS costs ON costs.publisherName = Publisher.name;

Complementing the great answer/solution from #Matt you can get Publisher Name, Publisher Location, Nbr of books, Total cost of Books and Avg price per Book easily in one single shot:
SELECT b.publisherName, p.location, COUNT(1) AS total_books,
ROUND(SUM(cost), 2) AS books_total_cost, ROUND(AVG(cost), 2) AS avg_cost_per_book
FROM book b
JOIN publisher p ON p.name = b.publisherName
WHERE 1=1
GROUP BY b.publisherName
ORDER BY b.publisherName
;
I think that the query itself is well self-explanatory but if you have any question, please, be my guest :)

Related

Join 2 different counts from 2 different tables into one subtable in sql

I'm having a problem in where i want to count how many medals in total a country has won from both the individual and team competitions does not give me the disered outcome. i have managed so far tocome up with this.
select distinct C.Cname as Country, count(i.medal) as Medals_Won
from individual_results as i, Country as C, participant as p
where (i.Olympian = p.OlympicID and C.Cname = p.country)
union
select distinct C.Cname, count(r.medal) as medals_Won
from team_results as r, Country as C, participant as p, team as t
where (r.team = t.TeamID and t.Member1 = p.OlympicID and C.Cname = p.Country)
group by C.Cname
order by medals_won desc
enter image description here
but i get this result.
even tho if i run the two separate pieces of code i ge the wanted restuls that is enter image description here
You say you can run your query and it gives you a result. This is bad. It indicates that you are MySQL's notorious cheat mode that lets you run invalid queries.
You have something like this:
select ...
union
select ...
group by ...
order by ...
There are two queries the results of which you glue together, namely
select ...
and
select ...
group by ...
So, your first query becomes:
select distinct C.Cname as Country, count(i.medal) as Medals_Won
from individual_results as i, Country as C, participant as p
where (i.Olympian = p.OlympicID and C.Cname = p.country)
You COUNT medals, i.e. you aggregate your data. And there is no GROUP BY clause. So you get one result row from all your data. You say you want to count all rows for which i.medal is not null. But you also want to select the country. The country? Which??? Is there just one country in the tables? And even then your query would be invalid, because still you'd have to tell the DBMS from which row to pick the country. You can pick the maximum country (MAX(C.Cname)) for instance or the minimum country (MIN(C.Cname)), but not the country.
The DBMS should raise an error on this invalid query, but you switched that off.
Make sure in MySQL to always
SET sql_mode = 'ONLY_FULL_GROUP_BY';
It is the default in more recent versions, so either you are working with old software or you switched from good mode to bad mode voluntarily.
And talking of old software: Even at the first moment MySQL was published, comma joins had long been deprecated. They were made redudant in 1992. Please don't ever use commas in your FROM clause. Use explicit joins ([INNER] JOIN, LEFT [OUTER] JOIN, etc.) instead.
As to the task, here is a straight-forward solution with joins:
select
c.cname as country,
coalesce(i.medals, 0) as medals_individual,
coalesce(t.medals, 0) as medals_team,
coalesce(i.medals, 0) + coalesce(t.medals, 0) as medals_total
from country c
left outer join
(
select p.country, count(ir.medal) as medals
from participant p
join individual_results ir on ir.olympian = p.olympicid
group by p.country
) i on on i.country = c.name
left outer join
(
select p.country, count(ir.medal) as medals
from participant p
join team t on t.member1 = p.olympicid
join team_results tr on tr.team = t.teamid
group by p.country
) t on on t.country = c.name
order by medals_total desc;
You should sum the union result for each of the subquery grouped by cname
select t.Cname , sum( t.Medals_Won)
from (
select C.Cname as Country, count(i.medal) Medals_Won
from individual_results i
inner join participant p ON i.Olympian = p.OlympicID
inner join Country C ON C.Cname = p.country
group by C.Cname
union
select distinct C.Cname, count(r.medal)
from team_results as r
inner join team as t ON r.team = t.TeamID
inner join participant as p ON t.Member1 = p.OlympicID
inner join Country as C ON C.Cname = p.Country
group by C.Cname
) t
group by t.Cname
order by sum( t.Medals_Won) desc

Return the company with most film in a genre

I am working on this project at my university, where I need to create a query to the database. I want the query to return the company with most movies in the given genre. At the moment I have this query, but this only return one company, but there can probably be more than one.
SELECT CompanyID, CategoryID, COUNT(*) as NumberOfMovies
FROM Movie
NATURAL JOIN CategoryFilm
NATURAL JOIN Category
NATUAL JOIN Comapny
GROUP BY CategoryID, CompanyID
Order by NumberOfMovies DESC LIMIT 1
I beleave I will need a "having" in here.
pls try this, it may because you added limit 1, which only show 1st retrieved record
SELECT CompanyID, CategoryID, COUNT(*) as NumberOfMovies
FROM Movie
NATURAL JOIN CategoryFilm
NATURAL JOIN Category
NATURAL JOIN Comapny
GROUP BY CategoryID, CompanyID
Order by NumberOfMovies DESC
I assume by "category" you mean "genre" -- or that they are the same thing.
Do not use NATURAL JOIN. It does not even use properly declared foreign key relationships, instead relying merely on name similarity between tables. It is dangerous because the columns used are not specified and can introduce hard-to-debug errors. I often refer to it as an "abomination" because it does not take table declarations into account.
If you have a given category, then I would expect a WHERE clause:
SELECT CompanyID, COUNT(*) as NumberOfMovies
FROM Movie m JOIN
CategoryFilm cf
ON cf.movie_id = m.movie_id JOIN
Company c
ON c.company_id = m.company_id
WHERE cf.category_id = ?
GROUP BY CategoryID
ORDER BY NumberOfMovies DESC
LIMIT 1;
If you want to allow ties, you can use window function rank():
select *
from (
select
co.companyID,
ca.categoryID,
count(*) NumberOfMovies,
rank() over(partition by c.categoryID order by count(*) desc) rn
from movie m
inner join categoryFilm cf on cf.movieID = m.movieID
inner join category ca on ca.categoryID = cf.categoryID
inner join company co on co.companyID = m.companyID
group by co.companyID, ca.categoryID
) t
where rn = 1
order by ca.categoryID
This gives you the top company for each and every category, ties included. If you want to filter on a given category, you can just add a where clause to the inner query.
Side note: do not use natural joins: they are error-prone. I rewrote the query to use inner joins instead (I made a few assumptions on the relations).

3 table query with count

I'm having huge difficulty with a 3 table query.
The scenario is that TEAM has many or no MEMBERS, a MEMBER could have many or no TASKS. What I want to get is the number of TASKS for every TEAM. TEAM has its own ID, MEMBER holds this as a FK on TEAM_ID, TASK has MEMBER_ID on the TASK.
I want to get a report of TEAM.NAME, COUNT(Person/Team), Count(Tasks/Team)
I have myself so confused, My thinking was to use an Outer Join on TEAM and MEMBER so I have all the teams with any members they have. From here I'm getting totally confused. If anyone can just point me in the right direction so I have something to work from I'd be so greateful
You want to use count distinct:
MySQL COUNT DISTINCT
select t.name as Team,
count(distinct m.ID) as Member_cnt,
count(distinct t.ID) as Task_cnt
from team t
left join member m
on t.ID= m.TEAM_ID
left join tasks t
on t.MEMBER_ID= m.ID
group by t.name;
I think you can do what you want with aggregation -- and count(distinct):
select t.name,
count(distinct m.memberid) as nummembers,
count(distinct tk.taskid) as numtasks
from team t left join
member m
on t.teamid = j.teamid left join
tasks tk
on tk.memberid = m.memberid
group by t.name;
Try this out :
SELECT Team.name, COUNT(Person.id_person), COUNT(Tasks.id_task)
FROM Team t,
LEFT JOIN Person p on p.team_id = t.id_team
LEFT JOIN Tasks ts on ts.person_id = p.id_person
GROUP BY p.team_id, ts.person_id

Need help combining two functional mysql querys into one?

I have this query that will return a list of all of the people associated with Thomas and their ids.
SELECT c.name, c.ID
FROM namesandID s, associations o, namesandID c
WHERE s.name='Thomas' AND o.id = s.ID AND o.associateID = c.ID
GROUP BY c.ID;
Then I have this query that I can manually type in the id number and it will return the correct count of associates.
SELECT count(*) FROM (
SELECT associateID FROM associations WHERE id=18827 GROUP BY associateID
) AS t;
My goal is to have one query that will take Thomas as the name and return three columns that will have his associate their id number an the number of people they are associated with.
Also as some additional information this is a very large database with about 4million rows and 300million associations so any speed increase on either of these queries would be greatly welcomed.
Not tested, however the below should work:
select
c.name,
c.id,
assoc_count.cnt
from
namesandIds s
inner join
associations o on
o.id = s.ID
inner join
namesandId c on
c.ID = o.associateId
left outer join
(
select
id,
count(distinct associateId) as cnt
from
associations
group by
id
) assoc_count on
assoc_count.id = c.ID
where
s.name = 'Thomas'
Not very efficient but
SELECT c.name, c.ID, COUNT(DISTINCT o.associateID)
FROM {the rest of the first query}
should do the trick.

SQL query, AVG and COUNT on multiple tables

I need a query, the query that i used doesn't work the way i want for some reason
Here's all the tables involved in the query.
Here's the query i want :
Show a list of books with their average ratings and its number of recommendations
result should be like this :
What i already tried :
SELECT book.isbn, AVG(ratings.rating) AS [AVG Ratings], COUNT(recommend.isbn) AS [Number of recommendation]
FROM book INNER JOIN
recommend ON book.isbn = recommend.isbn INNER JOIN
ratings ON book.isbn = ratings.isbn
GROUP BY book.isbn
But it didn't work, somehow the the AVG rating works great, but the # of recommendations does not, it conflicts with the ratings table.
here's what the result is:
However when i try each one alone, everything works great like this :
for AVG ratings :
SELECT book.isbn, AVG(ratings.rating) AS [AVG Ratings]
FROM book INNER JOIN
ratings ON book.isbn = ratings.isbn
GROUP BY book.isbn
Here's the result :
And for the # of recommendations :
SELECT book.isbn, COUNT(recommend.isbn) AS [Number of recommendation]
FROM book INNER JOIN
recommend ON book.isbn = recommend.isbn
GROUP BY book.isbn
Here's the result :
So i want a query to combine the two views into one view
If you want to get accurate results, then you need to do the aggregations before the join:
SELECT b.isbn, r.AvgRating, re.NumRecommendation
FROM book b LEFT JOIN
(SELECT r.isbn, AVG(r.rating) as AvgRating
FROM rating r
GROUP BY r.isbn
) r
ON b.isbn = r.isbn LEFT JOIN
(SELECT r.isbn, COUNT(*) as NumRecommendation
FROM recommendation r
GROUP BY r.isbn
) re
on b.isbn = r.isbn ;
Note that I also switched to left outer joins, so you will get results for all books, even those that are missing either ratings or recommendations.
SELECT
book.isbn,
IFNULL(AVG(ratings.rating),"Not yet rated") AS [AVG Ratings],
IFNULL(COUNT(DISTINCT recommend.iduse),0) AS [Number of recommendation]
FROM
book
LEFT JOIN recommend ON book.isbn = recommend.isbn
LEFT JOIN ratings ON book.isbn = ratings.isbn
GROUP BY book.isbn
Should to the trick:
Counting the distinct users recommending a book will fix the cartesian product issue
Left joins with the usual IFNULL() plumbing will make it work on books that have either no recommendations or no ratings
If you are using SQL Server you can use the over clause:
SELECT
B.isbn,
AVG(ra.rating) OVER (PARTITION BY B.isbn) AS [AVG RATE],
COUNT(re.isbn) OVER (PARTITION BY B.isbn) AS [RECOMMEND COUNT]
FROM Book B
LEFT JOIN recommend re ON B.isbn = re.isbn
LEFT JOIN ratings ra ON B.isbn = ra.isbn