I am using MySQL and am trying to create a query to solve this question:
Average number of borrowed books by occupation
My plan was to count the number of instances of 'BorrowID' because each time a book is borrowed it creates a unique BorrowID. Then group those by clientID so that each person has their total listed books borrowed. Then this is where I start to get lost, as I obviously want to average all the grouped occupations however I am not sure if I am doing that...
First I tried:
SELECT client.Occupation, AVG(BorrowIDCount)
FROM
(
SELECT COUNT(BorrowID) as BorrowIDCount
FROM client, borrower
WHERE client.ClientID = borrower.ClientID
GROUP BY borrower.ClientID
) as x
GROUP BY Occupation
But it gives the error:
Unknown column 'client.Occupation' in 'field list'
Which I thought was because the outer query needed to know which tables...
So then I tried:
SELECT client.Occupation, AVG(BorrowIDCount)
FROM client, borrower
WHERE client.ClientID = borrower.ClientID AND
(
SELECT COUNT(BorrowID) as BorrowIDCount
FROM client, borrower
WHERE client.ClientID = borrower.ClientID
GROUP BY borrower.ClientID
)
GROUP BY Occupation
It didn't like the alias for the subquery so I removed it although no idea why, however it then gave this error:
Unknown column 'BorrowIDCount' in 'field list'
I feel like I may be completely off base in terms of how to create this query but I also feel that I might be close and am just not understanding some rules or syntax here. Any help in the right direction would be incredibly appreciated.
Thanks!
It looks to me like you want to figure out the number of books borrowed by client, then to average that number by occupation. So let's do it in steps:
First is a subquery to get the books per client.
SELECT COUNT(*) borrowed, ClientID
FROM borrower
GROUP BY ClientID
Next, we use that subquery in an outer query to get the average you want.
SELECT AVG(byclient.borrowed) average_borrowed,
client.Occupation
FROM (
SELECT COUNT(*) borrowed, ClientID
FROM borrower
GROUP BY ClientID
) byclient
LEFT JOIN client ON byclient.ClientID = client.ClientID
GROUP BY client.Occupation
ORDER BY AVG(byclient.borrowed) DESC, client.Occupation;
Your requirement calls for an aggregate of an aggregate, so you must nest one aggregate query inside another.
LEFT JOIN allows the inclusion of clients without any occupation. If you don't want that just use JOIN.
The first query in your question failed because your FROM clause referred to a subquery (a virtual table) that lacks the Occupation column. The second one failed because AND (virtual table) doesn't mean anything in SQL.
This nesting of virtual tables is the Structured part of Structured Query Language.
Related
I am working on a simple problem set, and I cannot seem to find the issue that is generating this same error: "Syntax Error in FROM Clause".
The question involves the use of various databases in this instant to find "Which employee has sold the most product?"
Here is my code
SELECT (Employees.FirstName + Employees.LastName) as Employee, SUM(Orders.Quantity)
FROM Employees, Orders
JOIN Employees ON Orders.EmployeeID=Employees.EmployeeID
JOIN OrderDetails ON Orders.OrderID=OrderDetails.OrderID
GROUP BY Employee
ORDER BY max(SUM(Quantity)) DESC;
If I am misinterpreting the use of some syntax, please let me know. I am still learning.
Thanks for your help!
When you're using ANSI JOIN you don't list all the tables in the FROM clause. Just list the first table, and the other tables are in JOIN.
You also can't nest aggregate functions as MAX(SUM(Quantity)). If you want to find the employee who sold the most, order by quantity, and use TOP 1 to get the first row.
There's no need to join with OrderDetails, since you're not using anything from that table.
The query should be:
SELECT TOP 1 (Employees.FirstName + Employees.LastName) as Employee, SUM(Orders.Quantity) AS Quantity
FROM Employees
JOIN Orders ON Orders.EmployeeID=Employees.EmployeeID
GROUP BY Employee
ORDER BY Quantity DESC;
Note that if there's a tie for the most sold, this will just show one of them. Getting all of them is more complex, because you need a second query to get that maximum. See sql HAVING max(count()) return zero rows
select Accounts.name, Accounts.regno Accounts.model , Accounts.slacc, count (servicing.dt) as total
from Accounts l,eft
outer join servicing on Accounts.slacc = servicing.slacc
group by Accounts.slacc,Accounts.name
The error message is
Major Error 0x80040E14, Minor Error 25515
> select Accounts.name,Accounts .model , Accounts.regno, Accounts.slacc, count (servicing.dt) as total from Accounts left outer join servicing on Accounts.slacc = servicing.slacc group by Accounts.slacc,Accounts.name
In aggregate and grouping expressions, the SELECT clause can contain only
aggregates and grouping expressions. [ Select clause = Accounts,model ]
Your query has a group by clause. If you use a group by clause in the query, then every column in the select statement has to do one of two things - either it has to be part of the group by list, or it has to be an aggregate of some kind (Sum, Count, Avg, Max, etc). If you don't do this, SQL doesn't know what to do with the column. In your case Accounts.regno and Accounts.model are listed in the select, but they are not in the group by clause and they are not aggregates - hence your error.
Assume for the moment you have two account records with the same account name and slacc, but different Regno (or model). The group by clause says they have to be joined into one record for display, but you haven't told SQL how to do that. It doesn't matter if the data isn't like that, SQL looks for possible errors first.
In this case, you probably just want all the details grouped. The simplest way is just to make sure you add all the columns needed to the group by, like this
select Accounts.name, Accounts.regno, Accounts.model, Accounts.slacc, count(servicing.dt) as total
from Accounts
left outer join servicing on Accounts.slacc = servicing.slacc
group by Accounts.slacc, Accounts.name, Accounts.regno, Accounts.model
This will fix the error, but does extra grouping you don't need, and would get very cumbersome if you had a lot more columns you wanted from account, as you'd have to add them all. Another way to handle it is to use the minimum amount of columns for the group query, then join the result of that to your main query to get the other columns. This would probably look something like this
select Accounts.name, Accounts.regno, Accounts.model, Accounts.slacc, Totals.Total
from Accounts
left outer join
( Select slacc, count(dt) as total
from servicing
group by slacc
) Totals on Totals.slacc = Accounts.slacc
I'm doing what I would have expected to be a fairly straightforward query on a modified version of the imdb database:
select primary_name, release_year, max(rating)
from titles natural join primary_names natural join title_ratings
group by year
having title_category = 'film' and year > 1989;
However, I'm immediately running into
"column must appear in the GROUP BY clause or be used in an aggregate function."
I've tried researching this but have gotten confusing information; some examples I've found for this problem look structurally identical to mine, where others state that you must group every single selected parameter, which defeats the whole purpose of a group as I'm only wanting to select the maximum entry per year.
What am I doing wrong with this query?
Expected result: table with 3 columns which displays the highest-rated movie of each year.
If you want the maximum entry per year, then you should do something like this:
select r.*
from ratings r
where r.rating = (select max(r2.rating) where r2.year = r.year) and
r.year > 1989;
In other words, group by is the wrong approach to writing this query.
I would also strongly encourage you to forget that natural join exists at all. It is an abomination. It uses the names of common columns for joins. It does not even use properly declared foreign key relationships. In addition, you cannot see what columns are used for the join.
While I am it, another piece of advice: qualify all column names in queries that have more than one table reference. That is, include the table alias in the column name.
If you want to display all the columns you can user window function like :
select primary_name, year, max(rating) Over (Partition by year) as rating
from titles natural
join primary_names natural join ratings
where title_type = 'film' and year > 1989;
I'm trying to write a query which does the below:
For every guest who has the word “Edinburgh” in their address show the total number of nights booked. Be sure to include 0 for those guests who have never had a booking. Show last name, first name, address and number of nights. Order by last name then first name.
I am having problems with making the join work properly,
ER Diagram Snippet:
Here is my current (broken) solution:
SELECT last_name, first_name, address, nights
FROM booking
RIGHT JOIN guest ON (booking.booking_id = guest.id)
WHERE address LIKE '%Edinburgh%';
Here is the results from that query:
The query is partially complete, hoping someone can help me out and create a working version. I'm currently in the process of learning SQL so apologies if its a rather basic or dumb question!
Your query seems almost correct. You were joining the booking id with guets id which gave you some results because of overlapping (matching) ids, but this most likely doesn't correspond to the foreign keys. You should join on guest_id from booking to id from guest.
I'd add grouping to sum all booked nights for a particular guest (assuming that nights is an integer):
SELECT g.last_name, g.first_name, g.address, SUM(b.nights) AS nights
FROM guest AS g
LEFT JOIN booking AS b ON b.guest_id = g.id
WHERE g.address LIKE '%Edinburgh%'
GROUP BY g.last_name, g.first_name, g.address;
Are you sure that nights spent should be calculated using nights field? Why can it be null? If you'd like to show zero for null values just wrap it up with a coalesce function like that:
COALESCE(SUM(b.nights), 0)
Notes:
Rewriten RIGHT JOIN into LEFT JOIN, but that doesn't affect results - it's just cleaner for me
Using aliases eg. AS g makes the code shorter when specifying joining columns
Reference every column with their table alias to avoid ambiguity
SELECT g.first_name,
g.last_name,
g.address,
COALESCE(Sum(b.nights), 0)
FROM booking b
RIGHT JOIN guest g
ON ( b.guest_id = g.id )
WHERE address LIKE 'edinburgh%'
GROUP BY g.last_name,
g.first_name,
g.address;
This post answers your questions about how to make the query.
MySQL SUM with same ID
You can simply use COALESCE as referenced here to avoid the NULL Values
How do I get SUM function in MySQL to return '0' if no values are found?
I have 2 simple tables - Firm and Groups. I also have a table FirmGroupsLink for making connections between them (connection is one to many).
Table Firm has attributes - FirmID, FirmName, City
Table Groups has attributes - GroupID, GroupName
Table FirmGroupsLink has attributes - FrmID, GrpID
Now I want to make a query, which will return all those firms, that have less groups then #num, so I write
SELECT FirmID, FirmName, City
FROM (Firm INNER JOIN FirmGroupsLink ON Firm.FirmID =
FirmGroupsLink.FrmID)
HAVING COUNT(FrmID)<#num
But it doesn't run, I try this in Microsoft Access, but it eventually should work for Sybase. Please show me, what I'm doing wrong.
Thank you in advance.
In order to count properly, you need to provide by which group you are couting.
The having clause, and moreover the count can't work if you are not grouping.
Here you are counting by Firm. In fact, because you need to retrieve information about the Firm, you are grouping by FirmId, FirmName and City, so the query should look like this:
SELECT Firm.FirmID, Firm.FirmName, Firm.City
FROM Firm
LEFT OUTER JOIN FirmGroupsLink
ON Firm.FirmID = FirmGroupsLink.FrmID
GROUP BY Firm.FirmID, Firm.FirmName, Firm.City
HAVING COUNT(FrmID) < #num
Note that I replace the INNER JOIN by a LEFT OUTER JOIN, because you might want Firm which doesn't belongs to any groups too.