Error 1111 (HY000) - Invalid use of group function - mysql

I'm getting a problem when trying to run this query:
Select
c.cname as custName,
count(distinct o.orderID) as No_of_orders,
avg(count(distinct o.orderID)) as avg_order_amt
From Customer c
Inner Join Order_ o
On o.customerID = c.customerID
Group by cname;
This is an error message: #1111 (HY000) - Invalid use of group function
I just want to select each customer, find how many orders each customer has, and average the total number of orders for each customer. I think it might have a problem with too many aggregates in query.

The issue is that you need to have two separate groupings if you want to calculate the average over a count, so this expression isn't valid:
avg(count(distinct o.orderID))
Now it's hard to understand what exactly you mean, but it sounds as if you just want to use avg(o.amount) instead.
[edit] I see your addition now, so while the error is still the same, the solution will be slightly more complex. The last value you need, the avarage number of orders per customer, is not a value to calculate per customer. You'd need analytical functions to that, but that might be quite tricky in MySQL. I'd recommend to write a separate query for that, otherwise you would have very complex query which would return the same number for each row anyway.

select c.cname, o.customerID, count(*), avg(order_total)
from order o join customer using(customerID)
group by 1,2
This will calculate the number of orders and average order total (substitute the real column name for order_total) for each customer.

how many orders each customer has,
average the total number of orders.
SELECT
c1.cname AS custName,
c1.No_of_orders,
c2.avg_order_amt
FROM (
SELECT
c.id,
c.cname,
COUNT(DISTINCT o.orderID) AS No_of_orders
FROM
Customer c
JOIN Order_ o ON o.customerID = c.customerID
GROUP BY c.id, c.cname
) c1
CROSS JOIN (SELECT AVG(No_of_orders) AS avg_order_amt FROM (
SELECT
c.id,
COUNT(DISTINCT o.orderID) AS No_of_orders
FROM
Customer c
JOIN Order_ o ON o.customerID = c.customerID
GROUP BY c.id
)) c2

Related

Inner Join for 3 tables with SUM of two columns in SQL Query?

I have the following three tables:
I have Following Query to Join Above 3 Tables
customer.customer_id,
customer.name,
SUM(sales.total),
sales.created_at,
SUM(sales_payments.amount)
FROM
sales INNER JOIN customer ON customer.customer_id = sales.customer_id
INNER JOIN sales_payments ON sales.customer_id = sales_payments.customer_id
WHERE sales.created_at ='2020-04-03'
GROUP By customer.name
Result for Above Query is given below
Sum of sales.total is double of the actual sum of sales.total column which has 2-row count, I need to have the actual SUM of that column, without doubling the SUM of those rows, Thank you, for your help in advance..
PROBLEM
The problem here is that there are consecutive inner joins and the number of rows getting fetched in the second inner join is not restricted. So, as we have not added a condition on sales_payment_id in the join between the sales and sales_payment tables, one row in sales table(for customer_id 2, in this case) would be mapped to 2 rows in the payment table. This causes the same values to be reconsidered.
In other words, the mapping for customer_id 2 between the 3 tables is 1:1:2 rather than 1:1:1.
SOLUTION
Solution 1 : As mentioned by Gordon, you could first aggregate the amount values of the sales_payments table and then aggregate the values in sales table.
Solution 2 : Alternatively (IMHO a better approach), you could add a foreign key between sales and sales_payment tables. For example, the sales_payment_id column of sales_payment table can be introduced in the sales table as well. This would facilitate the join between these tables and reduce additional overheads while querying data.
The query would then look like:
`SELECT c.customer_id,
c.name,
SUM(s.total),
s.created_at,
SUM(sp.amount)
FROM customer c
INNER JOIN sales s
ON c.customer_id = s.customer_id
INNER JOIN sales_payments sp
ON c.customer_id = sp.customer_id
AND s.sales_payments_id = sp.sales_payments_id
WHERE s.created_at ='2020-04-03'
GROUP BY c.customer_id,
c.name,
s.created_at ;`
Hope that helps!
You have multiple rows for sales_payments and sales per customer. You need to pre-aggregate to get the right value:
SELECT c.customer_id, c.name, s.created_at, s.total, sp.amount
FROM customer c JOIN
(SELECT s.customer_id, s.created_at, SUM(s.total) as total
FROM sales s
WHERE s.created_at ='2020-04-03'
GROUP BY s.customer_id, s.created_at
) s
ON c.customer_id = s.customer_id JOIN
(SELECT sp.customer_id, SUM(sp.amount) as amount
FROM sales_payments sp
GROUP BY sp.customer_id
) sp
ON s.customer_id = sp.customer_id

Double data amount when joining multiple tables

Link to all my database tables in excel format:
Database
This is the assignment I was tasked with:
Find total deposit, total withdraw, and total difference for each customer.
What I did was
select c.customerid, customername, sum(d.depositamount) as 'Total Deposit', sum(w.withdrawalamount) as 'Total Withdraw', sum(d.depositamount) - sum(w.withdrawalamount) as 'differences'
from customers c left outer join deposits d on c.customerid = d.customerid
left outer join withdrawals w on c.CustomerID = w.CustomerID
group by c.CustomerID
order by c.CustomerID;
The result
My problem is that the "Total Deposit" and "Total Withdraw" have its data doubled. Since these two column data are doubled, the differences is also doubled. I know I can divide by 2 to all the column to solve the problem but I would like to know a proper method in doing this.
My question is how do I join multiple tables in such a way that the data will not be doubled?
(For example, "James Carlton Brokeridge" is suppose to have 450, 380, and 70 repsectively).
You are getting incorrect values because every time a customer makes more than one deposit or withdrawal, that causes the rows in the other (withdrawal/deposit) table to be replicated when the two tables are joined. To work around this, do the sums in subqueries:
select c.customerid,
c.customername,
d.totaldeposit as 'Total Deposit',
w.totalwithdrawal as 'Total Withdraw',
d.totaldeposit - w.totalwithdrawal as 'differences'
from customers c
left outer join (
select customerid, sum(depositamount) as totaldeposit
from deposits
group by customerid
) d on c.customerid = d.customerid
left outer join (
select customerid, sum(withdrawalamount) as totalwithdrawal
from withdrawals
group by customerid
) w on c.customerid = w.customerid
order by c.customerid

How to make my WHERE clause not run a syntax error in SQL?

The questions asks,
"Write a query to display the customer name and the number of payments they have made where the amount on the check is greater than their average payment amount. Order the results by the descending number of payments."
So far I have,
SELECT customerName,
(SELECT COUNT(checkNumber) FROM Payments WHERE
Customers.customerNumber = Payments.customerNumber) AS
NumberOfPayments
FROM Customers
WHERE amount > SELECT AVG(amount)
ORDER BY NumberOfPayments DESC;
But I am getting a syntax error every-time I run out. What am I doing incorrectly in this situation?
The syntax error comes from the fact that you are having an incorrect second subquery: amount > SELECT AVG(amount) doesn't work.
You could use amount > (SELECT AVG(amount) FROM Payments).
That is: complete the subquery and put it between ( ).
However this won't do what you want (plus it is inefficient).
Now since this is not a forum to do your homework for you, I will leave it at this and thus only help you with the actual question: why do you get the syntax error. Keep on looking, you will find it. No better way to learn than to search and find yourself.
I would phrase this as an inner join between the two tables, with a correlated subquery to find the average payment amount per customer:
SELECT
c.customerName,
COUNT(CASE WHEN p.amount > (SELECT AVG(p2.amount) FROM Payments p2
WHERE p2.customerName = c.customerName)
THEN 1 END) AS NumberOfPayments
FROM Customers c
INNER JOIN Payments p
ON c.customerNumber = p.customerNumber
GROUP BY
c.customerNumber
ORDER BY
NumberOfPayments DESC;
Your current query is on the right track, but you need to do something called conditional aggregation to obtain the count. In this case, we aggregate by customer then assert that a given payment amount is greater than his average before we include it in the count.
I would approach this just using JOINs:
SELECT c.customerName,
SUM( p.amount > p2.avg_amount ) as Num_Payments_Larger_Than_Average
FROM Customers c LEFT JOIN
Payments p
ON c.customerNumber = p.customerNumber LEFT JOIN
(SELECT p2.customerNumber, AVG(amount) as avg_amount
FROM payments p2
GROUP BY p2.customerNumber
) p2
ON p2.customerNumber = p.customerNumber
GROUP BY c.customerNumber, c.customerName
ORDER BY Num_Payments_Larger_Than_Average;
Some notes about this answer. First, it uses LEFT JOIN and conditional aggregation. This allows the query to return customers with zero payments larger than their average -- that is, customers with no payments or all of whose payments are the same.
Second, it includes customerNumber in the GROUP BY. I think this is important, because it may be possible for two customers to have the same name.

SQL retrieving filtered value in subquery

in this cust_id is a foreign key and ords returns the number of orders for every customers
SELECT cust_name, (
SELECT COUNT(*)
FROM Orders
WHERE Orders.cust_id = Customers.cust_id
) AS ords
FROM Customers
The output is correct but i want to filter it to retrieve only the customers with less than a given amount of orders, i don't know how to filter the subquery ords, i tried WHERE ords < 2 at the end of the code but it doesn't work and i've tried adding AND COUNT(*)<2 after the cust_id comparison but it doesn't work. I am using MySQL
Use the HAVING clause (and use a join instead of a subquery).....
SELECT Customers.cust_id, Customers.cust_name, COUNT(*) ords
FROM Orders, Customers
WHERE Orders.cust_id = Customers.cust_id
GROUP BY 1,2
HAVING COUNT(*)<2
If you want to include people with zero orders you change the join to an outer join.
There is no need for a correlated subquery here, because it calculates the value for each row which doesn't give a "good" performance. A better approach would be to use a regular query with joins, group by and having clause to apply your condition to groups.
Since your condition is to return only customers that have less than 2 orders, left join instead of inner join would be appropriate. It would return customers that have no orders as well (with 0 count).
select
cust_name, count(*)
from
customers c
left join orders o on c.cust_id = o.cust_id
group by cust_name
having count(*) < 2

SQL Query, What am I doing wrong?

Hi there fellas I'm trying to do this query but I am having trouble with it. Could anyone give me a hand I'll list below what I've done.
How busy each optometrist has been. Your SQL statement should return the full names of all optometrists, and the total number of appointments they have conducted. You must use the word ‘Optometrist’ not the positionID to select optometrists in your statement. Note that even optometrists with zero appointments should be displayed in the results.
What I've done..
SELECT firstName, lastName, optometristID, COUNT(optometristID)
FROM employee
LEFT JOIN appointment ON employee.employeeID=appointment.optometristID
GROUP BY (optometristID)
The full name, email and primary phone number and total number of invoices for all customers in ascending order of last name. Note that even customers with zero invoices should be displayed in the results.
What I've wrote..
SELECT c.firstName, c.lastName, c.primaryPhone,
(SELECT count(*) from invoice where customerID = c.customerID) as numInvoices
FROM customer c, invoice i
WHERE c.customerID = c.customerID
ORDER BY lastname ASC
Thank you!
The first one is ok for me
The second should be either
SELECT c.firstName, c.lastName, c.primaryPhone,
(SELECT count(*) from invoice where customerID = c.customerID) as numInvoices
FROM customer c
ORDER BY lastname ASC
OR
SELECT c.firstName, c.lastName, c.primaryPhone,
count(i.customerID ) as numInvoices
FROM customer c left join invoice i
on i.customerID = c.customerID
group by c.customerID
ORDER BY lastname ASC
The last one should be faster
In second query you have to write
WHERE c.customerID = i.customerID
Instead of
WHERE c.customerID = c.customerID