in this cust_id is a foreign key and ords returns the number of orders for every customers
SELECT cust_name, (
SELECT COUNT(*)
FROM Orders
WHERE Orders.cust_id = Customers.cust_id
) AS ords
FROM Customers
The output is correct but i want to filter it to retrieve only the customers with less than a given amount of orders, i don't know how to filter the subquery ords, i tried WHERE ords < 2 at the end of the code but it doesn't work and i've tried adding AND COUNT(*)<2 after the cust_id comparison but it doesn't work. I am using MySQL
Use the HAVING clause (and use a join instead of a subquery).....
SELECT Customers.cust_id, Customers.cust_name, COUNT(*) ords
FROM Orders, Customers
WHERE Orders.cust_id = Customers.cust_id
GROUP BY 1,2
HAVING COUNT(*)<2
If you want to include people with zero orders you change the join to an outer join.
There is no need for a correlated subquery here, because it calculates the value for each row which doesn't give a "good" performance. A better approach would be to use a regular query with joins, group by and having clause to apply your condition to groups.
Since your condition is to return only customers that have less than 2 orders, left join instead of inner join would be appropriate. It would return customers that have no orders as well (with 0 count).
select
cust_name, count(*)
from
customers c
left join orders o on c.cust_id = o.cust_id
group by cust_name
having count(*) < 2
Related
I am trying to get the data for the best 5 customers in a railway reservation system. To get that, I tried getting the max value by summing up their fare every time they make a reservation. Here is the code.
SELECT c. firstName, c.lastName,MAX(r.totalFare) as Fare
FROM customer c, Reservation r, books b
WHERE r.resID = b.resID
AND c.username = b.username
AND r.totalfare < (SELECT sum(r1.totalfare) Revenue
from Reservation r1, for_res f1, customer c1,books b1
where r1.resID = f1.resID
and c1.username = b1.username
and r1.resID = b1.resID
group by c1.username
)
GROUP BY c.firstName, c.lastName, r.totalfare
ORDER BY r.totalfare desc
LIMIT 5;
this throws the error:[21000][1242] Subquery returns more than 1 row
If I remove the group by from the subquery the result is:(its a tabular form)
Jade,Smith,1450
Jade,Smith,725
Jade,Smith,25.5
Monica,Geller,20.1
Rach,Jones,10.53
But that's not what I want, as you can see, I want to add the name 'Jade' with the total fare.
I just don't see the point for the subquery. It seems like you can get the result you want with a sum()
select c.firstname, c.lastname, sum(totalfare) as totalfare
from customer c
inner join books b on b.username = c.username
inner join reservation r on r.resid = b.resid
group by c.username
order by totalfare desc
limit 5
This sums all reservations of each client, and use that information to sort the resulstet. This guarantees one row per customer.
The query assumes that username is the primary key of table customer. If that's not the case, you need to add columns firstname and lastname to the group by clause.
Note that this uses standard joins (with the inner join ... on keywords) rather than old-school, implicit joins (with commas in the from clause: these are legacy syntax, that should not be used in new code.
I have a table 'rents' that has the field ID and a Foreign Key events_id and the field total_value, and i have another table 'payments' that has the field ID, a Foreign Key rents_id and the field payment.
Table events:
ID
Table rents:
ID,
total_value,
events_id
Table payments:
ID,
payment,
rents_id
There are many payments for the same ID of rents table, and there are many rents for the same ID of events table.
I need to create a query that shows the sum() for the field 'total_value' that has the same events ID in the rents table, and a sum() for the field 'payment' that has the same rents ID in the payments table.
I've managed to create this query that accomplish this task:
SELECT SUM(r.total_value) 'Total Value', (SELECT SUM(p.payment) FROM payments p INNER JOIN rents r ON p.rents_id = r.id WHERE r.events_id=8) 'Total Payment'
FROM rents r
WHERE r.events_id=8
;
I wonder if that's the only way to do that, or is there a better way to do it?
I would recommend a join - the idea is to compute intermediate payment sums in a subquery first, then join, and aggregate again in the outer query.
select sum(r.total_value) total_value, sum(p.payment) total_payment
from rents r
left join (
select rents_id, sum(payment) payment from payments group by rents_id
) p on p.rents_id = r.id
where r.events_id = 8
The left join avoid filtering out rents that have no payment at all.
You can easily change the query to generate the result for all event_ids at once:
select r.events_id, sum(r.total_value) total_value, sum(p.payment) total_payment
from rents r
left join (
select rents_id, sum(payment) payment from payments group by rents_id
) p on p.rents_id = r.id
group by r.events_id
A correlated subquery is probably the best approach (which I'll explain later). However, you don't need a join in the subquery, just a correlation clause:
SELECT SUM(r.total_value) as Total_Value,
(SELECT SUM(p.payment)
FROM payments p
WHERE p.rents_id = r.id
--------------^ correlation clause that "links" the subquery to the outer query
) as Total_Payment
FROM rents r
WHERE r.events_id = 8;
Why is this the best approach? First, with an index on payments(rents_id) and rests(events_id), this should be the fastest method.
Second, note that the filtering condition is only included once in the query. That makes it easy to update the query and less susceptible to error.
Third, this does not pre-aggregate the payments table. Pre-aggregating either requires repeating the filtering condition (as in your version of the query) or aggregating more data than necessary (which affects performance).
In addition, I would advise you to not use single quotes for column aliases. Only use single quotes for string and date constants.
I have the following three tables:
I have Following Query to Join Above 3 Tables
customer.customer_id,
customer.name,
SUM(sales.total),
sales.created_at,
SUM(sales_payments.amount)
FROM
sales INNER JOIN customer ON customer.customer_id = sales.customer_id
INNER JOIN sales_payments ON sales.customer_id = sales_payments.customer_id
WHERE sales.created_at ='2020-04-03'
GROUP By customer.name
Result for Above Query is given below
Sum of sales.total is double of the actual sum of sales.total column which has 2-row count, I need to have the actual SUM of that column, without doubling the SUM of those rows, Thank you, for your help in advance..
PROBLEM
The problem here is that there are consecutive inner joins and the number of rows getting fetched in the second inner join is not restricted. So, as we have not added a condition on sales_payment_id in the join between the sales and sales_payment tables, one row in sales table(for customer_id 2, in this case) would be mapped to 2 rows in the payment table. This causes the same values to be reconsidered.
In other words, the mapping for customer_id 2 between the 3 tables is 1:1:2 rather than 1:1:1.
SOLUTION
Solution 1 : As mentioned by Gordon, you could first aggregate the amount values of the sales_payments table and then aggregate the values in sales table.
Solution 2 : Alternatively (IMHO a better approach), you could add a foreign key between sales and sales_payment tables. For example, the sales_payment_id column of sales_payment table can be introduced in the sales table as well. This would facilitate the join between these tables and reduce additional overheads while querying data.
The query would then look like:
`SELECT c.customer_id,
c.name,
SUM(s.total),
s.created_at,
SUM(sp.amount)
FROM customer c
INNER JOIN sales s
ON c.customer_id = s.customer_id
INNER JOIN sales_payments sp
ON c.customer_id = sp.customer_id
AND s.sales_payments_id = sp.sales_payments_id
WHERE s.created_at ='2020-04-03'
GROUP BY c.customer_id,
c.name,
s.created_at ;`
Hope that helps!
You have multiple rows for sales_payments and sales per customer. You need to pre-aggregate to get the right value:
SELECT c.customer_id, c.name, s.created_at, s.total, sp.amount
FROM customer c JOIN
(SELECT s.customer_id, s.created_at, SUM(s.total) as total
FROM sales s
WHERE s.created_at ='2020-04-03'
GROUP BY s.customer_id, s.created_at
) s
ON c.customer_id = s.customer_id JOIN
(SELECT sp.customer_id, SUM(sp.amount) as amount
FROM sales_payments sp
GROUP BY sp.customer_id
) sp
ON s.customer_id = sp.customer_id
Given two tables, orders (order_id, date, $, customer_id) and customers (ID, name)
Here's my method but I'm not sure if it's working & I'd like to know if there's faster/better way of solving these problems:
1) find out number of customers who made at least one order on date 7/9/2018
Select count (distinct customer_id)
From
(
Select customer_id from orders a
Left join customer b
On a.customer_id = b.ID
Group by customer_id,date
Having date = 7/9/2018
) a
2) find out number of customers who did not make an order on 7/9/2018
Select count (customer_id) from customer where customer_id not in
(
Select customer_id from orders a
Left join customer b
On a.customer_id = b.ID
Group by customer_id,date
Having date = 7/9/2018
)
3) find the date with most sales between 7/1 and 7/30
select date, max($)
from (
Select sum($),date from orders a
Left join customer b
On a.customer_id = b.ID
Group by date
Having date between 7/1 and 7/30
)
Thanks,
For problem 1, a valid solution might look like this:
SELECT COUNT(DISTINCT customer_id) x
FROM orders
WHERE date = '2018-09-07'; -- or is that '2018-07-09' ??
For problem 2, a valid solution might look like this:
SELECT COUNT(*) x
FROM customer c
LEFT
JOIN orders o
ON o.customer_id = x.customer_id
AND o.date = '2018-07-09'
WHERE o.crder_id IS NULL;
Assuming there are no ties, a valid solution to problem 3 might look like this:
SELECT date
, COUNT(*) sales
FROM orders
WHERE date BETWEEN '2018-07-01' AND '2018-07-30'
GROUP
BY date
ORDER
BY sales DESC
LIMIT 1;
The default format for a date in MySQL is YYYY-MM-DD, although this can be customized. You have to put quotes around it, otherwise it's treated as an arithmetic expression.
And none of your queries need to join with the customer table. The customer ID is already in the orders table, and you're not returning any info about the customers (like the name or address), you're just counting them.
1) You don't need the subquery or grouping.
SELECT COUNT(DISTINCT customer_id)
FROM orders
WHERE date = '2018-07-09'
2) Again, you don't need GROUP BY in the subquery. There's also a better pattern than NOT IN to get the count of non-matching rows.
SELECT COUNT(*)
FROM customer AS c
LEFT JOIN order AS o on c.id = o.customer_id AND o.date = '2018-07-09'
WHERE o.id IS NULL
See Return row only if value doesn't exist for various patterns to do this.
3) You can't use MAX($) in the outer query because the inner query doesn't return a column with that name. But even if you fix that, it still won't work, because the date column won't necessarily come from the same row that has the maximum. See SQL select only rows with max value on a column for more explanation of this.
You don't need a subquery at all. Use a query that returns the total sales for each day, then use ORDER BY to get the highest one.
SELECT date, SUM($) AS total_sales
FROM orders
WHERE date BETWEEN '2018-07-01' AND '2017-07-30'
GROUP BY date
ORDER BY total_sales DESC
LIMIT 1
If "most sales" is supposed to mean "most number of sales", replace SUM($) with COUNT(*).
I'm working on a simple ordering system in MySQL and I came across this snag that I'm hoping some SQL genius can help me out with.
I have a table for Orders, Payments (with a foreign key reference to the Order table), and OrderItems (also, with a foreign key reference to the Order table) and what I would like to do is get the total outstanding balance (Total and Paid) for the Order with a single query. My initial thought was to do something simple like this:
SELECT Order.*, SUM(OrderItem.Amount) AS Total, SUM(Payment.Amount) AS Paid
FROM Order
JOIN OrderItem ON OrderItem.OrderId = Order.OrderId
JOIN Payment ON Payment.OrderId = Order.OrderId
GROUP BY Order.OrderId
However, if there are multiple Payments or multiple OrderItems, it messes up Total or Paid, respectively (eg. One OrderItem record with an amount of 100 along with two Payment Records will produce a Total of 200).
In order to overcome this, I can use some subqueries in the following way:
SELECT Order.OrderId, OrderItemGrouped.Total, PaymentGrouped.Paid
FROM Order
JOIN (
SELECT OrderItem.OrderId, SUM(OrderItem.Amount) AS Total
FROM OrderItem
GROUP BY OrderItem.OrderId
) OrderItemGrouped ON OrderItemGrouped.OrderId = Order.OrderId
JOIN (
SELECT Payment.OrderId, SUM(Payment.Amount) AS Paid
FROM Payment
GROUP BY Payment.OrderId
) PaymentGrouped ON PaymentGrouped.OrderId = Order.OrderId
As you can imagine (and as an EXPLAIN on this query will show), this is not exactly an optimal query so, I'm wondering, is there any way to convert these two subqueries with GROUP BY statements into JOINs?
The following is likely to be faster with the right indexes:
select o.OrderId,
(select sum(oi.Amount)
from OrderItem oi
where oi.OrderId = o.OrderId
) as Total,
(select sum(p.Amount)
from Payment p
where oi.OrderId = o.OrderId
) as Paid
from Order o;
The right indexes are OrderItem(OrderId, Amount) and Payment(OrderId, Amount).
I don't like writing aggregation queries this way, but it can sometimes help performance in MySQL.
Some answers have already suggested using a correlated subquery, but have not really offered an explanation as to why. MySQL does not materialise correlated subqueries, but it will materialise a derived table. That is to say with a simplified version of your query as it is now:
SELECT Order.OrderId, OrderItemGrouped.Total
FROM Order
JOIN (
SELECT OrderItem.OrderId, SUM(OrderItem.Amount) AS Total
FROM OrderItem
GROUP BY OrderItem.OrderId
) OrderItemGrouped ON OrderItemGrouped.OrderId = Order.OrderId;
At the start of execution MySQL will put the results of your subquery into a temporary table, and hash this table on OrderId for faster lookups, whereas if you run:
SELECT Order.OrderId,
( SELECT SUM(OrderItem.Amount)
FROM OrderItem
WHERE OrderItem.OrderId = OrderId
) AS Total
FROM Order;
The subquery will be executed once for each row in Order. If you add something like WHERE Order.OrderId = 1, it is obviously not efficient to aggregate the entire OrderItem table, hash the result to only lookup one value, but if you are returning all orders then the inital cost of creating the hash table will make up for itself it not having to execute the subquery for every row in the Order table.
If you are selecting a lot of rows and feel the materialisation will be of benefit, you can simplifiy your JOIN query as follows:
SELECT Order.OrderId, SUM(OrderItem.Amount) AS Total, PaymentGrouped.Paid
FROM Order
INNER JOIN OrderItem
ON OrderItem.OrderID = Order.OrderID
INNER JOIN
( SELECT Payment.OrderId, SUM(Payment.Amount) AS Paid
FROM Payment
GROUP BY Payment.OrderId
) PaymentGrouped
ON PaymentGrouped.OrderId = Order.OrderId;
GROUP BY Order.OrderId, PaymentGrouped.Paid;
Then you only have one derived table.
What about something like this:
SELECT Order.OrderId, (
SELECT SUM(OrderItem.Amount)
FROM OrderItem as OrderItemGrouped
where
OrderItemGrouped.OrderId = Order.OrderId
), AS Total,
(
SELECT SUM(Payment.Amount)
FROM Payment as PaymentGrouped
where
PaymentGrouped.OrderId = Order.OrderId
) as Paid
FROM Order
PS: You win again #Gordon xD
Select o.orderid, i.total, s.paid
From orders o
Left join (select orderid, sum(amount)
From orderitem) i
On i.orderid = o.orderid
Ieft join (select orderid, sum(amount)
From payments) s
On s.orderid = o.orderid