I have a table 'rents' that has the field ID and a Foreign Key events_id and the field total_value, and i have another table 'payments' that has the field ID, a Foreign Key rents_id and the field payment.
Table events:
ID
Table rents:
ID,
total_value,
events_id
Table payments:
ID,
payment,
rents_id
There are many payments for the same ID of rents table, and there are many rents for the same ID of events table.
I need to create a query that shows the sum() for the field 'total_value' that has the same events ID in the rents table, and a sum() for the field 'payment' that has the same rents ID in the payments table.
I've managed to create this query that accomplish this task:
SELECT SUM(r.total_value) 'Total Value', (SELECT SUM(p.payment) FROM payments p INNER JOIN rents r ON p.rents_id = r.id WHERE r.events_id=8) 'Total Payment'
FROM rents r
WHERE r.events_id=8
;
I wonder if that's the only way to do that, or is there a better way to do it?
I would recommend a join - the idea is to compute intermediate payment sums in a subquery first, then join, and aggregate again in the outer query.
select sum(r.total_value) total_value, sum(p.payment) total_payment
from rents r
left join (
select rents_id, sum(payment) payment from payments group by rents_id
) p on p.rents_id = r.id
where r.events_id = 8
The left join avoid filtering out rents that have no payment at all.
You can easily change the query to generate the result for all event_ids at once:
select r.events_id, sum(r.total_value) total_value, sum(p.payment) total_payment
from rents r
left join (
select rents_id, sum(payment) payment from payments group by rents_id
) p on p.rents_id = r.id
group by r.events_id
A correlated subquery is probably the best approach (which I'll explain later). However, you don't need a join in the subquery, just a correlation clause:
SELECT SUM(r.total_value) as Total_Value,
(SELECT SUM(p.payment)
FROM payments p
WHERE p.rents_id = r.id
--------------^ correlation clause that "links" the subquery to the outer query
) as Total_Payment
FROM rents r
WHERE r.events_id = 8;
Why is this the best approach? First, with an index on payments(rents_id) and rests(events_id), this should be the fastest method.
Second, note that the filtering condition is only included once in the query. That makes it easy to update the query and less susceptible to error.
Third, this does not pre-aggregate the payments table. Pre-aggregating either requires repeating the filtering condition (as in your version of the query) or aggregating more data than necessary (which affects performance).
In addition, I would advise you to not use single quotes for column aliases. Only use single quotes for string and date constants.
Related
This is a slight variant of the question I asked here
SQL Query for getting maximum value from a column
I have a Person Table and an Activity Table with the following data
-- PERSON-----
------ACTIVITY------------
I have got this data in the database about users spending time on a particular activity.
I intend to get the data when every user has spent the maximum number of hours.
My Query is
Select p.Id as 'PersonId',
p.Name as 'Name',
act.HoursSpent as 'Hours Spent',
act.Date as 'Date'
From Person p
Left JOIN (Select MAX(HoursSpent), Date from Activity
Group By HoursSpent, Date) act
on act.personId = p.Id
but it is giving me all the rows for Person and not with the Maximum Numbers of Hours Spent.
This should be my result.
You have several issues with your query:
The subquery to get hours is aggregated by date, not person.
You don't have a way to bring in other columns from activity.
You can take this approach -- joins and group by, but it requires two joins:
select p.*, a.* -- the columns you want
from Person p left join
activity a
on a.personId = p.id left join
(select personid, max(HoursSpent) as max_hoursspent
from activity a
group by personid
) ma
on ma.personId = a.personId and
ma.max_hoursspent = a.hoursspent;
Note that this can return duplicates for a given person -- if there are ties for the maximum.
This is written more colloquially using row_number():
select p.*, a.* -- the columns you want
from Person p left join
(select a.*,
row_number() over (partition by a.personid order by a.hoursspent desc) as seqnum
from activity a
) a
on a.personId = p.id and a.seqnum = 1
ma.max_hoursspent = a.hoursspent;
in this cust_id is a foreign key and ords returns the number of orders for every customers
SELECT cust_name, (
SELECT COUNT(*)
FROM Orders
WHERE Orders.cust_id = Customers.cust_id
) AS ords
FROM Customers
The output is correct but i want to filter it to retrieve only the customers with less than a given amount of orders, i don't know how to filter the subquery ords, i tried WHERE ords < 2 at the end of the code but it doesn't work and i've tried adding AND COUNT(*)<2 after the cust_id comparison but it doesn't work. I am using MySQL
Use the HAVING clause (and use a join instead of a subquery).....
SELECT Customers.cust_id, Customers.cust_name, COUNT(*) ords
FROM Orders, Customers
WHERE Orders.cust_id = Customers.cust_id
GROUP BY 1,2
HAVING COUNT(*)<2
If you want to include people with zero orders you change the join to an outer join.
There is no need for a correlated subquery here, because it calculates the value for each row which doesn't give a "good" performance. A better approach would be to use a regular query with joins, group by and having clause to apply your condition to groups.
Since your condition is to return only customers that have less than 2 orders, left join instead of inner join would be appropriate. It would return customers that have no orders as well (with 0 count).
select
cust_name, count(*)
from
customers c
left join orders o on c.cust_id = o.cust_id
group by cust_name
having count(*) < 2
I have 4 tables that I want to be joined.
Customers
Traffic
Average
Live
I want to insert joined data of these tables to "Details" table.
The relationship between the tables is here:
each of Traffic, Average and Live tables have a "cid" that is the primary key of "Customers" table:
Traffic.cid = Customers.id
Average.cid = Customers.id
Live.cid = Customers.id
The query that I wrote is here:
INSERT INTO Details
(
cid, Customer_Name, Router_Name,
Traffic_Received,
Average_Received,
Live_Received,
date
)
(
SELECT Customers.id AS cid, Customers.name AS Customer_Name, Traffic.Router_Name,
Traffic.Received,
Average.Received,
Live.Received,
Traffic.date
FROM Customers
INNER JOIN Traffic ON Customers.id=Traffic.cid
INNER JOIN Average ON Customers.id=Average.cid
INNER JOIN Live ON Customers.id=Live.cid
WHERE Traffic.date='2015-06-08'
)
But the result will have duplicated rows. I changed the JOIN to both LEFT JOIN, and RIGHT JOIN. but the result does not changed.
What should I do to not have duplicated rows in Details table?
With the LEFT JOIN, you will be joining to the table (e.g. Traffic) even when there is not a record that corresponds to the Customers.id, in which case, you will get the null value for the columns from this table where there is no matching record.
With the RIGHT JOIN, you will get every record from the joined table, even when there is not a corresponding record in Customers.
However, the type of JOIN is not the problem here. If you are getting duplicate records in your results, then this means that is more than one matching record in the tables you are joining to. For example, there may be more than one record in Traffic with the same cid. Use SELECT DISTINCT to remove duplicates, or if you are interested in an aggregate of those duplicates, use an aggregate function, such as count() or sum() and a GROUP BY clause, e.g. GROUP BY Traffic.cid.
If you still have duplicates, then check to make sure that they really are duplicates - I'd suggest that one or more columns is actually different.
Can you please try this
INSERT INTO Details
(
cid, Customer_Name, Router_Name,
Traffic_Received,
Average_Received,
Live_Received,
date
)
(
SELECT Customers.id AS cid,
Customers.name AS Customer_Name,
Traffic.Router_Name,
Traffic.Received,
Average.Received,
Live.Received,
Traffic.date
FROM Customers
INNER JOIN Traffic ON Customers.id=Traffic.cid
INNER JOIN Average ON Customers.id=Average.cid
INNER JOIN Live ON Customers.id=Live.cid
WHERE Traffic.date='2015-06-08'
GROUP BY
cid,
Customer_Name,
Traffic.Router_Name,
Traffic.Received,
Average.Received,
Live.Received,
Traffic.date
)
SELECT Customers.id AS cid, Customers.name AS Customer_Name, Traffic.Router_Name,
Traffic.Received,
Average.Received,
Live.Received,
Traffic.date
FROM Customers
LEFT JOIN Traffic ON Customers.id=Traffic.cid
LEFT JOIN Average ON Traffic.cid=Average.cid
LEFT JOIN Live ON Average.cid=Live.cid
WHERE Traffic.date='2015-06-08'
I'm working on a simple ordering system in MySQL and I came across this snag that I'm hoping some SQL genius can help me out with.
I have a table for Orders, Payments (with a foreign key reference to the Order table), and OrderItems (also, with a foreign key reference to the Order table) and what I would like to do is get the total outstanding balance (Total and Paid) for the Order with a single query. My initial thought was to do something simple like this:
SELECT Order.*, SUM(OrderItem.Amount) AS Total, SUM(Payment.Amount) AS Paid
FROM Order
JOIN OrderItem ON OrderItem.OrderId = Order.OrderId
JOIN Payment ON Payment.OrderId = Order.OrderId
GROUP BY Order.OrderId
However, if there are multiple Payments or multiple OrderItems, it messes up Total or Paid, respectively (eg. One OrderItem record with an amount of 100 along with two Payment Records will produce a Total of 200).
In order to overcome this, I can use some subqueries in the following way:
SELECT Order.OrderId, OrderItemGrouped.Total, PaymentGrouped.Paid
FROM Order
JOIN (
SELECT OrderItem.OrderId, SUM(OrderItem.Amount) AS Total
FROM OrderItem
GROUP BY OrderItem.OrderId
) OrderItemGrouped ON OrderItemGrouped.OrderId = Order.OrderId
JOIN (
SELECT Payment.OrderId, SUM(Payment.Amount) AS Paid
FROM Payment
GROUP BY Payment.OrderId
) PaymentGrouped ON PaymentGrouped.OrderId = Order.OrderId
As you can imagine (and as an EXPLAIN on this query will show), this is not exactly an optimal query so, I'm wondering, is there any way to convert these two subqueries with GROUP BY statements into JOINs?
The following is likely to be faster with the right indexes:
select o.OrderId,
(select sum(oi.Amount)
from OrderItem oi
where oi.OrderId = o.OrderId
) as Total,
(select sum(p.Amount)
from Payment p
where oi.OrderId = o.OrderId
) as Paid
from Order o;
The right indexes are OrderItem(OrderId, Amount) and Payment(OrderId, Amount).
I don't like writing aggregation queries this way, but it can sometimes help performance in MySQL.
Some answers have already suggested using a correlated subquery, but have not really offered an explanation as to why. MySQL does not materialise correlated subqueries, but it will materialise a derived table. That is to say with a simplified version of your query as it is now:
SELECT Order.OrderId, OrderItemGrouped.Total
FROM Order
JOIN (
SELECT OrderItem.OrderId, SUM(OrderItem.Amount) AS Total
FROM OrderItem
GROUP BY OrderItem.OrderId
) OrderItemGrouped ON OrderItemGrouped.OrderId = Order.OrderId;
At the start of execution MySQL will put the results of your subquery into a temporary table, and hash this table on OrderId for faster lookups, whereas if you run:
SELECT Order.OrderId,
( SELECT SUM(OrderItem.Amount)
FROM OrderItem
WHERE OrderItem.OrderId = OrderId
) AS Total
FROM Order;
The subquery will be executed once for each row in Order. If you add something like WHERE Order.OrderId = 1, it is obviously not efficient to aggregate the entire OrderItem table, hash the result to only lookup one value, but if you are returning all orders then the inital cost of creating the hash table will make up for itself it not having to execute the subquery for every row in the Order table.
If you are selecting a lot of rows and feel the materialisation will be of benefit, you can simplifiy your JOIN query as follows:
SELECT Order.OrderId, SUM(OrderItem.Amount) AS Total, PaymentGrouped.Paid
FROM Order
INNER JOIN OrderItem
ON OrderItem.OrderID = Order.OrderID
INNER JOIN
( SELECT Payment.OrderId, SUM(Payment.Amount) AS Paid
FROM Payment
GROUP BY Payment.OrderId
) PaymentGrouped
ON PaymentGrouped.OrderId = Order.OrderId;
GROUP BY Order.OrderId, PaymentGrouped.Paid;
Then you only have one derived table.
What about something like this:
SELECT Order.OrderId, (
SELECT SUM(OrderItem.Amount)
FROM OrderItem as OrderItemGrouped
where
OrderItemGrouped.OrderId = Order.OrderId
), AS Total,
(
SELECT SUM(Payment.Amount)
FROM Payment as PaymentGrouped
where
PaymentGrouped.OrderId = Order.OrderId
) as Paid
FROM Order
PS: You win again #Gordon xD
Select o.orderid, i.total, s.paid
From orders o
Left join (select orderid, sum(amount)
From orderitem) i
On i.orderid = o.orderid
Ieft join (select orderid, sum(amount)
From payments) s
On s.orderid = o.orderid
I have a query to show customers and the total dollar value of all their orders. The query takes about 100 seconds to execute.
I'm querying on an ExpressionEngine CMS database. ExpressionEngine uses one table exp_channel_data, for all content. Therefore, I have to join on that table for both customer and order data. I have about 14,000 customers, 30,000 orders and 160,000 total records in that table.
Can I change this query to speed it up?
SELECT link.author_id AS customer_id,
customers.field_id_122 AS company,
Sum(orders.field_id_22) AS total_orders
FROM exp_channel_data customers
JOIN exp_channel_titles link
ON link.author_id = customers.field_id_117
AND customers.channel_id = 7
JOIN exp_channel_data orders
ON orders.entry_id = link.entry_id
AND orders.channel_id = 3
GROUP BY customer_id
Thanks, and please let me know if I should include other information.
UPDATE SOLUTION
My apologies. I noticed that entry_id for the exp_channel_data table customers corresponds to author_id for the exp_channel_titles table. So I don't have to use field_id_117 in the join. field_id_117 duplicates entry_id, but in a TEXT field. JOINING on that text field slowed things down. The query is now 3 seconds
However, the inner join solution posted by #DRapp is 1.5 seconds. Here is his sql with a minor edit:
SELECT
PQ.author_id CustomerID,
c.field_id_122 CompanyName,
PQ.totalOrders
FROM
( SELECT
t.author_id
SUM( o.field_id_22 ) as totalOrders
FROM
exp_channel_data o
JOIN
exp_channel_titles t ON t.author_id = o.entry_id AND o.channel_id = 3
GROUP BY
t.author_id ) PQ
JOIN
exp_channel_data c ON PQ.author_id = c.entry_id AND c.channel_id = 7
ORDER BY CustomerID
If this is the same table, then the same columns across the board for all alias instances.
I would ensure an index on (channel_id, entry_id, field_id_117 ) if possible. Another index on (author_id) for the prequery of order totals
Then, start first with what will become an inner query doing nothing but a per customer sum of order amounts.. Since the join is the "author_id" as the customer ID, just query/sum that first. Not completely understanding the (what I would consider) poor design of the structure, knowing what the "Channel_ID" really indicates, you don't want to duplicate summation values because of these other things in the mix.
select
o.author_id,
sum( o.field_id_22 ) as totalOrders
FROM
exp_channel_data customers o
where
o.channel_id = 3
group by
o.author_id
If that is correct on the per customer (via author_id column), then that can be wrapped as follows
select
PQ.author_id CustomerID,
c.field_id_122 CompanyName,
PQ.totalOrders
from
( select
o.author_id,
sum( o.field_id_22 ) as totalOrders
FROM
exp_channel_data customers o
where
o.channel_id = 3
group by
o.author_id ) PQ
JOIN exp_channel_data c
on PQ.author_id = c.field_id_117
AND c.channel_id = 7
Can you post the results of an EXPLAIN query?
I'm guessing that your tables are not indexed well for this operation. All of the columns that you join on should probably be indexed. As a first guess I'd look at indexing exp_channel_data.field_id_117
Try something like this. Possibly you have error in joins. also check whether joins on columns are correct in your databases. Cross join may takes time to fetch large data, by mistake if your joins are not proper on columns.
select
link.author_id as customer_id,
customers.field_id_122 as company,
sum(orders.field_id_22) as total_or_orders
from exp_channel_data customers
join exp_channel_titles link on (link.author_id = customers.field_id_117 and
link.author_id = customer.channel_id = 7)
join exp_channel_data orders on (orders.entry_id = link.entry_id and orders.entry_id = orders.channel_id = 3)
group by customer_id