I'm fairly new to more advanced SQL queries
Given the following tables and associated fields:
Person
PersonId, FirstName, LastName
Order
OrderId, PersonId, OrderDateTime
I want to write a query that will join both tables by PersonId and will retrieve every person and their most recent order.
So if James Doe (PersonId = 1) below has many orders in the orders table,
OrderId, PersonId, OrderDateTime
1 1 12/1/2013 9:01 AM
2 1 2/1/2011 5:01 AM
3 2 10/1/2010 1:10 AM
it will only take the most recent for his.
PersonId NameFirst NameLast OrderId OrderDateTime
1 James Doe 1 12/1/2013 9:01 AM
2 John Doe 3 10/1/2010 1:10 AM
I have been trying something like this
SELECT p.PersonID, o.OrderID, MAX(o.OrderDateTime) From Person p
JOIN Orders o ON p.PersonID = o.PersonID
GROUP BY p.PersonID,
Thanks
The inner query in this solution is a temporary table containing the most recent orders for each person. I join this back to the Orders table to get the fields you want, and then join again to the Person table.
SELECT p.PersonID, p.NameFirst, p.NameLast, o.OrderID, o.OrderDateTime
FROM Person p INNER JOIN Orders o
ON o.PersonId = p.PersonId
INNER JOIN
(
SELECT o1.PersonId, MAX(o1.OrderDateTime) AS maxTime
FROM Orders o1
GROUP BY o1.PersonId
) t
ON o.PersonId = t.PersonId AND o.OrderDateTime = t.maxTime
You can use variables to simulate ROW_NUMBER not available in MySQL:
SELECT p.PersonId, FirstName, LastName,
o.OrderId, o.OrderDateTime
FROM Person AS p
LEFT JOIN (
SELECT OrderId, OrderDateTime, PersonId,
#row_number := IF(#pid <> PersonId,
IF(#pid:=PersonId, 1, 1),
IF(#pid:=PersonId, #row_number+1, #row_number+1)) AS rn
FROM `Order`
CROSS JOIN (SELECT #row_number := 0, #pid := 0) vars
ORDER BY OrderDateTime DESC
) AS o ON p.PersonId = o.PersonId AND o.rn = 1
rn = 1 for the top level record within each PersonId slice of the derived table. Using this predicate in the ON clause of the LEFT JOIN we can match each row of Person to the most recent row of Order and obtain all Order fields.
Demo here
EDIT:
In SQL-Server the query looks like this:
SELECT p.*, o.OrderId, o.OrderDateTime
FROM Person AS p
LEFT JOIN (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY PersonId
ORDER BY OrderDateTime DESC) AS rn
FROM [Order]
) AS o ON p.PersonId = o.PersonId AND o.rn = 1
So the most recent order per every person. You could do something like this:
select p.*, o.*
from person p
inner join orders o on p.personId = o.personId
inner join (
-- get the max order per person
SELECT max(orderId) as orderId, personId
from orders
group by personId
) maxOrder on o.orderId = maxOrder.orderId
joining o onto maxOrder on the orderId filters the result set only to the orders that are also the maximum order per customer.
Note that I used the ID on the table rather than the datetime, as the ID is guaranteed to be unique (for help with joins) - this is not always available depending on the nature of the use in the table, but it looks like it should work for your case.
Related
I have this schema here, and I need to find the name of the customer with the highest total amount for the orders. I have a SQL query here:
SELECT Name
FROM (SELECT Name, SUM(Amount) AS Total
FROM customer JOIN orders ON cust_id = ID
GROUP BY Name) AS Totals
WHERE Total = (SELECT MAX(Total)
FROM (SELECT Name, SUM(Amount) AS Total
FROM customer JOIN orders ON cust_id = ID
GROUP BY Name) AS X);
But this is very inefficient as it creates the same table twice. Is there any more efficient way to get the name?
If you want customer with the greatest total mount, then you can just join, order by and limit:
select c.name
from customer c
inner join orders o on o.cust_id = c.id
group by c.id, c.name
order by sum(o.amount) desc
limit 1
Note that this does not handle possible top ties. For this, you need a little more code. Instead of ordering, you would typically filter with a having clause:
select c.name
from customer c
inner join orders o on o.cust_id = c.id
group by c.id, c.name
having sum(o.amount) = (
select sum(o1.amount)
from orders o1
group by cust_id
order by sum(o1.amount) desc
limit 1
)
Finally: if you are running MySQL 8.0, this is simpler done with window function rank():
select name
from (
select c.name, rank() over(order by sum(o.amount) desc) rn
from customer c
inner join orders o on o.cust_id = c.id
group by c.id, c.name
) t
where rn = 1
Here's my orders table:
I want to select all orders excluding very first order of each customer (if customer has placed multiple orders).
So if a customer e.g. 215 has total 8 orders, then I will select his all last 7 orders excluding his very first order 70000 which was placed on 10 July 2017.
But if a customer e.g. 219 had placed only one order 70007, it must be selected by the query.
Using an anti-join approach:
SELECT o1.order_id, o1.customer_id, o1.order_date, o1.order_value
FROM orders o1
LEFT JOIN
(
SELECT customer_id, MIN(order_date) AS min_order_date, COUNT(*) AS cnt
FROM orders
GROUP BY customer_id
) o2
ON o1.customer_id = o2.customer_id AND
o1.order_date = o2.min_order_date
WHERE
o2.customer_site = 1 AND
(o2.customer_id IS NULL OR
o2.cnt = 1);
The idea here is to try to match each record in orders to a record in the subquery, which contains only first order records, for each customer. If we can't find a match, then such an order record cannot be the first.
You can try below -
select order_id,customer_id,order_date,order_Value
from tablename
group by order_id,customer_id,order_date,order_Value
having count(order_id)=1
union all
select order_id,customer_id,order_date,order_Value
from tablename a where order_date not in (select min(order_date) from tablename b
where a.customer_id=b.customer_id)
Solution
Dear #Tim Biegeleisen, your answer almost done. just add HAVING COUNT(customer_id)>1
So the query is below:
SELECT o1.order_id, o1.customer_id, o1.order_date, o1.order_value
FROM orders o1
LEFT JOIN (
SELECT customer_id, MIN(order_date) AS min_order_date
FROM orders
GROUP BY customer_id
HAVING COUNT(customer_id)>1
) o2
ON o1.customer_id = o2.customer_id AND
o1.order_date = o2.min_order_date
WHERE
o2.customer_id IS NULL;
I have these tables : customers, customer_invoices, customer_invoice_details, each customer has many invoices, and each invoice has many details.
The customer with the ID 574413 has these invoices :
select customer_invoices.customer_id,
customer_invoices.id,
customer_invoices.total_price
from customer_invoices
where customer_invoices.customer_id = 574413;
result :
customer_id invoice_id total_price
574413 662146 700.00
574413 662147 250.00
each invoice here has two details (or invoice lines) :
first invoice 662146:
select customer_invoice_details.id as detail_id,
customer_invoice_details.customer_invoice_id as invoice_id,
customer_invoice_details.total_price as detail_total_price
from customer_invoice_details
where customer_invoice_details.customer_invoice_id = 662146;
result :
detail_id invoice_id detail_total_price
722291 662146 500.00
722292 662146 200.00
second invoice 662147 :
select customer_invoice_details.id as detail_id,
customer_invoice_details.customer_invoice_id as invoice_id,
customer_invoice_details.total_price as detail_total_price
from customer_invoice_details
where customer_invoice_details.customer_invoice_id = 662147;
result :
detail_id invoice_id detail_total_price
722293 662147 100.00
722294 662147 150.00
I have a problem with this query :
select customers.id as customerID,
customers.last_name,
customers.first_name,
SUM(customer_invoices.total_price) as invoice_total,
SUM(customer_invoice_details.total_price) as details_total
from customers
join customer_invoices
on customer_invoices.customer_id = customers.id
join customer_invoice_details
on customer_invoice_details.customer_invoice_id = customer_invoices.id
where customer_id = 574413;
unexpected result :
customerID last_name first_name invoice_total details_total
574413 terry amine 1900.00 950.00
I need to have the SUM of the total_price of the invoices, and the SUM of the total_price of the details for each customer. In this case I'm supposed to get 950 as total_price for both columns (invoice_total& details_total) but it's not the case. what am I doing wrong & how can I get the correct result please. The answers in similar topics don't have the solution for this case.
When you mix normal columns with aggregate functions (for example SUM), you need to use GROUP BY where you list the normal columns from the SELECT.
The reason for the excessive amount in total_price for invoices is that the SUM is also calculated over each detail row as it is part of the join. Use this:
select c.id as customerID,
c.last_name,
c.first_name,
SUM(ci.total_price) as invoice_total,
SUM((select SUM(d.total_price)
from customer_invoice_details d
where d.customer_invoice_id = ci.id)) as 'detail_total_price'
from customers c
join customer_invoices ci on ci.customer_id = c.id
where c.id = 574413
group by c.id, c.last_name, c.first_name
db-fiddle
I used join against sub queries and then did a sum on the sums
SELECT c.id as customerID,
c.last_name,
c.first_name
SUM(i.sum) as invoice_total,
SUM(d.sum) AS details_total
FROM customers c
JOIN (SELECT id, customer_id, SUM(total_price) AS sum
FROM customer_invoices
GROUP BY id, customer_id) AS i ON i.customer_id = c.id
JOIN (SELECT customer_invoice_id as id, SUM(total_price) AS sum
FROM customer_invoice_details
GROUP BY customer_invoice_id) AS d ON d.id = i.id
WHERE c.id = 574413
GROUP BY c.id, c.name
The issue is in the joining logic. The table customers is used as the driving table in the joins. But in the second join, you are using a derivative key column from the first join, to join with the third tables. This is resulting in a Cartesian output doubling the records from the result from the nth-1 join, which is leading to customer_invoices.total_price getting repeated twice, hence the rolled up value of this field is doubled.
At a high level I feel that the purpose of rolling up the prices is already achieved in SUM(customer_invoice_details.total_price).
But if you have a specific project requirement that SUM(customer_invoices.total_price) should also be obtained and must match with SUM(customer_invoice_details.total_price), then you can do this:
In a separate query, Join customer_invoice_details and customer_invoices. Roll up the pricing fields, and have a result such that you have only one record for one customer ID.
Then use this as a sub-query and join it with the customers table.
You are aggregating along multiple dimensions. This is challenging. I would suggest doing the aggregation along each dimension independently:
select c.id as customerID, c.last_name, c.first_name,
ci.invoice_total,
cid.details_total
from customers c join
(select ci.sum(ci.total_price) as invoice_total
from customer_invoices ci
group by ci.customer_id
) ci
on ci.customer_id = c.id join
(select ci.sum(cid.total_price) as details_total
from customer_invoices ci join
customer_invoice_details cid
on cid.customer_invoice_id = ci.id
group by ci.customer_id
) cid
on cid.customer_id = c.id
where c.id = 574413;
A faster version (for one customer) uses correlated subqueries:
select c.id as customerID, c.last_name, c.first_name,
(select ci.customer_id, sum(ci.total_price) as invoice_total
from customer_invoices ci
where ci.customer_id = c.id
) as invoice_total,
(select ci.customer_id, sum(cid.total_price) as details_total
from customer_invoices ci join
customer_invoice_details cid
on cid.customer_invoice_id = ci.id
where ci.customer_id = c.id
) as details_total
from customers c
where c.id = 574413;
I have 2 MySQL tables:
Persons:
person_id
person_name
Orders:
person_id
cost
order_name
I need to make a ONE sql-query to get person name and it's cheapest order.
Want to write some like this: SELECT person_name, order=(SELECT order_name FROM orders WHERE order.person_id = person.person_id ORDER BY order.cost ASC LIMIT 1) FROM person
But it don't work.
I would use a correlated query. There are many ways to do it: as a subquery in the SELECT clause, or as a subquery in the WHERE clause.
In this case, I implemented the second option:
SELECT
person_name,
order_name
FROM Persons
INNER JOIN Orders ON Persons.person_id = Orders.person_id
WHERE
(
cost = (
SELECT MIN(cost)
FROM Orders
WHERE
(Orders.person_id = Persons.person_id)
)
)
remove order= and add AS order_name
Something like this:
SELECT p.person_name
, ( SELECT o.order_name
FROM orders o
WHERE o.person_id = p.person_id
ORDER BY o.cost ASC
LIMIT 1
) AS order_name
FROM person p
ORDER BY p.person_name
SELECT person_name, order_id, order_name, MIN(cost) as cost
FROM persons
INNER JOIN orders
ON persons.person_id = orders.person_id
GROUP BY person.person_name, order_id, order_name;
You can do this just by joining your 2 tables and use MIN.
I am having to set up a query that retrieves the last comment made on a customer, if no one has commented on them for more than 4 weeks. I can make it work using the query below, but for some reason the comment column won't display the latest record. Instead it displays the oldest, however the date shows the newest. It may just be because I'm a noob at SQL, but what exactly am I doing wrong here?
SELECT DISTINCT
customerid, id, customername, user, MAX(date) AS 'maxdate', comment
FROM comments
WHERE customerid IN
(SELECT DISTINCT id FROM customers WHERE pastdue='1' AND hubarea='1')
AND customerid NOT IN
(SELECT DISTINCT customerid FROM comments WHERE DATEDIFF(NOW(), date) <= 27)
GROUP BY customerid
ORDER BY maxdate
The first "WHERE" clause is just ensuring that it shows only customers from a specific area, and that they are "past due enabled". The second makes sure that the customer has not been commented on within the last 27 days. It's grouped by customerid, because that is the number that is associated with each individual customer. When I get the results, everything is right except for the comment column...any ideas?
Join much better to nested query so you use the join instead of nested query
Join increase your speed
this query resolve your problem.
SELECT DISTINCT
customerid,id, customername, user, MAX(date) AS 'maxdate', comment
FROM comments inner join customers on comments.customerid = customers.id
WHERE comments.pastdue='1' AND comments.hubarea='1' AND DATEDIFF(NOW(), comments.date) <= 27
GROUP BY customerid
ORDER BY maxdate
I think this might probably do what you are trying to achieve. If you can execute it and maybe report back if it does or not, i can probably tweak it if needed. Logically, it ' should' work - IF i have understood ur problem correctly :)
SELECT X.customerid, X.maxdate, co.id, c.customername, co.user, co.comment
FROM
(SELECT customerid, MAX(date) AS 'maxdate'
FROM comments cm
INNER JOIN customers cu ON cu.id = cm.customerid
WHERE cu.pastdue='1'
AND cu.hubarea='1'
AND DATEDIFF(NOW(), cm.date) <= 27)
GROUP BY customerid) X
INNER JOIN comments co ON X.customerid = co.customerid and X.maxdate = co.date
INNER JOIN customer c ON X.customerid = c.id
ORDER BY X.maxdate
You need to have subquery for each case.
SELECT a.*
FROM comments a
INNER JOIN
(
SELECT customerID, max(`date`) maxDate
FROM comments
GROUP BY customerID
) b ON a.customerID = b.customerID AND
a.`date` = b.maxDate
INNER JOIN
(
SELECT DISTINCT ID
FROM customers
WHERE pastdue = 1 AND hubarea = 1
) c ON c.ID = a.customerID
LEFT JOIN
(
SELECT DISTINCT customerid
FROM comments
WHERE DATEDIFF(NOW(), date) <= 27
) d ON a.customerID = d.customerID
WHERE d.customerID IS NULL
The first join gets the latest record for each customer.
The second join shows only customers from a specific area, and that they are "past due enabled".
The third join, which uses LEFT JOIN, select all customers that has not been commented on within the last 27 days. In this case,only records without on the list are selected because of the condition d.customerID IS NULL.
But tomake your query shorter, if the customers table has already unique records for customer, then you don't need to have subquery on it.Directly join the table and put the condition on the WHERE clause.
SELECT a.*
FROM comments a
INNER JOIN
(
SELECT customerID, max(`date`) maxDate
FROM comments
GROUP BY customerID
) b ON a.customerID = b.customerID AND
a.`date` = b.maxDate
INNER JOIN customers c
ON c.ID = a.customerID
LEFT JOIN
(
SELECT DISTINCT customerid
FROM comments
WHERE DATEDIFF(NOW(), date) <= 27
) d ON a.customerID = d.customerID
WHERE d.customerID IS NULL AND
c.pastdue = 1 AND
c.hubarea = 1
Two of your table columns are not contained in either an aggregate function or the GROUP BY clause. for example suppose that you have two data rows with the same customer id and same date, but with different comment data. how SQL should aggregate these two rows? :( it will generate an error...
try this
select customerid, id, customername, user,date, comment from(
select customerid, id, customername, user,date, comment,
#rank := IF(#current_customer = id, #rank+ 1, 1),
#current_customer := id
from comments
where customerid IN
(SELECT DISTINCT id FROM customers WHERE pastdue='1' AND hubarea='1')
AND customerid NOT IN
(SELECT DISTINCT customerid FROM comments WHERE DATEDIFF(NOW(), date) <= 27)
order by customerid, maxdate desc
) where rank <= 1