JOIN VS SUBQUERY - mysql

I need to do this but with a subquery, not a join. My problem is, how can I use a subquery to display another column? I could grab the info from there, but I'll be missing the order_date column from the orders table. Can I use a subquery to display it?
SELECT CONCAT(c.customer_first_name, ' ' , c.customer_last_name) AS customer_name, MAX(o.order_date) AS recent_order_date
FROM customers AS c
JOIN orders AS o
ON c.customer_id = o.customer_id
GROUP BY customer_name
ORDER BY MAX(o.order_date) DESC

It's not at all clear what resultset you are trying to return, but it looks an awful like the like the ubiquitous "latest row" problem.
The normative pattern for the solution to that problem is to use a JOIN to the inline view. If there's not a unique constraint, you run the possibility of returning more than one matching row.
To get the latest order (the row in the orders table with the maximum order_date for each customer, assuming that the (customer_id, order_date) tuple is unique, you can do something like this:
SELECT o.*
FROM ( SELECT n.customer_id
, MAX(n.order_date) AS latest_order_date
FROM orders n
GROUP BY n.customer_id
) m
JOIN orders o
ON o.customer_id = m.customer_id
AND o.order_date = m.latest_order_date
If you want to also retrieve columns from the customers table based on the customer_id returned from orders, you'd use a JOIN (not a subquery)
SELECT CONCAT(c.customer_first_name,' ',c.customer_last_name) AS customer_name
, c.whatever
, o.order_date AS recent_order_date
, o.whatever
FROM ( SELECT n.customer_id
, MAX(n.order_date) AS latest_order_date
FROM orders n
GROUP BY n.customer_id
) m
JOIN orders o
ON o.customer_id = m.customer_id
AND o.order_date = m.latest_order_date
JOIN customers c
ON c.customer_id = o.customer_id
ORDER BY o.order_date DESC, o.customer_id DESC
As I mentioned before, if a given customer can have two orders with the exact same value for order_date, there's potential to return more than one order for each customer_id.
To rectify that, we can return a unique key from the inline view, and use that in the join predicate to guarantee only a single row returned from orders.
(NOTE: this approach is specific to MySQL, with this syntax, other RDBMS will throw an error that essentially says "the GROUP BY must include all non-aggregates". But MySQL allows it.)
SELECT CONCAT(c.customer_first_name,' ',c.customer_last_name) AS customer_name
, c.whatever
, o.order_date AS recent_order_date
, o.whatever
FROM ( SELECT n.customer_id
, MAX(n.order_date) AS latest_order_date
, n.order_id
FROM orders n
GROUP BY n.customer_id
) m
JOIN orders o
AND o.customer_id = m.customer_id
AND o.order_date = m.latest_order_date
AND o.order_id = n.order_id
JOIN customers c
ON c.customer_id = o.customer_id
ORDER BY o.order_date DESC, o.customer_id DESC

I am not really sure i understand your question, but i think this works... (not tested though...)
SELECT
(
SELECT
CONCAT(c.customer_first_name, ' ' , c.customer_last_name)
FROM
customers c
WHERE
c.customer_id = o.customer_id
LIMIT 1
) AS customer_name,
MAX(o.order_date) AS recent_order_date
FROM
orders o
GROUP BY
customer_name
ORDER BY
MAX(o.order_date) DESC

Related

Finding first order in a single year

I'm trying to determine how many new people made an order in 2018. This looks straight forward enough but there is an error with putting calculated fields in the WHERE statement.
SELECT DISTINCT COUNT(c.customer_id)
FROM Customer c
LEFT JOIN
Orders o ON c.customer_id=o.customer_id
WHERE MIN(order_date) > '2017-12-31'
AND MIN(order_date) < '2019-01-01';
You can achieve this by putting a sequence number to the orders and then selecting the first row for each customer. Although, I'm not really sure why you're performing a count of the orders when you just want to consider the first orders. Nevertheless the below should work just fine.
SELECT count(res.customer_id) FROM (
SELECT c.customer_id,
ROW_NUMBER() OVER (PARTITION BY c.customer_id ORDER BY o.order_date ASC) row_num
FROM Customer c
LEFT JOIN Orders o ON c.customer_id=o.customer_id
WHERE o.order_date > '2017-12-31'
AND o.order_date < '2019-01-01'
) res WHERE res.row_num=1
Join with a subquery that finds the customers that were new in 2018.
SELECT COUNT(DISTINCT o.customer_id)
FROM Orders o
JOIN (
SELECT DISTINCT customer_id
FROM Orders
GROUP BY customer_id
HAVING MIN(order_date) > '2017-12-31'
) o1 ON o1.customer_id = o.customer_id
WHERE o.order_date < '2019-01-01';
There's also no need to join with Customers, since the customer ID is in Orders.
And the correct way to get the distinct count is COUNT(DISTINCT o.customer_id), not DISTINCT COUNT(o.customer_id).

mysql: get two values from subquery

I am trying to use a subquery to retrieve the oldest order for each customer. I want to select email_address, order_id, and order_date
Tables:
customers(customer_id, email_address)
orders(order_id, order_date, customer_id)
What I've tried:
I can get either the order_id or the order_date by doing
SELECT email_address,
(SELECT order_date /* or order_id */
FROM orders o
WHERE o.customer_id = c.customer_id
ORDER BY order_date LIMIT 1)
FROM customers c
GROUP BY email_address;
but if I try to do SELECT order_id, order_date in my subquery, I get the error:
Operand should contain 1 column(s)
You can solve this with a JOIN, but you need to be careful to only JOIN to the oldest values for a given customer:
SELECT c.email_address, o.order_id, o.order_date
FROM customers c
JOIN orders o ON o.customer_id = c.customer_id AND
o.order_date = (SELECT MIN(order_date) FROM orders o2 WHERE o2.customer_id = c.customer_id)
You could use a JOIN to get the result you want, or modify your query as below:
SELECT email_address,
(SELECT order_date
FROM orders o1
WHERE o1.customer_id = c.customer_id
ORDER BY order_date LIMIT 1) as `order_date`,
(SELECT order_id
FROM orders o2
WHERE o2.customer_id = c.customer_id
ORDER BY order_date LIMIT 1) as `order_id`
FROM customers c
GROUP BY email_address;
The JOIN is of your choice.
How can I select multiple columns from a subquery (in SQL Server) that should have one record (select top 1) for each record in the main query?
SELECT o.order_id, c.email_address, o.order_date
FROM customers c
INNER JOIN (
SELECT order_date, order_id, customer_id
FROM orders o
ORDER BY order_date
) as o on o.customer_id = c.customer_id
GROUP BY email_address;

inner join three tables results in multiplied values

I'm trying to (let's say) gather a report on customers.
In that report I want to include sum of orders and ticket number for each client.
Tables:
Customer(id, name)
Order(id, customer_id, amount)
support_ticket(id, customer_id)
query:
select
c.id as 'Customer',
count(distinct t.id) as "Ticket count",
count(distinct o.id) as "Order count",
sum(o.amount) as 'Order Amount'
from customer as c
inner join `order` as o on c.id = o.customer_id
inner join support_ticket as t on c.id = t.customer_id
group by c.id
Since I join with customer.id on the two tables, I get all the rows "duplicated", since I get all possible combinations, so if the client as multiple tickets, the sum(o.amount) will we multiplied because of "duplicated rows"
sqlFiddle (mysql): http://sqlfiddle.com/#!9/ba39ba/13
sqlFiddle (pg): http://sqlfiddle.com/#!17/bc32e/7
It seems like a simple case but I've been looking at it too much I think, I can't find the proper way to do that report.
What am I doing wrong?
your best bet is to re-write the Aggregation off the Order table as as Derived Table;
EG
select
c.id as 'Customer',
count(distinct t.id) as "Ticket count",
o.amount as 'Order Amount' ,
o.[Order count]
from customer as c
inner join
(SELECT
o.customer_id,
sum(amount) as amount ,
count(distinct o.id) as "Order count"
from [order]
group by o.customer_id)
as o on c.id = o.customer_id
inner join support_ticket as t on c.id = t.customer_id
group by
c.id ,
o.amount ,
o.[Order count]
Note that the Derived Table Columns then are added to the group by clause at the bottom.
Cheers!
Just calculate order values in a sub-query and join it.
SELECT
c.id as 'Customer'
,count(DISTINCT st.id) as 'Ticket Count'
,o.`Order Count`
,o.amount as `Order Amount`
FROM customer c
INNER JOIN support_ticket st
on c.id = st.customer_id
INNER JOIN (
SELECT
customer_id
,SUM(amount) as 'amount'
,count(distinct id) as 'Order Count'
FROM `order`
group by customer_id
) o
on c.id = o.customer_id
GROUP BY c.id;
select c.id as 'Customer'
,t2.count_ticket as "Ticket count"
,t1.count_order as "Order count"
,t1.amount as 'Order Amount'
from customer as c
inner join (select customer_id
,count(id) as count_order
,sum(amount) as amount
from Order group by customer_id) t1
on c.id = t1.customer_id
inner join (select customer_id
,count(id) as count_ticket
from support_ticket group by customer_id) t2
on c.id = t2.customer_id
In cases like yours, when I think the solution of my problem should be fairly simple but I cant wrap my head around it, I tend to use a WITH clause.
Not because its better, but because it helps me to understand my code better by splitting up complexity. First I create a relatively simple temp. Solving the first part of my problem.
WITH temp AS (
SELECT
c.id AS "customer",
COUNT(DISTINCT o.id) AS "order_count",
SUM(o.amount) AS "order_amount"
FROM customer AS c
INNER JOIN "order" AS o on c.id = o.customer_id
GROUP BY c.id
)
Then I simply select the first half of my solution from temp, adding this way all intermediate results, and solve the second part of my initial sql.
SELECT
temp.customer,
COUNT(DISTINCT t.id) as "ticket_count",
temp.order_count,
temp.order_amount
FROM temp
INNER JOIN support_ticket as t on temp.customer = t.customer_id
GROUP BY temp.customer, temp.order_count, temp.order_amount
The principle is the same like in all previous answers, but SELECTS are separated and I can check them fast, and continue on if I'm happy with parts of the solution.

Mysql sub select with SUM() and group by

I am trying to create a query that does a calculation in a subquery that requires the SUM function and a group by. My query returns the error "Subquery returns more than 1 row". Essentially I am trying to return the amount "Due" for each order. If the order total is greater than the sum of total_collected (for that order_id) from the payments table there will be amount due. Here is the query:
SELECT o.order_id
, o.server
, o.subtotal
, o.discount
, o.tax, o.total
, (SELECT (o.total - SUM(p.total_collected))
from orders o
join payments p
on o.order_id = p.order_id
group by p.order_id) as 'Due'
FROM orders o
join payments p
on o.order_id = p.order_id
WHERE...;
I cannot include 'p.order_id' in the sub select because it should only contain one column. I understand why I am getting the error, I just don't know how to get the sub select to only perform the SUM on a per order_id basis.
Without changing the structure much, I think the subquery is looking at all of the data in the orders/payments tables. I think you need to filter it down to look only at the relevant order_id like so.
(I also added a SUM around the order total because I am pretty sure that would give a different error without it.)
SELECT o.order_id
, o.server
, o.subtotal
, o.discount
, o.tax
, o.total
, (SELECT (SUM(o2.total) - SUM(p.total_collected))
from orders o2
JOIN payments p
ON o2.order_id = p.order_id
WHERE o2.order_id = o.order_id) as 'Due'
FROM orders o
WHERE...;
Although, if you adjust this so that it uses a join instead of a subquery, I think you will get better performance. Something like this:
SELECT o.order_id
, o.server
, o.subtotal
, o.discount
, o.tax
, o.total
, o.total - c.Collected AS 'Due'
FROM orders o
JOIN (
SELECT p2.order_id, SUM(p2.total_collected) AS 'Collected'
FROM payments p2
GROUP BY p2.order_id) AS c
ON o.order_id = c.order_id
WHERE...;
You do not need sub-query:
SELECT
o.order_id,
o.server,
o.subtotal,
o.discount,
o.tax,
o.total,
o.total - ifnull(sum(p.total_collected),0) As Due
FROM orders AS o
LEFT JOIN payments AS p ON o.order_id = p.order_id
WHERE ...
GROUP BY o.order_id

Can I execute a COUNT() before GROUP BY

I am working on an mySQL assignment for school and I am stuck on a question. I am still new to mySQL. COUNT(o.customer_id) is not working the way I want. I want it to count the number of orders but it is counting all items. i.e. Customer 1 has 2 orders but it is returning 3 because one order has two items. I have three tables one with customers, another with orders than another with each item on each order. Ive posed my query below. Any help would be great.
SELECT email_address, COUNT(o.order_id) AS num_of_orders,
SUM(((item_price - discount_amount) * quantity)) AS total
FROM customers c JOIN orders o
ON c.customer_id = o.customer_id
JOIN order_items ot
ON o.order_id = ot.order_id
GROUP BY o.customer_id
HAVING num_of_orders > 1
ORDER BY total DESC;
As simple as use Distinct reserved word:
SELECT email_address, COUNT(distinct o.order_id) AS num_of_orders
Looks like you want to count the DISTINCT number of orders. Add a DISTINCT into the COUNT. Although MySQL allows you to use the SELECT expression in the HAVING clause, it's not good practice to do so.
SELECT email_address, COUNT(DISTINCT o.order_id) AS num_of_orders,
SUM(((item_price - discount_amount) * quantity)) AS total
FROM customers c JOIN orders o
ON c.customer_id = o.customer_id
JOIN order_items ot
ON o.order_id = ot.order_id
GROUP BY o.customer_id
HAVING COUNT(DISTINCT o.order_id) > 1
ORDER BY total DESC;
Just take out the join to items. All it is doing is duplicating rows when there are multiple items.
SELECT email_address, COUNT(o.order_id) AS num_of_orders,
SUM(((item_price - discount_amount) * quantity)) AS total
FROM customers c JOIN orders o
ON c.customer_id = o.customer_id
GROUP BY o.customer_id
HAVING COUNT(o.order_id) > 1
ORDER BY total DESC;