so in MySQL I have the task to filter out those customer names with the max overall revenue. I used the following query:
SELECT name
FROM
(SELECT name, SUM(sale_price) as revenue
FROM customer
INNER JOIN order USING(customerID)
INNER JOIN orderdetails USING(ordernumber)
GROUP BY name) as revenues
WHERE revenues.revenue = (SELECT MAX(revenue)
FROM revenues);
Now I understand, I cannot reference the alias of the subquery in my WHERE-clause. But how can I filter out only those names with the max value without more or less writing the same query as a filter again?
Thank you!
I'm using version 8.0.20
WITH
revenues AS (SELECT name,
DENSE_RANK() OVER (ORDER BY SUM(sale_price) DESC) dr
FROM customer
INNER JOIN order USING (customerID)
INNER JOIN orderdetails USING (ordernumber)
GROUP BY name),
SELECT name
FROM revenues
WHERE dr = 1
Related
I have two tables Orders and RMA. I wrote this command to return an inner join between the two tables. OrderID is the primary key of Orders and foreign key of RMA.
SELECT Orders.SKU, COUNT(*) AS Frequency
FROM Orders
INNER JOIN RMA ON Orders.OrderID = RMA.OrderID
GROUP BY Orders.SKU
ORDER BY COUNT(*) DESC;
This select statement returns a table with one column containing SKU values and one column containing the number of times each SKU value appears in the data. My goal is to create a third column that includes a percent that represents the frequency of each SKU value.
(disclaimer: I'm new to mysql, so if there's more information needed for this question, I am happy to provide it. Thanks!)
You must divide COUNT(*) with the total number of rows in RMA:
SELECT Orders.SKU,
COUNT(*) AS Frequency,
COUNT(*) / (SELECT COUNT(*) FROM RMA) AS percent
FROM Orders INNER JOIN RMA
ON Orders.OrderID = RMA.OrderID
GROUP BY Orders.SKU
ORDER BY Frequency DESC;
I have two tables, a customers and orders table.
The customers table contains a unique ID for each customer. It contains 1141 entries.
The orders table contains many entries with a customerID and a date.
I am trying to query my database and return a list of customers and the max(date) from the orders list.
SELECT *
FROM customers
INNER JOIN
(
SELECT CustomerID, max(date) as date
FROM orders
GROUP BY CustomerID
) Sub1
ON customers.id = Sub1.CustomerID
INNER JOIN orders
ON orders.CustomerID = Sub1.CustomerID
AND orders.date = Sub1.Date
However this query is returning 1726 rows instead of 1141 rows. Where is this getting extra from?
I think it's beacause ORDERS table contains same customerID multiple times, so when you join the table with CUSTOMERS, each CUSTOMER.id matches multiple rows of ORDERS.
The problem is that there are ties.
For a given customer, some place more than one order per day. So there's a possibility that occasionally some may have placed more than one order on the date that is their max date.
To fix this, you need to use MAX() or some column that is always unique in the Orders table (or at least unique within a given date). This is easy if you can depend on an auto-increment primary key in the Orders table:
SELECT *
FROM customers
INNER JOIN
(
SELECT CustomerID, max(orderid) as orderid as date
FROM orders
GROUP BY CustomerID
) Sub1
ON customers.id = Sub1.CustomerID
INNER JOIN orders
ON orders.CustomerID = Sub1.CustomerID
AND orders.orderid = Sub1.orderid
This assumes that orderid increases in lock-step with increasing dates. That is, you'll never have an order with a greater auto-inc id but an earlier date. That might happen if you allow data to be entered out of chronological order, e.g. back-dating orders.
;with cte as
(
select CustomerID, orderdate
, rn = row_number() over (partition by customerID order by orderdate desc)
from orders
)
select c.*, cte.orderdate
from customer c
join cte on cte.customerID = c.customerid
where rn =1 -- This will limit to latest orderdate
I am getting a little tripped up with a SQL query. Here is some background.
Schema:
Product(pid, price, color),
Order(cid, pid, quantity),
Customer(cid, name, age)
I want to get the pid of the most ordered product (greatest quantity).
I have managed to determine the max value with:
Select Max(total)
From (Select Sum(quantity) as total
From Orders Group By pid) as Totals
but I am getting stuck trying to match which products are in this subquery. Here is what I have tried:
Select pid, SUM(quantity) as q
From Orders
Where q in (
Select Max(total)
From (Select Sum(quantity) as total
From Orders
Group By pid) as Totals
)
Group By pid
This says that q is an unknown column.
Any suggestions on how I could do this or do it better?
you can do a JOIN along with GROUP BY like
select p.*
from product p
join
(select pid from Order
group by pid having quantity = max(quantity)
) tab on p.pid = tab.pid;
In your posted query it's erroring q is an unknown column cause q is a column alias which you are trying to use in WHERE condition; which is not allowed.
You should be able to simply include the PID in the original query because you are grouping on it. Then ORDER BY and and get only the top result using LIMIT 1.
SELECT
pid
,Sum(quantity) as total
FROM
Orders
GROUP BY
pid
ORDER BY
Sum(quantity)
LIMIT 1
Here's one way you can do it using a subquery with limit:
select o.pid, sum(o.quantity)
from `order` o
group by o.pid
having sum(o.quantity) =
(
select sum(quantity)
from `order`
group by pid
order by sum(quantity) desc
limit 1
)
SQL Fiddle Demo
If you want only one most ordered product, then Karl's answer is fine. If you want all that have the same quantity, then:
select pid, sum(quantity) as quantity
from orders o
group by pid
having sum(quantity) = (select max(quantity)
from (select sum(quantity) as quantity
from orders o
group by pid
) q
);
I have 2 DB tables that both share an Order Number column.
One table is "orders" and the Order Number is the unique key.
The second table is my "transactions" table that has one row, per transaction made for each order number. Based on the fact we take monthly payments, the "transactions" table obviously has multiple rows with a unique date but many repeats of a each Order Number.
How can I run a query which has a list of unique OrderNumbers in one column, and the latest "TransDate" (Transaction Date) in the second column.
I tried the below but its pulling back the first TransDate that exists for each ordernumber, not the latest one. I think I need a sub query of some sort:
select orders.ordernumber, transdate from orders
join transactions on transactions.ordernumber = orders.ordernumber
where status = 'booking'
group by ordernumber
order by orders.ordernumber, TransDate DESC
You should just use MAX() function along with grouping on order number. There also doesn't seem to be any reason to do a join here.
SELECT
ordernumber,
MAX(transdate) AS maxtransdate
FROM transactions
WHERE status = 'booking'
GROUP BY ordernumber
ORDER BY ordernumber ASC
Use aggregate functions, specifically max():
select o.ordernumber, max(transdate) as last_transdate
from orders as o
inner join transactions as t on o.ordernumber = t.ordernumber
-- where conditions go here
group by ordernumber
If you need to pull the details of the last transaction for each order, you can use the above query as a data source of another query and join it with the transactions table:
select a.*, t.*
from (
select o.ordernumber, max(transdate) as last_transdate
from orders as o
inner join transactions as t on o.ordernumber = t.ordernumber
-- where conditions go here
group by ordernumber
) as a
inner join transactions as t on a.ordernumber = t.ordernumber and a.last_transdate = t.transdate
Change the order by line to
order by Transdate DESC, orders.ordernumber
Here's the full query with the change
select orders.ordernumber, transdate from orders
join transactions on transactions.ordernumber = orders.ordernumber
where status = 'booking'
group by ordernumber
order by Transdate DESC, orders.ordernumber
I have this situation. I have a table Orders that is related to OrderStatus.
OrderStatus
id | orderId | created
I need to retrieve the Orders with its last status. I tried this query, what I don't know if it is performant. I need to know if there are better solutions.
select Orders.id, OrderStatus.status from Orders
inner join OrderStatus on OrderStatus.id =
(select top 1 id from OrderStatus where orderId = Order.id order by created desc)
Correlated subquery is usually bad news (sometimes SQL Server can optimize it away, sometimes it acts like a really slow loop). Also not sure why you think you need DISTINCT when you're only taking the latest status, unless you don't have any primary keys...
;WITH x AS
(
SELECT o.id, os.status,
rn = ROW_NUMBER() OVER (PARTITION BY os.orderId ORDER BY created DESC)
FROM dbo.Orders AS o
INNER JOIN dbo.OrderStatus AS os
ON o.id = os.orderId
)
SELECT id, status
FROM x
WHERE rn = 1;
You can use the Row_Number function:
WITH CTE AS
(
SELECT Orders.id, OrderStatus.status,
RN = ROW_NUMBER() OVER (
PARTITION BY OrderStatus.OrderId
ORDER BY created DESC)
FROM Orders
INNER JOIN OrderStatus ON OrderStatus.OrderId = Orders.id
)
SELECT id, status
FROM CTE WHERE RN = 1
I've used a common-table-expression since it enables to filter directly and it's also very readable.