get data based on MAX date and customer id - mysql

I have two tables: customers and contracts. The common key between them is customer_id. I need to link these two tables to represent if my fictitious business is on contract with a customer.
The customer -> contract table has a one to many relationship, so a customer can have an old contract on record. I want the latest. This is currently handled by contract_id which is auto-incremented.
My query is supposed to grab the contract data based on customer_id and the max contract_id for that customer_id.
My query currently looks like this:
SELECT * FROM(
SELECT co.*
FROM contracts co
LEFT JOIN customers c ON co.customer_id = c.customer_id
WHERE co.customer_id ='135') a
where a.contract_id = MAX(a.contract_id);
The answer is probably ridiculously obvious and I'm just not seeing it.

Since the most recent contract will be the one with the highest a.contract_id, simply ORDER BY and LIMIT 1
SELECT * FROM(
SELECT co.*
FROM contracts co
LEFT JOIN customers c ON co.customer_id = c.customer_id
WHERE co.customer_id ='135') a
ORDER BY a.contract_id DESC
LIMIT 1

You can use NOT EXISTS() :
SELECT * FROM contracts c
LEFT JOIN customers co
ON(co.customer_id = c.customer_id)
WHERE co.customer_id = '135'
AND NOT EXISTS(SELECT 1 FROM contracts co2
WHERE co2.customer_id = co.customer_id
AND co2.contract_id > co.contract_id)
This will make sure it's the latest contract, it is dynamic for all customers, you can just remove WHERE co.customer_id = '135' and you will get all the results.
In general, you can't use an aggregation function on the WHERE clause, only on the HAVING() which will be usually combined with a GROUP BY clause.

Related

subquery shows more that one row group by

I am trying to get the data for the best 5 customers in a railway reservation system. To get that, I tried getting the max value by summing up their fare every time they make a reservation. Here is the code.
SELECT c. firstName, c.lastName,MAX(r.totalFare) as Fare
FROM customer c, Reservation r, books b
WHERE r.resID = b.resID
AND c.username = b.username
AND r.totalfare < (SELECT sum(r1.totalfare) Revenue
from Reservation r1, for_res f1, customer c1,books b1
where r1.resID = f1.resID
and c1.username = b1.username
and r1.resID = b1.resID
group by c1.username
)
GROUP BY c.firstName, c.lastName, r.totalfare
ORDER BY r.totalfare desc
LIMIT 5;
this throws the error:[21000][1242] Subquery returns more than 1 row
If I remove the group by from the subquery the result is:(its a tabular form)
Jade,Smith,1450
Jade,Smith,725
Jade,Smith,25.5
Monica,Geller,20.1
Rach,Jones,10.53
But that's not what I want, as you can see, I want to add the name 'Jade' with the total fare.
I just don't see the point for the subquery. It seems like you can get the result you want with a sum()
select c.firstname, c.lastname, sum(totalfare) as totalfare
from customer c
inner join books b on b.username = c.username
inner join reservation r on r.resid = b.resid
group by c.username
order by totalfare desc
limit 5
This sums all reservations of each client, and use that information to sort the resulstet. This guarantees one row per customer.
The query assumes that username is the primary key of table customer. If that's not the case, you need to add columns firstname and lastname to the group by clause.
Note that this uses standard joins (with the inner join ... on keywords) rather than old-school, implicit joins (with commas in the from clause: these are legacy syntax, that should not be used in new code.

Ordering rows in a parent table by SUM of column in a child table

I have 3 tables in my database
companies{
id,
name,
address
}
stores{
id,
name,
address,
company_id
}
invoices{
id,
total,
date_time,
store_id
}
As you can see, each store is connected to a company via foreign key, also each invoice is connected to a store.
My question is, how can i write a SQL query which will return all stores by a company and order them by their turnover?
If i use the query:
SELECT s.*,
sum(i.total) as turnover FROM store s
JOIN invoices i
ON i.store_id = s.id
WHERE YEAR(i.date_time) = 2019;
I can see the turnover for one store for a year 2019 for example, but i'm struggling to find a way to get a list of store ordered by their turnover for a certain period.
You're going to need to join all 3 tables:
SELECT *
FROM
companies c
INNER JOIN stores s on s.company_id = c.id
INNER JOIN invoices i ON i.store_id = s.id
That's your entire raw data in detailed list. Then you say you want it for a certain company only:
SELECT *
FROM
companies c
INNER JOIN stores s on s.company_id = c.id
INNER JOIN invoices i ON i.store_id = s.id
WHERE c.name = 'Acme Rubber Co'
Then you only want the stores and the invoices amounts:
SELECT s.name, i.total
FROM
companies c
INNER JOIN stores s on s.company_id = c.id
INNER JOIN invoices i ON i.store_id = s.id
WHERE c.name = 'Acme Rubber Co'
Then you want a row set where each line is a single store and the sum of all invoices for that store:
SELECT s.name, SUM(i.total)
FROM
companies c
INNER JOIN stores s on s.company_id = c.id
INNER JOIN invoices i ON i.store_id = s.id
WHERE c.name = 'Acme Rubber Co'
GROUP BY s.name
Lastly you want them in descending order, highest total first:
SELECT s.name as storename, SUM(i.total) as turnover
FROM
companies c
INNER JOIN stores s on s.company_id = c.id
INNER JOIN invoices i ON i.store_id = s.id
WHERE c.name = 'Acme Rubber Co'
GROUP BY s.name
ORDER BY turnover DESC
The order of evaluation in sql is FROM(with joins), WHERE, GROUP BY, SELECT, ORDER BY which is why I use different names in eg the order by than I do in the group by. Conceptually your db only sees the names of things as output by the immediately previous operation. Mysql isn't actually too picky but some db are - you couldn't say GROUP BY storename in sql server because the SELECT that creates the storename alias hasn't been run at the time the group by is done
Note: I wasn't really sure on what you were looking for in a WHERE - you started by saying "all stores turnover for a certain company" and finished saying you were "struggling to get turnover for a period"
If you want a period, use eg WHERE somedatecolumn BETWEEN '2000-01-01' AND '2000-12-31' (Between is inclusive) or WHERE somedatecolumn >= '2000-01-01' AND somedatecolumn < '2001-01-01' (A good pattern to use if the date includes a time too). It is almost never wise to call a function on a column you're searching with, ie do not do WHERE YEAR(somedatecolumn) = 2000 because it disables indexing on the column and makes the search very slow

SQL retrieving filtered value in subquery

in this cust_id is a foreign key and ords returns the number of orders for every customers
SELECT cust_name, (
SELECT COUNT(*)
FROM Orders
WHERE Orders.cust_id = Customers.cust_id
) AS ords
FROM Customers
The output is correct but i want to filter it to retrieve only the customers with less than a given amount of orders, i don't know how to filter the subquery ords, i tried WHERE ords < 2 at the end of the code but it doesn't work and i've tried adding AND COUNT(*)<2 after the cust_id comparison but it doesn't work. I am using MySQL
Use the HAVING clause (and use a join instead of a subquery).....
SELECT Customers.cust_id, Customers.cust_name, COUNT(*) ords
FROM Orders, Customers
WHERE Orders.cust_id = Customers.cust_id
GROUP BY 1,2
HAVING COUNT(*)<2
If you want to include people with zero orders you change the join to an outer join.
There is no need for a correlated subquery here, because it calculates the value for each row which doesn't give a "good" performance. A better approach would be to use a regular query with joins, group by and having clause to apply your condition to groups.
Since your condition is to return only customers that have less than 2 orders, left join instead of inner join would be appropriate. It would return customers that have no orders as well (with 0 count).
select
cust_name, count(*)
from
customers c
left join orders o on c.cust_id = o.cust_id
group by cust_name
having count(*) < 2

Getting many shipping addresses for one order number

I'm using 4 tables
CUSTOMER
CUSTOMER_ORDER
CUST_ORDER_LINE
CUST_ADDRESS
I used Inner joins to link the tables. CUSTOMER is linked to CUSTOMER_ORDER and CUST_ADDRESS by customer_ID, and CUSTOMER_ORDER_LINE is linked to CUSTOMER_ORDER by order_ID. Order_ID does not appear in the CUSTOMER or CUST_ADDRESS tables.
When I run the query below, I get every shipping address on record for that particular customer and order number.
For example, a distributor has 25 possible shipping addresses, but they only ship one order to one shipping address at a time. My query is bringing back one order number 25 times for every address. Any advice would be wonderful. Thank you.
SELECT DISTINCT TOP (100) PERCENT dbo.CUSTOMER_ORDER.ID,
dbo.CUSTOMER.NAME,
dbo.CUST_ORDER_LINE.PART_ID,
dbo.CUST_ORDER_LINE.ORDER_QTY,
dbo.CUSTOMER_ORDER.STATUS,
dbo.CUSTOMER_ORDER.SHIPTO_ID,
dbo.CUST_ADDRESS.NAME AS Expr1
FROM dbo.CUSTOMER
INNER JOIN dbo.CUSTOMER_ORDER
ON dbo.CUSTOMER.ID = dbo.CUSTOMER_ORDER.CUSTOMER_ID
INNER JOIN dbo.CUST_ORDER_LINE
ON dbo.CUSTOMER_ORDER.ID = dbo.CUST_ORDER_LINE.CUST_ORDER_ID
INNER JOIN dbo.CUST_ADDRESS
ON dbo.CUSTOMER.ID = dbo.CUST_ADDRESS.CUSTOMER_ID
WHERE (dbo.CUSTOMER_ORDER.ORDER_DATE > '1/1/2014')
AND (dbo.CUSTOMER_ORDER.ID NOT LIKE 'RMA%')
GROUP BY dbo.CUSTOMER_ORDER.ID,
dbo.CUSTOMER.NAME,
dbo.CUST_ORDER_LINE.PART_ID,
dbo.CUST_ORDER_LINE.ORDER_QTY,
dbo.CUSTOMER_ORDER.STATUS,
dbo.CUSTOMER_ORDER.SHIPTO_ID,
dbo.CUST_ADDRESS.NAME
ORDER BY dbo.CUSTOMER_ORDER.ID
As a shot in the dark it seems your query should be something like this.
SELECT
co.ID,
c.NAME,
col.PART_ID,
col.ORDER_QTY,
co.STATUS,
co.SHIPTO_ID,
ca.NAME AS Expr1
FROM dbo.CUSTOMER c
INNER JOIN dbo.CUSTOMER_ORDER co ON c.ID = co.CUSTOMER_ID
INNER JOIN dbo.CUST_ORDER_LINE col ON co.ID = col.CUST_ORDER_ID
INNER JOIN dbo.CUST_ADDRESS ca ON co.SHIPTO_ID = ca.CUSTOMER_ID --this is now joining to the order table.
WHERE co.ORDER_DATE > '2014-01-01'
AND co.ID NOT LIKE 'RMA%'
GROUP BY co.ID,
c.NAME,
col.PART_ID,
col.ORDER_QTY,
co.STATUS,
co.SHIPTO_ID,
ca.NAME
ORDER BY co.ID
Notice how using aliases makes this look a lot cleaner. I also changed up the string date to use the generally accepted format. This will work regardless of your DATEFORMAT setting.

Selecting a summed value from a subquery that relies on a joined table

So I'm the lucky guy who gets to optimize a query for our application that's taking far too long for the data we're getting. The data we're looking for isn't significantly complex, but the crappy database design is making it a lot harder then it should be (which is great, because I'm the one who designed it about a year ago).
The general idea is we're trying to calculate the total sales (they buy something that increases their balance) and the total payments (they paid money against their balance) for each customer.
The tables:
Customers
id
company
Sales (invoices):
id
customer_id
Payments (account_payments)
id
customer_id
transaction_id (links to invoice_transactions)
Transactions (invoice_transactions)
id
invoice_id (links to invoices, null if payment)
amount
If a user makes a sale, the info is recorded in invoices and invoice_transactions, with invoice_transactions having the invoice_id of the invoices record that contains the customer_id.
If the user makes a payment, the info is recording in account_payments and invoice_transactions, with invoice_transaction having an invoice_id of NULL, and account_payments containing the transaction_id as well as the customer_id.
I know, it's horrible... And I thought I was being clever! Well, I thought the problem through, and came up with a decent solution:
SELECT SQL_NO_CACHE
c.company,
(SELECT SUM(amount) FROM sales),
(SELECT SUM(amount) FROM payments)
FROM customers c
JOIN invoices i ON i.customer_id = c.id
JOIN invoice_transactions sales ON i.invoice_id = sales.id
JOIN account_payments ap ON ap.customer_id = c.id
JOIN invoice_transactions payments ON ap.transaction_id = payments.id
Which does absolutely nothing except give me an error "#1146 - Table 'db.sales' doesn't exist". I'm guessing it has something to do with sub queries being read prior to joins, but I honestly have no idea. And unfortunately I have no idea of another way to approach this problem... Much appreciated if anyone could give me a hand!
I think the best approach would be to separate the the elements Sales and Payments into subqueries, your current method is cross joining all payments with all invoices before doing the aggregation.
SELECT c.ID,
c.Company,
COALESCE(Sales.Amount, 0) AS Sales,
COALESCE(Payments.Amount, 0) AS Payments
FROM Customers c
LEFT JOIN
( SELECT Customer_ID, SUM(Amount) AS Amount
FROM Invoices
INNER JOIN invoice_transactions
ON Invoice_ID = Invoices.ID
GROUP BY Customer_ID
) As Sales
ON Sales.Customer_ID = c.ID
LEFT JOIN
( SELECT Customer_ID, SUM(Amount) AS Amount
FROM Account_Payments
INNER JOIN invoice_transactions tr
ON tr.ID = Transaction_ID
GROUP BY Customer_ID
) AS Payments
ON Payments.Customer_ID = c.ID;
This will include customers with no invoices and no payments. You can change the left joins to inner joins to manipulate this.
Working Example on SQL Fiddle
Your query doesn't make sense.
After doing all the joining, why not just use the tables in the "from" clause:
SELECT c.company, SUM(sales.amount), SUM(payments.amount)
FROM customers c JOIN invoices i ON i.customer_id = c.id JOIN
invoice_transactions sales ON i.invoice_id = sales.id JOIN
account_payments ap ON ap.customer_id = c.id JOIN
invoice_transactions payments ON ap.transaction_id = payments.id
group by c.company
Just giving a table an alias in the "from" clause does not make it available in subqueries elsewhere in the query.
I also added a GROUP BY clause, since your query seems to be aggregating by company.