I'm using 4 tables
CUSTOMER
CUSTOMER_ORDER
CUST_ORDER_LINE
CUST_ADDRESS
I used Inner joins to link the tables. CUSTOMER is linked to CUSTOMER_ORDER and CUST_ADDRESS by customer_ID, and CUSTOMER_ORDER_LINE is linked to CUSTOMER_ORDER by order_ID. Order_ID does not appear in the CUSTOMER or CUST_ADDRESS tables.
When I run the query below, I get every shipping address on record for that particular customer and order number.
For example, a distributor has 25 possible shipping addresses, but they only ship one order to one shipping address at a time. My query is bringing back one order number 25 times for every address. Any advice would be wonderful. Thank you.
SELECT DISTINCT TOP (100) PERCENT dbo.CUSTOMER_ORDER.ID,
dbo.CUSTOMER.NAME,
dbo.CUST_ORDER_LINE.PART_ID,
dbo.CUST_ORDER_LINE.ORDER_QTY,
dbo.CUSTOMER_ORDER.STATUS,
dbo.CUSTOMER_ORDER.SHIPTO_ID,
dbo.CUST_ADDRESS.NAME AS Expr1
FROM dbo.CUSTOMER
INNER JOIN dbo.CUSTOMER_ORDER
ON dbo.CUSTOMER.ID = dbo.CUSTOMER_ORDER.CUSTOMER_ID
INNER JOIN dbo.CUST_ORDER_LINE
ON dbo.CUSTOMER_ORDER.ID = dbo.CUST_ORDER_LINE.CUST_ORDER_ID
INNER JOIN dbo.CUST_ADDRESS
ON dbo.CUSTOMER.ID = dbo.CUST_ADDRESS.CUSTOMER_ID
WHERE (dbo.CUSTOMER_ORDER.ORDER_DATE > '1/1/2014')
AND (dbo.CUSTOMER_ORDER.ID NOT LIKE 'RMA%')
GROUP BY dbo.CUSTOMER_ORDER.ID,
dbo.CUSTOMER.NAME,
dbo.CUST_ORDER_LINE.PART_ID,
dbo.CUST_ORDER_LINE.ORDER_QTY,
dbo.CUSTOMER_ORDER.STATUS,
dbo.CUSTOMER_ORDER.SHIPTO_ID,
dbo.CUST_ADDRESS.NAME
ORDER BY dbo.CUSTOMER_ORDER.ID
As a shot in the dark it seems your query should be something like this.
SELECT
co.ID,
c.NAME,
col.PART_ID,
col.ORDER_QTY,
co.STATUS,
co.SHIPTO_ID,
ca.NAME AS Expr1
FROM dbo.CUSTOMER c
INNER JOIN dbo.CUSTOMER_ORDER co ON c.ID = co.CUSTOMER_ID
INNER JOIN dbo.CUST_ORDER_LINE col ON co.ID = col.CUST_ORDER_ID
INNER JOIN dbo.CUST_ADDRESS ca ON co.SHIPTO_ID = ca.CUSTOMER_ID --this is now joining to the order table.
WHERE co.ORDER_DATE > '2014-01-01'
AND co.ID NOT LIKE 'RMA%'
GROUP BY co.ID,
c.NAME,
col.PART_ID,
col.ORDER_QTY,
co.STATUS,
co.SHIPTO_ID,
ca.NAME
ORDER BY co.ID
Notice how using aliases makes this look a lot cleaner. I also changed up the string date to use the generally accepted format. This will work regardless of your DATEFORMAT setting.
Related
I have 5 SQL tables
store
staff
departments
sold_items
staff_rating
I created a view that JOINs this four of the tables together. The last table (staff_rating),I want to get the rating column at a time close to when items was sold (sold_items.date) for the view rows.
I have tried the following SQL Queries which works but have performance issues.
SQL QUERY 1
SELECT s.name,
s.country,
d.name,
si.item,
si.date,
(SELECT rating
FROM staff_ratings
WHERE staff_id = s.id
ORDER BY DATEDIFF(date, si.date) LIMIT 1) AS rating,
st.name,
st.owner
FROM store st
LEFT OUTER JOIN staff s ON s.store_id = st.id
LFET JOIN departments d ON d.store_id = st.id
LEFT JOIN sold_items si ON si.store_id = st.id
SQL QUERY 2
SELECT s.name,
s.country,
d.name,
si.item,
si.date,
si.rating ,
st.name,
st.owner
FROM store st
LEFT OUTER JOIN staff s ON s.store_id = st.id
LFET JOIN departments d ON d.store_id = st.id
LEFT JOIN (SELECT *,
(SELECT rating
FROM staff_ratings
WHERE staff_id = si.staff_id
ORDER BY DATEDIFF(date, si.date) LIMIT 1) AS rating
FROM sold_items) si ON si.store_id = st.id
SQL Query 2 is faster than SQL Query 1. But Both still have performance issue. Appreciate help for a query with better performance. Thanks in advance.
Your query doesn't look right to me (as mentioned in a comment on the original post; lacking staff_id in the join on the sales, etc)
Ignoring that, one of your biggest performance hits is likely to be this...
ORDER BY DATEDIFF(date, si.date) LIMIT 1
That order by can only be answered by comparing EVERY record for that staff member to the current sales record.
What you ideally want to be able to do is find the appropriate staff rating from an index, and not to have to run computations that involve dates from both the ratings table and the sales table.
If, for example, you wanted "the most recent rating BEFORE the sale", the query can be substantially improved...
SELECT
s.name,
s.country,
d.name,
si.item,
si.date,
(
SELECT sr.rating
FROM staff_ratings sr
WHERE sr.staff_id = s.id
AND sr.date <= si.date
ORDER BY sr.date DESC
LIMIT 1
)
AS rating,
st.name,
st.owner
FROM store st
LEFT JOIN staff s ON s.store_id = st.id
LFET JOIN departments d ON d.store_id = st.id
LEFT JOIN sold_items si ON si.store_id = st.id
Then, with an index for staff_ratings(staff_id, date, rating) the optimiser can very quickly look up which rating to use, without having to scan Every Single Rating for that staff member.
Why DATEDIFF? Would something like this work better? If so, the given index will make it work much faster.
WHERE staff_id = s.id
AND s.date >= s1.date
ORDER BY s.date
LIMIT 1
And INDEX(staff_id, date)
Do you need LEFT JOIN? Perhaps plain JOIN?
d may benefit from INDEX(store_id, name)
I have 3 tables in my database
companies{
id,
name,
address
}
stores{
id,
name,
address,
company_id
}
invoices{
id,
total,
date_time,
store_id
}
As you can see, each store is connected to a company via foreign key, also each invoice is connected to a store.
My question is, how can i write a SQL query which will return all stores by a company and order them by their turnover?
If i use the query:
SELECT s.*,
sum(i.total) as turnover FROM store s
JOIN invoices i
ON i.store_id = s.id
WHERE YEAR(i.date_time) = 2019;
I can see the turnover for one store for a year 2019 for example, but i'm struggling to find a way to get a list of store ordered by their turnover for a certain period.
You're going to need to join all 3 tables:
SELECT *
FROM
companies c
INNER JOIN stores s on s.company_id = c.id
INNER JOIN invoices i ON i.store_id = s.id
That's your entire raw data in detailed list. Then you say you want it for a certain company only:
SELECT *
FROM
companies c
INNER JOIN stores s on s.company_id = c.id
INNER JOIN invoices i ON i.store_id = s.id
WHERE c.name = 'Acme Rubber Co'
Then you only want the stores and the invoices amounts:
SELECT s.name, i.total
FROM
companies c
INNER JOIN stores s on s.company_id = c.id
INNER JOIN invoices i ON i.store_id = s.id
WHERE c.name = 'Acme Rubber Co'
Then you want a row set where each line is a single store and the sum of all invoices for that store:
SELECT s.name, SUM(i.total)
FROM
companies c
INNER JOIN stores s on s.company_id = c.id
INNER JOIN invoices i ON i.store_id = s.id
WHERE c.name = 'Acme Rubber Co'
GROUP BY s.name
Lastly you want them in descending order, highest total first:
SELECT s.name as storename, SUM(i.total) as turnover
FROM
companies c
INNER JOIN stores s on s.company_id = c.id
INNER JOIN invoices i ON i.store_id = s.id
WHERE c.name = 'Acme Rubber Co'
GROUP BY s.name
ORDER BY turnover DESC
The order of evaluation in sql is FROM(with joins), WHERE, GROUP BY, SELECT, ORDER BY which is why I use different names in eg the order by than I do in the group by. Conceptually your db only sees the names of things as output by the immediately previous operation. Mysql isn't actually too picky but some db are - you couldn't say GROUP BY storename in sql server because the SELECT that creates the storename alias hasn't been run at the time the group by is done
Note: I wasn't really sure on what you were looking for in a WHERE - you started by saying "all stores turnover for a certain company" and finished saying you were "struggling to get turnover for a period"
If you want a period, use eg WHERE somedatecolumn BETWEEN '2000-01-01' AND '2000-12-31' (Between is inclusive) or WHERE somedatecolumn >= '2000-01-01' AND somedatecolumn < '2001-01-01' (A good pattern to use if the date includes a time too). It is almost never wise to call a function on a column you're searching with, ie do not do WHERE YEAR(somedatecolumn) = 2000 because it disables indexing on the column and makes the search very slow
I want to write a query that can show the amount of purchases made in the month of June, grouped by city. So I wrote this query:
SELECT state, city, COUNT(*)
FROM address
JOIN person
JOIN purchase
WHERE purchase.person_FK = person.id
AND address.person_FK = person.id
AND MONTH(purchase.purchase_date) = 5
GROUP BY state, city
ORDER BY state, city;
But this query doesn't return the cities that have no purchases in that month, and I want to show them. Can you help me?
You need a city table with all the cities, then do a LEFT JOIN.
And put the JOIN condition on the ON section not the WHERE
SELECT Cities.state, Cities.city, COUNT(*)
FROM Cities
LEFT JOIN Purchase
ON Cities.city = Purchase.city
AND Cities.state = Cities.state
JOIN person
ON purchase.person_FK = person.id
AND MONTH(purchase.purchase_date) = 5
JOIN address
ON address.person_FK = person.id
GROUP BY Cities.state, Cities.city
ORDER BY Citiesstate, Cities.city;
Look at your joins, 'JOIN' is the same as 'INNER JOIN' which only shows results which is in both tables, you'll need to use a LEFT or FULL join to get what you need.
Theres a diagram here which explains them well
You will need to have a table that provides a listing of all cities you want to show (if you don't already have that). Then you join to the city table as well. Otherwise, your query has no idea which cities to show with a zero count. In addition, you will need to change your JOIN's to LEFT JOIN's
SELECT city.state, city.city, COUNT(*)
FROM address
LEFT JOIN person ON person.id = address.person_FK
LEFT JOIN purchase ON purchase.person_FK = person.id
LEFT JOIN city ON purchase.city = city.city
WHERE MONTH(purchase.purchase_date) = 5
GROUP BY address.state, address.city
ORDER BY address.state, address.city;
I have two tables: customers and contracts. The common key between them is customer_id. I need to link these two tables to represent if my fictitious business is on contract with a customer.
The customer -> contract table has a one to many relationship, so a customer can have an old contract on record. I want the latest. This is currently handled by contract_id which is auto-incremented.
My query is supposed to grab the contract data based on customer_id and the max contract_id for that customer_id.
My query currently looks like this:
SELECT * FROM(
SELECT co.*
FROM contracts co
LEFT JOIN customers c ON co.customer_id = c.customer_id
WHERE co.customer_id ='135') a
where a.contract_id = MAX(a.contract_id);
The answer is probably ridiculously obvious and I'm just not seeing it.
Since the most recent contract will be the one with the highest a.contract_id, simply ORDER BY and LIMIT 1
SELECT * FROM(
SELECT co.*
FROM contracts co
LEFT JOIN customers c ON co.customer_id = c.customer_id
WHERE co.customer_id ='135') a
ORDER BY a.contract_id DESC
LIMIT 1
You can use NOT EXISTS() :
SELECT * FROM contracts c
LEFT JOIN customers co
ON(co.customer_id = c.customer_id)
WHERE co.customer_id = '135'
AND NOT EXISTS(SELECT 1 FROM contracts co2
WHERE co2.customer_id = co.customer_id
AND co2.contract_id > co.contract_id)
This will make sure it's the latest contract, it is dynamic for all customers, you can just remove WHERE co.customer_id = '135' and you will get all the results.
In general, you can't use an aggregation function on the WHERE clause, only on the HAVING() which will be usually combined with a GROUP BY clause.
I have a table containing customers and another containing all orders.
I want to display a list of customers and along side show the total value of their orders.
Obviously I could loop through the customers and then using PHP run another query to get each customer's revenue. I don't think this is efficient.
I am looking to achieve something like this:
SELECT username, [SELCT sum(revenue) from orders where userID=userID] from customers
And for this to show output:
bob 10000
jeff 25000
alan 500
SELECT a.username, SUM(b.revenue) totalRevenue
FROM customers a
LEFT JOIN Orders b
ON a.userID = b.UserID
GROUP BY a.username
This will list all customers with or without Orders.
To further learn more about join, please visit the article below,
Visual Representation of SQL Joins
you're close...
SELECT username, (SELECT sum(revenue) from orders where userID=c.userID) rev
from customers c
You can join the tables and the group them by the order name
SELECT o.username,
sum(revenue) as sum_revenue
from orders o
left outer join customers c on c.userid = o.userid
group by o.username
No need for a subselect with that. Try something like this:-
SELECT customers.userID, customers.username, SUM(revenue)
FROM customers INNER JOIN orders ON customers.userID = orders.userID
GROUP BY customers.userID, customers.username