I have 3 tables which need to be linked in an SQL statement (I'm using PHP - MySQL if it helps). I need to extract all orders where the vendor field from the third table equals '3', as below:
orders - orders_items - items
order_id -> order_id
item_id -> id
vendor = '3'
There are many ways to do this I believe with various WHERE and JOINS but I'm asking for the most efficient methods in comparison to my method below:
SELECT
orders.order_id
FROM
items, orders
INNER JOIN
orders_items
ON
orders.order_id = orders_items.order_id
WHERE
orders_items.item_id = items.id
AND
items.vendor = '3'
GROUP BY
orders.order_id
Using , notation is not universally considered bad practice, but I think it's quite a minority now that agree with it. Even Oracle (whose users seems to be the most vocal supporters of that syntax) recommend to not use it.
But I don't know anyone who would support mixing , and ANSI-92's JOIN syntax. It's just asking for trouble.
SELECT
orders.order_id
FROM
orders
INNER JOIN
orders_items
ON orders.order_id = orders_items.order_id
INNER JOIN
items
ON orders_items.item_id = items.id
WHERE
items.vendor = '3'
GROUP BY
orders.order_id
The SQL Optimiser doesn't execute that exactly as you specified it. SQL is just a expression from which the SQL Optimiser derives a plan to give a result that fits. By writing it as above the optimiser will find what it sees as the best order to filter, join, sort, etc, and which are the best indexes, etc to use to do those things.
EDIT
I've noticed people supporting DISTINCT over GROUP BY.
While DISTINCT is slightly shorter, it is not any quicker, and does place restrictions on you. You can't later add COUNT(*) for example, but with GROUP BY you can.
In short, GROUP BY can do anything DISTINCT can, but that's not true the other way around. I only use DISTINCT in very trivial pieces of code so I can get a shole query on one line. Even then I often later regret it a little as the code develops and I need to rever to GROUP BY.
select o.order_id from orders o inner join orders_items oi on o.order_id = oi.item_id inner join items i on oi.item_id = i.id where i.vendor='3';
Many ways to do the same like joins, sub query, in clause. Depends on the need like terms of time or terms of memory which will best to use also major dependance on the INDEX columns of table and amount of data join table having.
You don't need the GROUP BY, just make a DISTINCT if you need to remove duplicates:
SELECT DISTINCT o.order_id
FROM orders o
INNER JOIN orders_items oi ON oi.order_id = o.order_id
INNER JOIN items i ON i.id = oi.items_id
where i.vendor = '3'
And also, use INNER JOIN on all tables :)
This is efficient and will work too::
SELECT
DISTINCT(orders.order_id)
FROM
items
INNER JOIN orders_items on (items.id=orders_items.item_id )
inner join orders on (orders.order_id=order_items.order_id)
WHERE
items.vendor = '3'
SELECT
orders.order_id
FROM
orders o
INNER JOIN orders_items oi ON o.order_id = oi.order_id
INNER JOIN items i ON oi.item_id = i.item_id
WHERE
i.vendor = 3
The table1, table2 syntax isn't something that I've used, but I imagine listing the tables as joins is more efficient as that seems to be the most accepted way.
Also, you don't need to put speech marks on the vendor criteria if the field is an integer.
SELECT O.order_id AS Id
FROM orders O
INNER JOIN orders_items OI
ON O.order_id = OI.order_id
INNER JOIN items I
ON OI.item_id = I.id
WHERE I.vendor = '3'
GROUP BY O.order_id
Related
I have come up with two queries, both use an inner join on two different tables.
Query 1
SELECT PRODUCTS.CODE, PRODUCTS.REFERENCE, PRODUCTS.TAXCAT, PRODUCTS.DISPLAY,PRODUCTS.NAME, PRODUCTS.PRICEBUY, PRODUCTS.PRICESELL, CATEGORIES.NAME AS CATEGORY
FROM PRODUCTS INNER JOIN CATEGORIES ON PRODUCTS.CATEGORY = CATEGORIES.ID;
Query 2
SELECT PRODUCTS.CODE, PRODUCTS.REFERENCE, PRODUCTS.TAXCAT, PRODUCTS.DISPLAY,PRODUCTS.NAME, PRODUCTS.PRICEBUY, PRODUCTS.PRICESELL,STOCKCURRENT.UNITS AS UNIT FROM PRODUCTS INNER JOIN STOCKCURRENT ON STOCKCURRENT.PRODUCT = PRODUCTS.ID;
Both queries run fine on their own, when I try to use both inner joins together I get errors. This is what I came up with on my own. I'm having trouble understanding the syntax to achieve this.
SELECT PRODUCTS.CODE, PRODUCTS.REFERENCE, PRODUCTS.TAXCAT,
PRODUCTS.DISPLAY,PRODUCTS.NAME, PRODUCTS.PRICEBUY,
PRODUCTS.PRICESELL,STOCKCURRENT.UNITS AS UNIT FROM PRODUCTS INNER JOIN
STOCKCURRENT ON STOCKCURRENT.PRODUCT = PRODUCTS.ID, CATEGORIES.NAME AS
CATEGORY FROM PRODUCTS INNER JOIN CATEGORIES ON PRODUCTS.CATEGORY =
CATEGORIES.ID;
Thank you.
Your attempted query has several syntax problems. Assuming you just want to join together the three tables, you may try the following query:
SELECT
p.CODE,
p.REFERENCE,
p.TAXCAT,
p.DISPLAY,
p.NAME,
p.PRICEBUY,
p.PRICESELL,
s.UNITS AS UNIT,
c.NAME AS CATEGORY
FROM PRODUCTS p
INNER JOIN STOCKCURRENT s
ON s.PRODUCT = p.ID
INNER JOIN CATEGORIES c
ON p.CATEGORY = c.ID;
Note that I introduced table aliases here. These aliases can be used elsewhere in the query to avoid having to repeat the entire table name.
By the way, I can also see taking a union of your two original queries. But without expected output, it was not entirely clear what you want.
I want to expand UI on my CodeIgniter shop with suggestions on what other people bought with the current product (either when viewing product or when product is put in the cart, irrelevant now for the question).
I have came up with this query (orders table contains order details, while order items contains products that are in specific order via foreign key, prd alias is for products table where all important info about prduct is stored).
Query looks like this
SELECT
pr.product_id,
COUNT(*) AS num,
prd.*
FROM
orders AS o
INNER JOIN order_items AS po ON o.id = po.order_id
INNER JOIN order_items AS pr ON o.id = pr.order_id
INNER JOIN products AS prd ON pr.product_id = prd.id
WHERE
po.product_id = '14211'
AND pr.product_id <> '14211'
GROUP BY
pr.product_id
ORDER BY
num DESC
LIMIT 3
It works nice and dandy, query time is 0.030ish seconds and it returns the products that bought together with the one I am currently viewing.
As for the questions and considerations, Percona query analyzer complains about this two things, Non-deterministic GROUP BY and GROUP BY or ORDER BY on different tables, which both I need so that I can get items on top that are actually relevant for the related query, but absolutely have no idea how to fix it, or even should I be really bothered with this notice from query analyzer.
Second question is regarding performace, since for this query, it using temporary and filesort, I was thinking of creating a view out of this query, and use it instead of actually executing the query each time some product is opened.
Mind you that I am not asking for CI model/view/controller tips, just tips on how to optimize this query, and/or suggestions regarding performance and going for views approach...
Any help is much than appreciated.
SELECT p.num, prd.*
FROM
(
SELECT a.product_id, COUNT(*) AS num
FROM orders AS o
INNER JOIN order_items AS b ON o.id = b.order_id
INNER JOIN order_items AS a ON o.id = a.order_id
WHERE b.product_id = '14211'
AND a.product_id <> '14211'
GROUP BY a.product_id
ORDER BY num DESC
LIMIT 3
) AS p
JOIN products AS prd ON p.product_id = prd.id
ORDER BY p.num DESC
This should
Run faster (especially as your data grows),
Avoid the group by complaint,
not over-inflate the count,
etc
Ignore the complaint about GROUP BY and ORDER BY coming from different tables -- that is a performance issue; you need it.
As for translating that back to CodeIgniter, good luck.
I have a table let's call it products with a list of Manufacturers and Products.
I have a second table let's call it Customer, Orders.
I can do a join to make a list of all the items from each manufacturer the customer ordered doing an Inner Join. Yet trying to do an Inner Join for the items they did not fails.
I tried an Inner Join with 'Orders.Product != Products.Product' but that only works where the Customer has one order. Once there is more than one order I get the same list I would have doing an Inner Join. Any thoughts? I'll try to make a SqlFiddle tonight but was hoping a quick description might help a MySql / Join expert who has done 'NOT Inner Join'before...
It is called an anti join, you can use left join with is null check:
select p.*
from products p
left join orders o on p.Product = o.Product
where o.product is null
I have created this SQL in order to find customers that haven't ordered for X days.
It is returning a result set, so this post is mainly just to get a second opinion on it, and possible optimizations.
SELECT o.order_id,
o.order_status,
o.order_created,
o.user_id,
i.identity_firstname,
i.identity_email,
(SELECT COUNT(*)
FROM orders o2
WHERE o2.user_id=o.user_id
AND o2.order_status=1) AS order_count,
(SELECT o4.order_created
FROM orders o4
WHERE o4.user_id=o.user_id
AND o4.order_status=1
ORDER BY o4.order_created DESC LIMIT 1) AS last_order
FROM orders o
INNER JOIN user_identities ui ON o.user_id=ui.user_id
INNER JOIN identities i ON ui.identity_id=i.identity_id
AND i.identity_email!=''
INNER JOIN subscribers s ON i.identity_id=s.identity_id
AND s.subscriber_status=1
AND s.subsriber_type=e
AND s.subscription_id=1
WHERE DATE(o.order_created) = "2013-12-14"
AND o.order_status=1
AND o.user_id NOT IN
(SELECT o3.user_id
FROM orders o3
WHERE o3.user_id=o.user_id
AND o3.order_status=1
AND DATE(o3.order_created) > "2013-12-14")
Can you guys find any potential problems with this SQL? Dates are dynamically inserted.
The final SQL that I put in production, will basically only include o.order_id, i.identity_id and o.order_count - this order_count will need to be correct. The other selected fields and 'last_order' subquery will not be included, it's only for testing.
This should give me a list of users that have their last order on that particular day, and is a newsletter subscriber. I am particular in doubt about correctness of the NOT IN part in the WHERE clause, and the order_count subquery.
There are several problems:
A. Using functions on indexable columns
You are searching for orders by comparing DATE(order_created) with some constant. This is a terrible idea, because a) the DATE() function is executed for every row (CPU) and b) the database can't use an index on the column (assuming one existed)
B. Using WHERE ID NOT IN (...)
Using a NOT IN (...) is almost always a bad idea, because optimizers usually have trouble with this construct, and often get the plan wrong. You can almost always express it as an outer join with a WHERE condition that filters for misses using an IS NULL condition for a joined column (and adds the side benefit of not needing DISTINCT, because there's only ever one miss returned)
C. Leaving joins that filtering out of large portions of rows too late
The earlier you can mask off rows by not making joins the better. You can do this by joining less likely to match tables earlier in the joined table list, and by putting non-key conditions into join rather than the where clause to get the rows excluded as early as possible. Some optimizers to this anyway, but I've often found they don't
D. Avoid correlated subqueries like the plague!
You have several correlated subqueries - ones that are executed for every row of the main table. That's really an incredibly bad idea. Again sometimes the optimizer can craft them into a join, but why rely (hope) on that. Most correlated subqueries can be expressed as a join; you examples are no exception.
With the above in mind, there are some specific changes:
o2 and o4 are the same join, so o4 may be dispensed with entirely - just use o2 after conversion to a join
DATE(order_created) = "2013-12-14" should be written as order_created between "2013-12-14 00:00:00" and "2013-12-14 23:59:59"
This query should be what you want:
SELECT
o.order_id,
o.order_status,
o.order_created,
o.user_id,
i.identity_firstname,
i.identity_email,
count(o2.user_id) AS order_count,
max(o2.order_created) AS last_order
FROM orders o
LEFT JOIN orders o2 ON o2.user_id = o.user_id AND o2.order_status=1
LEFT JOIN orders o3 ON o3.user_id = o.user_id
AND o3.order_status=1
AND o3.order_created >= "2013-12-15 00:00:00"
JOIN user_identities ui ON o.user_id=ui.user_id
JOIN identities i ON ui.identity_id=i.identity_id AND i.identity_email != ''
JOIN subscribers s ON i.identity_id=s.identity_id
AND s.subscriber_status=1
AND s.subsriber_type=e
AND s.subscription_id=1
WHERE o.order_created between "2013-12-14 00:00:00" and "2013-12-14 23:59:59"
AND o.order_status=1
AND o3.order_created IS NULL -- This gets only missed joins on o3
GROUP BY
o.order_id,
o.order_status,
o.order_created,
o.user_id,
i.identity_firstname,
i.identity_email;
The last line is how you achieve the same as NOT IN (...) using a LEFT JOIN
Disclaimer: Not tested.
Can't really comment on the results as you have not posted any table declares or example data, but your query has 3 correlated sub queries which is likely to make it perform poorly (OK, one of those is for last_order and is only for testing).
Eliminating the correlated sub queries and replacing them with joins would give something like this:-
SELECT o.order_id,
o.order_status,
o.order_created,
o.user_id,
i.identity_firstname,
i.identity_email,
Sub1.order_count,
Sub2.last_order
FROM orders o
INNER JOIN user_identities ui ON o.user_id=ui.user_id
INNER JOIN identities i ON ui.identity_id=i.identity_id
AND i.identity_email!=''
INNER JOIN subscribers s ON i.identity_id=s.identity_id
AND s.subscriber_status=1
AND s.subsriber_type=e
AND s.subscription_id=1
LEFT OUTER JOIN
(
SELECT user_id, COUNT(*) AS order_count
FROM orders
WHERE order_status=1
GROUP BY user_id
) Sub1
ON o.user_id = Sub1.user_id
LEFT OUTER JOIN
(
SELECT user_id, MAX(order_created) as last_order
FROM orders
WHERE order_status=1
GROUP BY user_id
) AS Sub2
ON o.user_id = Sub2.user_id
LEFT OUTER JOIN
(
SELECT DISTINCT user_id
FROM orders
WHERE order_status=1
AND DATE(order_created) > "2013-12-14"
) Sub3
ON o.user_id = Sub3.user_id
WHERE DATE(o.order_created) = "2013-12-14"
AND o.order_status=1
AND Sub3.user_id IS NULL
I'm writing a query for an application that needs to list all the products with the number of times they have been purchased.
I came up with this and it works, but I am not too sure how optimized it is. My SQL is really rusty due to my heavy usage of ORM's, But in this case a query is a much more elegant solution.
Can you spot anything wrong (approach wise) with the query?
SELECT products.id,
products.long_name AS name,
count(oi.order_id) AS sold
FROM products
LEFT OUTER JOIN
( SELECT * FROM orderitems
INNER JOIN orders ON orderitems.order_id = orders.id
AND orders.paid = 1 ) AS oi
ON oi.product_id = products.id
GROUP BY products.id
The schema (with relevant fields) looks like this:
*orders* id, paid
*orderitems* order_id, product_id
*products* id
UPDATE
This is for MySQL
I'm not sure about the "(SELECT *" ... business.
This executes (always a good start) and I think is equivalent to what was posted.
SELECT products.id,
products.long_name AS name,
count(oi.order_id) AS sold
FROM products
LEFT OUTER JOIN
orderitems AS oi
INNER JOIN
orders
ON oi.order_id = orders.id AND orders.paid = 1
ON oi.product_id = products.id
GROUP BY products.id
Here a solution for those of us who are nesting impaired. (I get so confused when I start nesting joins)
SELECT products.id,
products.long_name AS name,
count(oi.order_id) AS sold
FROM orders
INNER JOIN orderitems AS oi ON oi.order_id = orders.id AND orders.paid = 1
RIGHT JOIN products ON oi.product_id = products.id
GROUP BY products.id
However, I tested your solution, Mike's and mine on MS SQL Server and the query plans are identical. I can't speak for MySql but if MS SQL Server is anything to go by, you may find the performance of all three solutions equivalent. If that is the case I guess you pick which solution is clearest to you.
Does it give you the right answer?
Except for just modifying it to get rid of the SELECT in the inner query, I don't see anything wrong with it.
Well you have "LEFT OUTER JOIN" that can be a performance issue depending on your Database.
Last time I remember it caused hell on MySQL, and it doesn't exist in SQLite. I think Oracle can handle it ok, and I guess DB and MSSQL too.
EDIT: If I remember correctly LEFT OUTER JOIN can be orders of magnitude slower on MySQL, but please correct me if I'm outdated here :)
Untested code, but try it:
SELECT products.id,
MIN(products.long_name) AS name,
count(oi.order_id) AS sold
FROM (products
LEFT OUTER JOIN orderitemss AS oi ON oi.product_id = products.id)
INNER JOIN orders AS o ON oi.order_id = o.id
WHERE orders.paid = 1
GROUP BY products.id
I don't know if the parentheses are needed for the LEFT OUTER JOIN, neither if MySQL allows multiple joins, however the MIN(products.long_name) gives just the description, since for every products.id you have only one description.
Perhaps the parentheses need to be around the INNER JOIN.
Here's a subquery form.
SELECT
p.id,
p.long_name AS name,
(SELECT COUNT(*) FROM OrderItems oi WHERE oi.order_id in
(SELECT o.id FROM Orders o WHERE o.Paid = 1 AND o.Product_id = p.id)
) as sold
FROM Products p
It should perform roughly equivalent to the join form. If it doesn't, let me know.