I'm writing a query for an application that needs to list all the products with the number of times they have been purchased.
I came up with this and it works, but I am not too sure how optimized it is. My SQL is really rusty due to my heavy usage of ORM's, But in this case a query is a much more elegant solution.
Can you spot anything wrong (approach wise) with the query?
SELECT products.id,
products.long_name AS name,
count(oi.order_id) AS sold
FROM products
LEFT OUTER JOIN
( SELECT * FROM orderitems
INNER JOIN orders ON orderitems.order_id = orders.id
AND orders.paid = 1 ) AS oi
ON oi.product_id = products.id
GROUP BY products.id
The schema (with relevant fields) looks like this:
*orders* id, paid
*orderitems* order_id, product_id
*products* id
UPDATE
This is for MySQL
I'm not sure about the "(SELECT *" ... business.
This executes (always a good start) and I think is equivalent to what was posted.
SELECT products.id,
products.long_name AS name,
count(oi.order_id) AS sold
FROM products
LEFT OUTER JOIN
orderitems AS oi
INNER JOIN
orders
ON oi.order_id = orders.id AND orders.paid = 1
ON oi.product_id = products.id
GROUP BY products.id
Here a solution for those of us who are nesting impaired. (I get so confused when I start nesting joins)
SELECT products.id,
products.long_name AS name,
count(oi.order_id) AS sold
FROM orders
INNER JOIN orderitems AS oi ON oi.order_id = orders.id AND orders.paid = 1
RIGHT JOIN products ON oi.product_id = products.id
GROUP BY products.id
However, I tested your solution, Mike's and mine on MS SQL Server and the query plans are identical. I can't speak for MySql but if MS SQL Server is anything to go by, you may find the performance of all three solutions equivalent. If that is the case I guess you pick which solution is clearest to you.
Does it give you the right answer?
Except for just modifying it to get rid of the SELECT in the inner query, I don't see anything wrong with it.
Well you have "LEFT OUTER JOIN" that can be a performance issue depending on your Database.
Last time I remember it caused hell on MySQL, and it doesn't exist in SQLite. I think Oracle can handle it ok, and I guess DB and MSSQL too.
EDIT: If I remember correctly LEFT OUTER JOIN can be orders of magnitude slower on MySQL, but please correct me if I'm outdated here :)
Untested code, but try it:
SELECT products.id,
MIN(products.long_name) AS name,
count(oi.order_id) AS sold
FROM (products
LEFT OUTER JOIN orderitemss AS oi ON oi.product_id = products.id)
INNER JOIN orders AS o ON oi.order_id = o.id
WHERE orders.paid = 1
GROUP BY products.id
I don't know if the parentheses are needed for the LEFT OUTER JOIN, neither if MySQL allows multiple joins, however the MIN(products.long_name) gives just the description, since for every products.id you have only one description.
Perhaps the parentheses need to be around the INNER JOIN.
Here's a subquery form.
SELECT
p.id,
p.long_name AS name,
(SELECT COUNT(*) FROM OrderItems oi WHERE oi.order_id in
(SELECT o.id FROM Orders o WHERE o.Paid = 1 AND o.Product_id = p.id)
) as sold
FROM Products p
It should perform roughly equivalent to the join form. If it doesn't, let me know.
Related
The following is the image of the database,
I want to get the names of the Products along with the total quantity in which they are purchased. For the purpose, I have written the following query which is giving correct results,
select (select p.ProductName from Product p where p.id = o.Productid) 'Product Name', sum(o.quantity) 'Total Quantity' from orderitem o group by productid ORDER BY 'Total Quantity' desc
However, I believe that it is traversing through the product table for every different order and that is ineffective in terms of time complexity. Changing this from subquery to join might resolve the issue, but I can't figure how can I change this query into Join.
You're correct in that using a JOIN is almost always more efficient in SQL. Your query would look something like:
SELECT
p.productName
, SUM(oi.quantity) AS total_quantity
FROM orderitem oi
LEFT JOIN product p
ON oi.ProductId = p.Id
GROUP BY p.productName
Note I'm using a LEFT JOIN here as it's usually best practice to start from your fact table (orderitem in this case) and left join onto your lookup tables (product in this case). This will give you a row with a NULL product name and the total quantity of all unmatched items, which is always a good thing to at least look at. But, depending on your use case you may wan t a different kind of join.
Here is your solution, try following query using join
Select P.ProductName 'Product Name',
Sum(O.Quantity) 'Total Quantity'
From PRODUCT1 P
Join ORDERITEM O ON O.ProductId = P.Id
Group By P.ID,P.ProductName
SELECT p.ProductName "Product Name", sum(o.quantity) "Total Quantity"
FROM Product p
JOIN orderitem o
ON p.id = o.Productid
GROUP BY p.id
ORDER BY "Total Quantity" desc
I have come up with two queries, both use an inner join on two different tables.
Query 1
SELECT PRODUCTS.CODE, PRODUCTS.REFERENCE, PRODUCTS.TAXCAT, PRODUCTS.DISPLAY,PRODUCTS.NAME, PRODUCTS.PRICEBUY, PRODUCTS.PRICESELL, CATEGORIES.NAME AS CATEGORY
FROM PRODUCTS INNER JOIN CATEGORIES ON PRODUCTS.CATEGORY = CATEGORIES.ID;
Query 2
SELECT PRODUCTS.CODE, PRODUCTS.REFERENCE, PRODUCTS.TAXCAT, PRODUCTS.DISPLAY,PRODUCTS.NAME, PRODUCTS.PRICEBUY, PRODUCTS.PRICESELL,STOCKCURRENT.UNITS AS UNIT FROM PRODUCTS INNER JOIN STOCKCURRENT ON STOCKCURRENT.PRODUCT = PRODUCTS.ID;
Both queries run fine on their own, when I try to use both inner joins together I get errors. This is what I came up with on my own. I'm having trouble understanding the syntax to achieve this.
SELECT PRODUCTS.CODE, PRODUCTS.REFERENCE, PRODUCTS.TAXCAT,
PRODUCTS.DISPLAY,PRODUCTS.NAME, PRODUCTS.PRICEBUY,
PRODUCTS.PRICESELL,STOCKCURRENT.UNITS AS UNIT FROM PRODUCTS INNER JOIN
STOCKCURRENT ON STOCKCURRENT.PRODUCT = PRODUCTS.ID, CATEGORIES.NAME AS
CATEGORY FROM PRODUCTS INNER JOIN CATEGORIES ON PRODUCTS.CATEGORY =
CATEGORIES.ID;
Thank you.
Your attempted query has several syntax problems. Assuming you just want to join together the three tables, you may try the following query:
SELECT
p.CODE,
p.REFERENCE,
p.TAXCAT,
p.DISPLAY,
p.NAME,
p.PRICEBUY,
p.PRICESELL,
s.UNITS AS UNIT,
c.NAME AS CATEGORY
FROM PRODUCTS p
INNER JOIN STOCKCURRENT s
ON s.PRODUCT = p.ID
INNER JOIN CATEGORIES c
ON p.CATEGORY = c.ID;
Note that I introduced table aliases here. These aliases can be used elsewhere in the query to avoid having to repeat the entire table name.
By the way, I can also see taking a union of your two original queries. But without expected output, it was not entirely clear what you want.
I want to expand UI on my CodeIgniter shop with suggestions on what other people bought with the current product (either when viewing product or when product is put in the cart, irrelevant now for the question).
I have came up with this query (orders table contains order details, while order items contains products that are in specific order via foreign key, prd alias is for products table where all important info about prduct is stored).
Query looks like this
SELECT
pr.product_id,
COUNT(*) AS num,
prd.*
FROM
orders AS o
INNER JOIN order_items AS po ON o.id = po.order_id
INNER JOIN order_items AS pr ON o.id = pr.order_id
INNER JOIN products AS prd ON pr.product_id = prd.id
WHERE
po.product_id = '14211'
AND pr.product_id <> '14211'
GROUP BY
pr.product_id
ORDER BY
num DESC
LIMIT 3
It works nice and dandy, query time is 0.030ish seconds and it returns the products that bought together with the one I am currently viewing.
As for the questions and considerations, Percona query analyzer complains about this two things, Non-deterministic GROUP BY and GROUP BY or ORDER BY on different tables, which both I need so that I can get items on top that are actually relevant for the related query, but absolutely have no idea how to fix it, or even should I be really bothered with this notice from query analyzer.
Second question is regarding performace, since for this query, it using temporary and filesort, I was thinking of creating a view out of this query, and use it instead of actually executing the query each time some product is opened.
Mind you that I am not asking for CI model/view/controller tips, just tips on how to optimize this query, and/or suggestions regarding performance and going for views approach...
Any help is much than appreciated.
SELECT p.num, prd.*
FROM
(
SELECT a.product_id, COUNT(*) AS num
FROM orders AS o
INNER JOIN order_items AS b ON o.id = b.order_id
INNER JOIN order_items AS a ON o.id = a.order_id
WHERE b.product_id = '14211'
AND a.product_id <> '14211'
GROUP BY a.product_id
ORDER BY num DESC
LIMIT 3
) AS p
JOIN products AS prd ON p.product_id = prd.id
ORDER BY p.num DESC
This should
Run faster (especially as your data grows),
Avoid the group by complaint,
not over-inflate the count,
etc
Ignore the complaint about GROUP BY and ORDER BY coming from different tables -- that is a performance issue; you need it.
As for translating that back to CodeIgniter, good luck.
I have a table let's call it products with a list of Manufacturers and Products.
I have a second table let's call it Customer, Orders.
I can do a join to make a list of all the items from each manufacturer the customer ordered doing an Inner Join. Yet trying to do an Inner Join for the items they did not fails.
I tried an Inner Join with 'Orders.Product != Products.Product' but that only works where the Customer has one order. Once there is more than one order I get the same list I would have doing an Inner Join. Any thoughts? I'll try to make a SqlFiddle tonight but was hoping a quick description might help a MySql / Join expert who has done 'NOT Inner Join'before...
It is called an anti join, you can use left join with is null check:
select p.*
from products p
left join orders o on p.Product = o.Product
where o.product is null
I have 3 tables which need to be linked in an SQL statement (I'm using PHP - MySQL if it helps). I need to extract all orders where the vendor field from the third table equals '3', as below:
orders - orders_items - items
order_id -> order_id
item_id -> id
vendor = '3'
There are many ways to do this I believe with various WHERE and JOINS but I'm asking for the most efficient methods in comparison to my method below:
SELECT
orders.order_id
FROM
items, orders
INNER JOIN
orders_items
ON
orders.order_id = orders_items.order_id
WHERE
orders_items.item_id = items.id
AND
items.vendor = '3'
GROUP BY
orders.order_id
Using , notation is not universally considered bad practice, but I think it's quite a minority now that agree with it. Even Oracle (whose users seems to be the most vocal supporters of that syntax) recommend to not use it.
But I don't know anyone who would support mixing , and ANSI-92's JOIN syntax. It's just asking for trouble.
SELECT
orders.order_id
FROM
orders
INNER JOIN
orders_items
ON orders.order_id = orders_items.order_id
INNER JOIN
items
ON orders_items.item_id = items.id
WHERE
items.vendor = '3'
GROUP BY
orders.order_id
The SQL Optimiser doesn't execute that exactly as you specified it. SQL is just a expression from which the SQL Optimiser derives a plan to give a result that fits. By writing it as above the optimiser will find what it sees as the best order to filter, join, sort, etc, and which are the best indexes, etc to use to do those things.
EDIT
I've noticed people supporting DISTINCT over GROUP BY.
While DISTINCT is slightly shorter, it is not any quicker, and does place restrictions on you. You can't later add COUNT(*) for example, but with GROUP BY you can.
In short, GROUP BY can do anything DISTINCT can, but that's not true the other way around. I only use DISTINCT in very trivial pieces of code so I can get a shole query on one line. Even then I often later regret it a little as the code develops and I need to rever to GROUP BY.
select o.order_id from orders o inner join orders_items oi on o.order_id = oi.item_id inner join items i on oi.item_id = i.id where i.vendor='3';
Many ways to do the same like joins, sub query, in clause. Depends on the need like terms of time or terms of memory which will best to use also major dependance on the INDEX columns of table and amount of data join table having.
You don't need the GROUP BY, just make a DISTINCT if you need to remove duplicates:
SELECT DISTINCT o.order_id
FROM orders o
INNER JOIN orders_items oi ON oi.order_id = o.order_id
INNER JOIN items i ON i.id = oi.items_id
where i.vendor = '3'
And also, use INNER JOIN on all tables :)
This is efficient and will work too::
SELECT
DISTINCT(orders.order_id)
FROM
items
INNER JOIN orders_items on (items.id=orders_items.item_id )
inner join orders on (orders.order_id=order_items.order_id)
WHERE
items.vendor = '3'
SELECT
orders.order_id
FROM
orders o
INNER JOIN orders_items oi ON o.order_id = oi.order_id
INNER JOIN items i ON oi.item_id = i.item_id
WHERE
i.vendor = 3
The table1, table2 syntax isn't something that I've used, but I imagine listing the tables as joins is more efficient as that seems to be the most accepted way.
Also, you don't need to put speech marks on the vendor criteria if the field is an integer.
SELECT O.order_id AS Id
FROM orders O
INNER JOIN orders_items OI
ON O.order_id = OI.order_id
INNER JOIN items I
ON OI.item_id = I.id
WHERE I.vendor = '3'
GROUP BY O.order_id