Find orders that have more than just a specific SKU - mysql

I need to write a SQL query that will return all order numbers that have more than just SKU ENROLL but should not return an order number where SKU ENROLL is the only SKU on the order.
In this example, this order would be included in the query results.
Order 1001, contains SKU ENROLL and SKU 688631.
In this example, this order would NOT be included in the query results.
Order 1003, contains SKU ENROLL
Important to note, when query results are returned they look like this
Here is what I've written thus far but I am not sure on the rest. I've taken feedback from everyone who has responded and tried to incorporate it but haven't had good results.
select VO.DistID, VO.FirstName, VO.LastName, VO.OrderNumber, Email, Quantity, Sku, OrderStatus FROM dbo.VwOrders AS VO INNER JOIN dbo.VwDistributor AS VD ON VO.DistID = VD.DistID INNER JOIN dbo.VwOrderLines AS VOL ON VO.OrderNumber = VOL.OrderNumber INNER JOIN dbo.VwInventory AS VI ON vol.ItemNumber = VI.InventoryNo AND VOL.Warehouse = VI.Warehouse WHERE Sku = 'ENROLL'

There are several different ways to do this. Here's one option using conditional aggregation:
select orderid
from yourtable
group by orderid
having max(case when sku = 'ENROLL' then 1 else 0 end) = 1
and max(case when sku != 'ENROLL' then 1 else 0 end) = 1
SQL Fiddle Demo

select OrderID from table where SKU = 'ENROLL' and OrderID in
(select distinct OrderID from table group by OrderID having count(*) > 1)
Explanation: the inner query will give you all orderId's that appears more than once in the table. what we do in the rest of the query is select all records that equals to 'enroll' + in the appear twice list

You can use exists subqueries:
select *
from tbl t
where exists (select 1 from tbl x where x.orderid = t.orderid and x.sku = 'ENROLL')
and exists (select 1 from tbl x where x.orderid = t.orderid and x.sku <> 'ENROLL');

Very interesting answers so far. I really enjoyed reading them. Here is one option from me.
SELECT t1.*
FROM (SELECT * FROM tbl WHERE SKU = 'ENROLL') t1
INNER JOIN (SELECT * FROM tbl WHERE SKU != 'ENROLL') t2
ON t1.OrderID = t2.OrderID
GROUP BY t1.OrderID
I build two tables one having all order with at least one SKU = 'ENROLL' and one for orders with at least one SKU different from 'ENROLL' and I join them to get ... well the cross section of them two. This is what you look for, right?

Related

optimizing SQL counts

I have to select a list of Catalogs from one table, and perform counts in two other tables: Stores and Categories. The counters should show how many Stores and Categories are linked to each Catalog.
I have managed to get the functionality I need using this SQL query:
SELECT `catalog`.`id` AS `id`,
`catalog`.`name` AS `name`,
(
SELECT COUNT(*)
FROM `category`
WHERE `category`.`catalog_id` = `catalog`.`id`
AND `category`.`is_archive` = 0
AND `category`.`company_id` = 2
) AS `category_count`,
(
SELECT COUNT(*)
FROM `store`
WHERE `store`.`catalog_id` = `catalog`.`id`
AND `store`.`is_archive` = 0
AND `store`.`company_id` = 2
) AS `store_count`
FROM `catalog`
WHERE `catalog`.`company_id` = 2
AND `catalog`.`is_archive` = 0
ORDER BY `catalog`.`id` ASC;
This works as expected. But I don't like to perform sub-queries, as they are slow and this query may perform badly on LARGE lists.. Is there any method of optimizing this SQL using JOINs?
Thanks in advance.
You can make this a lot faster by refactoring the dependent subqueries in your SELECT clause into, as you mention, JOINed aggregate subqueries.
The first subquery you can write this way.
SELECT COUNT(*) num, catalog_id, company_id
FROM category
WHERE is_archive = 0
GROUP BY catalog_id, company_id
The second one like this.
SELECT COUNT(*) num, catalog_id, company_id
FROM store
WHERE is_archive = 0
GROUP BY catalog_id, company_id
Then, use those in your main query aas if they were tables containing the counts you want.
SELECT catalog.id,
catalog.name,
category.num category_count,
store.num store_count
FROM catalog
LEFT JOIN (
SELECT COUNT(*) num, catalog_id, company_id
FROM category
WHERE is_archive = 0
GROUP BY catalog_id, company_id
) category ON catalog.id = category.catalog_id
AND catalog.company_id = category.company_id
LEFT JOIN (
SELECT COUNT(*) num, catalog_id, company_id
FROM store
WHERE is_archive = 0
GROUP BY catalog_id, company_id
) store ON catalog.id = store.catalog_id
AND catalog.company_id = store.company_id
WHERE catalog.is_archive = 0
AND catalog.company_id = 2
ORDER BY catalog.id ASC;
This is faster than your example because each subquery need only run once, rather than once per catalog entry. It also has the nice feature that you only need say WHERE catalog.company_id = 2 once. The MySQL optimizer knows what to do with that.
I suggest LEFT JOIN operations so you'll still see catalog entries even if they're not mentioned in your category or store tables.
Subqueries are fine, but you can simplify your query:
SELECT c.id, c.name,
COUNT(*) OVER (PARTITION BY c.catalog_id) as category_count,
(SELECT COUNT(*)
FROM store s
WHERE s.catalog_id = s.id AND
s.is_archive = 0 AND
s.company_id = c.company_id
) AS store_count
FROM catalog c
WHERE c.company_id = 2 AND c.is_archive = 0
ORDER BY c.id ASC;
For performance, you want indexes on:
catalog(company_id, is_archive, id)
store(catalog_id, company_id, is_archive)
Because of the filtering in the outer query, a correlated subquery is probably the best performing way to get the results from store.
Also note some changes to the query:
I removed the backticks. They are unnecessary and just clutter the query.
An expression like c.id as id is redundant. The expression is given id as the alias anyway.
I changed the s.company_id = 2 to s.company_id = c.company_id. It seems like a correlation clause.

MySQL Get products also bought with a product / Optimise IN query

I'm trying to write a simple 'customers who bought this also bought...'
I have an order table, which contains orders, and an order_product table which contains all the products relating to an order.
In an attempt to find out the five most popular products that were bought with product_id = 155 I've composed the following query:
select product_id, count(*) as cnt
from order_product
where product_id != 155
and order_id in
(select order_id from order_product where product_id = 155)
group by product_id
order by cnt desc
limit 5;
So the inner query gets a list of all the orders that have the product I'm interested in (product_id = 155) then the outer query looks for all the products that aren't the same product but are in the one of the order that my product is in.
They are then ordered and limited to the top 5.
I think this works ok but it takes ages - I imagine this is because I'm using IN with a list of a couple of thousand.
I wonder if anyone could point me in the direction of writing it in a more optimised way.
Any help much appreciated.
You could try changing this:
select p1.product_id, p1.count(*) as cnt
To
select p1.product_id, count(distinct p1.order_id) as cnt
And see if that gives you any different result
Edit:
From the comments
If you prefer having the result you generate in your first query, you can try using this:
select a.product_id, count(*) as cnt
from order_product a
join (select distinct order_id from order_product where product_id = 155) b on (a.order_id = b.order_id)
where a.product_id != 155
group by a.product_id
order by cnt desc
limit 5;
A small alteration to your existing query :)
You can try a Join instead a subselect. Something like:
select p1.product_id, p1.count(*) as cnt
from order_product p1 JOIN order_product p2 on p1.order_id = p2. order_id
where p1.product_id != 155
and p2.product_id = 155
group by p1.product_id
order by p1.cnt desc
limit 5;

MySQL nested counts and returning values

I have searched high and low and can't seem to get a way to do what I want. I have a table with some customers, some products and their relationships.
I want to count the amount of returned rows from this part of the query
SELECT id
FROM customer
WHERE customer.name = 'SMITH'
OR customer.name = 'JONES'
I also want to return the ids that match SMITH and JONES (or other customer names chosen). I want to use the count of the returned rows as a variable (denoted as #var). I only want to return the products, id, and count that match my variable.
My questions are:
Is there a way that this can be done in a single SQL query?
Is there a way to return the count as well as the values?
I don't want to have to throw this into a PHP script or the like.
SELECT x.pId, p.productdesc, count(x.dId) as count
FROM
(
SELECT DISTINCT cId, pId
FROM Client
WHERE cId IN
(
SELECT id
FROM customer
WHERE customer.name = 'SMITH'
OR customer.name = 'JONES'
)
)x
JOIN Products p ON x.pId = p.id
GROUP BY x.pId
HAVING count = #var
Thanks,
M
This is sort of a 'literal' answer to what your asked, as you can use subqueries in the having clause. However, with more information (sample data and expected result) there may be a better way of doing what you want.
select x.pid, p.productdesc, count(x.pid) as count
from (select distinct cl.cid, cl.pid
from client cl
join customer cu
on cl.cid = cu.id
where cu.name in ('SMITH', 'JONES')) x
join products p
on x.pid = p.id
group by x.pid, p.productdesc
having count(x.pid) = (select count(*)
from customer
where name in ('SMITH', 'JONES'))

MYSQL - get lowest bid from table group by product (results weird)

Ok, I am baffled to what is going on with my query but what I am trying to do is the following:
Get the lowest bid grouped by product_id and then load in the product information related to that bid.
Currently when running the below query it says to me that the bid_id where column product_id = 2 is 30 but its definetly not 30, it should be 120 (although the bid_price value is correct at 29.99):
SELECT lowbid.bid_id, lowbid.bid_price
FROM (SELECT bid_id, min(bid_price) AS bid_price, product_id FROM tbl_products_bid WHERE is_active = 1 AND is_deleted = 0 GROUP BY product_id) AS lowbid;
Now due to this query giving me random bid_id's, which I am not sure why I was wondering if a SQL guru could provide me with an insight to 1. if I am being totally thick or 2. if there is another way or why I could be getting that random bid_id not even related to that bid_price.
I have created a SQLFiddle which can explain what I mean but any help would be grateful.
http://sqlfiddle.com/#!2/de77b/14
Also just to let you know that this query is part of another query but I took out the element that I think is giving me an issue (i.e above)
The part of the bigger query is below:
SELECT lowestbid.bid_id, lowestbid.product_id, lowestbid.bid_price as seller_bid_price, seller_description, pb.is_countdown, pb.startdate, pb.enddate
FROM
tbl_products_bid pb
inner JOIN (
SELECT bid_id, product_id, min(bid_price) as bid_price, seller_id, description as seller_description, is_countdown, startdate, enddate from tbl_products_bid where is_active = 1 group by product_id
) AS lowestbid ON pb.bid_id = lowestbid.bid_id
order by lowestbid.bid_price asc
SELECT a.*
FROM tbl_products_bid a
INNER JOIN
(
SELECT product_id, MIN(bid_price) min_price
FROM tbl_products_bid
GROUP BY product_id
) b ON a.product_id = b.product_id AND
a.bid_price = b.min_price
SQLFiddle Demo

MySQL "Distinct" join super slow

I have the following query which gives me the right results. But it's super slow.
What makes it slow is the
AND a.id IN (SELECT id FROM st_address GROUP BY element_id)
part. The query should show from which countries we get how many orders.
A person can have multiple addresses, but in this case, we only only want one.
Cause otherwise it will count the order multiple times. Maybe there is a better way to achieve this? A distinct join on the person or something?
SELECT cou.title_en, COUNT(co.id), SUM(co.price) AS amount
FROM customer_order co
JOIN st_person p ON (co.person_id = p.id)
JOIN st_address a ON (co.person_id = a.element_id AND a.element_type_id = 1)
JOIN st_country cou ON (a.country_id = cou.id)
WHERE order_status_id != 7 AND a.id IN (SELECT id FROM st_address GROUP BY element_id)
GROUP BY cou.id
Have you tried to replace the IN with an EXISTS?
AND EXISTS (SELECT 1 FROM st_address b WHERE a.id = b.id)
The EXISTS part should stop the subquery as soon as the first row matching the condition is found. I have read conflicting comments on if this is actually happening though so you might throw a limit 1 in there to see if you get any gain.
I found a faster solution. The trick is a join with a sub query:
JOIN (SELECT element_id, country_id, id FROM st_address WHERE element_type_id = 1 GROUP BY
This is the complete query:
SELECT cou.title_en, COUNT(o.id), SUM(o.price) AS amount
FROM customer_order o
JOIN (SELECT element_id, country_id, id FROM st_address WHERE element_type_id = 1 GROUP BY element_id) AS a ON (o.person_id = a.element_id)
JOIN st_country cou ON (a.country_id = cou.id)
WHERE o.order_status_id != 7
GROUP BY cou.id