Getting the most sold product per supplier in sql - mysql

Well, I am struggling with this question in SQL using MySql:
I have to give the product that was mostly sold per supplier from the popular open source database called NORTHWIND: https://northwinddatabase.codeplex.com
Now what I wrote is:
SELECT products.SupplierID ,`order details`.ProductID, count(*) as NumSales FROM `order details`
JOIN products ON `order details`.ProductID = products.ProductID
JOIN orders ON `order details`.OrderID = orders.OrderID
WHERE `order details`.OrderID
IN
(SELECT OrderID FROM orders
WHERE MONTH(OrderDate) = 7 AND YEAR(orderDate) = 1997)
group by products.SupplierID , `order details`.ProductID
ORDER BY NumSales desc
;
The result is:
that this is all good but I need to give back for example for Supplier 1 Product 1 since it was sold 3 times (at 7/1997)
Adding to the start:
SELECT SupplierID, ProductID, MAX(b.NumSales)
FROM( ... )
gets me closer but it gives my the highest of all suppliers and not for every supplier.
Help will be great.
P.S.
This question is similar but the same and didn't completely help me.

Please know this as a psuedo answer and work with it as you will...appreciate that you are putting in the time to learn this.
select supplier_id, max(num_sales) max_sales
from (put your select statement here)
group by supplier_id
This now gives you what you max num_sales is for each supplier. Something like
supplier_id max_sales
1 3
2 1
3 2
4 2
Now join this back to your original query to get the product data for the whatever matches to the max.
select a.supplier_id, b.product_id, a.max_sales
from
(select supplier_id, max(num_sales) max_sales
from (put your select statement here)
group by supplier_id) a
inner join
(your original query again) b
on a.supplier_id = b.supplier_id
and a.max_sales = b.num_sales
As you learn SQL, you will see that there is usually hundreds of valid working scripts that will give you the answer you want....your job is to find the script that is the quickest to write, the most efficient to run, and meets the criteria of your task. The advantage to the method shown here is it will display multiple records in the event of a tie (supplier_id = 2 has two product that bot have a max sales of one. This query returns both those rows).
just as additional info...other databases allow common table expressions (with clause), however mysql does not. How do you use the "WITH" clause in MySQL? in other databases you are able to simplify this script further.

Related

Need the list of products and orders

This is my homework task:
Products which were ordered along with the 5 most ordered products more than once and the count of orders they were included in. (Do not include the 5 most ordered products in the final result)
Products and orders are in same table. Order detail contain Order detail ID, order id, product id, quantity.
I've tried everything but I'm struggling with "along with" statement in the query.
Here is a query I have tried:
select
productid,
count
(
(select productid from orderdetails)
and
(select productid from orderdetails order by quantity desc limit 5)
) as ORDERS
from orderdetails
group by productid
order by ORDERS desc
You select from orderdetails, aggregate to get one result row per product and you count. It is very common to count rows with COUNT(*), but you can also count expressions, e.g. COUNT(mycolumn) where you just count those that are not null. You are counting an expresssion (because it is not COUNT(*) but COUNT(something else) that you are using). The expression to test for null and count is
(select productid from orderdetails)
and
(select productid from orderdetails order by quantity desc limit 5)
This, however is not an expression that leads to one value that gets counted (when it's not null) or not (when it's null). You are selecting all product IDs from the orderdetails table and you are selecting all the five product IDs from the orderdetails table that got ordered with the highest quantity. And then you apply AND as if these were two booleans, but they are not, they are data sets. Apart from the inappropriate use of AND which is an operator on booleans and not on data sets, you are missing the point here that you should be looking for products in the same order, i.e. compare the order number somehow.
So all in all: This is completely wrong. Sorry to say that. However, the task is not at all easy in my opinion and in order to solve it, you should go slowly, step by step, to build your query.
Products which were ordered along with the 5 most ordered products more than once
Dammit; such a short sentence, but that is deceiving ;-) There is a lot to do for us...
First we must find the 5 products that got ordered most. That means sum up all sales and find the five top ones:
select productid
from orderdetails
group by productid
order by sum(quantity) desc
limit 5
(The problem with this: What if six products got ordered most, e.g. products A, B, and C with a quantity of 200 and products D, E, and F with a quantity of 100? We would get the top three plus two of the top 4 to 6. In standard SQL we would solve this with a ties clause, but MySQL's LIMIT doesn't feature this.)
Anyway. Now we are looking for products that got ordered with these five products along. Does this mean with all five at once? Probably not. We are rather looking for products that were in the same order with at least one of the top five.
with top_5_products as
(query above)
, orders_with_top_5 as
(select orderid
from orderdetails
where productid in (select productid from top_5_products)
)
, other_products_in_order as
(select productid, orderid
from orderdetails
where orderid in (select orderid from orders_with_top_5)
and productid not in (select productid from top_5_products)
And once we've got there, we must even find products that got ordered with some of the top 5 "more than once" which I interpret as to appear in at least two orders containing top 5 products.
with <all the above>
select productid
from other_products_in_order
group by productid
having count(*) > 1;
And while we have counted how many orders the products share with top 5 products, we are still not there, because we are supposed to show the number of orders the products were included in, which I suppose refers to all orders, not only those containing top 5 products. That is another count, that we can get in the select clause for instance. The query then becomes:
with <all the above>
select
productid,
(select count(*) from orderdetails od where od.productid = opio.productid)
from other_products_in_order opio
group by productid
having count(*) > 1;
That's quite a lot for homework seeing that you are struggling with the syntax still. And we haven't even addressed that top-5-or-more ties problem yet (for which analytic functions come in handy).
The WITH clause is available since MySQL 8 and helps getting such a query that builds up step by step readable. Old MySQL versions don't support this. If working with an old version I suggest you upgrade :-) Else you can use subqueries directly instead.

sql SELECT query for 3 tables

I have 3 tables:
1. products(product_id,name)
2. orders(id,order_id,product_id)
3. factors(id,order_id,date)
I want to retrieve product names(products.name) where have similar order_id on a date in two last tables.
I use this query for this purpose:
select products.name
from products
WHERE products.product_id ~IN
(
SELECT distinct orders.product_id FROM orders WHERE
order_id IN (select order_id FROM factors WHERE
factors.datex ='2017-04-29') GROUP BY product_id
)
but no result. where is my mistake? how can I resolve that? thanks
Your query should be fine. I am rewriting it to make a few changes to the structure, but not the logic (this makes it easier for me to understand the query):
select p.name
from products p
where p.product_id in (select o.product_id
from orders o
where o.order_id in (select f.order_id
from factors f
where f.datex = '2017-04-29'
)
) ;
Notes on the changes:
When using multiple tables in a query, always qualify the column names.
Use table aliases. They make queries easier to write and to read.
SELECT DISTINCT and GROUP BY are unnecessary in IN subqueries. The logic of IN already handles (i.e. ignores) duplicates. And by explicitly including the operations, you run the risk of a less efficient query plan.
Why might your query not work?
factors.datex has a time component. If so, then this will work date(f.datex) = '2017-04-29'.
There are no factors on that date.
There are no orders that match factors on that date.
There are no products in the orders that match the factors on that date.
In factors table column name is date so it should be -
factors.date ='2017-04-29'
You have written -
factors.datex ='2017-04-29'

SQL - Selecting a customer that only bought 1 type of product on a specific date

I need to find out how many pencils were bought on 2017-01-01 by people that only bought 1 other type of product prior to buying pencils. (e.g. bought only notebooks beforehand)
This is what I have, that so far shows many who bought one type of product beforehand, so what I am missing is how many pencils they bought on the 2017-01-01:
SELECT
c.name,
s.units_sold AS Sold,
s.product_id
FROM
sales AS s
INNER JOIN customers AS c
ON c.id=s.customer_id
GROUP BY c.name
HAVING COUNT(DISTINCT s.product_id) = 1
I tried to look at similar questions without success.
Hope my question is clear :/
Thanks!
This seems like a very strange question. But if I read literally, then you would seem to want something like this:
select sum(s.units_sold)
from sales s
where s.product_id = 'pencil' and
s.date = '2017-01-01' and
1 = (select count(distinct s2.product_id)
from sales s2
where s2.customer_id = s.customer_id and
s2.date < s.date
);
Gordon's query is legit, although it seem to have a flaw: it would be executing the subquery as many times, as the number of customers are.
That's heavy.
Just remember that SQL is a declarative language, thus you not going to tell the engine how it should do the things - just declare what you need.
Thus, it might be thought through the way like:
I need to run my aggregate query on subset of customers.
So I'm going to determine this subset first: ones who had exactly one purchase; of any product other that pencil; prior to the given date
Once I have this set of customers, I can address their purchases of: exactly pencils; exactly on given date
Here, in first iteration (!), your query become pretty obvious (I'll be using Gordon's notation):
SELECT sum(s.units_sold)
FROM sales s
WHERE s.product_id = 'pencil' AND
s.date = '2017-01-01' AND
s.customer_id IN (
SELECT s2.customer_id
FROM sales s2
WHERE s2.date < '2017-01-01'
GROUP BY s2.customer_id
HAVING count(DISTINCT s2.product_id) = 1
);
I didn't really checked it, but I hope you got the idea - in this case, it is reduced to two-pass: one to get the subset of Customers that meet the given criteria, second to make the aggregation on their operations that are meeting given conditions.

counting the most sold products from mysql

I am trying to get the most sold products list by making a mysql query . The problem is its still getting all of the data even after i use count .
select
mf.*,
od.*,
count(od.product_id) as product_count
from masterflight mf ,
order_details od
where od.product_id=mf.ProductCode
group by od.product_id
order by product_count
Here masterflight is the table where the product details are stored with their ids . And order_details is the table where record of each product being sold individually are stored . What i was trying to put in a logic that suppose a product with id 2 is sold 4 times and each time it has a separate entry then i would count those using COUNT and then display it which does it seems to be working .
Try something a little neater:
select
mf.ProductCode,
count(od.*) as product_count
from
order_details od
inner join masterflight mf on
od.product_id = mf.ProductCode
group by
mf.ProductCode
order by product_count desc
The problem is that you're selecting all of od, but you're not grouping by it, so you're just getting all of the order rows, which doesn't help you really at all. I should note that MySQL is the only one of the major RDBMSes that allows that behavior--and it's confusing and tough to debug. I'd advise against using that particular feature. As a general rule, if you've selected a column but don't have an aggregate (e.g.-sum, avg, min, etc.) on it, then you need it in the group by clause.

MYSQL Count group by rows ignoring effect of JOIN and SUM fields on Joined tables

I have 3 tables:
Orders
- id
- customer_id
Details
- id
- order_id
- product_id
- ordered_qty
Parcels
- id
- detail_id
- batch_code
- picked_qty
Orders have multiple Details rows, a detail row per product.
A detail row has multiple parcels, as 10'000 ordered qty may come from 6 different batches, so goods from batches are packed and shipped separately. The picked quantity put in each parcel for a detail row should then be the same as the ordered_qty.
... hope that makes sense.
Im struggling to write a query to provide summary information of all of this.
I need to Group By customer_id to provide a row of data per customer.
That row should contain
Their total number of orders
Their total ordered_qty of goods across all orders
Their total picked_qty of goods across all orders
I can get the first one with:
SELECT customer_id, COUNT(*) as number_of_orders
FROM Orders
GROUP BY Orders.customer_id
But when I LEFT JOIN the other two tables and add the
SELECT ..... SUM(Details.ordered_qty) AS total_qty_ordered,
SUM(Parcels.picked_qty) AS total_qty_picked
.. then I get results that dont seem to add up for the quantities, and the COUNT(*) seems to include the additional lines from the JOIN which obviously then isn't giving me the number of Orders anymore.
Not sure what to try next.
===== EDIT =======
Here's the query I tried:
SELECT
customer_id,
COUNT(*) as number_of_orders,
SUM(Details.ordered_qty) AS total_qty_ordered,
SUM(Parcels.picked_qty) AS total_qty_picked
FROM Orders
LEFT JOIN Details ON Details.order_id=Order.id
LEFT JOIN Parcels ON Parcels.detail_id=Detail.id
GROUP BY Orders.customer_id
try COUNT(distinct Orders.order_id) as number_of_orders,
as in
SELECT
customer_id,
COUNT(distinct Orders.order_id) as number_of_orders,
SUM(Details.ordered_qty) AS total_qty_ordered,
(select SUM(Parcels.picked_qty)
FROM Parcels WHERE Parcels.detail_id=Detail.id ) AS total_qty_picked
FROM Orders
LEFT JOIN Details ON Details.order_id=Order.id
GROUP BY Orders.customer_id
EDIT: added an other select with subselect
Is there any particular reason you feel the need to combine all these in one query? Simplify by breaking it up in to separate queries, and if you want a single call to get the results, put the queries in a stored procedure, using temp tables.