The problem looks like this:
Write an SQL query that selects the product id, year, quantity, and price for the first year of every product sold.
Return the resulting table in any order.
A simple example of the question:
Input:
Sales table:
+---------+------------+------+----------+-------+
| sale_id | product_id | year | quantity | price |
+---------+------------+------+----------+-------+
| 1 | 100 | 2008 | 10 | 5000 |
| 2 | 100 | 2009 | 12 | 5000 |
| 7 | 200 | 2011 | 15 | 9000 |
+---------+------------+------+----------+-------+
Product table:
+------------+--------------+
| product_id | product_name |
+------------+--------------+
| 100 | Nokia |
| 200 | Apple |
| 300 | Samsung |
+------------+--------------+
Output:
+------------+------------+----------+-------+
| product_id | first_year | quantity | price |
+------------+------------+----------+-------+
| 100 | 2008 | 10 | 5000 |
| 200 | 2011 | 15 | 9000 |
+------------+------------+----------+-------+
My code:
SELECT product_id, year AS first_year, quantity, price
FROM Sales
WHERE year IN (
SELECT MIN(year) as year
FROM Sales
GROUP BY product_id) ;
My code worked with the above simple example but failed on a longer test case.
The expected query:
SELECT product_id, year AS first_year, quantity, price
FROM Sales
WHERE (product_id, year) IN (
SELECT product_id, MIN(year) as year
FROM Sales
GROUP BY product_id) ;
So I don't understand why I have to put productid when filtering with where. Doesn't SQL automatically choose the corresponding product_id with the first year? Any hints would be greatly appreciated!
Here's a step-by-step of why the first query doesn't work. Note that I've omitted some fields that are unused for the sake of brevity.
Imagine your sales data contained the following data:
product_id
year
100
2008
100
2011
100
2011
100
2011
100
2011
200
2011
Based on this data, the inner subquery of your first query:
SELECT MIN(year) as year
FROM Sales
GROUP BY product_id
will produce a result as follows:
MIN(year)
2008
2011
And so then your query is effectively doing the following:
SELECT product_id, year AS first_year
FROM Sales
WHERE year IN (2008, 2011)
So this query is going to find all the sales that occurred in 2008 and all the sales that occurred in 2011. It is not going to filter by product_id as that is not specified in the WHERE statement. So it'll yield results as follows which is not what you want:
product_id
first_year
100
2008
100
2011
100
2011
100
2011
100
2011
200
2011
This is why you need to specify the product_id in your IN statement.
On a general note, when debugging SQL, evaluate the inner-most queries first and then work outward as I have done in this answer.
Related
I am trying to understand how MySql processes a date condition differently when I set it on the 'WHERE' part of the code vs the 'HAVING' part. Can you please help me understand what is the difference in the logic in the below cases? How come I get just one product when I apply the date condition using HAVING, but I get two if I use the WHERE?
Question: Write an SQL query that reports the products that were only sold in spring 2019. That is, between 2019-01-01 and 2019-03-31 inclusive.
+------------+--------------+------------+
| product_id | product_name | unit_price |
+------------+--------------+------------+
| 1 | S8 | 1000 |
| 2 | G4 | 800 |
| 3 | iPhone | 1400 |
+------------+--------------+------------+
Sales table:
+-----------+------------+----------+------------+----------+-------+
| seller_id | product_id | buyer_id | sale_date | quantity | price |
+-----------+------------+----------+------------+----------+-------+
| 1 | 1 | 1 | 2019-01-21 | 2 | 2000 |
| 1 | 2 | 2 | 2019-02-17 | 1 | 800 |
| 2 | 2 | 3 | 2019-06-02 | 1 | 800 |
| 3 | 3 | 4 | 2019-05-13 | 2 | 2800 |
+-----------+------------+----------+------------+----------+-------+
HAVING OPTION (the correct one)
FROM Sales s
JOIN Product p
ON s.product_id = p.product_id
GROUP by s.product_id
HAVING min(sale_date)>='2019-01-01' AND max(sale_date)<='2019-03-31'
HAVING RESULT
{"headers": ["product_id", "product_name"], "values": [[1, "S8"]]}
WHERE OPTION
FROM Sales s
JOIN Product p
ON s.product_id = p.product_id
WHERE sale_date BETWEEN '2019-01-01' AND '2019-03-31'
GROUP by s.product_id
WHERE RESULT
{"headers": ["product_id", "product_name"], "values": [[1, "S8"], [2, "G4"]]}
Where does that 2, G4 come from?
(Apologies in advance if this is trivial, I am genuinely trying to learn on my own and I don't have anyone to ask)
The WHERE is filtering the rows to only consider sales in Jan, Feb, and March. So, any product that has sales in that period are included.
On the other hand, the HAVING is considering all sales for a product. Then for each product it is looking at the minimum and maximum dates. Only products that have all sales in Jan, Feb, and March are included.
Consider product_id = 2. It has a sale in February, so it would be included by the WHERE clause. However, it also has a sale in June. So the maximum date is Jun -- and the HAVING clause filters it out.
I need a complete count of each person_id from the database according to the date wise report
SELECT date, person_id, count(person_id)
FROM visits
group by date, person_id
I tried this one but this couldn't give the result what I expected.
Date | person_id| count(person_id)
2018-01-01 | 33000 | 10 |
2018-01-01 | 712000 | 111 |
2018-01-01 | 730000 | 30 |
2018-01-01 | 743000 | 5 |
2018-01-01 | 755000 | 123 |
you need total append to your query result? For example:
Date | person_id| count(person_id) | total
2018-01-01 | 33000 | 10 | 1000
2018-01-01 | 712000 | 111 | 1000
right? if so, I don't think it's a good idea only using sql query. On my case, I will query twice asynchronously,and then merge the result.
like this:
query1:
SELECT date, person_id, count(person_id)
FROM visits
group by date, person_id
query2:
SELECT count(person_id) as total
FROM visits
and then merge the results by program.
This is my table product_details:
Product_Code | Size | Quantity
-------------+------+-----------
CS01 | 10 | 15
CS01 | 11 | 25
CS01 | 12 | 35
PR01 | 40 | 50
PR01 | 41 | 60
I want a the following format for a report to get the total quantity group by product code (all sizes of product code):
Product_Code | Size | Quantity
-------------+------------+----------------
CS01 | 10 11 12 | 75
PR01 | 40 41 | 110
I tried the following query but it does not give the result I want.
SELECT product_no, size, SUM(quantity)
FROM product_details
GROUP BY product_no;
Please help me to find the query to format the report.
You can use group concat
SELECT
product_no,
group_concat(size SEPARATOR ' '),
sum(quantity)
FROM product_details group by product_no;
I am trying to make some reports regarding our orders and the products sold, revenue and the total weight, but when im grouping our orders together I get different results in some of my queries that should return the same.
The following is my queries:
Monthly
SELECT
MONTH(orders.date_purchased) as date,
YEAR(orders.date_purchased) as year,
count(DISTINCT orders.orders_id) AS total_orders,
categories.fields_23 as currency
FROM
orders_shipping_products
Inner Join orders ON orders.orders_id = orders_shipping_products.orders_id
Inner Join categories ON categories.fields_6 = orders.shopping_store_category_id
WHERE
orders.orders_status NOT IN (0, 1, 99)
GROUP BY
date,
year,
categories.fields_23
ORDER by
YEAR(orders.date_purchased),
MONTH(orders.date_purchased) ASC,
categories.fields_23
This returns for 2012 the following table, a total of 353:
+------+------+--------------+----------+
| date | year | total_orders | currency |
+------+------+--------------+----------+
| 11 | 2012 | 86 | EUR |
| 12 | 2012 | 267 | EUR |
+------+------+--------------+----------+
Yearly
SELECT
YEAR(orders.date_purchased) as year,
count(DISTINCT orders.orders_id) AS total_orders,
categories.fields_23 as currency
FROM
orders_shipping_products
Inner Join orders ON orders.orders_id = orders_shipping_products.orders_id
Inner Join categories ON categories.fields_6 = orders.shopping_store_category_id
WHERE
orders.orders_status NOT IN (0, 1, 99)
GROUP BY
year,
categories.fields_23
ORDER by
YEAR(orders.date_purchased),
categories.fields_23
it returns for 2012 the following
+------+--------------+----------+
| year | total_orders | currency |
+------+--------------+----------+
| 2012 | 351 | EUR |
+------+--------------+----------+
The only thing that changes is the total_orders, total amount of products, the revenue and weight is the same. I just get two more orders when checking per month. I also checked with a selection and grouping by QUARTER, YEAR that returns the following:
+---------+------+--------------+----------+
| quarter | year | total_orders | currency |
+---------+------+--------------+----------+
| 4 | 2012 | 351 | EUR |
+---------+------+--------------+----------+
This make me thinks, that i might be doing something wrong with my selections when i want to generate a report per month
WITH ROLLUP
Daniel asked me to try add WITH ROLLUP on my query, follow is what i did and got in return
SELECT
MONTH(orders.date_purchased) as date,
YEAR(orders.date_purchased) as year,
count(DISTINCT orders.orders_id) AS total_orders,
categories.fields_23 as currency
FROM
orders_shipping_products
Inner Join orders ON orders.orders_id = orders_shipping_products.orders_id
Inner Join categories ON categories.fields_6 = orders.shopping_store_category_id
WHERE
orders.orders_status NOT IN (0, 1, 99)
GROUP BY
year,
date,
categories.fields_23 WITH ROLLUP
This is the returned data:
+------+------+--------------+----------+
| date | year | total_orders | currency |
+------+------+--------------+----------+
| 11 | 2012 | 86 | EUR |
| 11 | 2012 | 86 | NULL |
| 12 | 2012 | 267 | EUR |
| 12 | 2012 | 267 | NULL |
| NULL | 2012 | 351 | NULL |
+------+------+--------------+----------+
I rechecked orders_id and it where not unique, in fact there where 2-3 the same orders id and some of them where located in different months. Found that a combination of orders_id and another column made it up for the real unique id. adding this to my join, solved everything.
can anyone generate a query for me.
Lets say i have a table sales(saleID, date_of_sales, customerID, itemID, saleprice)
date_of_sales is the datetime field which stores the time of the sale.
customerID is self exlpaining tells to whom item was sold.
itemID is ID of the item sold.
saleprice is the price that the item was sold.
I want to construct a query which will give out the detail of the last purchase by each customers. this could be done by using date_of_sales.
Example table
saleID | date_of_sales | customerID | itemID | saleprice
101 | 2008-01-01 | C2000 | I200 | 650 |
102 | 2010-01-01 | C2000 | I333 | 200 |
103 | 2007-01-01 | C3333 | I111 | 800 |
104 | 2009-12-12 | C3333 | I222 | 100 |
this is the example data table, there are only two customer for simplicity.
customer C2000 did his last purchase
on 2010-01-01
customer C3333 did his last purchase
on 2009-12-12
I want to get a result like this
customerID | date_of_sales | itemID | saleprice
C2000 | 2010-01-01 | I333 | 200 |
C3333 | 2009-12-12 | I222 | 100 |
This might be what you are looking for...
SELECT *
FROM sales
WHERE sales.date_of_sales = (SELECT MAX(date_of_sales)
FROM sales s2
WHERE s2.customerID = sales.customerID);
There is a slight problem with it; if there were two sales on the same day to the same customer, you'll get two rows (unless your date-of-sales column includes the time as well). I think the same applies to the answer above, though.
Additionally, if you DO want to get results based on only a SINGLE entry of the maximum date, I would use the query by #Sachin Shanbhag above, but add a maximum sales ID value too... Since that would be implied as sequential, whichever was entered last would probably be the most recent.
SELECT S.* FROM
sales S
INNER JOIN
( SELECT
customerID,
MAX(date_of_sales) dos,
MAX(SalesID) maxSale
FROM
sales
GROUP BY customerID
) S2 ON S.customerID = S2.customerID
AND S.date_of_sales = S2.dos
AND S.SalesID = S2.maxSale