SQL Listing only unique values with subquery - mysql

let me start off by saying Yes this is a homework question.
It seems so simple but I can't get it to work. I am just starting sub-queries and I'm guessing that's what my teacher wants here.
here's the question
5. Write a SELECT statement that returns the name and discount percent
of each product that has a unique discount percent. In other words,
don’t include products that have the same discount percent as another
product.
-Sort the results by the ProductName column.
Here's what I tried
SELECT DISTINCT p1.ProductName, p1.DiscountPercent
FROM Products AS p1
WHERE NOT EXISTS
(SELECT p2.ProductName, p2.DiscountPercent
FROM Products AS p2
WHERE p1.DiscountPercent <> p2.DiscountPercent)
ORDER BY ProductName;
Any Help would be highly appreciated - thanks !

When checking for uniqueness using COUNT() makes it simple, you can either use it in a HAVING clause or select it outright.
SELECT a.ProductName, a.DiscountPercent
FROM Products a
JOIN (SELECT DiscountPercent, COUNT(DiscountPercent) AS CT
FROM Products
GROUP BY DiscountPercent
)b
ON a.DiscountPercent = b.DiscountPercent
WHERE b.CT = 1
Or:
SELECT a.ProductName, a.DiscountPercent
FROM Products a
JOIN (SELECT DiscountPercent
FROM Products
GROUP BY DiscountPercent
HAVING COUNT(DiscountPercent) = 1
)b
ON a.DiscountPercent = b.DiscountPercent

Try this, it's quite similar to yours, but with not in you assure that the discount is not present in another product.
SELECT p1.ProductName, p1.DiscountPercent
FROM Products AS p1
WHERE p1.DiscountPercent NOT IN
(SELECT p2.DiscountPercent
FROM Products AS p2
WHERE p1.ProductName <> p2.ProductName)
ORDER BY ProductName

If you need to solve this with a subquery, you want to find products for which NOT EXISTS another product with an equal(=) discountPercent, not different (<>). Using a <> in that NOT EXISTS clause would return results only if all discountPercents in the table had the same value (there doesn't exist any other product with a different discount --> all discounts are the same)
And take into account that you'll need to add a condition to make sure the subquery isn't finding a match for the same row executing it (i.e. p1 is not the same row as p2)
For instance, if productName is enough to identify a product:
SELECT DISTINCT p1.ProductName, p1.DiscountPercent
FROM Products AS p1
WHERE NOT EXISTS
(SELECT p2.ProductName, p2.DiscountPercent
FROM Products AS p2
WHERE p1.DiscountPercent = p2.DiscountPercent
AND p1.ProductName <> p2.ProductName)
ORDER BY ProductName;

try this query in mysql
select ProductName, DiscountPercent
from product
where DiscountPercent in (select DiscountPercent from product
group by DiscountPercent
having count(DiscountPercent)=1)

Related

SQL average and Join

I'm trying to merge these two statements into one query to get the a list of product names(or ids) against the average of their TTFF data, and I'm stuck.
select AVG(TTFF) from TTFFdata group by product_id
select product.product_name, count(*) from product join TTFFdata on product.product_id = TTFFdata.product_id
I've looked into using a temporary table (CREATE TEMPORARY TABLE IF NOT EXISTS averages AS (select AVG(TTFF) from TTFFdata group by product_id)) but couldn't get that to work with a join.
Anyone able to help me please?
You need to understand the components. Your second query is missing a group by. This would seem to be what you want:
select p.product_name, count(t.product_id), avg(t.TTFF)
from product p left join
TTFFdata t
on p.product_id = t.product_id
group by p.product_name
It is better to do group by on product_id, product_name for two reasons. One is, you can select product id along with product name. Second reason is, If the product name is not unique then it may give wrong results(this may be a rare scenario like product name is same but it differs based on other columns like version or model). The below is the final query.
select Product.product_id,
product_name,
AVG(TTFF) as Avg_TTFF
from Product
inner join
TTFFdata
on Product.product_id = TTFFdata.product_id
group by Product.product_id,Product.product_name
TTFFdata:
product:
Output:

MySQL Join with sum()

I have two tables; One contains for products stats and another one contains additional stats
StatsHourly:
id
product_id (can be multiple)
amount
cost
time
StatsValues:
id
product_id (can be multiple)
value (double)
I need to join those two tables and get something like this in the result:
product_id
sum (amount)
sum (cost)
sum (value)
I'm trying to do this:
"SELECT
SUM(s.amount) as amount,
SUM(s.cost) as cost
FROM StatsHourly s
LEFT JOIN (
SELECT
COALESCE(SUM(value), 0) as value
FROM StatsValues
GROUP BY product_id
) value v ON v.product_id = s.product_id
WHERE 1
AND s.product_id = :product_id";
This doesn't work. Could someone show me the right way to do it?
You have an extra comma after as cost:
SUM(s.cost) as cost, <-- here
You also use 2 aliases for the subquery, you should remove value from there:
) value v
You do not use any output from the subquery.
Coalesce() is unnecessary in the subquery.
This works (tested):
SELECT
s.product_id as product_id,
s.amount_s as amount,
s.cost_s as cost,
v.value_v as value
FROM
(SELECT
product_id,
SUM(amount) as amount_s,
SUM(cost) as cost_s
FROM StatsHourly
GROUP BY product_id) as s
LEFT JOIN
(SELECT
product_id,
SUM(value) as value_v
FROM StatsValues
GROUP BY product_id) as v
ON v.product_id = s.product_id;
WHERE s.product_id = 'product_id';
The point is:
As you have multiple equal product_id in BOTH table you have to make two aggregated tables through subqueries that makes the product_id unique and sum all appropriate rows.
After that you can join and you select the already aggregated values.
Regards

sql nested query with group by

I was reading some tutorials about group by clause, i faced the following problem and don't know why it was solved like that, the table is as follows:
the requirement is to select the most expensive product in each category, and the following query was the answer:
SELECT
categoryID, productID, productName, MAX(unitprice)
FROM
products A
WHERE
unitprice = (
SELECT
MAX(unitprice)
FROM
products B
WHERE
B.categoryId = A.categoryID)
GROUP BY categoryID;
i don't know why the above query was the answer, why it wasn't just:
SELECT
categoryID, productID, productName, MAX(unitprice)
FROM
products
GROUP BY categoryID;
also, if the first query is the right one, why MAX function exists in the outer and inner query, isn't it enough to exist in the inner query?
thanks.
The second query will produce an error because it is not possible to have columns in the select clause whitout grouping by them in the Group by clause (unless they are subject to the aggregation).
Therefore you need to first find the highest unit price in each category and then find which product has that uniprice. You can actually accomplish this in many ways. This first query is one of them.
From your picture it looks as others have mentioned that you are using mysql, the MYSQL optimiser doesn't like subqueries very much and it would horrible to run over lots of data, best habit is to use joins where possible (if you look at query plans in postgres, oracle or mssql it will re-write sub-queries as joins 90% of the time)
The second query will run on default mysql as it will group by the missed columns you missed.
Below is an example:
SELECT
A.categoryID, A.productID, A.productName, B.max_unitprice
FROM products A
JOIN (
SELECT
max(unit price) as max_unitprice,
categoryId
FROM products
GROUP BY categoryId) B
ON B.categoryId = A.categoryID
SELECT p.*
FROM products p
WHERE NOT EXISTS ( SELECT 'p2'
FROM products p2
WHERE p2.categoryId = p.categoryId
AND p2.unitPrice > p.unitPrice
)

ORDER BY based on maximum number of row appearence

Table Fields:
shop_id , product_id
I want a list of all shops having specific products(should have at least 1 product)
results should be sorted on basis of shops having maximum number of specified products
I could write sql query for 1st part, but the list is not sorted according to the shops that match maximum number of products
SELECT
shop_id,
product_id
FROM
products_table
WHERE
product_id IN (1,2,3)
ORDER BY ???
Is there a optimal solution?
Join with a subquery that gets the counts for each shop, and order by that.
SELECT a.shop_id, a.product_id
FROM products_table AS a
JOIN (SELECT shop_id, COUNT(*) AS product_count
FROM products_table
WHERE product_id in (1, 2, 3)
GROUP BY shop_id) AS b
ON a.shop_id = b.shop_id
WHERE product_id IN (1, 2, 3)
ORDER BY b.product_count DESC
A query like this will avoid repeating the list of product_ids:
with sp as (
select shop_id, product_id
from products_table
where product_id IN (1,2,3)
)
select
shop_id, product_id,
(select count(*) from sp as sp2 where sp2.shop_id = sp.shop_id) as shop_count
from sp
order by shop_count desc
But now I see you're using MySQL so it won't work out for you although it can be expanded:
select
shop_id, product_id,
(
select count(*) from products_table as p2
where product_id in (1,2,3) and p2.shop_id = p.shop_id
) as shop_count
from products_table as p
where product_id in (1,2,3)
order by shop_count desc;
It's essentially the same query but the join is implied. I'm under the impression that MySQL doesn't always handle correlated queries very efficiently. I think the flavor of Barbar's answer is the one you'll have to use unless you create a temporary table mirroring "sp" above.
As a side note, I study languages and it's interesting to me that I chose to call my computed column "shop_count" while the other Barmar went with "product_count. I focused on shop as the center of my attention though we're actually counting up products. To me "shop_count" indicated "count per shop" while Barbar might describe his as "count of products". By no means am I arguing that one approach is more valid or natural. It's just fascinating to me to see the different perspective that people can take.

Selecting minimum within JOIN

I have two tables (listed only fields important for the question):
t_groups
INT groupId PRIMARY
VARCHAR(255) grname
t_goods
INT goodId PRIMARY
INT groupId
INT price
VARCHAR(255) name
Now I need a query, which selects group names and name of the cheapest good in each group. Tried doing it this way:
SELECT gr.groupId, grname, g.name
FROM t_groups AS gr
LEFT JOIN (SELECT * FROM t_goods ORDER BY PRICE ASC LIMIT 1) AS g
ON g.groupId = gr.groupId
but it doesn't work — returns NULLs in g.name field. It could be easily explained:
SELECT within JOIN statement selects cheapest good first, and then tries to "filter it" by groupId. Obviously, it'll only work for the group cheapest good belongs to.
How do I solve the task?
Why your query does not work
SELECT gr.groupId, grname, g.name
FROM t_groups AS gr
LEFT JOIN (SELECT * FROM t_goods ORDER BY PRICE ASC LIMIT 1) AS g
ON g.groupId = gr.groupId
The inner query selects the absolutely cheapest good (irrespective of group) in your database. Therefore, when you LEFT JOIN the groups to this result set, only the group which actually includes the universally cheapest good has a matching row (that group should get the g.name column filled properly). However, due to the way LEFT JOIN works all other groups will get NULL as the value of all columns in g.
The correct solution
First, you need to select the cheapest price in each group. This is easy:
SELECT groupId, MIN(price) AS minPrice FROM t_goods GROUP BY (groupId)
However the cheapest price is not useful without the associated goodId. The problem is that it's not meaningful to write something like:
/* does not make sense, although MySql has historically allowed it */
SELECT goodId, groupId, MIN(price) AS minPrice FROM t_goods GROUP BY (groupId)
The reason is that you cannot select a non-grouped column (i.e. goodId) unless you wrap it in an aggregate function (such as MIN): we don't know which goodId you want from among those that share the same groupId.
The correct, portable way to get the goodId of the cheapest goods in each group is
SELECT goodId, temp.groupId, temp.minPrice
FROM (SELECT groupId, MIN(price) AS minPrice FROM t_goods GROUP BY groupId) temp
JOIN t_goods ON temp.groupId = t_goods.groupId AND temp.minPrice = t_goods.price)
The above query first finds out the cheapest price per group, and then joins to the goods table again to find the goodIds of the goods having that price inside that group.
Important: if multiple goods have an equal cheapest price in a group, this query will return all of them. If you only want one result per group you have to specify the tiebreaker, for example:
SELECT MIN(goodId), temp.groupId, MIN(temp.minPrice)
FROM (SELECT groupId, MIN(price) AS minPrice FROM t_goods GROUP BY groupId) temp
JOIN t_goods ON temp.groupId = t_goods.groupId AND temp.minPrice = t_goods.price)
GROUP BY temp.groupId
With this query in hand, you can then find the name and price of the single cheapest good in each group (lowest goodId will be used as tiebreaker):
SELECT groupId, grname, gd.name, t3.minPrice
FROM t_groups AS gr
LEFT JOIN (SELECT MIN(goodId) AS goodId, t1.groupId, MIN(t1.minPrice) AS minPrice
FROM (SELECT groupId, MIN(price) AS minPrice FROM t_goods GROUP BY groupId) t1
JOIN t_goods ON t1.groupId = t_goods.groupId AND t1.minPrice = t_goods.price
) t2
) t3 ON gr.groupId = t3.groupId
LEFT JOIN t_goods gd ON t3.goodId = gd.goodId
This final query performs two joins at its "outer" level:
joins groups with the "goodId and cheapest price for each group" table to get the goodId and cheapest price
then joins with the goods table to get the name of the good with this goodId
It will produce only one good per group, even if multiple goods are tied for cheapest.
Here's how you could do it:
select
t_groups.grname as `name of group`,
t_goods.name as `name of good`
from (
select
groupId,
min(price) as min_price
from t_goods
group by groupId
) as mins
inner join t_goods
on mins.groupId = t_goods.groupId and mins.min_price = t_goods.price
inner join t_groups
on mins.groupId = t_groups.groupId
How this works:
mins subquery gets the minimum price for each groupId
joining mins to t_goods pulls all of the goods out that have the minimum price in their group. Note that this could return multiple goods in a single group, if there are multiple goods with the minimum price
that's then joined to t_groups to get the group name
Your query was presumably returning NULLs because it was left joining to a subquery with only one row.