Selecting all rows in a table which distinct key - mysql

I want to only retrieve the table rows where productId is DISTINCT
So for this case, I want to retrieve:
productID| name | price | ...
1 | ... | ... | ...
2 | ... | ... | ...
3 | ... | ... | ...
filtering out the repeated productID
What I have tried - Product is the table:
SELECT DISTINCT `productId` FROM `Product`
SELECT *
FROM `Product`
WHERE DISTINCT `productId`
Both of these don't work, any help would be greatly appreciated.

You can join on a subquery which returns ids with their minimum product index, the subquery uses group by to get productIds as a "distinct" value and filter with having and the aggregate function MIN and get the record with the minimum productindex (ie: the first record with that productId)
SELECT p.*
FROM Product as `p`
inner join
(
select `productId`, MIN(`productindex`) as `productindex`
from Product
group by `productId`
having MIN(`productindex`)
) as `x`
on `x`.`productId` = `p`.`productId` and `x`.`productindex` = `p`.`productindex`

Related

SQL where not exists with multiple rows and status

I have the following tables (minified for the sake of simplicity):
CREATE TABLE IF NOT EXISTS `product_bundles` (
bundle_id int AUTO_INCREMENT PRIMARY KEY,
-- More columns here for bundle attributes
) ENGINE=InnoDB;
CREATE TABLE IF NOT EXISTS `product_bundle_parts` (
`part_id` int AUTO_INCREMENT PRIMARY KEY,
`bundle_id` int NOT NULL,
`sku` varchar(255) NOT NULL,
-- More columns here for product attributes
KEY `bundle_id` (`bundle_id`),
KEY `sku` (`sku`)
) ENGINE=InnoDB;
CREATE TABLE IF NOT EXISTS `products` (
`product_id` mediumint(8) AUTO_INCREMENT PRIMARY KEY,
`sku` varchar(64) NOT NULL DEFAULT '',
`status` char(1) NOT NULL default 'A',
-- More columns here for product attributes
KEY (`sku`),
) ENGINE=InnoDB;
And I want to show only the 'product bundles' that are currently completely in stock and defined in the database (since these get retrieved from a third party vendor, there is no guarantee the SKU is defined). So I figured I'd need an anti-join to retrieve it accordingly:
SELECT SQL_CALC_FOUND_ROWS *
FROM product_bundles AS bundles
WHERE 1
AND NOT EXISTS (
SELECT *
FROM product_bundle_parts AS parts
LEFT JOIN products AS products ON parts.sku = products.sku
WHERE parts.bundle_id = bundles.bundle_id
AND products.status = 'A'
AND products.product_id IS NULL
)
-- placeholder for other dynamic conditions for e.g. sorting
LIMIT 0, 24
Now, I sincerely thought this would filter out the products by status, however, that seems not to be the case. I then changed one thing up a bit, and the query never finished (although I believe it to be correct):
SELECT SQL_CALC_FOUND_ROWS *
FROM product_bundles AS bundles
WHERE 1
AND NOT EXISTS (
SELECT *
FROM product_bundle_parts AS parts
LEFT JOIN products AS products ON parts.sku = products.sku
AND products.status = 'A'
WHERE parts.bundle_id = bundles.bundle_id
AND products.product_id IS NULL
)
-- placeholder for other dynamic conditions for e.g. sorting
LIMIT 0, 24
Example data:
product_bundles
bundle_id | etc.
1 |
2 |
3 |
product_bundle_parts
part_id | bundle_id | sku
1 | 1 | 'sku11'
2 | 1 | 'sku22'
3 | 1 | 'sku33'
4 | 1 | 'sku44'
5 | 2 | 'sku55'
6 | 2 | 'sku66'
7 | 3 | 'sku77'
8 | 3 | 'sku88'
products
product_id | sku | status
101 | 'sku11' | 'A'
102 | 'sku22' | 'A'
103 | 'sku33' | 'A'
104 | 'sku44' | 'A'
105 | 'sku55' | 'D'
106 | 'sku66' | 'A'
107 | 'sku77' | 'A'
108 | 'sku99' | 'A'
Example result: Since the product status of product #105 is 'D' and 'sku88' from part #8 was not found:
bundle_id | etc.
1 |
I am running Server version: 10.3.25-MariaDB-0ubuntu0.20.04.1 Ubuntu 20.04
So there are a few questions I have.
Why does the first query not filter out products that do not have the status A.
Why does the second query not finish?
Are there alternative ways of achieving the same thing in a more efficient matter, as this looks rather cumbersome.
First of all, I've read that SQL_CALC_FOUND_ROWS * is much slower than running two separate query (COUNT(*) and then SELECT * or, if you make your query inside another programming language, like PHP, executing the SELECT * and then count the number of rows of the result set)
Second: your first query returns all the boundles that doesn't have ANY active products, while you need the boundles with ALL products active.
I'd change it in the following:
SELECT SQL_CALC_FOUND_ROWS *
FROM product_bundles AS bundles
WHERE NOT EXISTS (
SELECT 'x'
FROM product_bundle_parts AS parts
LEFT JOIN products ON (parts.sku = products.sku)
WHERE parts.bundle_id = bundles.bundle_id
AND COALESCE(products.status, 'X') != 'A'
)
-- placeholder for other dynamic conditions for e.g. sorting
LIMIT 0, 24
I changed the products.status = 'A' in products.status != 'A': in this way the query will return all the boundles that DOESN'T have inactive products (I also removed the condition AND products.product_id IS NULL because it should have been in OR, but with a loss in performance).
You can see my solution in SQLFiddle.
Finally, to know why your second query doesn't end, you should check the structure of your tables and how they are indexed. Executing an Explain on the query could help you to find eventual issues on the structure. Just put the keyword EXPLAIN before the SELECT and you'll have your "report" (EXPLAIN SELECT * ....).

sql query with grouped by column

I have two tables as transactions and listings
Table T as fields of
order_date timestamp
order_id BIGINT
listing_id INT
price INT
Table L with fields of
listing_id INT
price INT
category varchar
If i want to get the sell ratio for each category if sell ratio is defined as the number of sold listings divided by the total number of listings * 100, how can I compose this? would a case statement or cte work better?
listings table is for all listings available and transactions represents all sold
Thanks
Is this what you want?
select
l.category,
count(*) no_listing_transactions
100.0 * count(*) / sum(count(*)) over() per100
from t
inner join l on l.listing_id = t.listing_id
group by l.category
This gives you the count of transactions per category, and the percent that this count represents over the total number of transactions.
Note that this makes uses of window functions, which require MySQL 8.0. In earlier versions, one solution would be to would use a correlated subquery (assuming that there are no "orphan" transactions):
select
l.category,
count(*) no_listing_transactions
100.0 * count(*) / (select count(*) from t) per100
from t
inner join l on l.listing_id = t.listing_id
group by l.category
Try this one
Schema (MySQL v5.7)
Query #1
Create Table `gilbertdim_333952_L` (
listing_id int NOT NULL AUTO_INCREMENT,
price float,
category varchar(10),
PRIMARY KEY (listing_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
There are no results to be displayed.
Query #2
INSERT INTO gilbertdim_333952_L (price, category) VALUES
(100, 'FOOD'),
(50, 'DRINKS');
There are no results to be displayed.
Query #3
Create Table `gilbertdim_333952_T` (
order_id int NOT NULL AUTO_INCREMENT,
order_date timestamp NULL DEFAULT CURRENT_TIMESTAMP,
listing_id int,
price float,
PRIMARY KEY (order_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
There are no results to be displayed.
Query #4
INSERT INTO gilbertdim_333952_T (listing_id, price) VALUES
(1, 100),(1, 100),(1, 100),
(2, 50),(2, 50);
There are no results to be displayed.
Query #5
SELECT l.*, (COUNT(1) / (SELECT COUNT(1) FROM gilbertdim_333952_T) * 100) as sales
FROM gilbertdim_333952_L l
LEFT JOIN gilbertdim_333952_T t ON l.listing_id = t.listing_id
GROUP BY l.listing_id;
| listing_id | price | category | sales |
| ---------- | ----- | -------- | ----- |
| 1 | 100 | FOOD | 60 |
| 2 | 50 | DRINKS | 40 |
View on DB Fiddle

How to get multiple columns on subquery or group by

I have two tables on MySql, the first contains an ID and the name of some products. I have to get the cheapest combination of brand/market for each product. So, I've inserted some itens into both tables:
UPDATE: Inserted new product (bed) with no 'Product_Brand_Market' to test LEFT JOIN.
UPDATE: Changed some product prices for better testing.
CREATE TABLE Product(
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(20) NOT NULL);
CREATE TABLE Product_Brand_Market(
product INT UNSIGNED,
market INT UNSIGNED, /*this will be a FOREIGN KEY*/
brand INT UNSIGNED, /*this will be a FOREIGN KEY*/
price DECIMAL(10,2) UNSIGNED NOT NULL,
PRIMARY KEY(product, market, brand),
CONSTRAINT FOREIGN KEY (product) REFERENCES Product(id));
INSERT INTO Product
(name) VALUES
('Chair'), /*will get id=1*/
('Table'), /*will get id=2*/
('Bed'); /*will get id=3*/
INSERT INTO Product_Brand_Market
(product, market, brand, price) VALUES
(1, 1, 1, 8.00), /*cheapest chair (brand=1, market=1)*/
(1, 1, 2, 8.50),
(1, 2, 1, 9.00),
(1, 2, 2, 9.50),
(2, 1, 1, 11.50),
(2, 1, 2, 11.00),
(2, 2, 1, 10.50),
(2, 2, 2, 10.00); /*cheapest table (brand=2, market=2)*/
/*no entries for bed, must return null*/
And tried the following code to get the desired values:
UPDATE: Changed INNER JOIN for LEFT JOIN.
SELECT p.id product, MIN(pbm.price) price, pbm.brand, pbm.market
FROM Product p
LEFT JOIN Product_Brand_Market pbm
ON p.id = pbm.product
GROUP BY p.id;
The returned price is OK, but I'm getting the wrong keys:
| product | price | brand | market |
|---------|-------|-------|--------|
| 1 | 8 | 1 | 1 |
| 2 | 10 | 1 | 1 |
| 3 | null | null | null |
So the only way I could think to solve it is with subqueries, but I had to use two subqueries to get both brand and market:
SELECT
p.id product,
(
SELECT pbm.brand
FROM Product_Brand_Market pbm
WHERE p.id = pbm.product
ORDER BY pbm.price
LIMIT 1
) as brand,
(
SELECT pbm.market
FROM Product_Brand_Market pbm
WHERE p.id = pbm.product
ORDER BY pbm.price
LIMIT 1
) as market
FROM Product p;
It returns the desired table:
| product | brand | market |
|---------|-------|--------|
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| 3 | null | null |
But I want to know if I really should use these two similar subqueries or there is a better way to do that on MySql, any ideas?
Use a correlated subquery with LIMIT 1 in the WHERE clause:
SELECT product, brand, market
FROM Product_Brand_Market pbm
WHERE (pbm.brand, pbm.market) = (
SELECT pbm1.brand, pbm1.market
FROM Product_Brand_Market pbm1
WHERE pbm1.product = pbm.product
ORDER BY pbm1.price ASC
LIMIT 1
)
This will return only one row per product, even if there are two or many of them with the same lowest price.
Demo: http://rextester.com/UIC44628
Update:
To get all products even if they have no entries in the Product_Brand_Market table, you will need a LEFT JOIN. Note that the condition should be moved to the ON clause.
SELECT p.id as product, pbm.brand, pbm.market
FROM Product p
LEFT JOIN Product_Brand_Market pbm
ON pbm.product = p.id
AND (pbm.brand, pbm.market) = (
SELECT pbm1.brand, pbm1.market
FROM Product_Brand_Market pbm1
WHERE pbm1.product = pbm.product
ORDER BY pbm1.price ASC
LIMIT 1
);
Demo: http://rextester.com/MGXN36725
The follwing query might make a better use of your PK for the JOIN:
SELECT p.id as product, pbm.brand, pbm.market
FROM Product p
LEFT JOIN Product_Brand_Market pbm
ON (pbm.product, pbm.market, pbm.brand) = (
SELECT pbm1.product, pbm1.market, pbm1.brand
FROM Product_Brand_Market pbm1
WHERE pbm1.product = p.id
ORDER BY pbm1.price ASC
LIMIT 1
);
An index on Product_Brand_Market(product, price) should also help to improve the performance of the subquery.

select min value of a field from joins table

CREATE VIEW products_view
AS
Hi guys ! I've tree tables:
Products
Categories
Prices
A product belongs to one category and may has more prices.
consider this set of data:
Product :
id title featured category_id
1 | bread | yes | 99
2 | milk | yes | 99
3 | honey | yes | 99
Price :
id product_id price quantity
1 | 1 | 99.99 | 10
2 | 1 | 150.00 | 50
3 | 2 | 33.10 | 20
4 | 2 | 10.00 | 11
I need to create a view, a full list of products that for each product select the min price and its own category.
eg.
id title featured cat.name price quantity
1 | bread | yes | food | 99.99 | 10
I tried the following query but in this way I select only the min Price.price value but Price.quantity, for example, came from another row. I should find the min Price.price value and so use the Price.quantity of this row as correct data.
CREATE VIEW products_view
AS
SELECT `Prod`.`id`, `Prod`.`title`, `Prod`.`featured`, `Cat`.`name`, MIN(`Price`.`price`) as price,`Price`.`quantity`
FROM `products` AS `Prod`
LEFT JOIN `prices` AS `Price` ON (`Price`.`product_id` = `Prod`.`id`)
LEFT JOIN `categories` AS `Cat` ON (`Prod`.`category_id` = `Cat`.`id`)
GROUP BY `Prod`.`id`
ORDER BY `Prod`.`id` ASC
My result is:
id title featured cat.name price quantity
1 | bread | yes | food | 99.99 | **50** <-- wrong
Can you help me ? Thx in advance !
As documented under MySQL Extensions to GROUP BY (emphasis added):
In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause. For example, this query is illegal in standard SQL because the name column in the select list does not appear in the GROUP BY:
SELECT o.custid, c.name, MAX(o.payment)
FROM orders AS o, customers AS c
WHERE o.custid = c.custid
GROUP BY o.custid;
For the query to be legal, the name column must be omitted from the select list or named in the GROUP BY clause.
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values within each group the server chooses.
What you are looking for is the group-wise minimum, which can be obtained by joining the grouped results back to the table:
SELECT Prod.id, Prod.title, Prod.featured, Cat.name, Price.price, Price.quantity
FROM products AS Prod
LEFT JOIN categories AS Cat ON Prod.category_id = Cat.id
LEFT JOIN (
prices AS Price NATURAL JOIN (
SELECT product_id, MIN(price) AS price
FROM prices
GROUP BY product_id
) t
) ON Price.product_id = Prod.id
ORDER BY Prod.id

MySQL SELECT combining 3 SELECTs INTO 1

Consider following tables in MySQL database:
entries:
creator_id INT
entry TEXT
is_expired BOOL
other:
creator_id INT
entry TEXT
userdata:
creator_id INT
name VARCHAR
etc...
In entries and other, there can be multiple entries by 1 creator. userdata table is read only for me (placed in other database).
I'd like to achieve a following SELECT result:
+------------+---------+---------+-------+
| creator_id | entries | expired | other |
+------------+---------+---------+-------+
| 10951 | 59 | 55 | 39 |
| 70887 | 41 | 34 | 108 |
| 88309 | 38 | 20 | 102 |
| 94732 | 0 | 0 | 86 |
... where entries is equal to SELECT COUNT(entry) FROM entries GROUP BY creator_id,
expired is equal to SELECT COUNT(entry) FROM entries WHERE is_expired = 0 GROUP BY creator_id and
other is equal to SELECT COUNT(entry) FROM other GROUP BY creator_id.
I need this structure because after doing this SELECT, I need to look for user data in the "userdata" table, which I planned to do with INNER JOIN and select desired columns.
I solved this problem with selecting "NULL" into column which does not apply for given SELECT:
SELECT
creator_id,
COUNT(any_entry) as entries,
COUNT(expired_entry) as expired,
COUNT(other_entry) as other
FROM (
SELECT
creator_id,
entry AS any_entry,
NULL AS expired_entry,
NULL AS other_enry
FROM entries
UNION
SELECT
creator_id,
NULL AS any_entry,
entry AS expired_entry,
NULL AS other_enry
FROM entries
WHERE is_expired = 1
UNION
SELECT
creator_id,
NULL AS any_entry,
NULL AS expired_entry,
entry AS other_enry
FROM other
) AS tTemp
GROUP BY creator_id
ORDER BY
entries DESC,
expired DESC,
other DESC
;
I've left out the INNER JOIN and selecting other columns from userdata table on purpose (my question being about combining 3 SELECTs into 1).
Is my idea valid? = Am I trying to use the right "construction" for this?
Are these kind of SELECTs possible without creating an "empty" column? (some kind of JOIN)
Should I do it "outside the DB": make 3 SELECTs, make some order in it (let's say python lists/dicts) and then do the additional SELECTs for userdata?
Solution for a similar question does not return rows where entries and expired are 0.
Thank you for your time.
This should work (assuming all creator_ids appear in the userdata table.
SELECT userdata.creator_id, COALESCE(entries_count_,0) AS entries_count, COALESCE(expired_count_,0) AS expired_count, COALESCE(other_count_,0) AS other_count
FROM userdata
LEFT OUTER JOIN
(SELECT creator_id, COUNT(entry) AS entries_count_
FROM entries
GROUP BY creator_id) AS entries_q
ON userdata.creator_id=entries_q.creator_id
LEFT OUTER JOIN
(SELECT creator_id, COUNT(entry) AS expired_count_
FROM entries
WHERE is_expired=0
GROUP BY creator_id) AS expired_q
ON userdata.creator_id=expired_q.creator_id
LEFT OUTER JOIN
(SELECT creator_id, COUNT(entry) AS other_count_
FROM other
GROUP BY creator_id) AS other_q
ON userdata.creator_id=other_q.creator_id;
Basicly, what you are doing looks correct to me.
I would rewrite it as follows though
SELECT entries.creator_id
, any_entry
, expired_entry
, other_entry
FROM (
SELECT creator_id, COUNT(entry) AS any_entry,
FROM entries
GROUP BY creator_id
) entries
LEFT OUTER JOIN (
SELECT creator_id, COUNT(entry) AS expired_entry,
FROM entries
WHERE is_expired = 1
GROUP BY creator_id
) expired ON expired.creator_id = entries.creator_id
LEFT OUTER JOIN (
SELECT creator_id, COUNT(entry) AS other_entry
FROM other
GROUP BY creator_id
) other ON other.creator_id = entries.creator_id
How about
SELECT creator_id,
(SELECT COUNT(*)
FROM entries e
WHERE e.creator_id = main.creator_id AND
e.is_expired = 0) AS entries,
(SELECT COUNT(*)
FROM entries e
WHERE e.creator_id = main.creator_id AND
e.is_expired = 1) as expired,
(SELECT COUNT(*)
FROM other
WHERE other.creator_id = main.creator_id) AS other,
FROM entries main
GROUP BY main.creator_id;