sql query with grouped by column - mysql

I have two tables as transactions and listings
Table T as fields of
order_date timestamp
order_id BIGINT
listing_id INT
price INT
Table L with fields of
listing_id INT
price INT
category varchar
If i want to get the sell ratio for each category if sell ratio is defined as the number of sold listings divided by the total number of listings * 100, how can I compose this? would a case statement or cte work better?
listings table is for all listings available and transactions represents all sold
Thanks

Is this what you want?
select
l.category,
count(*) no_listing_transactions
100.0 * count(*) / sum(count(*)) over() per100
from t
inner join l on l.listing_id = t.listing_id
group by l.category
This gives you the count of transactions per category, and the percent that this count represents over the total number of transactions.
Note that this makes uses of window functions, which require MySQL 8.0. In earlier versions, one solution would be to would use a correlated subquery (assuming that there are no "orphan" transactions):
select
l.category,
count(*) no_listing_transactions
100.0 * count(*) / (select count(*) from t) per100
from t
inner join l on l.listing_id = t.listing_id
group by l.category

Try this one
Schema (MySQL v5.7)
Query #1
Create Table `gilbertdim_333952_L` (
listing_id int NOT NULL AUTO_INCREMENT,
price float,
category varchar(10),
PRIMARY KEY (listing_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
There are no results to be displayed.
Query #2
INSERT INTO gilbertdim_333952_L (price, category) VALUES
(100, 'FOOD'),
(50, 'DRINKS');
There are no results to be displayed.
Query #3
Create Table `gilbertdim_333952_T` (
order_id int NOT NULL AUTO_INCREMENT,
order_date timestamp NULL DEFAULT CURRENT_TIMESTAMP,
listing_id int,
price float,
PRIMARY KEY (order_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
There are no results to be displayed.
Query #4
INSERT INTO gilbertdim_333952_T (listing_id, price) VALUES
(1, 100),(1, 100),(1, 100),
(2, 50),(2, 50);
There are no results to be displayed.
Query #5
SELECT l.*, (COUNT(1) / (SELECT COUNT(1) FROM gilbertdim_333952_T) * 100) as sales
FROM gilbertdim_333952_L l
LEFT JOIN gilbertdim_333952_T t ON l.listing_id = t.listing_id
GROUP BY l.listing_id;
| listing_id | price | category | sales |
| ---------- | ----- | -------- | ----- |
| 1 | 100 | FOOD | 60 |
| 2 | 50 | DRINKS | 40 |
View on DB Fiddle

Related

Selecting all rows in a table which distinct key

I want to only retrieve the table rows where productId is DISTINCT
So for this case, I want to retrieve:
productID| name | price | ...
1 | ... | ... | ...
2 | ... | ... | ...
3 | ... | ... | ...
filtering out the repeated productID
What I have tried - Product is the table:
SELECT DISTINCT `productId` FROM `Product`
SELECT *
FROM `Product`
WHERE DISTINCT `productId`
Both of these don't work, any help would be greatly appreciated.
You can join on a subquery which returns ids with their minimum product index, the subquery uses group by to get productIds as a "distinct" value and filter with having and the aggregate function MIN and get the record with the minimum productindex (ie: the first record with that productId)
SELECT p.*
FROM Product as `p`
inner join
(
select `productId`, MIN(`productindex`) as `productindex`
from Product
group by `productId`
having MIN(`productindex`)
) as `x`
on `x`.`productId` = `p`.`productId` and `x`.`productindex` = `p`.`productindex`

MySql join by json key and multiply quantity for each order item and get total price

I could not fiddle this out for hours now.
I would like to have the total price in one sql select.
Given is a json column where the key is the productId and the value is the quantity.
The customer can have multiple order items.
The quantity must be multiplied with net_price and tax_price.
In SUM This gives the total price.
I can do this relational without json, but my preference is a json column.
I prepared an example to make it clear:
Given:
CREATE TABLE order_items (
`customer_id` VARCHAR(26),
`products` json
);
INSERT INTO order_items VALUES ('01G51A4EK52RHB361SMXH2D5KL', '{"01G51A4EK52RHB361SMXH2D5KH": 10, "01G51A4EK52RHB361SMXH2D5KK": 20}');
INSERT INTO order_items VALUES ('01G51A4EK52RHB361SMXH2D5KL', '{"01G51A4EK52RHB361SMXH2D5KH": 30}');
INSERT INTO order_items VALUES ('01G51A4EK52RHB361SMXH2D5KL', '{"01G51A4EK52RHB361SMXH2D5KH": 30}');
CREATE TABLE product (
`productId` VARCHAR(26),
`net_price` INTEGER,
`tax_price` INTEGER
);
INSERT INTO product VALUES ('01G51A4EK52RHB361SMXH2D5KH', 100, 20);
INSERT INTO product VALUES ('01G51A4EK52RHB361SMXH2D5KK', 200, 10);
What I have by now but it is incomplete:
SELECT
JSON_UNQUOTE(
JSON_EXTRACT(
JSON_KEYS(`products`),
CONCAT(
'$[',
ROW_NUMBER() OVER(PARTITION BY `products`) -1,
']'
)
)
) AS "productId",quantity
FROM order_items
JOIN JSON_TABLE(
products,
'$.*' COLUMNS (
quantity VARCHAR(50) PATH '$'
)
) j
WHERE `order_items`.`customer_id` = '01G51A4EK52RHB361SMXH2D5KL';
DB-Fiddle:
https://www.db-fiddle.com/f/reewoqUCQxeDLJb6zpb1RG/1
Could someone help me out here? Is this even possible?
Thank you!
Here's a solution to get the corresponding net_price and tax_price. I am not sure how you want to use them.
SELECT j.productId,
JSON_UNQUOTE(JSON_EXTRACT(i.products, CONCAT('$."', j.productId, '"'))) AS quantity,
p.net_price,
p.tax_price
FROM order_items AS i
CROSS JOIN JSON_TABLE(JSON_KEYS(i.products),
'$[*]' COLUMNS (
productId VARCHAR(26) PATH '$'
)
) AS j
JOIN product AS p USING (productId)
WHERE i.`customer_id` = '01G51A4EK52RHB361SMXH2D5KL';
Output given your sample data:
+----------------------------+----------+-----------+-----------+
| productId | quantity | net_price | tax_price |
+----------------------------+----------+-----------+-----------+
| 01G51A4EK52RHB361SMXH2D5KH | 30 | 100 | 20 |
| 01G51A4EK52RHB361SMXH2D5KH | 30 | 100 | 20 |
| 01G51A4EK52RHB361SMXH2D5KH | 10 | 100 | 20 |
| 01G51A4EK52RHB361SMXH2D5KK | 20 | 200 | 10 |
+----------------------------+----------+-----------+-----------+
Calculating the total aggregate price:
SELECT SUM(
JSON_UNQUOTE(JSON_EXTRACT(i.products, CONCAT('$."', j.productId, '"')))
* (p.net_price + p.tax_price)
) AS total_price
FROM order_items AS i
CROSS JOIN JSON_TABLE(JSON_KEYS(i.products),
'$[*]' COLUMNS (
productId VARCHAR(26) PATH '$'
)
) AS j
JOIN product AS p USING (productId)
WHERE i.`customer_id` = '01G51A4EK52RHB361SMXH2D5KL';
Output:
+-------------+
| total_price |
+-------------+
| 12600 |
+-------------+

SQL where not exists with multiple rows and status

I have the following tables (minified for the sake of simplicity):
CREATE TABLE IF NOT EXISTS `product_bundles` (
bundle_id int AUTO_INCREMENT PRIMARY KEY,
-- More columns here for bundle attributes
) ENGINE=InnoDB;
CREATE TABLE IF NOT EXISTS `product_bundle_parts` (
`part_id` int AUTO_INCREMENT PRIMARY KEY,
`bundle_id` int NOT NULL,
`sku` varchar(255) NOT NULL,
-- More columns here for product attributes
KEY `bundle_id` (`bundle_id`),
KEY `sku` (`sku`)
) ENGINE=InnoDB;
CREATE TABLE IF NOT EXISTS `products` (
`product_id` mediumint(8) AUTO_INCREMENT PRIMARY KEY,
`sku` varchar(64) NOT NULL DEFAULT '',
`status` char(1) NOT NULL default 'A',
-- More columns here for product attributes
KEY (`sku`),
) ENGINE=InnoDB;
And I want to show only the 'product bundles' that are currently completely in stock and defined in the database (since these get retrieved from a third party vendor, there is no guarantee the SKU is defined). So I figured I'd need an anti-join to retrieve it accordingly:
SELECT SQL_CALC_FOUND_ROWS *
FROM product_bundles AS bundles
WHERE 1
AND NOT EXISTS (
SELECT *
FROM product_bundle_parts AS parts
LEFT JOIN products AS products ON parts.sku = products.sku
WHERE parts.bundle_id = bundles.bundle_id
AND products.status = 'A'
AND products.product_id IS NULL
)
-- placeholder for other dynamic conditions for e.g. sorting
LIMIT 0, 24
Now, I sincerely thought this would filter out the products by status, however, that seems not to be the case. I then changed one thing up a bit, and the query never finished (although I believe it to be correct):
SELECT SQL_CALC_FOUND_ROWS *
FROM product_bundles AS bundles
WHERE 1
AND NOT EXISTS (
SELECT *
FROM product_bundle_parts AS parts
LEFT JOIN products AS products ON parts.sku = products.sku
AND products.status = 'A'
WHERE parts.bundle_id = bundles.bundle_id
AND products.product_id IS NULL
)
-- placeholder for other dynamic conditions for e.g. sorting
LIMIT 0, 24
Example data:
product_bundles
bundle_id | etc.
1 |
2 |
3 |
product_bundle_parts
part_id | bundle_id | sku
1 | 1 | 'sku11'
2 | 1 | 'sku22'
3 | 1 | 'sku33'
4 | 1 | 'sku44'
5 | 2 | 'sku55'
6 | 2 | 'sku66'
7 | 3 | 'sku77'
8 | 3 | 'sku88'
products
product_id | sku | status
101 | 'sku11' | 'A'
102 | 'sku22' | 'A'
103 | 'sku33' | 'A'
104 | 'sku44' | 'A'
105 | 'sku55' | 'D'
106 | 'sku66' | 'A'
107 | 'sku77' | 'A'
108 | 'sku99' | 'A'
Example result: Since the product status of product #105 is 'D' and 'sku88' from part #8 was not found:
bundle_id | etc.
1 |
I am running Server version: 10.3.25-MariaDB-0ubuntu0.20.04.1 Ubuntu 20.04
So there are a few questions I have.
Why does the first query not filter out products that do not have the status A.
Why does the second query not finish?
Are there alternative ways of achieving the same thing in a more efficient matter, as this looks rather cumbersome.
First of all, I've read that SQL_CALC_FOUND_ROWS * is much slower than running two separate query (COUNT(*) and then SELECT * or, if you make your query inside another programming language, like PHP, executing the SELECT * and then count the number of rows of the result set)
Second: your first query returns all the boundles that doesn't have ANY active products, while you need the boundles with ALL products active.
I'd change it in the following:
SELECT SQL_CALC_FOUND_ROWS *
FROM product_bundles AS bundles
WHERE NOT EXISTS (
SELECT 'x'
FROM product_bundle_parts AS parts
LEFT JOIN products ON (parts.sku = products.sku)
WHERE parts.bundle_id = bundles.bundle_id
AND COALESCE(products.status, 'X') != 'A'
)
-- placeholder for other dynamic conditions for e.g. sorting
LIMIT 0, 24
I changed the products.status = 'A' in products.status != 'A': in this way the query will return all the boundles that DOESN'T have inactive products (I also removed the condition AND products.product_id IS NULL because it should have been in OR, but with a loss in performance).
You can see my solution in SQLFiddle.
Finally, to know why your second query doesn't end, you should check the structure of your tables and how they are indexed. Executing an Explain on the query could help you to find eventual issues on the structure. Just put the keyword EXPLAIN before the SELECT and you'll have your "report" (EXPLAIN SELECT * ....).

Using MIN value after JOIN MAX(date) MYSQL

I have 3 table. manufacturers, products and prices
I want to get the last price of product and select min price of them.
Table manufacturers:
# manufacturers
id name
1 Manufacturer 1
2 Manufacturer 2
Table products:
# products
id name
1 Product 1
2 Product 2
Table prices:
# prices
id price manufacturerId createdAt
1 10 1 '2019-09-09 00:00:00'
2 20 1 '2019-09-10 00:00:00'
3 11 2 '2019-09-09 00:00:00'
4 21 2 '2019-09-10 00:00:00'
Full code:
DROP DATABASE if exists ssg ;
CREATE DATABASE ssg;
USE ssg;
# Create database manufacturers
CREATE TABLE manufacturers (id INT(11) UNSIGNED AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(256) NOT NULL);
# Insert value
INSERT INTO manufacturers (name) VALUES ('Manufacturer 1');
INSERT INTO manufacturers (name) VALUES ('Manufacturer 2');
# Create database products
CREATE TABLE products (id INT(11) UNSIGNED AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(256) NOT NULL);
# Insert value
INSERT INTO products (name) VALUES ('Product 1');
# Create database prices
CREATE TABLE prices (id INT(11) UNSIGNED AUTO_INCREMENT PRIMARY KEY,
productId INT(11) UNSIGNED NOT NULL,
price BIGINT UNSIGNED NOT NULL,
manufacturerId INT(11) UNSIGNED NOT NULL,
createdAt DATETIME NOT NULL);
# Insert value
INSERT INTO prices (productId, price, manufacturerId, createdAt) VALUES (1, 10, 1, '2019-09-09 00:00:00');
INSERT INTO prices (productId, price, manufacturerId, createdAt) VALUES (1, 20, 1, '2019-09-10 00:00:00');
INSERT INTO prices (productId, price, manufacturerId, createdAt)VALUES (1, 11, 2, '2019-09-09 00:00:00');
INSERT INTO prices (productId, price, manufacturerId, createdAt)VALUES (1, 21, 2, '2019-09-10 00:00:00');
# Query
SELECT products.id, products.name, lastValue.price as latestPrice, lastValue.manufacturerId
FROM products
LEFT JOIN(
SELECT productId, COUNT(DISTINCT manufacturerId) AS total
FROM prices
GROUP BY prices.productId) counts ON counts.productId = products.id
LEFT JOIN (
SELECT prices.*
FROM (
SELECT productId, MAX(createdAt) createdAt
FROM prices
GROUP BY productId) latest
JOIN prices ON latest.productId = prices.productId
AND prices.createdAt = latest.createdAt
) lastValue
ON lastValue.productId = products.id
and I got:
id name latestPrice manufacturerId
1 Product 1 20 1
1 Product 1 21 2
So how can I receive products with only with the MIN of latestPrice.
I have to post it in http://sqlfiddle.com/#!9/418cb7/1 . Please "Build Schema" then "Run SQL"
Sorry for my bad english.
In MySQL 8.0, you can do this with window functions only:
select id, name, price, manufacturerId
from (
select
t.*,
rank() over(order by price) rn2
from (
select
p.id,
p.name,
i.price,
i.manufacturerId,
rank() over(partition by p.id order by i.createdAt desc) rn1
from products p
inner join prices i on i.productId = p.id
) t
where rn1 = 1
) t
where rn2 = 1
This phrases as:
first rank the prices of each product by descending date, and filter on the latest price per product
then rank the all the latest prices by ascending price, and filter on the lowest of them
Demo on DB Fiddle:
id | name | price | manufacturerId
-: | :-------- | ----: | -------------:
1 | Product 1 | 20 | 1

How to get multiple columns on subquery or group by

I have two tables on MySql, the first contains an ID and the name of some products. I have to get the cheapest combination of brand/market for each product. So, I've inserted some itens into both tables:
UPDATE: Inserted new product (bed) with no 'Product_Brand_Market' to test LEFT JOIN.
UPDATE: Changed some product prices for better testing.
CREATE TABLE Product(
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(20) NOT NULL);
CREATE TABLE Product_Brand_Market(
product INT UNSIGNED,
market INT UNSIGNED, /*this will be a FOREIGN KEY*/
brand INT UNSIGNED, /*this will be a FOREIGN KEY*/
price DECIMAL(10,2) UNSIGNED NOT NULL,
PRIMARY KEY(product, market, brand),
CONSTRAINT FOREIGN KEY (product) REFERENCES Product(id));
INSERT INTO Product
(name) VALUES
('Chair'), /*will get id=1*/
('Table'), /*will get id=2*/
('Bed'); /*will get id=3*/
INSERT INTO Product_Brand_Market
(product, market, brand, price) VALUES
(1, 1, 1, 8.00), /*cheapest chair (brand=1, market=1)*/
(1, 1, 2, 8.50),
(1, 2, 1, 9.00),
(1, 2, 2, 9.50),
(2, 1, 1, 11.50),
(2, 1, 2, 11.00),
(2, 2, 1, 10.50),
(2, 2, 2, 10.00); /*cheapest table (brand=2, market=2)*/
/*no entries for bed, must return null*/
And tried the following code to get the desired values:
UPDATE: Changed INNER JOIN for LEFT JOIN.
SELECT p.id product, MIN(pbm.price) price, pbm.brand, pbm.market
FROM Product p
LEFT JOIN Product_Brand_Market pbm
ON p.id = pbm.product
GROUP BY p.id;
The returned price is OK, but I'm getting the wrong keys:
| product | price | brand | market |
|---------|-------|-------|--------|
| 1 | 8 | 1 | 1 |
| 2 | 10 | 1 | 1 |
| 3 | null | null | null |
So the only way I could think to solve it is with subqueries, but I had to use two subqueries to get both brand and market:
SELECT
p.id product,
(
SELECT pbm.brand
FROM Product_Brand_Market pbm
WHERE p.id = pbm.product
ORDER BY pbm.price
LIMIT 1
) as brand,
(
SELECT pbm.market
FROM Product_Brand_Market pbm
WHERE p.id = pbm.product
ORDER BY pbm.price
LIMIT 1
) as market
FROM Product p;
It returns the desired table:
| product | brand | market |
|---------|-------|--------|
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| 3 | null | null |
But I want to know if I really should use these two similar subqueries or there is a better way to do that on MySql, any ideas?
Use a correlated subquery with LIMIT 1 in the WHERE clause:
SELECT product, brand, market
FROM Product_Brand_Market pbm
WHERE (pbm.brand, pbm.market) = (
SELECT pbm1.brand, pbm1.market
FROM Product_Brand_Market pbm1
WHERE pbm1.product = pbm.product
ORDER BY pbm1.price ASC
LIMIT 1
)
This will return only one row per product, even if there are two or many of them with the same lowest price.
Demo: http://rextester.com/UIC44628
Update:
To get all products even if they have no entries in the Product_Brand_Market table, you will need a LEFT JOIN. Note that the condition should be moved to the ON clause.
SELECT p.id as product, pbm.brand, pbm.market
FROM Product p
LEFT JOIN Product_Brand_Market pbm
ON pbm.product = p.id
AND (pbm.brand, pbm.market) = (
SELECT pbm1.brand, pbm1.market
FROM Product_Brand_Market pbm1
WHERE pbm1.product = pbm.product
ORDER BY pbm1.price ASC
LIMIT 1
);
Demo: http://rextester.com/MGXN36725
The follwing query might make a better use of your PK for the JOIN:
SELECT p.id as product, pbm.brand, pbm.market
FROM Product p
LEFT JOIN Product_Brand_Market pbm
ON (pbm.product, pbm.market, pbm.brand) = (
SELECT pbm1.product, pbm1.market, pbm1.brand
FROM Product_Brand_Market pbm1
WHERE pbm1.product = p.id
ORDER BY pbm1.price ASC
LIMIT 1
);
An index on Product_Brand_Market(product, price) should also help to improve the performance of the subquery.