A better solution than using heavily the conditional JOIN - mysql

I'm trying to create a "complex" view in MySql. I need good performance because I have to query it 2 times per second and each result count about 1200 rows.
I report a schema example with data:
CREATE TABLE objects (
object_id INT AUTO_INCREMENT,
model_id INT,
mode TINYINT,
recipe_id INT,
CONSTRAINT pk_objects PRIMARY KEY (object_id));
INSERT INTO objects (model_id, mode, recipe_id) VALUES (1, 0, 1), (1, 1, 1), (2, 1, 1);
CREATE TABLE models (
model_id INT AUTO_INCREMENT,
family_id INT,
CONSTRAINT pk_models PRIMARY KEY (model_id));
INSERT INTO models (family_id) VALUES (0), (1);
CREATE TABLE models_recipes (
model_id INT,
recipe_id INT,
distinction_id INT,
CONSTRAINT pk_models_recipes PRIMARY KEY (model_id, recipe_id, distinction_id));
INSERT INTO models_recipes (model_id, recipe_id, distinction_id) VALUES (1, 2, 1), (1, 3, 2);
CREATE TABLE families (
family_id INT AUTO_INCREMENT,
name VARCHAR(45),
CONSTRAINT pk_families PRIMARY KEY (family_id));
INSERT INTO families (name) VALUES ("Family_1");
CREATE TABLE families_recipes (
family_id INT,
recipe_id INT,
distinction_id INT,
CONSTRAINT pk_families_recipes PRIMARY KEY (family_id, recipe_id, distinction_id));
INSERT INTO families_recipes (family_id, recipe_id, distinction_id) VALUES (1, 3, 1), (1, 2, 2);
CREATE TABLE recipes (
recipe_id INT AUTO_INCREMENT,
name VARCHAR(45),
CONSTRAINT pk_recipes PRIMARY KEY (recipe_id));
INSERT INTO recipes (name) VALUES ("recipe1"), ("recipe2"), ("recipe3");
My view needs to report the recipe name in these different conditions:
IF 'objects.mode' is 0 -> the name of 'object.recipe_id'
IF 'objects.mode' is 1
IF 'models.family_id > 0' -> the name of 'families_recipes.recipe_id' WHERE distinction_id = foo
ELSE -> the name of 'models_recipes.recipe_id' WHERE distinction_id = foo
I have written this query:
SELECT o.object_id, o.mode, o.model_id,
CASE
WHEN o.mode = 1 THEN
CASE
WHEN m.family_id > 0 THEN rf.name
ELSE rm.name
END
WHEN o.mode = 0 THEN ro.name
END AS 'recipe_name'
FROM objects AS o
LEFT JOIN models AS m
ON o.model_id = m.model_id
LEFT JOIN (SELECT * FROM models_recipes WHERE distinction_id = 1) AS mr
ON m.model_id = mr.model_id
LEFT JOIN recipes AS rm
ON mr.recipe_id = rm.recipe_id
LEFT JOIN (SELECT * FROM families_recipes WHERE distinction_id = 1) AS fr
ON m.family_id = fr.family_id
LEFT JOIN recipes AS rf
ON fr.recipe_id = rf.recipe_id
LEFT JOIN recipes AS ro
ON o.recipe_id = ro.recipe_id;
and the result is right
object_id | mode | model_id | recipe_name
-----------------------------------------
1 | 0 | 1 | recipe1
2 | 1 | 1 | recipe2
3 | 1 | 2 | recipe3
But I'm looking for a better solution, avoiding to JOIN the wanted data (recipes) a number of times equal to the number conditions.
Thanks

You can join recipes only once if you use conditional aggregation:
select o.object_id, o.mode, o.model_id,
case o.mode
when 0 then max(case when r.recipe_id = o.recipe_id then r.name end)
when 1 then case
when m.family_id > 0 then max(case when r.recipe_id = fr.recipe_id then r.name end)
else max(case when r.recipe_id = mr.recipe_id then r.name end)
end
end recipe_name
from objects o
left join models m on m.model_id = o.model_id
left join families f on f.family_id = m.family_id
left join families_recipes fr on fr.family_id = f.family_id and fr.distinction_id = 1
left join models_recipes mr on mr.model_id = m.model_id and mr.distinction_id = 1
left join recipes r on r.recipe_id in (o.recipe_id, fr.recipe_id, mr.recipe_id)
group by o.object_id, o.mode, o.model_id
See the demo.
Results:
object_id | mode | model_id | recipe_name
--------: | ---: | -------: | :----------
1 | 0 | 1 | recipe1
2 | 1 | 1 | recipe2
3 | 1 | 2 | recipe3

Related

How to select differences using three tables?

I need to run a script in order to fix some rows from my table company_menu.
However, I can't build this query to get these registers.
I build the schema in this link: http://sqlfiddle.com/#!9/5ab86b
Below I show the expected result.
companies
id
name
1
company 1
2
company 2
3
company 3
menu_items
id
name
1
home
2
charts
3
users
4
projects
company_menu
id
company_id
menu_item_id
1
1
1
2
1
2
3
1
3
4
1
4
5
2
1
6
2
3
This is a result that I expected:
id
company_id
menu_item_id
1
2
2
2
2
4
3
3
1
4
3
2
5
3
3
6
3
4
CREATE TABLE companies(
id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(50)
);
CREATE TABLE menu_items(
id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(50)
);
CREATE TABLE company_menu(
id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
company_id INT,
menu_item_id INT,
FOREIGN KEY(company_id) REFERENCES companies(id),
FOREIGN KEY(menu_item_id) REFERENCES menu_items(id)
);
INSERT INTO companies (name) VALUES ("Company 1"),("Company 2"),("Company 3");
INSERT INTO menu_items (name) VALUES ("home"),("charts"),("users"),("projects");
INSERT INTO company_menu (company_id, menu_item_id) VALUES (1, 1),(1, 2),(1,3),(1,4);
INSERT INTO company_menu (company_id, menu_item_id) VALUES (2, 1),(2,3);
Two ways I can think of. Don't know which is more efficient. Both start with a full compannies-menu_items join to get all possible combinations, and then cut out the existing:
WHERE NOT EXISTS
select c.id company_id, m.id menu_item_id
from companies c
join menu_items m
where not exists (
select * from company_menu where company_id = c.id and menu_item_id = m.id
);
LEFT JOIN + IS NULL:
select c.id company_id, m.id menu_item_id
from companies c
join menu_items m
left join company_menu cm on cm.company_id = c.id and cm.menu_item_id = m.id
where cm.id is null;
Both are sortable on any company or menu_item column.
http://sqlfiddle.com/#!9/5ab86b/11

MYSQL : Group count specific column per user?

I want count column per specific user, using data from 3 tables.
TABLE 1 (users) :
CREATE TABLE `datastore`.`users` ( `uid` INT NOT NULL AUTO_INCREMENT , `name` VARCHAR(30) NOT NULL DEFAULT 'john' , `class` VARCHAR(20) NOT NULL DEFAULT 'NEW' , PRIMARY KEY (`uid`)) ENGINE = InnoDB;
INSERT INTO `users` (`uid`, `name`, `class`) VALUES (NULL, 'john', 'NEW'), (NULL, 'mark', 'OLD');
SAMPLE :
uid name class
1 john NEW
2 mark OLD
TABLE 2 (data) :
CREATE TABLE `datastore`.`data` ( `id` INT NOT NULL AUTO_INCREMENT , `source` VARCHAR(30) NULL DEFAULT NULL , `destination` VARCHAR(30) NULL DEFAULT NULL , PRIMARY KEY (`id`)) ENGINE = InnoDB;
INSERT INTO `data` (`id`, `source`, `destination`) VALUES (NULL, 'NETWORK', 'SERVER_1'), (NULL, 'STATION', 'SERVER_2'), (NULL, 'DATASTORE', 'SERVER_1');
SAMPLE :
id source destination
1 NETWORK SERVER_1
2 STATION SERVER_2
3 DATASTORE SERVER_1
TABLE 3 (access):
CREATE TABLE `datastore`.`access` ( `id` INT(11) NOT NULL AUTO_INCREMENT , `uid` INT(11) NULL DEFAULT NULL , `source` VARCHAR(30) NULL DEFAULT NULL , PRIMARY KEY (`id`)) ENGINE = InnoDB;
INSERT INTO `access` (`id`, `uid`, `source`) VALUES (NULL, '1', 'NETWORK'), (NULL, '2', 'STATION'), (NULL, '1', 'STATION'), (NULL, '1', 'STATION');
SAMPLE :
id uid source
1 1 NETWORK
2 2 STATION
3 1 STATION
4 1 STATION
What i tried so far :
SELECT access.uid, data.destination, COUNT(*) as count FROM data, access WHERE access.source = data.source GROUP BY destination, uid
Result :
uid destination count
1 SERVER_1 1
1 SERVER_2 2
2 SERVER_2 1
I what to link it with user name alse,
Desired Result :
uid name destination count
1 john SERVER_1 1
1 john SERVER_2 2
2 mark SERVER_2 1
Seems you need also a join for users
SELECT access.uid
, users.name
, data.destination
, COUNT(*) as count
FROM data
INNER JOIN access ON access.source = data.source
INNER JOIN users ON users.uid = access.uid
GROUP BY destination, uid, users.name
and as suggestion, you should not use the (old) implicit join syntax based on where .. but the explicit join syntax.
Use aggregation:
select
a.uid,
u.name,
d.destination,
count(*)
from
access a
inner join users u on u.uid = a.uid
inner join data on d.source = a.source
group by
a.uid,
u.name,
d.destination
All you need to get the user's name is to join your query to the table users:
SELECT u.uid, u.name, t.destination, t.count
FROM users u INNER JOIN (
SELECT a.uid, d.destination, COUNT(*) AS count
FROM data d
INNER JOIN access a ON a.source = d.source
GROUP BY d.destination, a.uid
) t ON u.uid = t.uid
ORDER BY u.uid, t.destination
See the demo.
Results:
| uid | name | destination | count |
| --- | ---- | ----------- | ----- |
| 1 | john | SERVER_1 | 1 |
| 1 | john | SERVER_2 | 2 |
| 2 | mark | SERVER_2 | 1 |

MySQL - Join if no duplicate

I want to Join to table. the condition is I want to only join those rows which have only one row to match. eg.
books:
id | name | price
1 | book1 | 19
2 | book2 | 19
3 | book3 | 30
price_offer:
id | offer | price
1 | offer1 | 19
2 | offer2 | 30
so now if I do select query on these table:
SELECT * FROM price_offer
JOIN books ON price_offer.price = books.price
I only want to join book with id 3 as it have only one match with price_offer table.
You could use a self join for books table to pick a book with only single match
select po.*, b1.*
from price_offer po
join books b1 on po.price = b1.price
join (
select price,max(id) id
from books
group by price
having count(*) = 1
) b2 on b1.id = b2.id
Demo
Try following query:
Sample data:
create table books(id int, name varchar(10), price int);
insert into books values
(1, 'book1', 19),
(2, 'book2', 19),
(3, 'book3', 30);
create table price_offer(id int, offer varchar(10), price int);
insert into price_offer values
(1, 'offer1', 19),
(2, 'offer2', 30);
Query:
select max(b.id)
from price_offer p
left join books b on b.price = p.price
where p.id is not null
group by b.price
having count(*) = 1;
If you want to avoid nesting queries where you have to use self-joins, you can use window-functions of MySQL 8.0.11, which are exactly for cases like this

How to get multiple columns on subquery or group by

I have two tables on MySql, the first contains an ID and the name of some products. I have to get the cheapest combination of brand/market for each product. So, I've inserted some itens into both tables:
UPDATE: Inserted new product (bed) with no 'Product_Brand_Market' to test LEFT JOIN.
UPDATE: Changed some product prices for better testing.
CREATE TABLE Product(
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(20) NOT NULL);
CREATE TABLE Product_Brand_Market(
product INT UNSIGNED,
market INT UNSIGNED, /*this will be a FOREIGN KEY*/
brand INT UNSIGNED, /*this will be a FOREIGN KEY*/
price DECIMAL(10,2) UNSIGNED NOT NULL,
PRIMARY KEY(product, market, brand),
CONSTRAINT FOREIGN KEY (product) REFERENCES Product(id));
INSERT INTO Product
(name) VALUES
('Chair'), /*will get id=1*/
('Table'), /*will get id=2*/
('Bed'); /*will get id=3*/
INSERT INTO Product_Brand_Market
(product, market, brand, price) VALUES
(1, 1, 1, 8.00), /*cheapest chair (brand=1, market=1)*/
(1, 1, 2, 8.50),
(1, 2, 1, 9.00),
(1, 2, 2, 9.50),
(2, 1, 1, 11.50),
(2, 1, 2, 11.00),
(2, 2, 1, 10.50),
(2, 2, 2, 10.00); /*cheapest table (brand=2, market=2)*/
/*no entries for bed, must return null*/
And tried the following code to get the desired values:
UPDATE: Changed INNER JOIN for LEFT JOIN.
SELECT p.id product, MIN(pbm.price) price, pbm.brand, pbm.market
FROM Product p
LEFT JOIN Product_Brand_Market pbm
ON p.id = pbm.product
GROUP BY p.id;
The returned price is OK, but I'm getting the wrong keys:
| product | price | brand | market |
|---------|-------|-------|--------|
| 1 | 8 | 1 | 1 |
| 2 | 10 | 1 | 1 |
| 3 | null | null | null |
So the only way I could think to solve it is with subqueries, but I had to use two subqueries to get both brand and market:
SELECT
p.id product,
(
SELECT pbm.brand
FROM Product_Brand_Market pbm
WHERE p.id = pbm.product
ORDER BY pbm.price
LIMIT 1
) as brand,
(
SELECT pbm.market
FROM Product_Brand_Market pbm
WHERE p.id = pbm.product
ORDER BY pbm.price
LIMIT 1
) as market
FROM Product p;
It returns the desired table:
| product | brand | market |
|---------|-------|--------|
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| 3 | null | null |
But I want to know if I really should use these two similar subqueries or there is a better way to do that on MySql, any ideas?
Use a correlated subquery with LIMIT 1 in the WHERE clause:
SELECT product, brand, market
FROM Product_Brand_Market pbm
WHERE (pbm.brand, pbm.market) = (
SELECT pbm1.brand, pbm1.market
FROM Product_Brand_Market pbm1
WHERE pbm1.product = pbm.product
ORDER BY pbm1.price ASC
LIMIT 1
)
This will return only one row per product, even if there are two or many of them with the same lowest price.
Demo: http://rextester.com/UIC44628
Update:
To get all products even if they have no entries in the Product_Brand_Market table, you will need a LEFT JOIN. Note that the condition should be moved to the ON clause.
SELECT p.id as product, pbm.brand, pbm.market
FROM Product p
LEFT JOIN Product_Brand_Market pbm
ON pbm.product = p.id
AND (pbm.brand, pbm.market) = (
SELECT pbm1.brand, pbm1.market
FROM Product_Brand_Market pbm1
WHERE pbm1.product = pbm.product
ORDER BY pbm1.price ASC
LIMIT 1
);
Demo: http://rextester.com/MGXN36725
The follwing query might make a better use of your PK for the JOIN:
SELECT p.id as product, pbm.brand, pbm.market
FROM Product p
LEFT JOIN Product_Brand_Market pbm
ON (pbm.product, pbm.market, pbm.brand) = (
SELECT pbm1.product, pbm1.market, pbm1.brand
FROM Product_Brand_Market pbm1
WHERE pbm1.product = p.id
ORDER BY pbm1.price ASC
LIMIT 1
);
An index on Product_Brand_Market(product, price) should also help to improve the performance of the subquery.

MySQL complex semi-join without group by

Summary
I am looking for a semi-join(ish) query that selects a number of customers and joins their most recent data from other tables.
At a later time, I wish to directly append conditions to the end of the query: WHERE c.id IN (1,2,3)
Problem
As far as I am aware, my requirement rules out GROUP BY:
SELECT * FROM customer c
LEFT JOIN customer_address ca ON ca.customer_id = c.id
GROUP BY c.id
# PROBLEM: Cannot append conditions *after* GROUP BY!
With most subquery-based attempts, my problem is the same.
As an additional challenge, I cannot strictly use a semi-join, because I allow at least two types of phone numbers (mobile and landline), which come from the same table. As such, from the phone table I may be joining multiple records per customer, i.e. this is no longer a semi-join. My current solution below illustrates this.
Questions
The EXPLAIN result at the bottom looks performant to me. Am I correct? Are each of the subqueries executed only once? Update: It appears that DEPENDENT SUBQUERY is executed once for each row in the outer query. It would be great if we could avoid this.
Is there a better solution to what I am doing?
DDLs
DROP TABLE IF EXISTS customer;
CREATE TABLE `customer` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)
);
DROP TABLE IF EXISTS customer_address;
CREATE TABLE `customer_address` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`customer_id` bigint(20) unsigned NOT NULL,
`street` varchar(85) DEFAULT NULL,
`house_number` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id`)
);
DROP TABLE IF EXISTS customer_phone;
CREATE TABLE `customer_phone` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`customer_id` bigint(20) unsigned NOT NULL,
`phone` varchar(32) DEFAULT NULL,
`type` tinyint(3) unsigned NOT NULL COMMENT '1=mobile,2=landline',
PRIMARY KEY (`id`)
);
insert ignore customer values (1);
insert ignore customer_address values (1, 1, "OldStreet", 1),(2, 1, "NewStreet", 1);
insert ignore customer_phone values (1, 1, "12345-M", 1),(2, 1, "12345-L-Old", 2),(3, 1, "12345-L-New", 2);
SELECT * FROM customer;
+----+
| id |
+----+
| 1 |
+----+
SELECT * FROM customer_address;
+----+-------------+-----------+--------------+
| id | customer_id | street | house_number |
+----+-------------+-----------+--------------+
| 1 | 1 | OldStreet | 1 |
| 2 | 1 | NewStreet | 1 |
+----+-------------+-----------+--------------+
SELECT * FROM customer_phone;
+----+-------------+-------------+------+
| id | customer_id | phone | type |
+----+-------------+-------------+------+
| 1 | 1 | 12345-M | 1 |
| 2 | 1 | 12345-L-Old | 2 |
| 3 | 1 | 12345-L-New | 2 |
+----+-------------+-------------+------+
Solution so far
SELECT *
FROM customer c
# Join the most recent address
LEFT JOIN customer_address ca ON ca.id = (SELECT MAX(ca.id) FROM customer_address ca WHERE ca.customer_id = c.id)
# Join the most recent mobile phone number
LEFT JOIN customer_phone cphm ON cphm.id = (SELECT MAX(cphm.id) FROM customer_phone cphm WHERE cphm.customer_id = c.id AND cphm.`type` = 1)
# Join the most recent landline phone number
LEFT JOIN customer_phone cphl ON cphl.id = (SELECT MAX(cphl.id) FROM customer_phone cphl WHERE cphl.customer_id = c.id AND cphl.`type` = 2)
# Yay conditions appended at the end
WHERE c.id IN (1,2,3)
Fiddle
This fiddle gives the appropriate result set using the given solution. See my questions above.
http://sqlfiddle.com/#!9/98c57/3
I would avoid those dependent subqueries, instead try this:
SELECT
*
FROM customer c
LEFT JOIN (
SELECT
customer_id
, MAX(id) AS currid
FROM customer_phone
WHERE type = 1
GROUP BY
customer_id
) gm ON c.id = gm.customer_id
LEFT JOIN customer_phone mobis ON gm.currid = mobis.id
LEFT JOIN (
SELECT
customer_id
, MAX(id) AS currid
FROM customer_phone
WHERE type = 2
GROUP BY
customer_id
) gl ON c.id = gl.customer_id
LEFT JOIN customer_phone lands ON gl.currid = lands.id
WHERE c.id IN (1, 2, 3)
;
or, perhaps:
SELECT
*
FROM customer c
LEFT JOIN (
SELECT
customer_id
, MAX(case when type = 1 then id end) AS mobid
, MAX(case when type = 2 then id end) AS lndid
FROM customer_phone
GROUP BY
customer_id
) gp ON c.id = gp.customer_id
LEFT JOIN customer_phone mobis ON gp.mobid = mobis.id
LEFT JOIN customer_phone lands ON gp.lndid = lands.id
WHERE c.id IN (1, 2, 3)
;
see: http://sqlfiddle.com/#!9/ef983/1/