Match only if all elements of a column are in another table - mysql

I've made a little database in SQL that as 2 tables Product (Name, Ingredient and Available (Ingredient):
| Product | Available |
| Name | Ingredient | Ingredient |
| 1 | a | a |
| 1 | b | c |
| 2 | a |
| 2 | c |
I want the name of a product only if ALL its ingredients are inside the Available table.
For the previous example, the result should be: Product "2"
and not Product "1", because I don't have the ingredient "b" in the Available table.
Thanks for the help

You can try with left join (to figure out which Products don't have necessary Ingredients) and group by + having to filter Products that have at least one missing Ingredient:
select p.Name
from Products p
left join Available a on a.Ingredient = p.Ingredient
group by p.Name
having sum(a.Ingredient is null) = 0

You can try something like this also:
WITH TEMP_PRODUCTS AS
(
SELECT NAME, COUNT(1) AS NUMBER_OF_INGREDIENTS
FROM PRODUCT
GROUP BY PRODUCT
)
SELECT PRD.NAME, COUNT(1) AS NUMBER_OF_AVAILABLE_INGREDIENTS
FROM PRODUCT PRD
JOIN TEMP_PRODUCTS TMP ON PRD.NAME = TMP.NAME
WHERE EXISTS (SELECT 1
FROM INGREDIENT ING
WHERE ING.INGREDIENT = PRD.INGREDIENT)
GROUP BY PRD.NAME
HAVING COUNT(1) = TMP.NUMBER_OF_INGREDIENTS;

Related

MYSQL - Group Contact rows with records NOT IN

My case looks simple but i'm messing around with this..
I have 4 tables: User, Macros, Categories, and another one that relate users with categories. One Macro have many Categories.
What i need, is a query that based on the Macro, get the users and the Categories where user is NOT IN.
Example: I have a macro named VEICULES, with categories CAR,TRUCK and Motorcycle. User José is on category CAR and User Julio on category CAR and TRUCK, so my query should return:
José | TRUCK,Motorcycle
Julio | Motorcycle
Tables:
prd_users
id | name | Email
---------------------------
1 | José | jose#email.com
2 | Júlio | julio#email.com
3 | André | andre#email.com
cat_macros
macro_id | macro_name
-----------------------
1 | Veicules |
cat_categories
category_id | category_name | macro_id
---------------------------------------
1 | Cars | 1
2 | Trucks | 1
3 | Motorcycles | 1
prd_tr_rabbit_catg
id | category_id | tasker_user_id
---------------------------------------
1 | 1 | 1
2 | 1 | 2
3 | 2 | 2
I'm stucked on just getting the categories where the user already is ..
SELECT prd_users.id, prd_users.name,
prd_users.email,cat_macros.macro_name as macro,
GROUP_CONCAT(cat_categories.category_name SEPARATOR ', ') as in_categories
FROM prd_users
INNER JOIN prd_tr_rabbit_catg ON prd_tr_rabbit_catg.tasker_user_id = prd_users.id
INNER JOIN cat_categories ON cat_categories.category_id = prd_tr_rabbit_catg.category_id
INNER JOIN cat_macros ON cat_macros.macro_id = cat_categories.macro_id
WHERE cat_macros.macro_id = '45'
GROUP BY prd_users.id;
To solve this problem it's necessary to create a list of all users joined with all categories for the given macro category. This can be done with a CROSS JOIN:
SELECT *
FROM prd_users u
CROSS JOIN (SELECT m.macro_id, m.macro_name, c.category_name, c.category_id
FROM cat_macros m
JOIN cat_categories c ON c.macro_id = m.macro_id) c
This can then be LEFT JOINed to the prd_tr_rabbit_catg table and by selecting those rows where there is no matching entry in the prd_tr_rabbit_catg table, we can find the users who don't have an entry for the given category:
SELECT c.macro_name, u.id AS user_id, u.name, u.Email, GROUP_CONCAT(c.category_name) AS missing_cats
FROM prd_users u
CROSS JOIN (SELECT m.macro_id, m.macro_name, c.category_name, c.category_id
FROM cat_macros m
JOIN cat_categories c ON c.macro_id = m.macro_id) c
LEFT JOIN prd_tr_rabbit_catg x ON x.tasker_user_id = u.id AND x.category_id = c.category_id
WHERE x.id IS NULL
AND c.macro_id = 1
GROUP BY c.macro_name, u.id
For your sample data, this gives:
macro_name user_id name Email missing_cats
Veicules 1 José jose#email.com Motorcycles,Trucks
Veicules 2 Júlio julio#email.com Motorcycles
Veicules 3 André andre#email.com Cars,Motorcycles,Trucks
Update
To exclude users who don't have any of the categories, add a HAVING clause:
HAVING COUNT(*) < (SELECT COUNT(*) FROM cat_categories WHERE macro_id = 1)
Demo on SQLFiddle

Count frequency displaying also "0" values

I want to count and order the frequency of one main category (see SQLFiddle for table structures). But I want to display also "0" values, so if a categoryId isn't assigned by a product, this categoryId should have a frequency of "0".
My current SQL looks like this.
SELECT
category.categoryId,
category.name,
COUNT(*) AS frequency
FROM
Categories category
LEFT JOIN
Product entry ON entry.categoryId = category.categoryId
WHERE
category.parentId = 1
GROUP BY category.categoryId
ORDER BY COUNT(*) DESC
Result
| categoryId | name | frequency |
|------------|------------------|-----------|
| 2 | Sub Category 1-2 | 3 |
| 4 | Sub Category 1-4 | 1 |
| 3 | Sub Category 1-3 | 1 |
If I make a RIGHT JOIN the category, which hasn't been assigned, will not be displayed at all (but I need it in the result).
The result I need should look like this:
| categoryId | name | frequency |
|------------|------------------|-----------|
| 2 | Sub Category 1-2 | 3 |
| 4 | Sub Category 1-4 | 1 |
| 3 | Sub Category 1-3 | 0 |
Is there a way to display "0" frequency like in the result above?
SQLFiddle
You need to do count(entity.categoryId)
SELECT
c.categoryId,
c.name,
COUNT(e.categoryId) AS frequency
FROM
Categories c
LEFT JOIN
Product e ON e.categoryId = c.categoryId
WHERE
c.parentId = 1
GROUP BY c.categoryId
ORDER BY frequency DESC
I think tis is the query you need:
SELECT
category.categoryId,
category.name,
COUNT(category.parentId) AS frequency
FROM
Categories category
LEFT JOIN
Product entry ON entry.categoryId = category.categoryId
WHERE
category.parentId = 1 or category.parentId is null
GROUP BY category.categoryId
ORDER BY COUNT(*) DESC

Many To Many join with additional where

I think I have a somewhat trivial question but I can't figure out how this works. I have the following Companies and Products tables with a simple Many-To-Many relationship.
How would I have to extend this query, so that the results just contains let's say all companies which have products with id 1 AND 2?
I tried adding wheres and havings wherever I could imagine but all i could get was all companies which have products with id x (without the additional and)
Companies Table
id | name
-----------------
1 | Company 1
2 | Company 2
3 | Company 3
Companies_Products Table
id | product_id | company_id
----------------------------
1 | 1 | 1
2 | 2 | 1
3 | 3 | 1
4 | 1 | 2
5 | 1 | 3
6 | 2 | 3
Products Table
id | name
-----------------
1 | Product A
2 | Product B
3 | Product C
Statement
SELECT companies.name,
companies.id AS company_id,
products.id AS product_id
FROM companies
LEFT JOIN company_products
ON companies.id = company_products.company_id
INNER JOIN products
ON company_products.product_id = products.id
If you want ALL companies with associated products 1 and 2, you can write this query:
SELECT c.name,
c.id AS company_id
FROM companies c
WHERE (SELECT COUNT(*)
FROM company_products cp
WHERE cp.company_id = c.id
AND cp.product_id in ('1', '2')
) = 2
Go to Sql Fiddle
If you want to know informations about associated product in the main query so you must use a join in addition of existing query.
Maybe you could using the following subquery in your query:
SELECT company_id, count(*) as no_companies
FROM Companies_Products
WHERE product_id IN (1, 2)
HAVING count(*) = 2
(In this case company an product must be coupled only once.) It returns all the company_ids with product 1 and 2.
There always some discussion about subquery's and performance, but I don't think you will notice.
You could make this function flexible by using a array.
pseudo code:
$parameter = array(1, 2);
...
WHERE product_id IN $parameter
HAVING count(*) = count($parameter)
Please say so if you need more help.

MySQL Join table row based on lowest cell value

I have two tables in a MySQL database like this:
PRODUCT:
product_id | product_name
-----------+-------------
1 | shirt
2 | pants
3 | socks
PRODUCT_SUPPLIER: (id is primary key)
id | supplier_id | product_id | part_no | cost
----+---------------+--------------+-----------+--------
1 | 1 | 1 | s1p1 | 5.00
2 | 1 | 2 | s1p2 | 15.00
3 | 1 | 3 | s1p3 | 25.00
4 | 2 | 1 | s2p1 | 50.00
5 | 2 | 2 | s2p2 | 10.00
6 | 2 | 3 | s2p3 | 5.00
My goal is a query that joins the tables and outputs a single row for each product joined with all fields from the corresponding supplier row with the lowest cost like this:
product_id | product_name | supplier_id | part_no | cost
-----------+---------------+---------------+------------+---------
1 | shirt | 1 | s1p1 | 5.00
2 | pants | 2 | s2p2 | 10.00
3 | socks | 2 | s3p3 | 5.00
At present I do have the following query written which seems to work but I'd like to know from any of the more experienced SQL users if there is a cleaner, more efficient or otherwise better solution? Or if there is anything essentially wrong with the code I have?
SELECT p.product_id, p.product_name, s. supplier_id, s.part_no, s.cost
FROM product p
LEFT JOIN product_supplier s ON
(s.id = (SELECT s2.id
FROM product_supplier s2
WHERE s2.product_id = p.product_id
ORDER BY s2.cost LIMIT 1));
I would run:
select p.product_id, p.product_name, s.supplier_id, s.part_no, s.cost
from product p
join product_supplier s
on p.product_id = s.product_id
join (select product_id, min(cost) as min_cost
from product_supplier
group by product_id) v
on s.product_id = v.product_id
and s.cost = v.min_cost
I don't see the point in an outer join. Is every product is on the product_supplier table? If not then the outer join makes sense (change the join to inline view aliased as v above to a left join if that is the case).
The above may run a little faster than your query because the subquery is not running for each row. Your current subquery is dependent and relative to each row of product.
If you want to eliminate ties and don't care about doing so arbitrarily you can add a random number to the end of the results, put the query into an inline view, and then select the lowest/highest/etc. random number for each group. Here is an example:
select product_id, product_name, supplier_id, part_no, cost, min(rnd)
from (select p.product_id,
p.product_name,
s.supplier_id,
s.part_no,
s.cost,
rand() as rnd
from product p
join product_supplier s
on p.product_id = s.product_id
join (select product_id, min(cost) as min_cost
from product_supplier
group by product_id) v
on s.product_id = v.product_id
and s.cost = v.min_cost) x
group by product_id, product_name, supplier_id, part_no, cost
If for some reason you don't want the random # to come back in output, you can put the whole query above into an inline view, and select all columns but the random # from it.

Select values from different rows in a mysql join

I have two tables, products and categories, and a join table products_categories.
Categories are nested (via categories.parent_id). So, for any given product, it can belong to many, potentially nested categories.
The categories are structured like this:
department (depth = 0)
category (depth = 1)
class (depth = 2)
Each product will belong to one "department", one "category", and one "class".
So, a product like "Rad Widget", could belong to the "Electronics" department, the "Miscellaneous" category, and the "Widgets" class. My schema would look like this:
# products
id | name
---------------
1 | Rad Widget
# categories
id | parent_id | depth | name
--------------------------------------
1 | null | 0 | Electronics
2 | 1 | 1 | Miscellaneous
3 | 2 | 2 | Widgets
# products_categories
product_id | category_id
------------------------
1 | 1
1 | 2
1 | 3
I'd like to run a query that lists all of a product's departments into a single row, like this:
product.id | product.name | department | category | class
-----------------------------------------------------------------
1 | Rad Widget | Electronics | Miscellaneous | Widgets
I can't think of a way to do this, so I'm considering denormalizing my data, but I want to make certain I'm not missing something first.
Since each category (and by the way, you might want to rename either the table or the level so that "category" doesn't mean two different things) has a singular known parent, but an indeterminate number of unknown children, you need to "walk up" from the most specific (at depth = 2) to the most general category, performing a self-join on the category table for each additional value you want to insert.
If you're impatient, skip to the SQL Fiddle link at the bottom of the post. If you'd rather be walked through it, continue reading - it's really not that different from any other case where you have a surrogate ID that you want to replace with data from the corresponding table.
You could start by looking at all the information:
SELECT * FROM products AS P
JOIN
products_categories AS PC ON P.id = PC.product_id
JOIN
categories AS C ON PC.category_id = C.id
WHERE P.id = 1 AN D C.depth = 2;
+----+------------+------------+-------------+----+-----------+-------+---------+
| id | name | product_id | category_id | id | parent_id | depth | name |
+----+------------+------------+-------------+----+-----------+-------+---------+
| 1 | Rad Widget | 1 | 3 | 3 | 2 | 2 | Widgets |
+----+------------+------------+-------------+----+-----------+-------+---------+
First thing you have to do is recognize which information is useful and which is not. You don't want to be SELECT *-ing all day here. You have the first two columns you want, and the last column (recognize this as your "class"); you need parent_id to find the next column you want, and let's hold onto depth just for illustration. Forget the rest, they're clutter.
So replace that * with specific column names, alias "class", and go after the data represented by parent_id. This information is stored in the category table - you might be thinking, but I already joined that table! Don't care; do it again, only give it a new alias. Remember that your ON condition is a bit different - the products_categories has done its job already, now you want the row that matches C.parent_id - and that you only need certain columns to find the next parent:
SELECT
P.id,
P.name,
C1.parent_id,
C1.depth,
C1.name,
C.name AS 'class'
FROM
products AS P
JOIN
products_categories AS PC ON P.id = PC.product_id
JOIN
categories AS C ON PC.category_id = C.id
JOIN
categories AS C1 ON C.parent_id = C1.id
WHERE
P.id = 1
AND C.depth = 2;
+----+------------+-----------+---------------+---------+
| id | name | parent_id | name | class |
+----+------------+-----------+---------------+---------+
| 1 | Rad Widget | 1 | Miscellaneous | Widgets |
+----+------------+-----------+---------------+---------+
Repeat the process one more time, aliasing the column you just added and using the new C1.parent_id in your next join condition:
SELECT
P.id,
P.name,
PC.category_id,
C2.parent_id,
C2.depth,
C2.name,
C1.name AS 'category',
C.name AS 'class'
FROM
products AS P
JOIN
products_categories AS PC ON P.id = PC.product_id
JOIN
categories AS C ON PC.category_id = C.id
JOIN
categories AS C1 ON C.parent_id = C1.id
JOIN
categories AS C2 ON C1.parent_id = C2.id
WHERE
P.id = 1
AND C.depth = 2;
+----+------------+-----------+-------+-------------+---------------+---------+
| id | name | parent_id | depth | name | category | class |
+----+------------+-----------+-------+-------------+---------------+---------+
| 1 | Rad Widget | NULL | 0 | Electronics | Miscellaneous | Widgets |
+----+------------+-----------+-------+-------------+---------------+---------+
Now we're clearly done; we can't join another copy on C2.parent_id = NULL and we also see that depth = 0, so all that's left to do is get rid of the columns we don't want to display and double check our aliases. Here it is in action on SQL Fiddle.
If you want a list of all the categories, you can simply do a
Select Distinct p.category_id, c.name
From products_categories p Join
categories c On p.category_id = c.id
Where p.product_id = 1
The problem is you are putting Classes and Departments into your Category table. Technically you'd be correctly normalizing your data by moving each of these to their own tables. I know the overhead of creating more tables is a pain but it'll simplify your queries (saving processing power and potentially bandwidth).