MySQL SELECT query with a many to many relationship - mysql

I'm having trouble making a SELECT/WHERE query using a many to many relationship type. A user inputs ingredients, and I want to find which recipes use all the ingredients provided among the other ingredients (if any). (Think: use up the last ingredients I have in my fridge)
My DB is currently designed like this:
recipes_ingredients looks like this
For example, if I give,id_ingredient IN (22, 23) i want the recipe #16497 only, not #16631 (since it only has 22 and not 23).
I've come up with something that does the opposite of what I described
SELECT DISTINCT recipes.*
FROM recipes_ingredients
JOIN recipes ON recipes_ingredients.id_recipe = recipes.id
WHERE id_ingredient IN ( 96, 13196 )

If you want to get recipes which should have these both ingredients(not single ingredient) then you can use aggregation with some filter
SELECT r.*
FROM recipes_ingredients i
JOIN recipes r ON i.id_recipe = r.id
WHERE i.id_ingredient IN ( 96, 13196 )
GROUP BY r.id
HAVING COUNT(DISTINCT i.id_ingredient ) = 2
OR
SELECT r.*
FROM recipes_ingredients i
JOIN recipes r ON i.id_recipe = r.id
GROUP BY r.id
HAVING SUM(i.id_ingredient = 96)
AND SUM(i.id_ingredient = 13196)

Assuming that you need recipes that contains all ingredients you have on the input then you may use JOIN
SELECT recipes.* FROM recipes
JOIN recipes_ingredients r1 ON recipes.id = r1.id_recipe AND r1.id_ingredient = 96
JOIN recipes_ingredients r2 ON recipes.id = r2.id_recipe AND r2.id_ingredient = 13192
Unfortunately there is no intersect operator in mysql which would be more simple.

SELECT count(*) matches, id_recipe FROM `recipes_ingredients`
WHERE `id_ingredient` in ('23',...)
Group By `id_recipe`
WHERE matches = (
SELECT count(*) FROM ingredients where id in ('23',...)
);
This provides the count of matching ingredients per recipe, then compares counts to the exact number of parameters passed in. Or, since you are using phpmyadmin (and as such: PHP), you can pass in a count of the parameters (using PHP's count() if they start in an array, for example), and skip the subquery.
You can then join this list outwards to get any further information.

Related

How to select resource with where condition enforcing having two relations in joined table

How to select resource with where condition enforcing having two relations in joined table.
For example lets say I have to tables resource and item where one resource can have many items and item can be assigned to many resources.
Now I need to select a resource which have two specific items ? how to do it in the simplest way possible ?
SELECT
r.name,
GROUP_CONCAT(DISTINCT CONCAT(i.name)) AS itemNames
FROM
resources r
LEFT JOIN resources_items ri ON ri.resourceId = r.id
LEFT JOIN items i ON i.id = ri.itemId
WHERE i.id= '1' AND i.id = '2'
GROUP BY r.id
Is this a good direction ?
Later on I'd like to select many resources based on many items. For example all resources that have id in given array or have an item with id 1 or 2
Another approach would be using IN() and COUNT()
SELECT
r.name,
GROUP_CONCAT(DISTINCT CONCAT(i.name)) AS itemNames
FROM
resources r
LEFT JOIN resources_items ri ON ri.resourceId = r.id
LEFT JOIN items i ON i.id = ri.itemId
WHERE i.id IN(1,2)
GROUP BY r.id
HAVING COUNT(DISTINCT i.id) = 2
For more element you just need to update your IN and the count need to be equal to the length of your given array
This approach is mysql specific which will use sum() to make the each filter is true like
SELECT
r.name,
GROUP_CONCAT(DISTINCT CONCAT(i.name)) AS itemNames
FROM
resources r
LEFT JOIN resources_items ri ON ri.resourceId = r.id
LEFT JOIN items i ON i.id = ri.itemId
WHERE i.id IN(1,2)
GROUP BY r.id
HAVING SUM(i.id = 1) > 1
AND SUM(i.id = 2) > 1
And repeat this SUM(i.id = #itemInput) clause according to your input items so as compare to first approach this is also a complex one like your which will involve no. of conditions or no. of joins. while in first approach you just need to count your inputs and match in having clause

SQL -- many-to-many query

I'm trying to select all recipes which contain a given number of ingredients. I'm able to find all recipes based on one given recipe_id
with:
SELECT name
FROM recipe
INNER JOIN recipe_ingredient
ON recipe.recipe_id = recipe_ingredient.recipe_id
WHERE recipe_ingredient.recipe_id = ?
But I'm having trouble figuring out what the query should look like when I'm looking for recipes which contain more than contain more than one specific ingredient. For Example Ingredient A and Ingredient B.
My tables look like this:
ingredient
-ingredient_id
-name
recipe_ingredient
-recipe_ingredient
-ingredient_id
-recipe_id
recipe
-recipe_id
-name
I would be very happy about any ideas on how to solve that problem!
Thanks.
Your query will look something like this
Your query will look something like this
SELECT name, count(*)
FROM recipe
INNER JOIN recipe_ingredient
ON recipe.recipe_id = recipe_ingredient.recipe_id
GROUP BY name
HAVING count(*) > 1
IF looking for specific Ingredients, you could do a pre-query doing a union of all the ingredients you are interested in. Join that to the ingredients table per recipe and make sure that all ingredients are accounted for. This is handled by the group by and having count = the number of ingredients you are looking for.
I did this example based on the name of the ingredient. If you have/know the actual ingredient ID (which would be more accurate such as web-based and you have the IDs chosen by a user), just change the join condition to the ingredient ID instead of just the description of the ingredient.
SELECT
r.name,
r.recipe_id
from
( SELECT 'Milk' as Ingredient
UNION select 'Eggs'
UNION select 'Butter' ) as Wanted
JOIN recipe_ingredient ri
ON Wanted.Ingredient = ri.recipe_ingredient
JOIN Recipe r
ON ri.Recipe_id = r.id
group by
ri.recipe_id
having
COUNT(*) = 3
In this case, Milk, Butter, Eggs and a final count = 3.
In order to match all elements of an IN clause, you need to make sure you select only Recipes that have a count which matches the total number of Ingredients in your list:
SELECT name
FROM recipe
INNER JOIN recipe_ingredient
ON recipe.recipe_id = recipe_ingredient.recipe_id
WHERE recipe_ingredient.ingredient_id IN (ID1, ID2, ID3) --list of ingredient IDs
Group By Name
Having Count(*) = 3 --# of Ingredients you have chosen
Good luck finding which recipe will work with the ingredients you have available
Here is a functional example
Just use OR for your where.
Like this
$query = "SELECT name
FROM recipe
INNER JOIN recipe_ingredient
ON recipe.recipe_id = recipe_ingredient.recipe_id
WHERE recipe_ingredient.recipe_id = ? OR recipe_ingredient.recipe_id = ?";
Use group by and having clause
SELECT name
FROM recipe
INNER JOIN recipe_ingredient
ON recipe.recipe_id = recipe_ingredient.recipe_id
GROUP BY name
HAVING count(1) > 1

Incomprehensible query behaviour

I have multiple tables, related by multiple foreign keys as in the following example:
Recipes(id_recipe,name,calories,category) - id_recipe as PK.
Ingredients(id_ingredient,name,type) - id_ingredient as PK.
Contains(id_ingredient,id_recipe,quantity,unit) - (id_ingredient,id_recipe) as PK, and as Foreign Keys for Recipes(id_recipe) and Ingredients(id_ingredient).
You can see this relations represented in this image.
So basically Contains is a bridge between Recipes and Ingredients.
The query I try to write it's supposed to give as result the names of the recipes whose ingredients type are "bovine" but not "lactic".
My attempt:
SELECT DISTINCT Recipes.name
FROM Ingredients JOIN Contains USING(id_ingredient) JOIN Recipes USING (id_recipe)
WHERE Ingredients.type = "bovin"
AND Ingredients.type <> "lactic";
The problem is it still shows me recipes that have at least one lactic ingredient.
I would appreciate any help!
This is the general form of the kind of query you need:
SELECT *
FROM tableA
WHERE tableA.ID NOT IN (
SELECT table_ID
FROM ...
)
;
-- EXAMPLE BELOW --
The subquery gives the id values of all recipes that the "lactic" ingredient is used in, the outer query says "give me all the recipes not in that list".
SELECT DISTINCT Recipes.name
FROM Recipes
WHERE id_recipe IN (
SELECT DISTINCT id_recipe
FROM `Ingredients` AS `i`
INNER JOIN `Contains` AS `c` USING (id_ingredient)
WHERE `i`.`type` = "lactic"
)
;
Alternatively, using your original query:
You could've changed the second join to a LEFT JOIN, changed it's USING to an ON & included AND type = "lactic" there instead, and ended the query with HAVING Ingredients.type IS NULL (or WHERE, I just prefer HAVING for "final result" filtering). This would tell you which items could not be joined to the "lactic" ingredient.
A common solution of this type of question (checking conditions over a set of rows) utilizes aggregate + CASE.
SELECT R.Name
FROM Recipes R
INNER JOIN Contains C
on R.ID_Recipe = C.ID_Recipe
INNER JOIN Ingredients I
on C.ID_Ingredient = I.ID_Ingredient
GROUP BY R.name
having -- at least one 'lactic' ingredient
sum(case when type = 'lactic' then 1 else 0 end) = 0
and -- no 'bovin' ingredient
sum(case when type = 'bovin' then 1 else 0 end) > 0
It's easy to extend to any number of ingredients and any kind of question.
Hijacked the fiddle of xQbert
SELECT R.NAME
FROM CONTAINS C
INNER JOIN INGREDIENTS I
ON I.ID_INGREDIENTS = C.ID_INGREDIENTS AND I.TYPE = 'bovine' AND I.TYPE <> "lactic"
INNER JOIN RECIPES R
ON R.ID_RECIPE = C.ID_RECIPE
GROUP BY R.NAME
That should work, maybe you need to escape 'contains'. It could be recognized as a SQL function.
SQL Fiddle
In my example burgers and pasta have 'Bovin' and thus show up. So do cookies but cookies also have 'lactic' which is why they get excluded.
SELECT R.Name
FROM Recipes R
INNER JOIN Contains C
on R.ID_Recipe = C.ID_Recipe
INNER JOIN Ingredients I
on C.ID_Ingredient = I.ID_Ingredient
LEFT JOIN (SELECT R2.ID_Recipe
FROM Ingredients I2
INNER JOIN Contains C2
on C2.ID_Ingredient = I2.ID_Ingredient
INNER JOIN Recipes R2
on R2.ID_Recipe = C2.ID_Recipe
WHERE Type = 'lactic'
GROUP BY R2.ID_Recipe) T3
on T3.ID_Recipe = R.ID_Recipe
WHERE T3.ID_Recipe is null
and I.Type = 'Bovin'
GROUP BY R.name
There likely is a more elegant way of doing this. I really wanted to CTE this and join it to itself.. but no CTE in mySQL. Likely a way to do this using exists too.... I'm not a big fan of using IN clauses as the performance generally suffers. Exists fastest, Joins 2nd fastest, in slowest (generally speaking)
The inline view (sub query) returns the ID_recipe of those you don't want to include.
The outer query returns the Name of the recipes with ingredients you want.
By joining these two together using an outer join we return all recipes and only those with the undesired ingredient. We then limit the results to only those where the recipe ID doesn't exist for the undesired ingredient. (undesired ingredient not found) you'll get only those recipes having all desired ingredients.
You can use NOT EXISTS for this.
Try this:
SELECT DISTINCT Recipes.`name`
FROM Recipes JOIN Contains AS C1 USING (id_recipe) JOIN Ingredients USING(id_ingredient)
WHERE Ingredients.type = "bovin"
AND NOT EXISTS (
SELECT 1
FROM Contains AS C2 JOIN Ingredients USING(id_ingredient)
WHERE C1.id_recipe = C2.id_recipe
AND Ingredients.type = "lactic"
)

mysql query where not with Hierarchical tables

So basically I'm joining 3 tables together. The main table is recipe, then it goes to ingredients list then ingredient.
So I need to have a query which has only recipes which contain NO chicken. The problem I am having is that because recipes have many ingredients when I use where != that just removes the ingredients with that meat but leaves the others.....how can i account for the multiple ingredients.
select Recipe.name as "No chicken" from Recipe inner join IngredientList on Recipe.recipeId=IngredientList.recipeId inner join Ingredients on IngredientList.IngredientId=Ingredients.ingredientId where type!="chcicken" group by Recipe.name;
Your original statement has a GROUP BY with no aggregate function. That doesn't make sense. It should be an ORDER BY if you're trying to sort.
Try something like this:
SELECT `Recipe`.`name` AS "No chicken"
FROM `Recipe`
WHERE `Recipe`.`RecipeId` NOT IN (
SELECT DISTINCT `IngredientList`.`RecipeId` AS `RecipeID`
FROM `IngredientList`
INNER JOIN `Ingredients` ON `IngredientList`.`IngredientId` = `Ingredients`.`IngredientId`
WHERE `Ingredients`.`Type` = 'chicken'
)
ORDER BY `Recipe`.`name`
Depending on your schema, you may need to use SELECT DISTINCT in the main select statement if you're getting duplicate recipe names.
The above have some typos, but Amirshk has a logically correct answer.
However, I recommend one avoid the IN() and NOT IN() clauses in MySQL as they are very, very slow on a set of tables as big as a large recipe database would get. IN and NOT IN can be re-written as joins to cut the runtime to 1/100th the time in MySQL 5.0. Even with MySQL 5.5's great improvements, the equivalent JOIN query benchmarks 1/5th the time on large tables.
Here is the revised query:
SELECT
Recipe.name AS "No Chicken"
FROM Recipe LEFT JOIN
(
SELECT IngredientList.recipeId, Ingredients.ingredientId
FROM IngredientList JOIN Ingredients USING (IngredientId)
WHERE Ingredients.type = 'chicken'
) WithChicken
ON Recipe.recipeId = WithChicken.recipeId
WHERE WithChicken.recipeId IS NULL;
This is pretty obtuse, so here is simplified SQL that provides the key concept of the NOT IN(...) equivalent exclusion join:
SELECT whatever FROM x
WHERE x.id NOT IN (
SELECT id FROM y
};
becomes
SELECT whatever FROM x
LEFT JOIN y ON x.id = y.id
WHERE y.id IS NULL;
Use an inner query to filter recipes with chicken, then select all the recipes without them.
As so:
select
Recipe.name as "No chicken"
from Recipe
inner join IngredientList on Recipe.recipeId=IngredientList.recipeId
inner join Ingredients on IngredientList.IngredientId=Ingredients.ingredientId
where Recipe.recipeId NOT IN (
select
Recipe.recipeId
from Recipe
inner join IngredientList on Recipe.recipeId=IngredientList.recipeId
inner join Ingredients on IngredientList.IngredientId=Ingredients.ingredientId
type ="chcicken" group by Recipe.recipeId)

Find rows without many-to-many children meeting a certain condition

Here's a generic version of what I'm trying to do:
The table recipes has fields id and name. The table ingredients has fields id, name, and sweetness, describing how sweet that ingredient is on a scale of 1-10. Recipes have many ingredients and ingredients are in many recipes, so the two are related in a ingredients_recipes table, with fields ingredient_id and recipe_id.
It's easy to find recipes that contain an ingredient with sweetness of 10.
SELECT DISTINCT recipes.* FROM recipes
INNER JOIN recipes_ingredients ri ON ri.recipe_id = recipes.id
INNER JOIN ingredients ON ingredients.id = ri.ingredient_id
WHERE ingredients.sweetness = 10
However, I'm having trouble with negating that query to find recipes with no ingredients with sweetness 10. My first thought was this:
SELECT DISTINCT recipes.* FROM recipes
INNER JOIN recipes_ingredients ri ON ri.recipe_id = recipes.id
INNER JOIN ingredients ON ingredients.id = ri.ingredient_id
WHERE ingredients.sweetness != 10
However, that finds recipes that contain any non-sweetness-10 ingredients.
My next attempt was the following, which seems to work:
SELECT * FROM recipes WHERE
(
SELECT count(*) FROM ingredients INNER JOIN recipes_ingredients ri ON
ri.ingredient_id = ingredients.id WHERE ingredients.sweetness = 10 AND
ri.recipe_id = recipes.id
) = 0
However, my general experience is that dependent subqueries run slowly compared to equivalent, well-crafted JOINs. I played around with joining, grouping, etc. but couldn't quite wrap my head around it, especially since, though it seems like LEFT JOIN and IS NULL were the proper tools, having two joins already made things nasty. Great SQL wizards, what query can I run to get the best results? Thanks!
Try this:
SELECT DISTINCT recipes.*
FROM recipes r LEFT JOIN
(SELECT ri.recipe_id
FROM recipes_ingredients ri
INNER JOIN ingredients ON ingredients.id = ri.ingredient_id
WHERE ingredients.sweetness = 10) i on i.recipe_id=r.recipe_id
WHERE i.recipe_id is null
Try:
select
r.*
from
recipes r
where
not exists (
select
1
from
recipe_ingredients ri
join ingredients i on ri.ingredient_id = ri.ingredient_id
where
ri.recipie_id = r.recipe_id
and i.sweetness = 10
)
It's still a correlated subquery, but exists and not exists have some optimizations that should make them perform better than your original query.
For a direct join solution, this should work:
select distinct
r.*
from
recipes r
join recipe_ingredients ri on ri.recipe_id = r.recipe_id
left join ingredents i on i.ingredient_id = ri.ingredient_id and i.sweetness = 10
where
i.ingredient_id is null
Depending on indexing, the not exists solution could be faster as not exists returns immediately upon figuring out if any rows satisfy the given conditions without looking at any more of the table than necessary. For example, if it finds a single row of sweetness 10, it stops looking at the table and returns false.
I played around with the answers given me here (which I've since upvoted), and, from their inspiration, have come up with a query that seems to do the job with surprisingly outstanding performance:
SELECT r.* FROM recipes r
LEFT JOIN recipes_ingredients ri ON ri.parent_id = r.id
LEFT JOIN ingredients i ON i.id = ri.ingredient_id AND i.sweetness = 10
GROUP BY r.id HAVING MAX(i.id) IS NULL
The joins with the condition inside (inspired by #Donnie) bring out recipe-ingredient combinations, with NULL rows if the ingredient is not of sweetness 10. We then group by recipe ID, and select the "max" ingredient ID. (The MAX function will return null if and only if there are no actual IDs to select, i.e., there are absolutely no non-sweetness-10 items associated with this recipe to choose instead.) If that "max" ingredient ID is null, then there were no sweetness-10 items for the MAX function to select, and, therefore, rows HAVING a null MAX(i.id) are selected.
I ran both the NOT EXISTS version of the query and the above version of the query a number of times with the query cacher disabled. Against about 400 recipes, the NOT EXISTS query would consistently take about 1.0 seconds to complete, whereas this query's runtime was usually around 0.1 seconds. Against about 5000 recipes, the NOT EXISTS query took about 30 seconds, whereas the above query usually still took 0.1 seconds, and was almost always under 1.0.
It's worth noting that, checking EXPLAINs on each, the query listed here is able to run almost entirely on the indices I've given these tables, which probably explains why it is able to do all sorts of joining and grouping without batting an eye. The NOT EXISTS query, on the other hand, has to do dependent subqueries. The two might perform more equally if these indices weren't in place, but that query optimizer is pretty darn powerful when given the chance to use raw joins, it would seem.
Moral of the story: well-formed JOINs are super-duper powerful :) Thanks, all!