Select if another select returns rows - mysql

Im trying to make a select to make cooking recipe select within the items you have.
i have a table named ingredientsOwn with the following structure:
idType (int) amount (int)
Another table named recipes with this structure:
idRecipe (int) name (varchar)
And another table named recipeIngredients
idRecipe (int) idType (int) amount (int)
I would like to show the recipes you can do with the elements you have, how could i perform this?
Im trying to implement it in only one query cause i really dont know how to go throw and array on node js.
Thanks

The way I would go around this is, try to compute for each recipe, the number of ingredients you need, and join that with the number of ingredients own, and if the two numbers match, you have a candidate recipe.
So, to get the number of ingredients a recipe needs you'll have to do something like (this is more like a sql server syntax, so please try to focus in the concepts, and not the syntax):
select idRecipe, count(*) as neededIngredientsCount
from recipeIngredients
group by idRecipe
To get the number of available ingredients for each receipe, you have to join your ingredientsOwn with recipeIngredients, to be able to tell how many matching ingredients you have for each recipe.
select ingredientsOwn.idRecipe, count(*) as matchingIngredientsCount
from ingredientsOwn inner join recipeIngredients
on ingredientsOwn.idType = recipeIngredients.idType
where ingredientsOwn.amount >= recipeIngredients.amount
group by ingredientsOwn.idRecipe
Now you join the previous 2 queries to get the idRecieps that you have enough ingredients for, and join them with the recipes table to get the recipe name.
select r.idRecipe, r.name from
((select idRecipe, count(*) as neededIngredientsCount
from recipeIngredients
group by idRecipe) as in
inner join
(select ingredientsOwn.idRecipe, count(*) as matchingIngredientsCount
from ingredientsOwn inner join recipeIngredients
on ingredientsOwn.idType = recipeIngredients.idType
where ingredientsOwn.amount >= recipeIngredients.amount
group by ingredientsOwn.idRecipe) as io
on in.idRecipe = io.idRecipe
and in.neededIngredientsCount = io.matchingIngredientsCount
inner join
(select * from recipes) as r
on r.idRecipe = in.idRecipe)
Hope this helps, and sorry for not being able to provide valid mysql syntax.

SELECT * FROM recipes INNER JOIN (
select idRecipe from recipeIngredients
WHERE recipeIngredients.idType IN (
SELECT ingredientsOwn.idType from ingredientsOwn
)
) as a ON a.idRecipe = recipes.idRecipe

Related

Why does this sql query error show up? Is there maybe another way to write this?

I am using the Chinook database for a project and I have two difficult queries to execute, but both provide errors.
I am looking for all the orders (invoice) that were sent to 'New York' and contain tracks that belong to more than one genre. [InvoiceId, amount of products, total1, total2]. Total1 should be unitprice*quantity and total2 is total. It should show only 2 rows.
So far I have come up with this. I have also tried switching up with left join, full outer join, etc
CREATE TEMPORARY TABLE temp AS
SELECT *
FROM track join invoiceline USING (TrackId)
WHERE (select * from track t1 where EXISTS (select * from track t2 where t1.GenreId <> t2.GenreId));
SELECT invoice.InvoiceId, invoiceline.Quantity, invoiceline.UnitPrice*invoiceline.Quantity, invoice.Total
FROM (SELECT * FROM invoice JOIN invoiceline
WHERE invoice.BillingCity LIKE '%New York%') JOIN temp cc ON invoiceline.TrackId
GROUP BY invoiceline.InvoiceId;
DROP TABLE temp;
It provides the error:
Operand should contain 1 column(s)
I am looking for clients (in couples) that have bought more than two of the same tracks. It should provide 14 rows.
Until now I have come up with this.
SELECT CONCAT(FIRSTNAME,',', LASTNAME) AS name1 FROM customer
JOIN invoice ON customer.CustomerId = invoice.CustomerId
JOIN invoiceline ON invoice.InvoiceId = invoiceline.InvoiceId
JOIN track ON invoiceline.TrackId = track.TrackId
UNION
(
SELECT CONCAT(FIRSTNAME,',', LASTNAME) AS name2 FROM customer
JOIN invoice ON customer.CustomerId = invoice.CustomerId
JOIN invoiceline ON invoice.InvoiceId = invoiceline.InvoiceId
JOIN track ON invoiceline.TrackId = track.TrackId
);
So A) Does anybody know why it provides that error?
B) Could anyone give any tips or suggest a better way to write these queries?
Here are two helpful schemas:ER diagram
relational diagram
Answer to you first question:
The error comes up because many rows would have a single genre id. This method is also very redundant.
You should use count of genre Ids and take track Ids with count more than 1 as shown below:
CREATE TEMPORARY TABLE temp AS
SELECT *
FROM track join invoiceline USING (TrackId)
WHERE TrackId in
(select TrackId from (select TrackId, count(distinct GenreId) as genres from track group by 1 having genres>1));
SELECT invoice.InvoiceId, invoiceline.Quantity, invoiceline.UnitPrice*invoiceline.Quantity, invoice.Total
FROM (SELECT * FROM invoice JOIN invoiceline
WHERE invoice.BillingCity LIKE '%New York%') JOIN temp cc ON invoiceline.TrackId
GROUP BY invoiceline.InvoiceId;
DROP TABLE temp;
I have assumed that track id is the primary key here.
For the second question, I assume that you want to find customers buying the same records. You can use a query like the one below:
SELECT invoiceline.TrackId, group_concat(customer.CustomerId) as customers FROM customer
JOIN invoice ON customer.CustomerId = invoice.CustomerId
JOIN invoiceline ON invoice.InvoiceId = invoiceline.InvoiceId
JOIN track ON invoiceline.TrackId = track.TrackId
group by 1
This will give you comma separated customer ids who have bought the same track. Also, use customer id instead of first name and last name since some customers can have the same name. Using primary key is best.
Since you mentioned, you want customers buying the same records in couples, I would suggest reading up on market basket analysis or association analysis using apriori algorithm. You can import your dataset into R or Python whichever you are comfortable with and build a visualization. Python is faster and can handle more data but its visualizations are bad. R is a bit slow at handling large amounts of data but has good visualizations for apriori algorithm

SQL -- many-to-many query

I'm trying to select all recipes which contain a given number of ingredients. I'm able to find all recipes based on one given recipe_id
with:
SELECT name
FROM recipe
INNER JOIN recipe_ingredient
ON recipe.recipe_id = recipe_ingredient.recipe_id
WHERE recipe_ingredient.recipe_id = ?
But I'm having trouble figuring out what the query should look like when I'm looking for recipes which contain more than contain more than one specific ingredient. For Example Ingredient A and Ingredient B.
My tables look like this:
ingredient
-ingredient_id
-name
recipe_ingredient
-recipe_ingredient
-ingredient_id
-recipe_id
recipe
-recipe_id
-name
I would be very happy about any ideas on how to solve that problem!
Thanks.
Your query will look something like this
Your query will look something like this
SELECT name, count(*)
FROM recipe
INNER JOIN recipe_ingredient
ON recipe.recipe_id = recipe_ingredient.recipe_id
GROUP BY name
HAVING count(*) > 1
IF looking for specific Ingredients, you could do a pre-query doing a union of all the ingredients you are interested in. Join that to the ingredients table per recipe and make sure that all ingredients are accounted for. This is handled by the group by and having count = the number of ingredients you are looking for.
I did this example based on the name of the ingredient. If you have/know the actual ingredient ID (which would be more accurate such as web-based and you have the IDs chosen by a user), just change the join condition to the ingredient ID instead of just the description of the ingredient.
SELECT
r.name,
r.recipe_id
from
( SELECT 'Milk' as Ingredient
UNION select 'Eggs'
UNION select 'Butter' ) as Wanted
JOIN recipe_ingredient ri
ON Wanted.Ingredient = ri.recipe_ingredient
JOIN Recipe r
ON ri.Recipe_id = r.id
group by
ri.recipe_id
having
COUNT(*) = 3
In this case, Milk, Butter, Eggs and a final count = 3.
In order to match all elements of an IN clause, you need to make sure you select only Recipes that have a count which matches the total number of Ingredients in your list:
SELECT name
FROM recipe
INNER JOIN recipe_ingredient
ON recipe.recipe_id = recipe_ingredient.recipe_id
WHERE recipe_ingredient.ingredient_id IN (ID1, ID2, ID3) --list of ingredient IDs
Group By Name
Having Count(*) = 3 --# of Ingredients you have chosen
Good luck finding which recipe will work with the ingredients you have available
Here is a functional example
Just use OR for your where.
Like this
$query = "SELECT name
FROM recipe
INNER JOIN recipe_ingredient
ON recipe.recipe_id = recipe_ingredient.recipe_id
WHERE recipe_ingredient.recipe_id = ? OR recipe_ingredient.recipe_id = ?";
Use group by and having clause
SELECT name
FROM recipe
INNER JOIN recipe_ingredient
ON recipe.recipe_id = recipe_ingredient.recipe_id
GROUP BY name
HAVING count(1) > 1

Incomprehensible query behaviour

I have multiple tables, related by multiple foreign keys as in the following example:
Recipes(id_recipe,name,calories,category) - id_recipe as PK.
Ingredients(id_ingredient,name,type) - id_ingredient as PK.
Contains(id_ingredient,id_recipe,quantity,unit) - (id_ingredient,id_recipe) as PK, and as Foreign Keys for Recipes(id_recipe) and Ingredients(id_ingredient).
You can see this relations represented in this image.
So basically Contains is a bridge between Recipes and Ingredients.
The query I try to write it's supposed to give as result the names of the recipes whose ingredients type are "bovine" but not "lactic".
My attempt:
SELECT DISTINCT Recipes.name
FROM Ingredients JOIN Contains USING(id_ingredient) JOIN Recipes USING (id_recipe)
WHERE Ingredients.type = "bovin"
AND Ingredients.type <> "lactic";
The problem is it still shows me recipes that have at least one lactic ingredient.
I would appreciate any help!
This is the general form of the kind of query you need:
SELECT *
FROM tableA
WHERE tableA.ID NOT IN (
SELECT table_ID
FROM ...
)
;
-- EXAMPLE BELOW --
The subquery gives the id values of all recipes that the "lactic" ingredient is used in, the outer query says "give me all the recipes not in that list".
SELECT DISTINCT Recipes.name
FROM Recipes
WHERE id_recipe IN (
SELECT DISTINCT id_recipe
FROM `Ingredients` AS `i`
INNER JOIN `Contains` AS `c` USING (id_ingredient)
WHERE `i`.`type` = "lactic"
)
;
Alternatively, using your original query:
You could've changed the second join to a LEFT JOIN, changed it's USING to an ON & included AND type = "lactic" there instead, and ended the query with HAVING Ingredients.type IS NULL (or WHERE, I just prefer HAVING for "final result" filtering). This would tell you which items could not be joined to the "lactic" ingredient.
A common solution of this type of question (checking conditions over a set of rows) utilizes aggregate + CASE.
SELECT R.Name
FROM Recipes R
INNER JOIN Contains C
on R.ID_Recipe = C.ID_Recipe
INNER JOIN Ingredients I
on C.ID_Ingredient = I.ID_Ingredient
GROUP BY R.name
having -- at least one 'lactic' ingredient
sum(case when type = 'lactic' then 1 else 0 end) = 0
and -- no 'bovin' ingredient
sum(case when type = 'bovin' then 1 else 0 end) > 0
It's easy to extend to any number of ingredients and any kind of question.
Hijacked the fiddle of xQbert
SELECT R.NAME
FROM CONTAINS C
INNER JOIN INGREDIENTS I
ON I.ID_INGREDIENTS = C.ID_INGREDIENTS AND I.TYPE = 'bovine' AND I.TYPE <> "lactic"
INNER JOIN RECIPES R
ON R.ID_RECIPE = C.ID_RECIPE
GROUP BY R.NAME
That should work, maybe you need to escape 'contains'. It could be recognized as a SQL function.
SQL Fiddle
In my example burgers and pasta have 'Bovin' and thus show up. So do cookies but cookies also have 'lactic' which is why they get excluded.
SELECT R.Name
FROM Recipes R
INNER JOIN Contains C
on R.ID_Recipe = C.ID_Recipe
INNER JOIN Ingredients I
on C.ID_Ingredient = I.ID_Ingredient
LEFT JOIN (SELECT R2.ID_Recipe
FROM Ingredients I2
INNER JOIN Contains C2
on C2.ID_Ingredient = I2.ID_Ingredient
INNER JOIN Recipes R2
on R2.ID_Recipe = C2.ID_Recipe
WHERE Type = 'lactic'
GROUP BY R2.ID_Recipe) T3
on T3.ID_Recipe = R.ID_Recipe
WHERE T3.ID_Recipe is null
and I.Type = 'Bovin'
GROUP BY R.name
There likely is a more elegant way of doing this. I really wanted to CTE this and join it to itself.. but no CTE in mySQL. Likely a way to do this using exists too.... I'm not a big fan of using IN clauses as the performance generally suffers. Exists fastest, Joins 2nd fastest, in slowest (generally speaking)
The inline view (sub query) returns the ID_recipe of those you don't want to include.
The outer query returns the Name of the recipes with ingredients you want.
By joining these two together using an outer join we return all recipes and only those with the undesired ingredient. We then limit the results to only those where the recipe ID doesn't exist for the undesired ingredient. (undesired ingredient not found) you'll get only those recipes having all desired ingredients.
You can use NOT EXISTS for this.
Try this:
SELECT DISTINCT Recipes.`name`
FROM Recipes JOIN Contains AS C1 USING (id_recipe) JOIN Ingredients USING(id_ingredient)
WHERE Ingredients.type = "bovin"
AND NOT EXISTS (
SELECT 1
FROM Contains AS C2 JOIN Ingredients USING(id_ingredient)
WHERE C1.id_recipe = C2.id_recipe
AND Ingredients.type = "lactic"
)

"Join" admins of different tables into a string

The real issue
Involved tables and their columns
accounts [id,name]
rooms [id,name,topic,owner]
room_admins [account_id,room_id]
Q: Get all rooms with their admin- and owner ids.
Where "all" of course has a condition to it (above: WHERE name LIKE ...)
Admins and owners should be returned in one column just called "admins". I tried to concatenate them above into one string.
What I tried
I came up with a solution, but it requires the use of an omnious external variable ":room_id" that changes on each outer SELECT and makes therefore no sense at all.
SELECT id,name,topic,
(SELECT GROUP_CONCAT(admins.account_id) AS owner
FROM
(SELECT account_id
FROM `room_admins`
WHERE room_id=:room_id
UNION
SELECT owner FROM `rooms` WHERE id=:room_id) admins) AS owner
FROM `rooms`
WHERE name LIKE "%htm%" OR topic LIKE "%htm%" LIMIT 20
Well, I haven't given this a deep thought... but I've just came up with this (sample data would have been useful to make tests... so this is just a blind answer).
select id, name, topic, group_concat(owner_admin) from (
select id, name, topic, owner owner_admin from rooms
union
select id, name, topic, account_id from rooms
left join room_admins on id = room_id
) s
where name like "%htm%" or topic like "%htm%"
group by id, name, topic
Basically I'm just generating a derived table with owner and admins mixed in one column. Then performing the grouping on that mixed column.
Most of the times, when you want to select and display dependent data, you want to use a JOIN. In this case, you want to join the rooms with their admins, so basically:
SELECT r.id, r.name, r.topic, a.id
FROM rooms r
LEFT JOIN admins a
ON r.id = a.room_id
WHERE :condition
Since you have one additional admin not in the admins table (the room owner), you have to (self) join a second time:
SELECT r.id, r.name, r.topic, a.id
FROM rooms r
LEFT JOIN admins a
ON r.id = a.room_id
LEFT JOIN rooms o
ON r.id = o.id
WHERE :condition
This doesn't give us any new information, but your question states that you want to return the list of admins in a single field. So, finally, putting it all together:
SELECT r.id, r.name, r.topic, GROUP_CONCAT(a.id)
FROM rooms r
LEFT JOIN
(
SELECT id, room_id FROM admins
UNION SELECT room.owner AS id, rooms.id AS room_id FROM rooms
) a
ON r.id = a.room_id
WHERE :condition
GROUP BY r.id
But to avoid this ugly sub-select-union clause, I'd advise you to put the room owner into your admin table too.

SQL query finding best categories match

I have categories and multiple categorization for my Items. How to find, for specific Item, other Items that have same categories, ordered by most categories matching (aka best match)?
My table structure is roughly:
Item Table
ID
Name
...
Category Table
ID
Name
...
Categorization Table
ID
Item_ID
Category_ID
...
To find all Items having similar categories, for example, I use
SELECT `items`.*
FROM `items`
INNER JOIN `categorizations` c1
ON c1.`item_id` = `items`.`id`
INNER JOIN `categorizations` c2
ON c2.`item_id` = <Item_ID>
WHERE `c1.`category_id` = c2.`category_id`
This should produce a table of counts of category matches between each pair of items that share at least one category.
select i1.item_id,i2.item_id,count(1)
from items i1
join categorizations c1 on c1.item_id=i1.item_id
join categorizations c2 on c2.category_id=c1.category_id
join items i2 on c2.item_id=i2.item_id
where i1.item_id <> i2.item_id
group by i1.item_id,i2.item_id
order by count(1)
I suspect that it may be a bit slow, though. I don't have an instance of MySQL at the moment to try it out.
Something like:
select item_id, count(id)
from item_category ic
where exists(
select category_id
from item_category ic2
where ic2.item_id = #item_id
and ic2.category_id = ic.category_id )
where item_id <> #item_id
group by item_id
order by count(item_id) desc
An alternative method which I have just implemented to solve this problem is using bitwise operators to speed things up. In MySQL this method only works if you have 64 or less categories as the bit functions are 64 bit.
1) Assign each category a unique integer value which is a power of 2.
2) For each item sum the category values that the item is in to create a 64 bit int representing all of the categories that the item is in.
3) To compare an item to another do something like:
SELECT id, BIT_COUNT(item1categories & item2categories) AS numMatchedCats FROM tablename HAVING numMatchedCats > 0 ORDER BY numMatchedCats DESC
The BIT_COUNT() function might be MySQL specific so an alternative may well be required for any other DB.
MySQL bit functions used are explained here:
http://dev.mysql.com/doc/refman/5.0/en/bit-functions.html