New to MySQL looking for Many-to-Many relationship query

New to MySQL looking for Many-to-Many relationship query - mysql

Some background I have a set of data that represents the alchemy ingredients and their effects from Skyrim. If you're unfamiliar with this you can combine 2-4 ingredients to make a potion. Each ingredient has 4 effects. If any effects between ingredients are the same it will make that type of potion. I've identified this as a many-to-many relationship and I set up my tables like so:
ingredients: ing_id (key), ing_name, (other supplemental info)
effects: eff_id (key), eff_name
ing_eff_xref: eff_id, ing_id
I would like to input 2 or more available ingredients and return possible combinations without knowing what the effects are. My sql experience is pretty much limited to phpmyadmin and simple select queries. I guess my questions are: is this the right way to structure the tables for this type of relationship, do I need to set foreign keys if I don't plan on updating the tables, and is there a query that can take a set of ing_names and return only eff_names that intersect?
Here is the mysqldump of the db if anyone is interested: http://dl.dropbox.com/u/59699040/alchemy_db.sql

is this the right way to structure the tables for this type of relationship?
Yes, but then you don't need to have effect1 through effect4 on the ingredient table.
do I need to set foreign keys if I don't plan on updating the tables?
Yes. The only way for you to get the data that you're after is by JOINing three tables together. Without foreign keys (or more specifically, appropriate indexes), that may not perform well on queries. Of course you do have a small number of rows overall, but using foreign keys is a good practice to follow in this type of scenario.
is there a query that can take a set of ing_names and return only
eff_names that intersect?
I think you're after something like this:
SELECT e.eff_name
FROM ingredients i
INNER JOIN ing_eff_xref ie ON ie.ing_id = i.ing_id
INNER JOIN effects e ON e.eff_id = ie.eff_id
WHERE i.ing_name = 'Abecean Longfin ';
If you need to see effects for multiple ingredients, you could adjust your WHERE clause, like this:
WHERE i.ing_name IN ('Abecean Longfin ','Eye of Sabre Cat ','Bear Claws ');
You'll probably not want duplicate effects, so you could do a SELECT DISTINCT to eliminate those.
Can potion effects stack in Skyrim? If they in can stack, then you can do a GROUP BY query with a COUNT to get the stacked value of each effect:
SELECT e.eff_name, count(*) as value
FROM ingredients i
INNER JOIN ing_eff_xref ie ON ie.ing_id = i.ing_id
INNER JOIN effects e ON e.eff_id = ie.eff_id
WHERE i.ing_name IN ('Eye of Sabre Cat ','Bear Claws ')
GROUP BY e.eff_name;
This query will list 6 effects with a value of 1, and "Restore Stamina" will have a value of 2. Not sure if Skyrim potions work this way or not, but it was just an extra thought.

Related

Whats the best way to implement a database with multivalued attributes?

i am trying to implement a database which has multi valued attributes and create a filter based search. For example i want my people_table to contain id, name, address, hobbies, interests (hobbies and interests are multi-valued). The user will be able to check many attributes and sql will return only those who have all of them.
I made my study and i found some ways to implement this but i can't decide which one is the best.
The first one is to have one table with the basic info of people (id, name, address), two more for the multi-valued attributes and one more which contains only the keys of the other tables (i understand how to create this tables, i don't know yet how to implement the search).
The second one is to have one table with the basic info and then one for each attribute. So i will have 20 or more tables (football, paint, golf, music, hiking etc.) which they only contain the ids of the people. Then when the user checks the hobbies and the activities i am going to get the desired results with the use of the JOIN feature (i am not sure about the complexity, so i don't know how fast is going to be if the user do many checks).
The last one is an implementation that i didn't find on internet (and i know there is a reason :) ) but in my mind is the easiest to implement and the fastest in terms of complexity. Use only one table which will have the basic infos as normal and also all the attributes as boolean variables. So if i have 1000 people in my table there are going to be only 1000 loops and which i imagine with the use of AND condition are going to be fast enough.
So my question is: can i use the the third implementation or there is a big disadvantage that i don't get? And also which one of the first two ways do you suggest me to use?

That is a typical n to m relation. It works like this
persons table
------------
id
name
address
interests table
---------------
id
name
person_interests table
----------------------
person_id
interest_id
person_interests contains a record for each interest of a person. To get the interests of a person do:
select i.name
from interests i
join person_interests pi on pi.interest_id = i.id
join persons p on pi.person_id = p.id
where p.name = 'peter'
You could create also tables for hobbies. To get the hobbies do the same in a separate query. To get both in one query you can do something like this
select p.id, p.name,
i.name as interest,
h.name as hobby
from persons p
left join person_interests pi on pi.person_id = p.id
left join interests i on pi.interest_id = i.id
left join person_hobbies ph on ph.person_id = p.id
left join hobbies h on ph.hobby_id = h.id
where p.name = 'peter'

The basic way to deal with this is with a many-to-many join table. Each user can have many hobbies. Each hobby can have many users. That's basic stuff you can find information about anywhere, and #juergend already covered that.
The harder part is tracking different information about various hobbies and interests. Like if their hobby is "baseball" you might want to track what position they play, but if their hobby is "travel" you might want to track their favorite countries. Doing this with typical SQL relationships will lead to a rapid proliferation of tables and columns.
A hybrid approach is to use the new JSON data type to store some unstructured data. To expand on #juergend's example, you might add a field to Person_Interests which can store some of those details about that person's interest.
create table Person_Interests (
InterestID integer references Interests(ID),
PersonID integer references Persons(ID),
Details JSON
);
And now you could add that Person 45 has Interest 12 (travel), their favorite country is Djibouti, and they've been to 45 countries.
insert into person_interests
(InterestID, PersonID, Details)
(12, 45, '{"favorite_country": "Djibouti", "countries_visited": 45}');
And you can use JSON search functions to find, for example, everyone whose favorite country is Djibouti.
select p.id, p.name
from person_interests pi
join persons p on p.id = pi.personid
where pi.details->"$.favorite_country" = "Djibouti"
The advantage here is flexibility: interests and their attributes aren't limited by your database schema.
The disadvantages is performance. The JSON data type isn't the most efficient, and indexing a JSON column in MySQL is complicated. Good indexing is critical to good SQL performance. So as you figure out common patterns you might want to turn commonly used attributes into real columns in real tables.
The other option would be to use table inheritance. This is a feature of Postgres, not MySQL, and I'd recommend considering switching. Postgres also has better and more mature JSON support and JSON columns are easier to index.
With table inheritance, rather than having to write a completely new table for every different interest, you can make specific tables which inherit from a more generic one.
create table person_interests_travel (
FavoriteCountry text,
CountriesVisited text[]
) inherits(person_interests);
This still has InterestID, PersonID, and Details, but it's added some specific columns for tracking their favorite country and countries they've visited.
Note that text[]. Postgresql also supports arrays so you can store real lists without having to create another join table. You can also do this in MySQL with a JSON field, but arrays offer type constraints that JSON does not.

SQL Inner Join With Multiple Columns

I've got 2 tables - dishes and ingredients:
in Dishes, I've got a list of pizza dishes, ordered as such:
In Ingredients, I've got a list of all the different ingredients for all the dishes, ordered as such:
I want to be able to list all the names of all the ingredients of each dish alongside each dish's name.
I've written this query that does not replace the ingredient ids with names as it should, instead opting to return an empty set - please explain what it that I'm doing wrong:
SELECT dishes.name, ingredients.name, ingredients.id
FROM dishes
INNER JOIN ingredients
ON dishes.ingredient_1=ingredients.id,dishes.ingredient_2=ingredients.id,dishes.ingredient_3=ingredients.id,dishes.ingredient_4=ingredients.id,dishes.ingredient_5=ingredients.id,dishes.ingredient_6=ingredients.id, dishes.ingredient_7=ingredients.id,dishes.ingredient_8=ingredients.id;
It would be great if you could refer to:
The logic of the DB structuring - am I doing it correctly?
The logic behind the SQL query - if the DB is built in the right fashion, then why upon executing the query I get the empty set?
If you've encountered such a problem before - one that requires a single-to-many relationship - how did you solved it in a way different than this, using PHP & MySQL?
Disregard The Text In Hebrew - Treat It As Your Own Language.

It seems to me that a better Database Structure would have a Dishes_Ingredients_Rel table, rather than having a bunch of columns for Ingredients.
DISHES_INGREDIENTS_REL
DishesID
IngredientID
Then, you could just do a much simpler JOIN.
SELECT Ingredients.Name
FROM Dishes_Ingredients_Rel
INNER JOIN Ingredients
ON Dishes_Ingredients.IngredientID = Ingredients.IngredientID
WHERE Dishes_Ingredients_Rel.DishesID = #DishesID

1. The logic of the DB structuring - am I doing it correctly?
This is denormalized data. To normalize it, you would restructure your database into three tables:
Pizza
PizzaIngredients
Ingredients
Pizza would have ID, name, and type where ID is the primary key.
PizzaIngredients would have PizzaId and IngredientId (this is a many-many table where the primary key is a composite key of PizzaId and IngredientID)
Ingredients has ID and name where ID is the primary key.
2. List all the names of all the ingredients of each dish alongside each dish's name. Something like this in MySQL (untested):
SELECT p.ID, p.name, GROUP_CONCAT(i.name) AS ingredients
FROM pizza p
INNER JOIN pizzaingredients pi ON p.ID = pi.PizzaID
INNER JOIN ingredients i ON pi.IngredientID = i.ID
GROUP BY p.id
3. If you've encountered such a problem before - one that requires a single-to-many relationship - how did you solved it in a way different than this, using PHP & MySQL?
Using a many-many relationship, since that what your example truly is. You have many pizzas which can have many ingredients. And many ingredients belong to many different pizzas.

The reason you are getting an empty result is because you are setting a join condition that never gets satisfied. During the INNER join execution the database engine compares each record of the first table with each record of the second one trying to find a match where the id of the ingredient table record being evaluated is equal to ingredient1 AND ingredient2 AND so on. It would return some result if you create a record in the first table with the same ingredient in all 8 columns (testing purposes only).
Regarding the database structure, you choose a denormalized one creating 8 columns for each ingredient. There are a lot of considerations possible on this data structure (performance, maintainability, or just think if you are asked to insert a dish with 9 ingredients for example) and I would personally go for a normalized data structure instead.
But if you want to keep this, you should write something like:
SELECT dishes.name, ingredients1.name, ingredients1.id, ingredients2.name, ingredients2.id, ...
FROM dishes
LEFT JOIN ingredients AS ingredients1 ON dishes.ingredient_1=ingredients1.id
LEFT JOIN ingredients AS ingredients2 ON dishes.ingredient_2=ingredients2.id
LEFT JOIN ingredients AS ingredients3 ON dishes.ingredient_3=ingredients3.id
...
The LEFT join is required to get a result for unmatched ingredients (0 value when no ingredient is set reading your example)

Efficiently returning objects joining ALL tags

I'm running a basic tagging-style system and am wondering how efficient my queries are.
My specific use case involves tagging recipe objects with ingredients through a requirement object, which has a recipe_id and an ingredient_id.
Recipes, ingredients and requirements are all completely siloed by user.
I want to be able to return a user's recipes that include ALL ingredients in a given set.
The way I'm doing this, given a list of ingredient_ids (1,2) and user_id of 1, is like this:
SELECT `recipes`.* FROM `recipes`
WHERE `recipes`.`id` IN (
SELECT `requirements`.`recipe_id`
FROM `requirements`
WHERE `requirements`.`ingredient_id` IN (1, 2)
AND `requirements`.`user_id` = 1
GROUP BY `requirements`.`recipe_id`
HAVING COUNT(`requirements`.`recipe_id`) = 2)
This is returning the data I need but I'm worried about its performance. The sub-query doesn't look good because it is grabbing all requirements with ingredient_id 1 or 2, grouping them by recipe and then counting them to match the given array size, simply to create an array against which to further query recipe ids.
But the requirements table could be massive, as each entry manages one of potentially an n-squared number of bi-directional relationships between recipes and ingredients. So it doesn't make sense to me to query the whole table in this way.
Am I missing something?
I've often heard that IN and NULL equality comparisons are so much faster than JOINs, but surely not when the complexity of the subquery negates the speed saving?
It seems like a very simple problem that I'm over-engineering, how would you improve it?

I don't have your database to test, sorry, So I am not sure if this will yield the result you want. But maybe try join to the requirements table instead of using a subquery, it will avoid the potential performance loss and just make for generally cleaner code. Here is what I hope will work for you:
SELECT `recipes`.`recipe_id`
FROM `recipes` AS rec
JOIN `requirements` AS req ON rec.`recipe_id` = req.`recipe_id`
WHERE `requirements`.`ingredient_id` IN (1, 2)
AND `requirements`.`user_id` = 1
GROUP BY `recipes`.`recipe_id`
HAVING COUNT(`requirements`.`ingredient_id`) = 2
If you have questions let me know

MYSQL join tables based on column data and table name

I'm wondering if this its even posible.
I want to join 2 tables based on the data of table 1.
Example table 1 has column food with its data beeing "hotdog".
And I have a table called hotdog.
IS it possible to do a JOIN like.
SELECT * FROM table1 t join t.food on id = foodid
I know it doesnt work but, its even posible, is there a work arround?.
Thanks in advance.

No, you can't join to a different table per row in table1, not even with dynamic SQL as #Cade Roux suggests.
You could join to the hotdog table for rows where food is 'hotdog' and join to other tables for other specific values of food.
SELECT * FROM table1 JOIN hotdog ON id = foodid WHERE food = 'hotdog'
UNION
SELECT * FROM table1 JOIN apples ON id = foodid WHERE food = 'apples'
UNION
SELECT * FROM table1 JOIN soups ON id = foodid WHERE food = 'soup'
UNION
...
This requires that you know all the distinct values of food, and that all the respective food tables have compatible columns so you can UNION them together.
What you're doing is called polymorphic associations. That is, the foreign key in table1 references rows in multiple "parent" tables, depending on the value in another column of table1. This is a common design mistake of relational database programmers.
For alternative solutions, see my answers to:
Possible to do a MySQL foreign key to one of two possible tables?
Why can you not have a foreign key in a polymorphic association?
I also cover solutions for polymorphic associations in my presentation Practical Object Oriented Models In SQL, and in my book SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.

Only with dynamic SQL. It is also possible to left join many different tables and use CASE based on type, but the tables would be all have to be known in advance.
It would be easier to recommend an appropriate design if we knew more about what you are trying to achieve, what your design currently looks like and why you've chosen that particular table design in the first place.
-- Say you have a table of foods:
id INT
foodtype VARCHAR(50) (right now it just contains 'hotdog' or 'hamburger')
name VARCHAR(50)
-- Then hotdogs:
id INT
length INT
width INT
-- Then hamburgers:
id INT
radius INT
thickness INT
Normally I would recommend some system for constraining only one auxiliary table to exist, but for simplicity, I'm leaving that out.
SELECT f.*, hd.length, hd.width, hb.radius, hb.thickness
FROM foods f
LEFT JOIN hotdogs hd
ON hd.id = f.id
AND f.foodtype = 'hotdog'
LEFT JOIN hamburgers hb
ON hb.id = f.id
AND f.foodtype = 'hamburger'
Now you will see that such a thing can be code generated (or even for a very slow prototype dynamic SQL on the fly) from SELECT DISTINCT foodtype FROM foods given certain assumptions about table names and access to the table metadata.
The problem is that ultimately whoever consumes the result of this query will have to be aware of new columns showing up whenever a new table is added.
So the question moves back to your client/consumer of the data - how is it going to handle the different types? And what does it mean for different types to be in the same set? And if it needs to be aware of the different types, what's the drawback of just writing different queries for each type or changing a manual query when new types are added given the relative impact of such a change anyway?

How do I make the rows of a lookup table into the columns of a query?

I have three tables: students, interests, and interest_lookup.
Students has the cols student_id and name.
Interests has the cols interest_id and interest_name.
Interest_lookup has the cols student_id and interest_id.
To find out what interests a student has I do
select interests.interest_name from `students`
inner join `interest_lookup`
on interest_lookup.student_id = students.student_id
inner join `interests`
on interests.interest_id = interest_lookup.interest_id
What I want to do is get a result set like
student_id | students.name | interest_a | interest_b | ...
where the column name 'interest_a' is a value in interests.name and
the interest_ columns are 0 or 1 such that the value is 1 when
there is a record in interest_lookup for the given
student_id and interest_id and 0 when there is not.
Each entry in the interests table must appear as a column name.
I can do this with subselects (which is super slow) or by making a bunch of joins, but both of these really require that I first select all the records from interests and write out a dynamic query.

You're doing an operation called a pivot. #Slider345 linked to (prior to editing his answer) another SO post about doing it in Microsoft SQL Server. Microsoft has its own special syntax to do this, but MySQL does not.
You can do something like this:
SELECT s.student_id, s.name,
SUM(i.name = 'a') AS interest_a,
SUM(i.name = 'b') AS interest_b,
SUM(i.name = 'c') AS interest_c
FROM students s
INNER JOIN interest_lookup l USING (student_id)
INNER JOIN interests i USING (interest_id)
GROUP BY s.student_id;
What you cannot do, in MySQL or Microsoft or anything else, is automatically populate columns so that the presence of data expands the number of columns.
Columns of an SQL query must be fixed and hard-coded at the time you prepare the query.
If you don't know the list of interests at the time you code the query, or you need it to adapt to changing lists of interest, you'll have to fetch the interests as rows and post-process these rows in your application.

What your trying to do sounds like a pivot.
Most solutions seem to revolve around one of the following approaches:
Creating a dynamic query, as in Is there a way to pivot rows to columns in MySQL without using CASE?
Selecting all the attribute columns, as in How to pivot a MySQL entity-attribute-value schema
Or, identifying the columns and using either a CASE statement or a user defined function as in pivot in mysql queries

I don't think this is possible. Actually I think this is just a matter of data representatioin. I would try to use a component to display the data that would allow me to pivot the data (for instance, the same way you do on excel, open office's calc, etc).
To take it one step further, you should think again why you need this and probably try to solve it in the application not in the database.
I know this doesn't help much but it's the best I can think of :(

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008