database design -- many to many table - how to best represent - mysql

I need to represent a many-to-many relationship between teachers and the subjects that they teach.
For implementation, a couple of strategies come to mind:
teacher_name | subject_names
bill math, english, science, french
sally | chemistry, english, arts & crafts
I've rejected this strategy because querying fields with comma separated values does not seem efficient, especially when I will be pulling them for search engines, iteration, etc... though I am certainly open to hearing a defense of this strategy
teacher_name | subject_name
bill math
bill english
bill science
... ...
sally chemistry
sally english
... ...
I initially thought this was a better idea, but when I query for information about the teacher, I get data that gets hard to report. i.e. it's fine that I have 5 rows for bob's subjects, but it's not fine that I also find that he lives at 123 main st. and 123 main st. and 123 main st... I still think this is a better idea overall, but maybe a better one exists.
By the way, I don't really use teacher_names and subject_names to index through, I use numbers, but I've drawn it this way for clarity

Many-to-many relationships are best solved using a third table called a Junction Table that maps the relations. Here's a good guide that explains a little more in detail, but basically...
The new table table will contain two columns of foreign keys; the unique ID (primary key of the teachers-table) of your teachers in one column, and subjects unique ID (primary key of the subjects-table) in the the other column (... and of course a column for the junction table's it's own unique ID).
Say this is your table of teachers:
----------------------------
| ID | Name | Last_Name |
----------------------------
| 0001 | JOHN | STEPHENS |
----------------------------
| 0002 | BRUCE | WAYNE |
----------------------------
And this is your table of subjects
-----------------------
| ID | Subject_name |
-----------------------
| 0101 | MATH |
-----------------------
| 0202 | BIOLOGY |
-----------------------
| 0303 | ENGLISH |
-----------------------
Then you need a junction table like this:
TeacherSubject_JunctionTable:
--------------------------------
| ID | Teacher_ID | Subject_ID |
--------------------------------
| 01 | 0001 | 0101 |
--------------------------------
| 02 | 0001 | 0202 |
--------------------------------
| 03 | 0002 | 0101 |
--------------------------------
| 04 | 0002 | 0303 |
--------------------------------
Now, whenever you need to get a list of teachers that teach math (subject_id 0101), you can simply query the juction table; something like
SELECT Teacher_ID FROM TeacherSubject_JunctionTable
WHERE Subject_ID = 0101;
Or the other way around, if you want to get the subjects that a teacher teach.

You can have a table for subjects (id,subject_name) another for teachers (id,teacher_name) and another table called teacher_subject that has teacher_id,subject_id as a composite key. this is the most recommended approach for many to many relationships as it is normalized.

Both of your solutions break normalization stratagies
The normal way to do this is with a 3rd Join Table.
teacher_ID | teacher_name | Other stuff about teachers
1 | bill | address, dob etc.
2 | sally | etc.
subject_ID | subject_name | Other stuff about teachers
1 | math | department, campus etc.
2 | english | etc.
teacher_ID | subject_ID | Other stuff about the relationship
1 | 4 | location etc.
1 | 2 | etc.
2 | 2 | etc.
2 | 1 | etc.
The teacher_ID & subject_ID is an auto-incrementing int primary key in the first 2 tables.
teacher_ID & subject_ID are the primary key of the 3rd.

Related

Concatenate values for concatenated IDs

Database: MySQL
I have two tables, one for user's assigned roles and one that contains the role information. My problem is that the assigned roles are stored in a single field, separated by commas. I need to build a report that lists the roles by name, not the id, but still be in a single field separated by columns.
I'm thinking GROUP_CONCAT might be the solution but I've seen it used to create a concatenated list, not use one that already exists.
Table 1:USERS
ID | FNAME | LNAME | ROLE_IDS
------------------------------------------
1 | Bob | Jones | 445,44,45,449,459
2 | Mark | Doe | 426,459,445
3 | Jeff | Apple | 444,45
Table 2: ROLES
ID | ROLE_NAME
------------------------------------
4 | Basic
13 | Reporting
16 | Advanced
44 | Admin
45 | Super User
426 | Accounting
444 | User
445 | Receivables
449 | Processing
459 | Research
Expected Query Results:
ID | FNAME | LNAME | ROLES
-------------------------------------------
1 | Bob | Jones | Receivables, Admin, Super User, Processing, Research
2 | Mark | Doe | Accounting, Research, Receivables
3 | Jeff | Apple | User, Super User
For getting referencing role names, you can use GROUP_CONCAT like this :
SELECT us.ID,us.FNAME,us.LNAME,
GROUP_CONCAT(ro.ROLE_NAME) ROLES_NAME
FROM USERS us
INNER JOIN ROLES ro
ON FIND_IN_SET(ro.ID, us.ROLE_IDS) > 0
GROUP BY us.ID
I've tested it in SQLFIDDLE and working fine.

MySQL - Update table values from another table

I have two tables -
subjects and questions.
Structure and rows of table subjects are like :-
----------------------------
subject_id | subject_name
----------------------------
1 | physics
2 | chemistry
3 | biology
Structure and rows of table questions are like :-
question_id | subject_id | subject_name | question
---------------------------------------------------------
1 | 0 | physics | demo_question_1
2 | 0 | physics | demo_question_2
3 | 0 | chemistry | demo_question_3
4 | 0 | biology | demo_question_4
I added column subject_id in questions table after I had already inserted some rows. I want to bulk update subject_id of questions table as per subjects table. I can update them individually, using WHERE, but I was hoping if any single query would do this work?
Not sure what you mean by a single query versus a WHERE, but I think this will suit your needs.
UPDATE questions q, subjects s
SET q.subject_id = s.subject_id
WHERE q.subject_name = s.subject_name;

How To Design A Database for a "Check In" Social Service

I want to build a "check in" service like FourSquare or Untappd.
How do I design a suitable database schema for storing check-ins?
For example, suppose I'm developing "CheeseSquare" to help people keep track of the delicious cheeses they've tried.
The table for the items into which one can check in is fairly simple and would look like
+----+---------+---------+-------------+--------+
| ID | Name | Country | Style | Colour |
+----+---------+---------+-------------+--------+
| 1 | Brie | France | Soft | White |
| 2 | Cheddar | UK | Traditional | Yellow |
+----+---------+---------+-------------+--------+
I would also have a table for the users, say
+-----+------+---------------+----------------+
| ID | Name | Twitter Token | Facebook Token |
+-----+------+---------------+----------------+
| 345 | Anne | qwerty | poiuyt |
| 678 | Bob | asdfg | mnbvc |
+-----+------+---------------+----------------+
What's the best way of recording that a user has checked in to a particular cheese?
For example, I want to record how many French cheeses Anne has checked-in. Which cheeses Bob has checked into etc. If Cersei has eaten Camembert more than 5 times etc.
Am I best putting this information in the user's table? E.g.
+-----+------+------+--------+------+------+---------+---------+
| ID | Name | Blue | Yellow | Soft | Brie | Cheddar | Stilton |
+-----+------+------+--------+------+------+---------+---------+
| 345 | Anne | 1 | 0 | 2 | 1 | 0 | 5 |
| 678 | Bob | 3 | 1 | 1 | 1 | 1 | 2 |
+-----+------+------+--------+------+------+---------+---------+
That looks rather ungainly and hard to maintain. So should I have separate tables for recordings check in?
No, don't put it into the users table. That information is better stored in a join table which represents a many-to-many relationship between users and cheeses.
The join table (we'll call cheeses_users) must have at least two columns (user_ID, cheese_ID), but a third (a timestamp) would be useful too. If you default the timestamp column to CURRENT_TIMESTAMP, you need only insert the user_ID, cheese_ID into the table to log a checkin.
cheeses (ID) ⇒ (cheese_ID) cheeses_users (user_ID) ⇐ users (ID)
Created as:
CREATE TABLE cheeses_users
cheese_ID INT NOT NULL,
user_ID INT NOT NULL,
-- timestamp defaults to current time
checkin_time DATETIME DEFAULT CURRENT_TIMESTAMP,
-- (add any other column *specific to* this checkin (user+cheese+time))
--The primary key is the combination of all 3
-- It becomes impossible for the same user to log the same cheese
-- at the same second in time...
PRIMARY KEY (cheese_ID, user_ID, checkin_time),
-- FOREIGN KEYs to your other tables
FOREIGN KEY (cheese_ID) REFERENCES cheeses (ID),
FOREIGN KEY (user_ID) REFERENCES users (ID),
) ENGINE=InnoDB; -- InnoDB is necessary for the FK's to be honored and useful
To log a checkin for Bob & Cheddar, insert with:
INSERT INTO cheeses_users (cheese_ID, user_ID) VALUES (2, 678);
To query them, you join through this table. For example, to see the number of each cheese type for each user, you might use:
SELECT
u.Name AS username,
c.Name AS cheesename,
COUNT(*) AS num_checkins
FROM
users u
JOIN cheeses_users cu ON u.ID = cu.user_ID
JOIN cheeses c ON cu.cheese_ID = c.ID
GROUP BY
u.Name,
c.Name
To get the 5 most recent checkins for a given user, something like:
SELECT
c.Name AS cheesename,
cu.checkin_time
FROM
cheeses_users cu
JOIN cheeses c ON cu.cheese_ID = c.ID
WHERE
-- Limit to Anne's checkins...
cu.user_ID = 345
ORDER BY checkin_time DESC
LIMIT 5
Let's define more clearly, so you can tell me if I'm wrong:
Cheese instances exist and aren't divisible ("Cheddar/UK/Traditional/Yellow" is a valid checkinable cheese, but "Cheddar" isn't, nor is "Yellow" or "Cheddar/France/...)
Users check into a single cheese instance at a given time
Users can re-check into the same cheese instance at a later date.
If this is the case, then to store fully normalized data, and to be able to retrieve that data's history, you need a third relational table linking the two existing tables.
+-----+------------+---------------------+
| uid | cheese_id | timestamp |
+----+-------------+---------------------+
| 345 | 1 | 2014-05-04 19:04:38 |
| 345 | 2 | 2014-05-08 19:04:38 |
| 678 | 1 | 2014-05-09 19:04:38 |
+-----+------------+---------------------+
etc. You can add extra columns to correspond to the cheese data, but strictly speaking you don't need to.
By putting all this in a third table, you potentially improve both performance and flexibility. You can always reconstruct the additions to the users table you mooted, using aggregate queries.
If you really decide you don't need the timestamps, then you'd replace them with basically the equivalent of a COUNT(*) field:
+-----+------------+--------------+
| uid | cheese_id | num_checkins |
+----+-------------+--------------+
| 345 | 1 | 15 |
| 345 | 2 | 3 |
| 678 | 1 | 8 |
+-----+------------+--------------+
That would dramatically reduce the size of your joining table, although obviously there's less of a "paper trail", should you need to reconstruct your data (and possibly say to a user "oh, yeah, we forgot to record your checkin on such-a-date.")
The entities 'User' and 'Cheese' have a many-to-many relationship. A user can have multiple cheeses he checked into, and a cheese can have multiple people that checked into it.
The only right way to design this in a relational database is to store it into a separate table. There are many reasons why storing it into the user table for instance, is a very bad idea. Read up on normalizing databases for more info on this.
Your table should look something like this:
CheckIns(CheeseId, UserId, (etc...))
Other useful columns might include date or rating, or whatever you want to store about a particular relationship between a user and a cheese.

Optimising table design or optimising the query

I am trying to decide which one is better: to design a table that wastes a lot of space and has a simple query OR to write a very tight table but then the process of finding what I am looking for would be very processing intense.
The actual problem is this:
Imagine you have a very simple table. 1st column for the ID number the 2nd is a list of names and the 3rd is a list of names too. The 2nd column is a list of people who owe to the people in the 3rd column.
The search should do the following:
I search for a name in the 3rd column and see who owes this person in the 2nd column. A name or multiple names come up, then I want to see who owes them, again a bunch of names come up, and so on to level 5.
Maybe this is a well known scheme for which there is a well known simple answer in table design or MySQL circles. Could anybody suggest a MySQL query or perhaps an appropriate table design where I can use a simple query?
Example
ID owes owned to
1 Peter John
2 John George
3 Abdul George
4 George Anna
So I could design a wasteful table like this
ID 1 2 3 4 5
1 Anna George Abdul
2 Anna George John Peter
3 George Abdul
4 George John Peter
5 John Peter
But this would be very wasteful and bad bad design but it would be very easy to access the data along with the hierarchy and the owing chain.
Something like this seems suitable:
people
+----+--------+
| id | name |
+----+--------+
| 1 | Marty |
| 2 | Steven |
| 3 | John |
+----+--------+
With the table building the relationships between people owed and owing:
loans
+-----------+-------------+
| lender_id | borrower_id |
+-----------+-------------+
| 1 | 2 |
| 1 | 3 |
| 2 | 1 |
+-----------+-------------+
You could get all the people owing a given lender with something as simple as:
SELECT people.id, people.name
FROM loans
INNER JOIN people ON people.id = loans.borrower_id
WHERE loans.lender_id = X
Where X is the id of the lender. Given the lender_id of 1 (Marty) for example would yield:
+----+--------+
| id | name |
+----+--------+
| 2 | Steven |
| 3 | John |
+----+--------+
You can repeat this process for each of the resulting people until there are no results (no one being owed).

How to store family tree data in a mysql database

I have a family tree. I would like to store it in a mysql database. I have a table with a column called "family members," but i don't know how to arrange these family members. For example, I am under my dad and my son is under me. So i guess, how can i store this type of tree in a database?
So, you said you have a table with a column called "family members". For me, that's just inappropriate because it doesn't respect normalization :) First of all I would call it "familyTreeId". Now, let's move to the FamilyTree table.
This table would be something like this:
FamilyTree(id, motherId, fatherId, etc) --> etc: if you have additional data
id will be the primary key of the table
motherId will link to the row in the FamilyTree table that belongs to the mother
fatherId will link to the row in the FamilyTree table that belongs to the father
So the rows will be:
+--------+--------------+--------------+
| id | motherId | fatherId |
+--------+--------------+--------------+
| son1 | yourwife | you |
| son2 | yourwife | you |
| you | mother | father |
| mother | grandmother1 | grandfather1 |
| father | grandmother2 | grandfather2 |
+--------+--------------+--------------+
Other option would be to store the couples
FamilyTreeParents(id, motherId, fatherId)
FamilyTreeNodes(id, familyTreeParentsId)
id will be the primary keys of the tables
familyTreeParentsId will be a foreign key to a FamilyTreeParents table
motherId will be a foreign key to a row in the FamilyTreeNodes table that belongs to the mother
fatherId will be a foreign key to a row in the FamilyTreeNodes table that belongs to the father
So the rows will be:
FamilyTreeParents
+----+--------------+--------------+
| id | motherId | fatherId |
+----+--------------+--------------+
| 1 | yourwife | you |
| 2 | mother | father |
| 3 | grandmother1 | grandfather1 |
| 4 | grandmother2 | grandfather2 |
+----+--------------+--------------+
FamilyTreeNodes
+--------+---------------------+
| id | familyTreeParentsId |
+--------+---------------------+
| son1 | 1 |
| son2 | 1 |
| you | 2 |
| mother | 3 |
| father | 4 |
+--------+---------------------+
Data is more normalized this way because you are not repeating information (like you and yourwife for son1 and son2 as I did in the other solution. However, this solution might be less efficient in terms of speed because there will be needed more joins.
I would keep two tables, one with persons, other with relations.
Question here is if you should keep the realtion in one record (eg husband - wife) or also from the other person's view (1:husband - wife, 2:wife - husband)
Advantage of second approach is quick searches so fast rendering of eg a layout but also larger table with more writes when data change and possible errors.
I would take the first approach and use some index to make the searches quicker.
So with a minimum of connections you could write out the following family
grandfather louis(id1)
x grandmother clothild(id2)
father francois(id3)
x mother diana(id4)
me peter(id5)
x my first wife fabienne(id6)
my son laurent(id9)
x my second wife jane(id7)
my son tristan(id10)
my brother hans(id8)
as
1x2
3x4
5x6
5x7
1>3
2>3
3>5
4>5
3>8
4>8
6>9
5>9
5>10
7>10
or shorter
1x2>3
3x4>5
3x4>8
5x6>9
5x7>10
So in a databasetable this gives
id_partner1 id_partner2 id_child
1 2 3
3 4 5
3 4 8
5 6 9
5 7 10
You can have schema like this
Family( Parent_name, Child_name ). The "tuple" (Parent_name, Child_name) are the key of your table. Assuming there is no duplicate (Parent_name, Child_name) exist in your family tree. If you have anything like Social Security Number to uniquely identify a person in the family tree, then you should the Parent_ssn, Child_ssn instead of names and have a separate table to store the relation between ssn and name, whose key would be ssn
items in this table can be
[Your dad, you]
[Your mum, you]
[you, your son]
[you, your 2nd son]
[your wife, your son]
Hope this helps
The schema can be this:
id
person
related_person
relation
comments
1
Mac
Mac's Brother
Brother-Brother
2
Mac' mother
Mac
Mother-Son
3
Mac
Mac' mother
Son-Mother
actually same as 2
Support more relationships, even ex-wife and ex-husband.
Also cost-saving, only one row is required between any two people, because their relationship can be reversed.
MARK: It is feasible for a small amount of data.