Lets say we have quite a few tables (T1, T2... T50), and we would like to have n to n relations between all of them.
What would be a propper way of implementig that.
Having a relations table for each pair of Tx and Ty would not be practical if the number of tables goes up to 100 or more.
The current solution I have is
relationships_table
id_x, table_name_x, id_y, table_name_y
for storing all the relationships. This way adding new tables is trivial, but what are the disadvantages?
1) What is a better way of supporting such a use case, if we're limited to sql?
2) How to efficiently solve this if we're not limited to sql?
The solution you proposed is the most reasonable solution to the stated problem. But the problem seems somewhat unreasonable.
If you need a graph, then you only need two tables, one for the nodes and another one for the edges.
If some nodes are of specific types then you can have extra specialization tables for them.
Add only the essential Relation tables. tblA relates to tblB, and tblB relates to tblC. So, usually that implies that you can get from A to C via
FROM tblA
JOIN tblB ON ...
JOIN tblC ON ...
Won't this do? And need not much more than 50 extra tables? And be a lot cleaner?
I run into the same problem and I had a sligthly different approach. I added a table called relationable, only storing an id and all tables appearing in the graph have a reference to this table. I make sure on my own that only one element references an relationable entry in the whole database (This is actually what boters me the most, but in practice it is not such a problem just not looking nice). and then a relation table for the n to n relationship between relationable.
To make my point I add an example i MADE IN MySQL.
CREATE TABLE relationable
(
relationable_id INT AUTO_INCREMENT PRIMARY KEY
) ENGINE=INNODB;
in the relation table I added a name, because my vertices have a name, there might even be multiple vertices between two nodes with different names.
CREATE TABLE relation
(
from_id INT NOT NULL,
to_id INT NOT NULL,
name VARCHAR(255) NOT NULL,
FOREIGN KEY (from_id) REFERENCES relationable(relationable_id) ON DELETE CASCADE,
FOREIGN KEY (to_id) REFERENCES relationable(relationable_id) ON DELETE CASCADE
)ENGINE=INNODB;
finally a table which appears in the graph would look like the following
CREATE TABLE place
(
place_id INT NOT NULL,
name VARCAHR(255),
FOREIGN KEY (PLACE_ID) REFERENCES relationable(relationable_id)
ON DELETE CASCADE
) ENGINE=INNODB;
Now obviously this has pros and cons,
cons
You need to make sure yourself that a relationable is only referenced once. Inside one table this is taken care of by PRIMARY KEY but over all tables this is not done.
You might need a huge int for the id of relationable.
The table relation might get quite big.
pros
To errase an entry and all its relations deleting the relationable entry suffices, all entrys in relation and the respective table will be deleted.
When joining two tables there is no need for the relationable table.
Related
Sorry, not sure if question title is reflects the real question, but here goes:
I designing system which have standard orders table but with additional previous and next columns.
The question is which approach for foreign keys is better
Here I have basic table with following columns (previous, next) which are self referencing foreign keys. The problem with this table is that the first placed order doesn't have previous and next fields, so they left out empty, so if I have say 10 000 records 30% of them have those columns empty that's 3000 rows which is quite a lot I think, and also I expect numbers to grow. so in a let's say a year time period it can come to 30000 rows with empty columns, and I am not sure if it's ok.
The solution I've have came with is to main table with other 2 tables which have foreign keys to that table. In this case those 2 additional tables are identifying tables and nothing more, and there's no longer rows with empty columns.
So the question is which solution is better when considering query speed, table optimization, and common good practices, or maybe there's one even better that I don't know? (P.s. I am using mysql with InnoDB engine).
If your aim is to do order sets, you could simply add a new table for that, and just have a single column as a foreign key to that table in the order table.
The orders could also include a rank column to indicate in which order orders belonging to the same set come.
create table order_sets (
id not null auto_increment,
-- customer related data, etc...
primary key(id)
);
create table orders (
id int not null auto_increment,
name varchar,
quantity int,
set_id foreign key (order_set),
set_rank int,
primary key(id)
);
Then inserting a new order means updating the rank of all other orders which come after in the same set, if any.
Likewise, for grouping queries, things are way easier than having to follow prev and next links. I'm pretty sure you will need these queries, and the performances will be much better that way.
I am in a situation where i have to store key -> value pairs in a table which signifies users who have voted certain products.
UserId ProductID
1 2345
1 1786
6 657
2 1254
1 2187
As you can see that userId keeps on repeating and so can productId. I wanted to know what can be the best way to represent this data. Also is there a necessity of using primary key in here. I've searched a lot but am not able to find the exact specification about my problem. Any help would be appreciated. Thank you.
If you want to enforce that a given user can vote for a given product at most once, create a unique constraint over both columns:
ALTER TABLE mytable ADD UNIQUE INDEX (UserId, ProductID);
Although you can use these two columns together as a key, your app code is often simpler if you define a separate, typically auto increment, key column, but the decision to do this depends on which app code language/library you use.
If you have any tables that hold a foreign key reference to this table, and you intend to use referential integrity, those tables and the SQL used to define the relationship will also be simpler if you create a separate key column - you just end up carting multiple columns around instead of just one.
I'm currently designing a database structure for our team's project. I have this very question in mind currently: Is it possible to have a foreign key act as a primary key on another table?
Here are some of the tables of our system's database design:
user_accounts
students
guidance_counselors
What I wanted to happen is that the user_accounts table should contain the IDs (supposedly the login credential to the system) and passwords of both the student users and guidance counselor users. In short, the primary keys of both the students and guidance_counselors table are also the foreign key from the user_accounts table. But I am not sure if it is allowed.
Another question is: a student_rec table also exists, which requires a student_number (which is the user_id in the user_accounts table) and a guidance_counsellor_id (which is also the user_id in the user_accounts) for each of its record. If both the IDs of a student and guidance counselor come from the user_accounts table, how would I design the student_rec table? And for future reference, how do I manually write it as an SQL code?
This has been bugging me and I can't find any specific or sure answer to my questions.
Of course. This is a common technique known as supertyping tables. As in your example, the idea is that one table contains a superset of entities and has common attributes describing a general entity, and other tables contain subsets of those entities with specific attributes. It's not unlike a simple class hierarchy in object-oriented design.
For your second question, one table can have two columns which are separately foreign keys to the same other table. When the database builds the query, it joins that other table twice. To illustrate in a SQL query (not sure about MySQL syntax, I haven't used it in a long time, so this is MS SQL syntax specifically), you would give that table two distinct aliases when selecting data. Something like this:
SELECT
student_accounts.name AS student_name,
counselor_accounts.name AS counselor_name
FROM
student_rec
INNER JOIN user_accounts AS student_accounts
ON student_rec.student_number = student_accounts.user_id
INNER JOIN user_accounts AS counselor_accounts
ON student_rec.guidance_counselor_id = counselor_accounts.user_id
This essentially takes the student_rec table and combines it with the user_accounts table twice, once on each column, and assigns two different aliases when combining them so as to tell them apart.
Yes, there should be no problem. Foreign keys and primary keys are orthogonal to each other, it's fine for a column or a set of columns to be both the primary key for that table (which requires them to be unique) and also to be associated with a primary key / unique constraint in another table.
If I have two tables - table beer and table distributor, each one have a primary key and a third table that have the foreign keys and calls beer_distributor
Is it adequate a new field (primary key) in this table? The other way is with joins, correct? To obtain for example DUVEL De vroliijke drinker?
You've definitely got the right idea. Your beer_distributor table is what's known as a junction table. JOINs and keys/indexes are used together. The database system uses keys to make JOINs work quickly and efficiently. You use this junction table by JOINing both beer and distributor tables to it.
And, your junction table should have a primary key that spans both columns (a multiple-column index / "composite index"), which it looks like it does if I understand that diagram correctly. In that case, it looks good to me. Nicely done.
I would put a primary key in the join table beer_distributor, not a dual primary key of the two foreign keys. IMO, it makes life easier when maintaining the relationship.
UPDATE
To emphasize this point, consider having to change the distributor ACOO9 for beer 163. With the dual primary key, you'd have to remove then reinsert OR know both existing values to update the record. With a separate primary key, you'd simply update the record using this value. Comes in handy when building applications on top this data. If this is strictly a data warehouse, then a dual primary key might make more sense from the DBA perspective.
UPDATE beer_distributor SET distributor_id = XXXXX WHERE beer_id = 163 AND distributor_id = AC009
versus
UPDATE beer_distributor SET distributor_id = XXXXX WHERE id = 1234
Been reading the tutorial How to handle a Many-to-Many relationship with PHP and MySQL .
In this question I refer to the "Database schema" section which states the following rules:
This new table must be constructed to
allow the following:
* It must have a column which links back to table 'A'.
* It must have a column which links back to table 'B'.
* It must allow no more than one row to exist for any combination of rows from table 'A' and table 'B'.
* It must have a primary key.
Now it's crystal clear so far.
The only problem I'm having is with the 3rd rule ("It must allow no more than one row to exist for any combination").
I want this to be applied as well, but it doesn't seem to work this way.
On my test instance of mysql (5.XX) I'm able to add two rows which reflect the same relationship!
For example, if I make this relation (by adding a row):
A to B
It also allows me to make this relation as well:
B to A
So the question is two questions actually:
1) How do I enfore the 3rd rule which will not allow to do the above? Have only one unique relation regardless of the combination.
2) When I'll want to search for all the relations of 'A', how would the SQL query look like?
Note #1: Basically my final goal is to create a "friendship" system, and as far as I understand the solution is a many-to-many table. Suggest otherwise if possible.
Note #2: The users table is on a different database from the relations (call it friendships) table. Therefore I cannot use foreign keys.
For the first question:
Create a unique constraint on both
columns
Make sure you always sort the columns. So if your table has the
colummns a and b than make sure
that a is less than or equal to
b
For the second question:
SELECT
*
FROM
many_to_many_table
WHERE
a = A or b = A
It sounds like you want a composite primary key.
CREATE TABLE relationship (
A_id INTEGER UNSIGNED NOT NULL,
B_id INTEGER UNSIGNED NOT NULL,
PRIMARY KEY (A_id, B_id)
);
This is how you setup a table so that there can only ever be one row that defines tables A and B as related. It works because a primary key has to be unique in a table so therefore the database will allow only one row with any specific pair of values. You can create composite keys that aren't a primary key and they don't have to be unique (but you can create a unique non-primary key, composite or not), but your specification requested a primary key, so that's what I suggested.
You can, of course, add other columns to store information about this specific relationship.
Ok WoLpH was faster, I basically agree (note that you have to create a single constraint on both columns at the same time!). And just to explain why you collide with the rules you mentioned: Typically, A and B are different tables. So the typical example for n:m relations would allow entries (1,0) and (0,1) because they'd be refering to different pairs. Having table A=table B is a different situation (you use A and B as users, but in the example they're tables).