Is there any sense in using two foreign key to the same parent table, to avoid inner join?
table: user_profile
id1, userid, username, firstname
table: user_hobby1
id2, userid(fk), hobby, movies
table: user_hobby2
id3, userid(fk), firstname(fk), hobby, movies
I want to select all firstname and hobby from the above table. I am not sure if user_hobby1 or user_hobby2 is the best design in terms of performance? One adds extra foreign key and another requires join.
Query1 :
Select firstname, hobby
from user_hobby2;
Query2 :
Select p.firstname, h.hobby
from
user_profile p
inner join user_hobby1 h on u.userid=h.userid;
Copying the value of an attribute from the user table into the hobby table isn't a "foreign key", that's redundancy.
Our performance objectives are not usually met with an approach of avoiding JOIN operations, which are a normal part of how relational databases operate.
I'd go with the normalized design as a first cut. Each attribute should be dependent on the key, the whole key, and nothing but the key. The "firstname" attribute is dependent on the id of the user, not the hobby.
Sometimes, we do gain performance benefits by introducing redundancy into the database. We have to do that in a controlled way, and make sure that we don't get update anomalies. (Consider what changes we want to apply if the value of "firstname" attribute is updated... do we make that change to the user table, the user_hobby table, or both.
Likely, "firstname" is not unique in the user table, so we definitely don't want a foreign key referencing that column; we want foreign keys that reference the user table to reference the PRIMARY KEY of the table.
There's no point in having two foreign keys defined between user_hobby and user, if a user_hobby is related to exactly one user. We only need one foreign key... we just store the id from the user table in the user_hobby table.
if you have two FK in user_hobby2 then you can only ensure that userid and username exist in user_profile, but you have no way to ensure which userid goes with a given username.
if you make (userid, username) a composite FK, then you'll guarantee the consistency of each tuple, but composite FK are generally more complicate to deal with. Depending on the behavior for update and delete cascades I've seen mysql triggering them both and refusing to delete from the parent.
Besides... what's the point of keeping that composite FK? It will only help you when you update or delete from user_profile, but won't help you copy the data when you insert new users or new hobbies for a user.
The join you are trying to avoid is very cheap. Just go with the first approach. It's easier to maintain and will help you keep your data consistent and normalized.
Related
I've just realized that one of my tables, "pclass", has multiple instances of several foreign keys. In the Structure tab, #2-5 are the foreign keys. I have no idea why multiple instances are being generated.
Could they be generated by the JOINS? Please let me know if I need to provide other information.
$brother_id = htmlspecialchars($_GET["brother_id"]);
$selected = $brother_id;
$query_brotherId = "SELECT b.id, b.firstname, b.lastname, b.pname, b.country, b.street01, b.street02, b.city, usStates.abv AS us_state, b.intl_state, b.postalcode, b.zipcode, b.phone, b.email, pclass.id AS pclass_id, greekAlphabet.name AS pclass01, prepclass.name AS prepclass, pclassSuffix.name AS pclass02, semester.name AS pclass_sem, pclass.year AS pclass_year, b.bigbrother_id AS bbID, bb.firstname AS bbFirst, bb.lastname AS bbLast, b.status, b.comments
FROM brothers AS b
LEFT JOIN pclass ON b.pclass_id = pclass.id
LEFT JOIN prepclass ON pclass.prepclass_id = prepclass.id
LEFT JOIN greekAlphabet ON pclass.greekAlphabet_id = greekAlphabet.id
LEFT JOIN pclassSuffix ON pclass.suffix_id = pclassSuffix.id
LEFT JOIN semester ON pclass.semester_id = semester.id
LEFT JOIN usStates ON b.us_state = usStates.id
LEFT JOIN brothers AS bb ON b.bigbrother_id = bb.id
WHERE b.id = $brother_id";
$result_brotherId = mysqli_query($link, $query_brotherId);
First your question:
Could they be generated by the JOINS?
No. Foreign Keys are generated by data definition statements like CREATE TABLE, ALTER TABLE and so on.
I have no idea why multiple instances are being generated.
The person who created the database must have thought they will be useful. Or if you created the database via some sql-tool (don't know) the tool created the foreign keys because it got told there is a relation between those fields.
Why it is probably not bad to have the keys:
Foreign Keys are created to display the relations between your different table.
Also they enforce a specific behaviour when you are doing actions which could disrupt the integrity of your data. You can change this behaviour in your last screenshot.
For each foreign key you can give a name which will be shown in error messages when you try to act against the constraing. And you can define how the foreign key acts if you change or delete the parent field.
For example
You have the following tables displaying which tool belongs to which person.
persons
personid
firstname
lastname
...
tools
toolid
personid (foreign key to persons)
name
....
So in the tools table you have a foreign key to the persons table, this field defines the owner of the tool.
Now let's define some use cases
Assumption: For some reason Peter is no longer able to wield any tools, so he no longer fits into the database.
What should happen to his tools? It depends what your database displays!
your database displays anyone who ever owned a tool.
This means, even if the person actually doesn't even exist anymore, the data should still remain. You would actually enforce this behaviour otherwise, but it would work in our current case to show what the foreign key can do.
So the action we choose for ON DELETE is RESTICT. (It also is the default action)
Now let's try to call: DELETE FROM persons WHERE firstname = 'Peter'
Result: the foreign key constraint will prompt you an error message. There are relations which depend on this entry in the persons table.
The database displays persons and some tools, tools don't have to have an owner
In this case we again want to delete the person Peter. His tools can remain in the database, instead of the personid they will get a null value into this field.
So we choose the action ON DELETE: SET NULL
This one is pretty straight forward. Important: the field with the foreign key must not have a NOT NULL constraint.
The database displays the people and the tools in a building or something..
So if Peter and his tools leave the building, we don't care about them anymore.
The action for ON DELETE: CASCADE.
If you now enter the DELETE-statement, the foreign key will take care of deleting all the other entries (the tools) connected to Peter.
I'm currently designing a database structure for our team's project. I have this very question in mind currently: Is it possible to have a foreign key act as a primary key on another table?
Here are some of the tables of our system's database design:
user_accounts
students
guidance_counselors
What I wanted to happen is that the user_accounts table should contain the IDs (supposedly the login credential to the system) and passwords of both the student users and guidance counselor users. In short, the primary keys of both the students and guidance_counselors table are also the foreign key from the user_accounts table. But I am not sure if it is allowed.
Another question is: a student_rec table also exists, which requires a student_number (which is the user_id in the user_accounts table) and a guidance_counsellor_id (which is also the user_id in the user_accounts) for each of its record. If both the IDs of a student and guidance counselor come from the user_accounts table, how would I design the student_rec table? And for future reference, how do I manually write it as an SQL code?
This has been bugging me and I can't find any specific or sure answer to my questions.
Of course. This is a common technique known as supertyping tables. As in your example, the idea is that one table contains a superset of entities and has common attributes describing a general entity, and other tables contain subsets of those entities with specific attributes. It's not unlike a simple class hierarchy in object-oriented design.
For your second question, one table can have two columns which are separately foreign keys to the same other table. When the database builds the query, it joins that other table twice. To illustrate in a SQL query (not sure about MySQL syntax, I haven't used it in a long time, so this is MS SQL syntax specifically), you would give that table two distinct aliases when selecting data. Something like this:
SELECT
student_accounts.name AS student_name,
counselor_accounts.name AS counselor_name
FROM
student_rec
INNER JOIN user_accounts AS student_accounts
ON student_rec.student_number = student_accounts.user_id
INNER JOIN user_accounts AS counselor_accounts
ON student_rec.guidance_counselor_id = counselor_accounts.user_id
This essentially takes the student_rec table and combines it with the user_accounts table twice, once on each column, and assigns two different aliases when combining them so as to tell them apart.
Yes, there should be no problem. Foreign keys and primary keys are orthogonal to each other, it's fine for a column or a set of columns to be both the primary key for that table (which requires them to be unique) and also to be associated with a primary key / unique constraint in another table.
I'm designing a db table that will save a list of user's favorited food items.
I created favorite table with the following schema
id, user_id, food_id
user_id and food_id will be foreign key linking to another table.
Im just wondering if this is efficient and scalable cause if user has multiple favorite things then it would need multiple rows of data.
i.e. user has 5 favorited food items, then it will consist of five rows to save the list for that user.
Is this efficient? and scalable? Whats the best way to optimize this schema?
thnx in advance!!!
tldr; This is called a "join table" and is the correct and scalable approach to model M-M relationships in a relational database. (Depending upon the constraints used it can also model 1-M/1-1 relationships in a "no NULL FK" schema.)
However, I contend that the id column should be omitted here so that the table is only user_id, food_id. The PK will be (user_id, food_id) in this case.
Unlike other tables, where surrogate (aka auto-increment) PKs are sometimes argued for, a surrogate PK generally only adds clutter in a join table as it has a very natural compound PK.
While the PK itself is compound in this case, each "joined" table only relates back by part of the PK. Depending upon queries performed it might also be beneficial to add covering indices on food_id or (food_id, user_id).
Eliminate Surrogate Key: Unless you have a specific reason for the surrogate key id, exclude it from the table.
Fine-tune Indexing: A this point, you just have a composite primary key that is the combination of the two foreign keys. In which order should the PK fields be?
If your application(s) predominantly execute queries such as: "for given user, give me foods", then PK should be {user_id, food_id}.
If the predominant query is "for given food, give me users", then the PK should be {food_id, user_id}.
If both query "directions" are common, add a UNIQUE INDEX that has the same fields as PK, but in opposite directions. So you'll have PK on {user_id, food_id} and index on {food_id, user_id}.
Note that InnoDB tables are clustered, which eliminates (in this case "unnecessary") table heap. Yet, the secondary index discussed above will not cause a double-lookup (since it fully covers the query), nor have a hidden overhead of PK fields (since it indexes the same fields as PK, just in opposite order).
For more on designing a junction table, take a look at this post.
To my opinion, you can optimize your table in the following ways:
As a relation table with 2 foreighkeys you don't have to use "id" field.
use "innodb" engine to your table
name your relation table "user_2_food", which will make it more clear.
try to use datatype as small as possible, i.e. "smallint" is better than "int", and don't forget "UNSIGNED" attribute.
Creating the below three Tables will result in an efficient design.
users : userId, username, userdesc
foods : foodId, foodname, fooddesc
userfoodmapping : ufid, userid, foodid, rowstate
The significance of rowstate is, if the user in future doesn't like that food, its state will become -1
You have 2 options in my opnion:
Get rid of the ID field, but in that case, make both your other keys (combined) your primary key
Keep your ID key as the primary key for your table.
In either case, I think this is a proper approach. Once you get into a problem of inefficiency, then you will look at probably how to load part of the table or any other technique. This would do for now.
I'm developing a helpdesk-like system, and I want to employ foreign keys, to make sure the DB structure is decent, but I don't know if I should use them at all, and how to employ them properly.
Are there any good tutorials on how (and when) to use Foreign keys ?
edit The part where I'm the most confused at is the ON DELETE .. ON UPDATE .. part, let's say I have the following tables
table 'users'
id int PK auto_increment
department_id int FK (departments.department_id) NULL
name varchar
table 'departments'
id int PK auto_increment
name
users.department_id is a foreign key from departments.department_id, how does the ON UPDATE and ON DELETE functions work here when i want to delete the department or the user?
ON DELETE and ON UPDATE refer to how changes you make in the key table propagate to the dependent table. UPDATE means that the key values get changed in the dependent table to maintain the relation, and DELETE means that dependent records get deleted to maintain the integrity.
Example: Say you have
Users: Name = Bob, Department = 1
Users: Name = Jim, Department = 1
Users: Name = Roy, Department = 2
and
Departments: id = 1, Name = Sales
Departments: id = 2, Name = Bales
Now if you change the deparments table to modify the first record to read id = 5, Name = Sales, then with "UPDATE" you would also change the first two records to read Department = 5 -- and without "UPDATE" you wouldn't be allowed to make the change!
Similarly, if you deleted Department 2, then with "DELETE" you would also delete the record for Roy! And without "DELETE" you wouldn't be allowed to remove the department without first removing Roy.
You will need foreign keys if you are splitting your database into tables and you are working with a DBMS (e.g. MySQL, Oracle and others). I assume from your tags you are using MySQL.
If you don't use foreign keys your database will become hard to manage and maintain. The process of normalisation ensures data consistency, which uses foreign keys.
See here for foreign keys. See here for why foreign keys are important in a relational database here.
Although denormalization is often used when efficiency is the main factor in the design. If this is the case you may want to move away from what I have told you.
Hope this helps.
I have read a number of solutions for a mysql Facebook friendship table and have decided on a fairly simple table with two fields user_a and user_b. I would then using a query with a UNION to get a list of all of a users friends (as they could be in user_a or user_b). My question now is... is it better to have a auto incrementing unique id or a compound id?
table 1)
user_a, user_b
table 2)
unique_id, user_a, user_b
My comments:
either approach for the key is fine. I would prefer a compound key over surrogate key to save space and avoid additional indexes
you may require a surrogate key though - some DALs do not work with compound keys
Update:
You may consider that friendship is a two-way street. Just because UserA has friended UserB does not mean that UserB has friended UserA. If you track both sides, it makes your queries easier. In that case you do:
Friend
-------
UserID
FriendUserID
So, you are only matching on the UserID column to get the list of the user's friends. If two users friend each other, you put two rows in the table. If one user unfriends another, you remove that one row.
While it is true that the compound key solution seems to be more elegant from a design perspective and less space-consuming at first glance, there are circumstances in which I'd personnaly go for an auto incremented numeric id instead.
If the friendship is referenced elsewhere, it will save more space on the long run to have a single numeric ID as a foreign key in the referencing table than a compound ID. Plus, an index on a single id will be (slightly) shorter and faster than a composite index if you query often on the friendship ID.