I'm having trouble understanding the difference between partial keys/weak entities and foreign keys. I feel like an idiot for not being able to understand this stuff.
As I understand it:
Weak Entity: An entity that is dependent on another entity.
Partial Key: Specifies a key that that is only partially unique. Used for weak entities.
vs
Foreign Key: A key that is used to establish and enforce a relation between data in different tables.
These don't seem like they're the same thing, but I'm having trouble distinguishing their uses.
Take the [very] simple example:
We have employees specified by an empid. We also have children specified by name. A
child is uniquely specified by name when the parent (employee) is known.
Would the child entity be a weak identity where the partial key is the name (partially unique)? Or should I be using a foreign key because I'm trying to establish and enforce a relation between employee and child? I feel like I can justify both, but I also feel like I'm missing something here. Any insight is appreciated, and I apologize for the stupid questions.
The problem is not you, it is that the ancient textbook or whatever you are using is pure excreta, the "definitions" are not clear, and there have been standard definitions for Relational Databases in use for over 30 years, which are much more clear. The "definitions" you have posted are in fact quite the opposite, non-intuitive, and it is no surprise that people would get confused.
A Foreign Key in a child row, is the value that references its parent Primary Key (in the parent table).
Using IDEF1X terminology. An Identifying Relation is one in which the FK (the parent Pk in the child) is also used to form the child PK. It is unique in the parent, but not unique in the child, you need to add some column to make it unique. Hence the stupid term "Partial Key". Either it is a Key (unique) or it is not a Key; the concept of a "partial Key" is too stupid to contemplate.
In a properly Normalised and standard-compliant database, there will be very few Independent entities. All the rest will be Dependent on some Independent entity. Such entities are not "weak", except in the sense that they cannot exist without the entity that they are Dependent upon.
The use of Identifying Relations (as opposed to Non-identifying) is actually strong; it gives the Dependent ("weak") entities their Identifier. So silly terms like "weak" and "strong" should not be used in a science that demands precision.
Use standard terms.
But to answer your explicit question:
assuming that Employee is "strong" and has a Primary Key (EmployeeId)
then the "weak" EmployeeChild table would need a FK (EmployeeId) to identify the Employee
which would be the perfect first component of the EmployeeChild table, the adorable "partial key"
to which you might add ChildNo, in order to make an ordinary Relational Primary Key
but it is not really "partial" because it is the full Primary Key of the Parent.
Readers who are unfamiliar with the Standard for Modelling Relational Databases may find â–¶IDEF1X Notationâ—€ useful.
A weak entity type is one whose primary key includes some attribute(s) that reference another entity. In other words a foreign key is a subset of the primary key. Therefore the entity cannot exist without its parent.
A partial key means just part of a key - some proper subset of the key attributes.
In your example if the primary key of a Child was (Empid, ChildName) with Empid as a foreign key referencing the Employee then Child is a weak entity. If Empid was not part of the primary key then Child would be a strong entity.
It's worth bearing in mind that the weak/strong distinction is purely an ER modelling concept. In relational database terms it doesn't make much difference. In particular the relational model doesn't make any distinction between primary keys and other candidate keys so for all practical purposes it doesn't make any difference to single out primary key attributes as being a "special" case when they reference other tables.
Suppose there is a relation between two entity Employees and Dependents. Employees is strong entity and Dependents is weak entity. Dependents have attributes Name, Age, Relation and Employees have attributes E_Id (primary key) and E_Name.
Then to satisfy relation we use foreign key E_Id in Dependents table which refers to the E_Id of Employees table.
But by using only foregin key we can't identify the tuples uniquely in Dependents table we require Name(partial key) attribute also to identify the tuples uniquely.
Example : suppose Dependents table has values in Name are Rahul, Akshat, Rahul then it will not unique and when it combine with E_Id then we can identify it uniquely.
E_Id with Name acts as primary key in Dependents table.
Related
In database terms, when i add a new foreign key, insert a record for that foreign key and update the existing record, what is the process called? My goal is to be able to find answers more effectively.
//create temporary linking key
alter table example add column example_foreign_key int unsigned null;
//contains more fields
insert into example_referenced_table (example_id, ...)
select id, ...
from example
join ...;
//link with the table
update example join example_referenced_table on example_id = example.id
set example.example_foreign_key = example_referenced_table.id;
//drop linking key
alter table example_referenced_table drop column example_id;
It looks like you're substituting one surrogate identifier for another. Introducing a surrogate key is sometimes (incorrectly) called normalization, so you may get some hits on that term.
In rare cases, normalization requires the introduction of a surrogate key, but in most cases, it simply decomposes a relation (table) into two or more, in such a way that no information is lost.
Surrogate keys are generally used when a natural or candidate key doesn't exist, isn't convenient, or not supported (e.g. composite keys are often a problem for object-relational mappers). For criteria on picking a good primary key, see: What are the design criteria for primary keys
There's little value in substituting one surrogate identifier for another, so the procedure you demonstrate has no proper name as far as I know, at least in the relational model.
If you mean to introduce a surrogate key as an identifier of a new entity set to which the original attribute is transferred, that's close to what Peter Chen called shifting a value set from the lower conceptual domain to the upper conceptual domain. You can find more information in his paper "The entity-relationship model - A basis for the enterprise view of data".
As for your question's title, it's not wrong to say that you're adding a relationship to a table (though that wording mixes conceptual and physical terms), but note that in the entity-relationship model, a relationship is represented by a two or more entity keys in a table (e.g. (id, example_foreign_key) in the example table) and not by a foreign key constraint between tables. The association of relationships with foreign key constraints came from the network data model, which is older than both the relational and entity-relationship models.
I am new to database design and am trying to understand the best practice for using foreign keys. I know that when you have a 1:m relationship, we don't have to create a relation for the relationship; instead we could add a foreign key to the m-side of the relationship(which corresponds to the primary key on the 1-side) so as to preserve referential integrity. My question however is: Under what other circumstances could we do the same? Can we do the same when we have a 0..1 to 1 or 1-1 relationship as well? What is the best practice for this type of situation when referential integrity is as important as the computational cost?
There are three possible approaches when we are mapping 1:1 relation to Relational Model:
Foreign Key approach: Choose one of the relations-say S-and include a
foreign key in S the primary key of T. It is better to choose an entity type
with total participation in R in the role of S.
Merged relation option: An alternate mapping of a 1:1 relationship type
is possible by merging the two entity types and the relationship into a
single relation. This may be appropriate when both participations are
total.
Cross-reference or relationship relation option: The third alternative
is to set up a third relation R for the purpose of cross-referencing the
primary keys of the two relations S and T representing the entity types.
For more details you can look into this home.iitj.ac.in/~ramana/ch7-mapping-ER-EER-relations.pdf
Foreign key constraints are restrictions, not references. A relation exists implicitly wherever different columns represent the same domain, and their tables can be joined, with or without foreign keys. The constraint just ensures that the values/entities stored in one column exists in another column. They're appropriate to use wherever a relation is dependent on another, regardless of the cardinality of either. The typical use case here is subtyping of entity sets.
The main benefits of foreign key constraints is the performance gains (reduced calls across the wire as well as development time) enabled by cascading updates/deletes from a single statement instead of having to query for an ID before executing multiple update/delete statements wrapped in a transaction. Not to mention repair time if changes weren't propagated properly.
the M-to-M relationship is equivalent to two 1:M relationships .we can not assign primary key of 1 side as a foreign key of other for this purpose we use a middle entitiy that resolve an M:M and that entity is tipically called "association " or intersection entity. e.g A M:M relationship between project and employee a new midlle entity can be assignment so in this way the relation will convert into 1:M and we can assign a foreign key easilly
Question
Is there a way to have a many-to-many relationship among 3 tables without the use of automatic incrementers (usually ID), or are ID's required for this?
Why I ask
I have 3 relative tables. Since one-to-one relationships seem to can't happen directly, I made a 4th to do one-to-many relationships to the other 3 tables. However, since there's still a primary key to each table, a value can only be used once in a table, which I don't want to happen.
What I have
Connectors has multiple Pockets which have multiple pins.
The 4th Table is ConnectorFullInfo
There is no requirement that a table have an "automatic incrementer" as a primary key.
But, a familiar pattern is to add a surrogate ID column as primary key on entity tables. The "ideal" primary key will be "anonymous" (carry no meaningful information), "unique" (no duplicate values), "simple" (single column, short simple native datatype), ...
There are a couple of schools of thought on whether it's a good idea to introduce a surrogate key. I will also note that there are those who have been later burned by the decision to use a natural key rather than a surrogate key. And there are those that haven't yet been burned by that decision.
In the case of "association" tables (tables introduced to resolve many-to-many relationships), the combination of the foreign keys can be used as the primary key. I often do this.
BUT, if the association table is itself turns out to be entity table, with it's own attributes, I will introduce a surrogate ID column. As an example, the association between person and club, a person can be a member of multiple clubs, and a club can have multiple members...
club +--< membership >--+ person
When we start adding attributes to membership (such as status, date_joined, office_held, etc... at that point membership isn't just an association table; it's turning into an entity. When I suspect that an association is actually an entity, so I'll add the surrogate ID column.
The other case where I will add a surrogate ID column to an association table is when we want to allow "duplicates", where we want to allow multiple associations. In that case, I will also introduce a surrogate ID column.
Yes you can but,
It is customary to represent a table row by a unique identifier which is the number, its becomes more efficient.
When using a foreign key in a table, is it good form to change the name of the key for that table to make it clear what function the key performs in the table, or is it good form to retain the original name, to make it clear that it is a foreign key?
Example:
a table keeps track of users, the primary key is user_id
a second table stores articles on the website and keeps track of the author with the foreign key user_id.
In the context of the second table it would make more sense to call the foreign key author. In the context of the whole database it would make more sense to call the foreign key user_id
Is there a general convention that deals with this situation, or is that what comments are for?
Well, if you have a movie table you wouldn't want columns called person_id and person_id, but rather producer and director, or perhaps producer_id and director_id, or maybe producer_person_id and director_person_id.
I know movies can have multiple directors and multiple producers; this was just an example. Any case in which a table has two foreign keys to the same table will show you that you cannot in principle stick completely to a convention of using only the table name in the column name. You can use both (as in the producer_person_id example) but that leads to long column names.
Don't use comments. No one reads them. Okay that was just snark, perhaps, but in general favor descriptive names to comments!
Aside from the two-foreign-key issue, I'm not really aware of any univerally accepted convention.
It is conventional to know the database schema's modelling and designing. Whatever makes sense to the database administrator. Business logic is not concerned with how the database is named, only the results. For the database administrator if it make more sense to rename the foreign key author_id to refer to user_id of another table then do so and notate it in some documents that T2.author_id must exist in T1.user_id. When transitioning from modelling to designing the database (which is where you are now) it would make sense to just keep it simple, but you can change the foreign key names so long as you can remember them (and document them as well).
I am trying to use the same column to represent a has foreign key to different columns. This is because there could be an arbitrary number of tables to be indexed using this column.
Right now, my idea is to use a small varchar() field to represent which field they are indexing and then check for them my probably sub-querying for all that match the given field, then querying based on the id?
Is this a good method that would take advantage of MySQL indexing?
Are there any other better ways to accomplish this?
I usually use Abba's solution for these sort of problems in a one-to-many relationship. Use a type field to define the table the foreign key reffers to.
If this comes up in a one-to-one relationship you may consider flipping the relationship around. Move the foreign key to the other tables. Any number of tables may link a foreign key to the single original table.
Check out the http://github.com/Theaxiom/Polymorphic2.0 Polymorhpic Behavior.
You use 2 fields to represent a connection to any other table. One field holds the ModelName of the linked Model and the other holds any arbitrary foreign_id value.
Create a "supertype" table that unifies the keys from the other tables. This example might help:
http://consultingblogs.emc.com/davidportas/archive/2007/01/08/Distributed-Keys-and-Disjoint-Subtypes.aspx
One way to represent the gen-spec design pattern is to use the same key as both a foreign key and as a primary key in the specialized tables. As a foreign key, it references the PK in the generalized table. And the PK in the generalized table references a row in one of the specialized tables, without specify which one.
This is the usual method of modeling the gen-spec pattern in the relational model.