What is adding a new relationship to an existing table called? - mysql

In database terms, when i add a new foreign key, insert a record for that foreign key and update the existing record, what is the process called? My goal is to be able to find answers more effectively.
//create temporary linking key
alter table example add column example_foreign_key int unsigned null;
//contains more fields
insert into example_referenced_table (example_id, ...)
select id, ...
from example
join ...;
//link with the table
update example join example_referenced_table on example_id = example.id
set example.example_foreign_key = example_referenced_table.id;
//drop linking key
alter table example_referenced_table drop column example_id;

It looks like you're substituting one surrogate identifier for another. Introducing a surrogate key is sometimes (incorrectly) called normalization, so you may get some hits on that term.
In rare cases, normalization requires the introduction of a surrogate key, but in most cases, it simply decomposes a relation (table) into two or more, in such a way that no information is lost.
Surrogate keys are generally used when a natural or candidate key doesn't exist, isn't convenient, or not supported (e.g. composite keys are often a problem for object-relational mappers). For criteria on picking a good primary key, see: What are the design criteria for primary keys
There's little value in substituting one surrogate identifier for another, so the procedure you demonstrate has no proper name as far as I know, at least in the relational model.
If you mean to introduce a surrogate key as an identifier of a new entity set to which the original attribute is transferred, that's close to what Peter Chen called shifting a value set from the lower conceptual domain to the upper conceptual domain. You can find more information in his paper "The entity-relationship model - A basis for the enterprise view of data".
As for your question's title, it's not wrong to say that you're adding a relationship to a table (though that wording mixes conceptual and physical terms), but note that in the entity-relationship model, a relationship is represented by a two or more entity keys in a table (e.g. (id, example_foreign_key) in the example table) and not by a foreign key constraint between tables. The association of relationships with foreign key constraints came from the network data model, which is older than both the relational and entity-relationship models.

Related

Cardinality and foreign key relationship

I am new to database design and am trying to understand the best practice for using foreign keys. I know that when you have a 1:m relationship, we don't have to create a relation for the relationship; instead we could add a foreign key to the m-side of the relationship(which corresponds to the primary key on the 1-side) so as to preserve referential integrity. My question however is: Under what other circumstances could we do the same? Can we do the same when we have a 0..1 to 1 or 1-1 relationship as well? What is the best practice for this type of situation when referential integrity is as important as the computational cost?
There are three possible approaches when we are mapping 1:1 relation to Relational Model:
Foreign Key approach: Choose one of the relations-say S-and include a
foreign key in S the primary key of T. It is better to choose an entity type
with total participation in R in the role of S.
Merged relation option: An alternate mapping of a 1:1 relationship type
is possible by merging the two entity types and the relationship into a
single relation. This may be appropriate when both participations are
total.
Cross-reference or relationship relation option: The third alternative
is to set up a third relation R for the purpose of cross-referencing the
primary keys of the two relations S and T representing the entity types.
For more details you can look into this home.iitj.ac.in/~ramana/ch7-mapping-ER-EER-relations.pdf
Foreign key constraints are restrictions, not references. A relation exists implicitly wherever different columns represent the same domain, and their tables can be joined, with or without foreign keys. The constraint just ensures that the values/entities stored in one column exists in another column. They're appropriate to use wherever a relation is dependent on another, regardless of the cardinality of either. The typical use case here is subtyping of entity sets.
The main benefits of foreign key constraints is the performance gains (reduced calls across the wire as well as development time) enabled by cascading updates/deletes from a single statement instead of having to query for an ID before executing multiple update/delete statements wrapped in a transaction. Not to mention repair time if changes weren't propagated properly.
the M-to-M relationship is equivalent to two 1:M relationships .we can not assign primary key of 1 side as a foreign key of other for this purpose we use a middle entitiy that resolve an M:M and that entity is tipically called "association " or intersection entity. e.g A M:M relationship between project and employee a new midlle entity can be assignment so in this way the relation will convert into 1:M and we can assign a foreign key easilly

Database Design - Custom attributes table - Table that "relate" entities

I'm designing a database (for use in mysql) that permits new user-defined attributes to an entity called nodes.
To accomplish this I have created 2 other tables. One customvars table that holds all custom attributes and a *nodes_customvars* that define the relationship between nodes and customvars creating a 1..n and n..1 relationship.
Here is he link to the drawed model: Sketched database model
So far so good... But I'm not able to properly handle INSERTs and UPDATEs using separate IDs for each table.
For example, if I have a custom attribute called color in the *nodes_customvars* table inserted for a specific node, if I try to "INSERT ... ON DUPLICATE KEY UPDATE" either it will always insert or always update.
I've thinked on remove the "ID" field from the *nodes_customvars* tables and make it a composite key using nodes id and customvars id, but I'm not sure if this is the best solution...
I've read this article, and the comments, as well: http://weblogs.sqlteam.com/jeffs/archive/2007/08/23/composite_primary_keys.aspx
What is the best solution to this?
EDIT:
Complementing: I don't know the *nodes_customvars* id, only nodes id and customvars id. Analysing the *nodes_customvars* table:
1- If I make nodes id and/or customvars id UNIQUE in this table, using "INSERT ... ON DUPLICATE KEY UPDATE" will always UPDATE. Since that multiple nodes can share the same customvar, this is wrong;
2- If I don't make any UNIQUE key, "INSERT ... ON DUPLICATE KEY UPDATE" will always INSERT, since that no UNIQUE key is already found in the statement...
You have two options for solving your specific problem of the "INSERT...ON DUPLICATE KEY" either always inserting or updating as you describe.
Change the primary to be a composite key using nodeId and customvarId (as suggested by SyntaxGoonoo and in your question as a possible option).
Add a composite unique index using nodeId and customvarId.
CREATE UNIQUE INDEX IX_NODES_CUSTOMVARS ON NODES_CUSTOMVARS(nodeId, customvarId);
Both of the options would allow for the "INSERT...ON DUPLICATE KEY" functionality to work as you require (INSERT if a unique combination of nodeId and customvarId doesn't exist; update if it does).
As for the question about whether to have a composite primary key or a separate primary key column with an additional unique index, there are many things to consider in the design. There's the 1NF considerations and the physical characteristics of the database platform you're on and the preference of the ORM you happen to be using (if any). Given how InnoDB secondary indexes work (see last paragraph at: http://dev.mysql.com/doc/refman/5.0/en/innodb-index-types.html), I would suggest that you keep the design as you currently have it and add in the additional unique index.
HTH,
-Dipin
You current entity design breaks 1NF. This means that your schema can erroneously store duplicate data.
nodes_customvars describes the many-to-many relationship between nodes and customvars. This type of table is sometimes referred to as an auxiliary table, because its contents are purely derived from base tables (in this case nodes and customvars).
The PK for an auxiliary table describing a many-to-many relationship should be a composite key in order to prevent duplication. Basically 1NF.
Any PK on a table is inherently UNIQUE. regardless of whether it is a single, or composite key. So in some ways your question doesn't make sense, because you are talking about turning the UNIQUE constraint on/off on id for nodes and customvars . Which you can't do if your id is actually a PK.
So what are you actually trying to achieve here???

Renaming foreign keys to fit the context of a table

When using a foreign key in a table, is it good form to change the name of the key for that table to make it clear what function the key performs in the table, or is it good form to retain the original name, to make it clear that it is a foreign key?
Example:
a table keeps track of users, the primary key is user_id
a second table stores articles on the website and keeps track of the author with the foreign key user_id.
In the context of the second table it would make more sense to call the foreign key author. In the context of the whole database it would make more sense to call the foreign key user_id
Is there a general convention that deals with this situation, or is that what comments are for?
Well, if you have a movie table you wouldn't want columns called person_id and person_id, but rather producer and director, or perhaps producer_id and director_id, or maybe producer_person_id and director_person_id.
I know movies can have multiple directors and multiple producers; this was just an example. Any case in which a table has two foreign keys to the same table will show you that you cannot in principle stick completely to a convention of using only the table name in the column name. You can use both (as in the producer_person_id example) but that leads to long column names.
Don't use comments. No one reads them. Okay that was just snark, perhaps, but in general favor descriptive names to comments!
Aside from the two-foreign-key issue, I'm not really aware of any univerally accepted convention.
It is conventional to know the database schema's modelling and designing. Whatever makes sense to the database administrator. Business logic is not concerned with how the database is named, only the results. For the database administrator if it make more sense to rename the foreign key author_id to refer to user_id of another table then do so and notate it in some documents that T2.author_id must exist in T1.user_id. When transitioning from modelling to designing the database (which is where you are now) it would make sense to just keep it simple, but you can change the foreign key names so long as you can remember them (and document them as well).

How to relate two tables without a foreign key?

Can someone give a demo?
I'm using MySQL,but the idea should be the same!
EDIT
In fact I'm asking what's the difference between Doctrine_Relation and Doctrine_Relation_ForeignKey in doctrine?
I suspect what you are looking at would be to be map columns from one db table to another db table. You can do this using some string comparison algorithm. An algo like Levenstein or Jaro-Winkler distance would let you infer the "matching" columns.
For example, if db1.tableA has a column L_Name and db2.tableB has a column LastName, a string distance match would fetch you one measure. You can extend that by comparing the values in the rows to check if there is some consistency for example if the values in both tables contains: "Smith"s, "Johnson"s etc. you have a double-win.
I recently did something similar, integrating multiple large databases (one of them in a different language - French!) and it turned out to be quite a great experience.
HTH
You should use foreign keys to relate tables in MySQL, because it does not offer other ways to create relationships (such as references or nested tables in an object-oriented database).
See:
http://lists.mysql.com/mysql/206589
EDIT:
If you are willing to use Oracle, references and nested-tables are alternate ways to create relationships between tables. References are more versatile, so here is an example.
Since references are used in object-oriented fashion, you should first create a type and a table to hold objects of that type.
Lets create an object type of employee which has a reference to his manager:
CREATE TYPE employee_type AS OBJECT (
name VARCHAR2(30),
manager REF manager_type
);
We should also create an object type for managers:
CREATE TYPE manager_type AS OBJECT (
name VARCHAR2(30),
);
Now lets create two tables, one for employees and other for managers:
CREATE TABLE employees OF employee_type;
CREATE TABLE managers OF manager_type;
We can relate this tables using references. To insert an employee in employees table, just do this:
INSERT INTO employees
SELECT employee_type('Bob Jones', REF(m))
FROM managers m
WHERE m.name = 'Larry Ellison';
More info: Introduction to Oracle Objects
Well you could get around that by taking care of relationships in a server side language. Some database abstraction layers can handle this for you (such as Zend_Db_Table for PHP) but it is recommended to use foreign keys.
MySQL has InnoDB storage engine that supports foreign keys and also transactions.
Using a foreign key is the standard way of creating a relationship. Alternatives are pretty much nonexistent, as you'd have to identify the related rows SOMEHOW.
A column (or set of columns) which links the two tables IS a foreign key - even if it doesn't have a constraint defined on it (or even an index) and isn't either of the tables' primary key (although in that case you can end up with a weird situation where you can get unintended cartesian products when joining, as you will end up with a set vs set relationship which is probably not what you want)
Not having a foreign key constraint is no barrier to using a foreign key.

unable to enforce referential integrity in Access

I've checked everything for errors: primary key, uniqueness, and type. Access just doesnt seem to be able to link the 2 fields i have in my database. can someone please take a look?
http://www.jpegtown.com/pictures/jf5WKxKRqehz.jpg
Thanks.
Your relationship diagram shows that you've made the ID fields your primary key in all your tables, but you're not using them for your joins. Thus, they serve absolutely no purpose. If you're not going to use "surrogate keys" (i.e., a meaningless ID number that is generated by the database and is unique to each record, but has absolutely no meaning in regard to the data in your table), then eliminate them. But if you're going to use "natural keys" (i.e., a primary key constructed from a set of real data fields that together are going to be unique for each record), you must have a unique compound index on those fields.
However, there are issues with both approaches:
Surrogate Keys: a surrogate PK makes each record unique. That is you could have a record for David Fenton with ID 1 and a record for David Fenton with ID 2. If it's the same David Fenton, you've got duplicate data, but as far as your database knows, they are unique.
Natural Keys: some types of entities work very well with natural keys. The best such are where there's a single field that identifies the record uniquely. An example would be "employee type," where values might be "associate, manager, etc." In that case, it's a very good candidate for using the natural key instead of adding a surrogate key. The only argument against the natural key in that case is if the data in the candidate natural key is highly volatile (i.e., it changes frequently). While every modern database engine provides "CASCADE UPDATE" functionality (i.e., if the value in the PK field changes, all the tables where that field is a Foreign Key are automatically updated), this imposes a certain amount of overhead and can be problematic. For single-column keys, it's unlikely to be an issue. Now, except for lookup tables, there are very few entities for which a natural key will be a single column. Instead, you have to create a compound index, i.e., an index that spans multiple data fields. In the index dialog in Access table design, you create a compound key by giving it a name in the first column, and then adding multiple rows in the second column (from the dropdown list of fields in your table). The drawback of this is that if any of the fields in your compound unique index are unknown, you won't get uniqueness. That is, if a field has a Null in two records, and the rest of the fields are identical, this won't be counted as a conflict of uniqueness because Null never equals Null. This is because Null doesn't mean "empty" -- it means "Unknown."
Allen Browne has explained everything you need to know about Nulls:
Nulls: Do I Need Them?
Common Errors with Null
In your graphic, you show that you are trying to link the Company table with the PManager table. The latter table has a CompanyID field, and your Company table has a unique index on its ID field, so all you need is a link from the ID field of the Company table to the CompanyID field of the PManager table. For your example to work (which would be useless, since you already have a unique index on the ID field), you'd need to create a unique compound key spanning both ID and ShortName in the Company table.
Additionally, if ShortName is a field that you want to be unique (i.e., you don't want two company records to have the same ShortName), you should add a unique index to it, whether or not you still use the ID field as your primary key. This brings me back to item #1 above, where I described a situation where a surrogate key could lead you to enter duplicate records, because uniqueness is established by the surrogate key along. Any time you choose to use a surrogate key, you must also add a unique compound index on any combination of data fields that needs to be unique (with the caveat about Null fields as outlined in item #2).
If you're thinking "surrogate keys mean more indexes" you're correct, in that you have two unique indexes on the same table (assuming you don't have the Null problem). But you do get substantial ease of use in joining tables in SQL, as well as substantially less duplication of data. Likewise, you avoid the overhead of CASCADE UPDATE. On the other hand, if you're viewing a child table with a natural foreign key, you don't need to join to the parent table to be able to identify the parent record, because the data that identifies that record is right there in the foreign-key fields. That lack of a need for a join can be a major performance gain in certain scenarios (especially for the case where you'd need an outer join because the foreign key can be Null).
This is actually quite a huge topic, and it's something of a religious argument. I'm firmly in the surrogate key camp, but I use natural keys for lookup tables where the key is a single column. I don't use natural keys for any other purpose. That said, where possible (i.e., no Null problems) I also have a unique index on the natural key.
Hope this helps.
Actually you need an index on the name fields, on both sides
However, may I suggest that you have way too many joins? In general there should only be one join from one table to the next. It is rare to have more than one join between tables, and exceedingly rare to have more than two.
Have a look at this link:
http://weblogs.asp.net/scottgu/archive/2006/07/12/Tip_2F00_Trick_3A00_-Online-Database-Schema-Samples-Library.aspx
Notice how all of the tables are joined together by a single relationship?
Each of the fields labeled PK are primary keys. These are AUTONUMBER fields. Each of the fields labeled FK are foreign keys. These are indexed Number fields of type Integer. The Primary Keys are connected to the Foreign Keys in a 1 to many relationship (in most cases).
99% of the time, you won't need any other kind of joins. The trick is to create tables with unique information. There is a lot of repeated information in your database.
A database that is reorganized in this manner is called a "normalized" database. There are lots of good examples of these at http://www.databaseanswers.org/data_models/
Just join on the CompanyID. You could also get rid of the Company field in PManager.
I did the following and the problem was solved (I face the same problem of referential integrity in access).
I exported data from both tables in Access to Excel. Table1
was containing Cust Code and basic information about the company.
Cust Code as Primary key.
Table2 was containing all information about who the
customers associated with that company.
I removed all duplicates from Table2 exported to excel.
Using Vlookup I checked and found that there are 11
customers code not present in Table1.
I added those codes in Access Table. I linked by
referential integrity and Problem was solved.
Also look for foreign key if it does not work.
You need to create an INDEX. Perhaps look for some kind of create index button and create an index on CompanyID