Indexes on many-to-many table - mysql

current setup is:
objects (notes, reminders, files) - each in separate table
entities (clients, projects) - each in separate table
object can belong to many entities, entities can have many objects
associations table looks like this:
object_type_id, object_id, entity_type_id, entity_id
How would you handle indexes on associations table? Any comments about the setup?

I'm not that strong on databases in general. But i usually allways index any field that is a id reference to another table.
So i'd probably index all the fields in your associations table, since they all refer to data in other tables (or so i assume).
You should probably also add a Primary KEY id to the associations table, so when you wan't to delete an association you can do it via a primary key reference.

With mysql, if you've defined the foreign keys as actual RI foreign keys using the references keyword, you get an index automatically defined on the table. And primary keys also get an index, so you shouldn't have to define any indexes manually.

Related

What is adding a new relationship to an existing table called?

In database terms, when i add a new foreign key, insert a record for that foreign key and update the existing record, what is the process called? My goal is to be able to find answers more effectively.
//create temporary linking key
alter table example add column example_foreign_key int unsigned null;
//contains more fields
insert into example_referenced_table (example_id, ...)
select id, ...
from example
join ...;
//link with the table
update example join example_referenced_table on example_id = example.id
set example.example_foreign_key = example_referenced_table.id;
//drop linking key
alter table example_referenced_table drop column example_id;
It looks like you're substituting one surrogate identifier for another. Introducing a surrogate key is sometimes (incorrectly) called normalization, so you may get some hits on that term.
In rare cases, normalization requires the introduction of a surrogate key, but in most cases, it simply decomposes a relation (table) into two or more, in such a way that no information is lost.
Surrogate keys are generally used when a natural or candidate key doesn't exist, isn't convenient, or not supported (e.g. composite keys are often a problem for object-relational mappers). For criteria on picking a good primary key, see: What are the design criteria for primary keys
There's little value in substituting one surrogate identifier for another, so the procedure you demonstrate has no proper name as far as I know, at least in the relational model.
If you mean to introduce a surrogate key as an identifier of a new entity set to which the original attribute is transferred, that's close to what Peter Chen called shifting a value set from the lower conceptual domain to the upper conceptual domain. You can find more information in his paper "The entity-relationship model - A basis for the enterprise view of data".
As for your question's title, it's not wrong to say that you're adding a relationship to a table (though that wording mixes conceptual and physical terms), but note that in the entity-relationship model, a relationship is represented by a two or more entity keys in a table (e.g. (id, example_foreign_key) in the example table) and not by a foreign key constraint between tables. The association of relationships with foreign key constraints came from the network data model, which is older than both the relational and entity-relationship models.

Using enumerations in mySQL Workbench

Im creating a database design in MySQL Workbench. I want to have a enumarated table which holds some standard values. The values of the enumaration table needs to be linked to a row in my other table. So i have a table called 'club' which holds a row 'club_soort'. The row 'club_soort' needs to relate to the enumaration table.
Also, I want to use my tables (when i'm ready with my database design) into phpMyAdmin.
I understand the concept of enumaration, but I can't implement it. I hope someone can help me!
Thanks!
Rather than using enumerations, you should use what's known as a lookup or reference table. This table would contain your enumerations and be referenced as a foreign key by the parent table.
As an example, this would look like:
parent_table
------------ club
id ----
club_soort ----------> soort
ENUM values cannot be linked to any MySQL structures. It can contain only static data.
Are you talking about primary keys?
Being a relational database, mysql uses primary keys and indexes to joint data the way you want to achieve.
Primary keys join tables in an efficient way, PK in the the origin or parent table and FPK, Foreign Primary Key in the related table.
When creating a table, in mysql workbench or phpmyadmin, define a primary key, just one per table and if needed indexes and if needed foreign keys.
Use union statements to join two or more tables.
Always use numeric keys data_type INT instead of natural, string keys. Also make then autoincrement and Not Null.
mysqlworkbench has an exporting tool, which allows you to export each created table, including their keys, indexes and cascading. You can copy and paste to create tables in phpmyadmin.

Database Design - Custom attributes table - Table that "relate" entities

I'm designing a database (for use in mysql) that permits new user-defined attributes to an entity called nodes.
To accomplish this I have created 2 other tables. One customvars table that holds all custom attributes and a *nodes_customvars* that define the relationship between nodes and customvars creating a 1..n and n..1 relationship.
Here is he link to the drawed model: Sketched database model
So far so good... But I'm not able to properly handle INSERTs and UPDATEs using separate IDs for each table.
For example, if I have a custom attribute called color in the *nodes_customvars* table inserted for a specific node, if I try to "INSERT ... ON DUPLICATE KEY UPDATE" either it will always insert or always update.
I've thinked on remove the "ID" field from the *nodes_customvars* tables and make it a composite key using nodes id and customvars id, but I'm not sure if this is the best solution...
I've read this article, and the comments, as well: http://weblogs.sqlteam.com/jeffs/archive/2007/08/23/composite_primary_keys.aspx
What is the best solution to this?
EDIT:
Complementing: I don't know the *nodes_customvars* id, only nodes id and customvars id. Analysing the *nodes_customvars* table:
1- If I make nodes id and/or customvars id UNIQUE in this table, using "INSERT ... ON DUPLICATE KEY UPDATE" will always UPDATE. Since that multiple nodes can share the same customvar, this is wrong;
2- If I don't make any UNIQUE key, "INSERT ... ON DUPLICATE KEY UPDATE" will always INSERT, since that no UNIQUE key is already found in the statement...
You have two options for solving your specific problem of the "INSERT...ON DUPLICATE KEY" either always inserting or updating as you describe.
Change the primary to be a composite key using nodeId and customvarId (as suggested by SyntaxGoonoo and in your question as a possible option).
Add a composite unique index using nodeId and customvarId.
CREATE UNIQUE INDEX IX_NODES_CUSTOMVARS ON NODES_CUSTOMVARS(nodeId, customvarId);
Both of the options would allow for the "INSERT...ON DUPLICATE KEY" functionality to work as you require (INSERT if a unique combination of nodeId and customvarId doesn't exist; update if it does).
As for the question about whether to have a composite primary key or a separate primary key column with an additional unique index, there are many things to consider in the design. There's the 1NF considerations and the physical characteristics of the database platform you're on and the preference of the ORM you happen to be using (if any). Given how InnoDB secondary indexes work (see last paragraph at: http://dev.mysql.com/doc/refman/5.0/en/innodb-index-types.html), I would suggest that you keep the design as you currently have it and add in the additional unique index.
HTH,
-Dipin
You current entity design breaks 1NF. This means that your schema can erroneously store duplicate data.
nodes_customvars describes the many-to-many relationship between nodes and customvars. This type of table is sometimes referred to as an auxiliary table, because its contents are purely derived from base tables (in this case nodes and customvars).
The PK for an auxiliary table describing a many-to-many relationship should be a composite key in order to prevent duplication. Basically 1NF.
Any PK on a table is inherently UNIQUE. regardless of whether it is a single, or composite key. So in some ways your question doesn't make sense, because you are talking about turning the UNIQUE constraint on/off on id for nodes and customvars . Which you can't do if your id is actually a PK.
So what are you actually trying to achieve here???

Could I same column to represent a foreign key to multiple tables?

I am trying to use the same column to represent a has foreign key to different columns. This is because there could be an arbitrary number of tables to be indexed using this column.
Right now, my idea is to use a small varchar() field to represent which field they are indexing and then check for them my probably sub-querying for all that match the given field, then querying based on the id?
Is this a good method that would take advantage of MySQL indexing?
Are there any other better ways to accomplish this?
I usually use Abba's solution for these sort of problems in a one-to-many relationship. Use a type field to define the table the foreign key reffers to.
If this comes up in a one-to-one relationship you may consider flipping the relationship around. Move the foreign key to the other tables. Any number of tables may link a foreign key to the single original table.
Check out the http://github.com/Theaxiom/Polymorphic2.0 Polymorhpic Behavior.
You use 2 fields to represent a connection to any other table. One field holds the ModelName of the linked Model and the other holds any arbitrary foreign_id value.
Create a "supertype" table that unifies the keys from the other tables. This example might help:
http://consultingblogs.emc.com/davidportas/archive/2007/01/08/Distributed-Keys-and-Disjoint-Subtypes.aspx
One way to represent the gen-spec design pattern is to use the same key as both a foreign key and as a primary key in the specialized tables. As a foreign key, it references the PK in the generalized table. And the PK in the generalized table references a row in one of the specialized tables, without specify which one.
This is the usual method of modeling the gen-spec pattern in the relational model.

How to relate two tables without a foreign key?

Can someone give a demo?
I'm using MySQL,but the idea should be the same!
EDIT
In fact I'm asking what's the difference between Doctrine_Relation and Doctrine_Relation_ForeignKey in doctrine?
I suspect what you are looking at would be to be map columns from one db table to another db table. You can do this using some string comparison algorithm. An algo like Levenstein or Jaro-Winkler distance would let you infer the "matching" columns.
For example, if db1.tableA has a column L_Name and db2.tableB has a column LastName, a string distance match would fetch you one measure. You can extend that by comparing the values in the rows to check if there is some consistency for example if the values in both tables contains: "Smith"s, "Johnson"s etc. you have a double-win.
I recently did something similar, integrating multiple large databases (one of them in a different language - French!) and it turned out to be quite a great experience.
HTH
You should use foreign keys to relate tables in MySQL, because it does not offer other ways to create relationships (such as references or nested tables in an object-oriented database).
See:
http://lists.mysql.com/mysql/206589
EDIT:
If you are willing to use Oracle, references and nested-tables are alternate ways to create relationships between tables. References are more versatile, so here is an example.
Since references are used in object-oriented fashion, you should first create a type and a table to hold objects of that type.
Lets create an object type of employee which has a reference to his manager:
CREATE TYPE employee_type AS OBJECT (
name VARCHAR2(30),
manager REF manager_type
);
We should also create an object type for managers:
CREATE TYPE manager_type AS OBJECT (
name VARCHAR2(30),
);
Now lets create two tables, one for employees and other for managers:
CREATE TABLE employees OF employee_type;
CREATE TABLE managers OF manager_type;
We can relate this tables using references. To insert an employee in employees table, just do this:
INSERT INTO employees
SELECT employee_type('Bob Jones', REF(m))
FROM managers m
WHERE m.name = 'Larry Ellison';
More info: Introduction to Oracle Objects
Well you could get around that by taking care of relationships in a server side language. Some database abstraction layers can handle this for you (such as Zend_Db_Table for PHP) but it is recommended to use foreign keys.
MySQL has InnoDB storage engine that supports foreign keys and also transactions.
Using a foreign key is the standard way of creating a relationship. Alternatives are pretty much nonexistent, as you'd have to identify the related rows SOMEHOW.
A column (or set of columns) which links the two tables IS a foreign key - even if it doesn't have a constraint defined on it (or even an index) and isn't either of the tables' primary key (although in that case you can end up with a weird situation where you can get unintended cartesian products when joining, as you will end up with a set vs set relationship which is probably not what you want)
Not having a foreign key constraint is no barrier to using a foreign key.