I have a Product Database, and I want to attach text, images, videos to the products. I also want that each entity (text, images or videos) have a tag, for further organisation on application.
I thought of using this model:
Content:
content_id|content_product_id|content_type|content_tag_id|content_url|content_title|content_text
Tag
tag_id|tag_name
This mean to use Entity(content_product_id) - Attribute(content_tag_id) - Value (content_url or content_title|content_text) Model.
After reading a lot, I understood that is a bad idea to use this modeling pattern (described as a database antipattern, unscalable and causing performance issues), have you an idea for an alternative method for this ?
I want to use Doctrine ORM, and I would like to find an method that will be easily compatible with that data mapper
I'd create a general table for any type of content:
CREATE TABLE ProductContents(
content_id INT AUTO_INCREMENT PRIMARY KEY,
content_type INT NOT NULL
-- other general attributes like when it was created, by whom, etc.
);
For each text, image, or video, insert one row into this table. If you use an auto-increment primary key, this table is responsible for generating the id number.
For tags, now you simply have a many-to-many relationship between ProductContent and Tags. This is represented by an intersection table.
CREATE TABLE Tags (
tag_id INT AUTO_INCREMENT PRIMARY KEY,
tag TEXT NOT NULL
);
CREATE TABLE ProductContentTagged (
content_id INT,
tag_id INT,
PRIMARY KEY (content_id, tag_id),
FOREIGN KEY (content_id) REFERENCES ProductContents(content_id),
FOREIGN KEY (tag_id) REFERENCES Tags(tag_id),
);
Then if you have any attributes specific to each type of content, create auxiliary tables for each type, with a one-to-one relationship to the content table.
CREATE TABLE ProductContentTexts (
content_id INT PRIMARY KEY,
content TEXT NOT NULL,
FOREIGN KEY (content_id) REFERENCES ProductContents(content_id)
);
CREATE TABLE ProductContentImages (
content_id INT PRIMARY KEY,
image_path TEXT NOT NULL,
FOREIGN KEY (content_id) REFERENCES ProductContents(content_id)
);
CREATE TABLE ProductContentVideos (
content_id INT PRIMARY KEY,
video_path TEXT NOT NULL,
FOREIGN KEY (content_id) REFERENCES ProductContents(content_id)
);
Note these auxiliary tables don't have an auto-increment column. They don't need to -- they will always use the value that was generated by the ProductContents table, and you're responsible for inserting that value.
Bill Karwin's answer is very good.
However, since you say:
I want to use Doctrine ORM, and I would like to find an method that will be easily compatible with that data mapper
I'll relate his answer to that particular ORM.
What Bill describes is inheritance. You have a superclass of "content", represented by a table that holds all the shared data. Then you have subclasses (text, image, video) that extend that superclass by adding content-type-specific columns.
Doctrine2 will do essentially what Bill has suggested when you use class-table inheritance. Once you configure your entities properly, it will create a set of tables very similar to what Bill describes.
So, with Doctrine you cave the Content entity, which is extended by Image, Text, and Video.
As far as the tagging goes, you would just create a basic Tag entity, and Content would have a ManyToMany relationship to Tag. Doctrine will handle creating the intermediate table for you.
Related
I have get an intermediate table ArticleLanguage
idArticleLanguage
ArticleId
LanguageId
Name
Foreign keys are:
ArticleId
LanguageId
Should I use primary keys for:
ArticleId
LanguageId
Because these fields are primary keys in related tables?
Link / Junction Tables
Assuming the linked tables are defined as:
CREATE TABLE Article
(
ArticleId INT PRIMARY KEY
-- ... other columns
);
CREATE TABLE Language
(
LanguageId INT PRIMARY KEY
-- ... other columns
);
As per #JulioPĂ©rez Option 1, the link table could be created as:
CREATE TABLE ArticleLanguage
(
ArticleId INT NOT NULL,
LanguageId INT NOT NULL,
Name VARCHAR(50),
-- i.e. Composite Primary Key, consisting of the two foreign keys.
PRIMARY KEY(ArticleId, LanguageId),
FOREIGN KEY(ArticleId) REFERENCES Article(ArticleId),
FOREIGN KEY(LanguageId) REFERENCES Language(LanguageId)
);
i.e. with a composite primary key consisting of the two foreign keys used in the "link" relationship, and with no additional Surrogate Key (idArticleLanguage) at all.
Pros of this approach
Enforces uniqueness of the link, i.e. the same ArticleId and LanguageId cannot be linked more than once.
Saves an unnecessary additional surrogate key column on the link table.
Cons of this approach:
Any downstream tables which needs to reference this link table, would need to repeat both keys (ArticleId, LanguageId) as a composite foreign key, which would again consume space. Queries involving downstream tables which reference ArticleLanguage would also be able to join directly to Article and Language, potentially bypassing the link table (it is often easy to 'forget' that both keys are required in the join when using foreign composite keys).
SqlFiddle of option 1 here
The alternative (#JulioPĂ©rez Option 2), would be to to keep your additional surrogate PK on the reference table.
CREATE TABLE ArticleLanguage
(
-- New Surrogate PK
idArticleLanguage INT NOT NULL AUTO_INCREMENT,
ArticleId INT NOT NULL,
LanguageId INT,
Name VARCHAR(50),
PRIMARY KEY(idArticleLanguage),
-- Can still optionally enforce uniqueness of the link
UNIQUE(ArticleId, LanguageId),
FOREIGN KEY(ArticleId) REFERENCES Article(ArticleId),
FOREIGN KEY(LanguageId) REFERENCES Language(LanguageId)
);
Pros of this Approach
The Primary Key idArticleLanguage is narrower than the composite key, which will benefit any further downstream tables referencing table ArticleLanguage. It also requires downstream tables to join through the ArticleLanguage link table in order to get ArticleId and LanguageId, for further joins to the Language and Article tables.
The approach allows for an additional use case, viz that if it IS possible to add the same link to Language and Article more than once (e.g. two revisions or two reprints etc), then the UNIQUE key constraint can be removed
Cons of this Approach
If only one unique link per Article and Language is possible, then the additional surrogate key is redundant
SqlFiddle of option 2 here
If you're asking for an opinion, I would stick with option 1, unless you do require non-unique links in your ArticleLanguage table, or unless you have many further downstream tables which reference ArticleLanguage (this would be unusual, IMO).
Table per Type / per Class Inheritance
Unrelated to OP's post, but another common occurrence where a Foreign Key can be used as a Primary Key in the referencing table is when the Table per Type approach is taken when modelling an object oriented class hierarchy with multiple subclasses. Because of the 0/1 to 1 relationship between subclass and base class tables, the base class table's primary key can also be used as the primary key for the subclass tables, for instance:
CREATE TABLE Animal
(
AnimalId INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
-- Common Animal fields here
);
CREATE TABLE Shark
(
AnimalId INT NOT NULL PRIMARY KEY,
-- Subclass specific columns
NumberFins INT,
FOREIGN KEY(AnimalId) REFERENCES Animal(AnimalId)
);
CREATE TABLE Ewok
(
AnimalId INT NOT NULL PRIMARY KEY,
-- Subclass specific columns
Fleas BOOL,
FOREIGN KEY(AnimalId) REFERENCES Animal(AnimalId)
);
More on TPT and other OO modelling in tables here
You have 2 ways:
1) Put "ArticleId + LanguageId" as your only primary key in "intermediate table" and you can name it as "idArticleLanguage". This is called a "composite" primary key because it is composed by 2 (in other case more than 2) fields, in this case 2 foreign keys (PK= FK + FK).
2) Create "idArticleLanguage" that has no relation to the other two "id" and set it as primary key.It can be a simple auto-increment integer.
Both alternatives are accepted. Your election will depend on the goal you want to achieve because what happens if you need to add in this intermediate table the same Article with the same language (Wilkommen German for example) because you have 2 different editions of the article? if you choose alternative 1 it will throw an error because you will have the same composite primary key for 2 rows then you must choose alternative 2 and create a completely different primary key for this table
In any other case (or purpose) you can choose alternative 1 and it will work perfectly
About the change of your question title:
When use foreign key as primary key in the same time?
I will explain it with this example:
You have 2 tables: "country" and "city". "country" have all the countries of the world, "city" have all the cities of the world. But you need to know every capital in the world. What you should do?
You must create an "intermediate table" (named as "capital") that will have every capital on the world. So, we know that country have it's primary key "idcountry" and city have it's primary key is "idcity" you need to bring both as foreign keys to the table "capital" because you will need data of "city" and "country" tables to fill "capital" table
Then "capital" will have it's own primary key "idcapital" that can be a composite one "idcity+idcountry" or it can be an auto-increment integer in both cases you must have "idcity" and "idcountry" as foreign keys on your "capital" table.
If I was to create a marketplace database for something specific, lets say for books. I would have a store table and a store_books table that would contain the books of each store.
Now if there is a high probability of multiple stores selling the exact same book would it be a good idea to keep another table books containing book related information and put prices etc in the store_book table? I also want to add languages using another two tables book_langs and langs.
Where in book_langs I have the localized information and langs contains all supported languages.
My main concern in this case is that if a store needs to add a new book it will create the first entry in the books table as well as all the translations which will be the ones used by everyone. If someone translates wrong, misspells or if there simply is multiple ways of translating lets say a title this would introduce a mess in the database since everyone will start creating new entries for each book. What would be a good approach of solving such kind of problems? De-normalizing the store_books to contain the title would be one approach but is that preferred approach?
Please take the books as an example. One could argue that a book needs to have a specific title since it was translated. Think of the book as a placeholder since I can't think of a better example at the moment.
You need to distinguish between "editions" of the same book. I would argue that translations are different editions and stored sell editions. In your case, an "edition" could simply be a book/language combination. So, call it BookLanguageId.
Something like the following entities:
create table Books (
BookId b int auto_increment primary key,
Title varchar(255),
. . .
);
create table BookLanguages (
BookLanguageId int auto_increment primary key,
BookId int not null,
LanguageId int not null,
IsPrimaryLanguage tinyint,
. . .
foreign key (BookId) references Books(BookId),
foreign key (LanguageId) references Languages(LanguageId)
);
create table StoreBookLanguages (
StoreBookLanguageId int auto_increment primary key,
StoreId not null,
BookLanguageId int not null,
DateArrived int,
foreign key (StoreId) references Stores(StoreId),
foreign key (BookLanguageId) references BookLanguages(BookLanguageId)
);
You have the option to uncheck Mandatory in the foreign key tab of the relationship window, but that doesn't fully capture the meaning of a disjoint relationships, which is an EITHER-OR relationship between multiple relations.
Your referring to the mandatory property of the foreign key makes me believe you are either misunderstanding the meaning of a disjoint relationship, or implementing it with a relation in the wrong "direction".
Let's say we want to implement the following schema:
class: Staff Member
class: Permanent (specialises Staff Member)
class: Temporary (specialises Staff Member)
a Staff Member is either a Permanent employee or a Temporary contractor
A corresponding EER schema would be (MySQL syntax):
CREATE TABLE staff_member (
id INT PRIMARY KEY,
name VARCHAR(20) NOT NULL
);
CREATE TABLE permanent (
id INT PRIMARY KEY,
next_appraisal DATETIME NOT NULL,
FOREIGN KEY (id) REFERENCES staff_member(id)
);
CREATE TABLE temporary (
id INT PRIMARY KEY,
contract_end DATETIME NOT NULL,
FOREIGN KEY (id) REFERENCES staff_member(id)
);
Notice the foreign key is from the specialised entity to the parent entity (id being the primary key, it is also always mandatory by definition).
This still doesn't answer your question. How to model the disjoint property of this relationship? You cannot do this easily (neither can you model that a specialisation is complete, by the way).
Many RDBMS support the use of CHECK constraints in order to enforce these extra conditions, but MySQL does not (beware, the syntax is accepted by the MySQL parser, but the declaration is ignored). However, simple workarounds exist that result in the same effect.
Minimilized background (ie in bare pseudo code details)
I am making a record keeping (among other things) php/mysql app for my farm. There are lots of types of animals etc that could have pictures (or other records - videos etc.) but just for simplicity I'll only refer to one of each (Goats and Pictures). so say the
tables are approximately like so:
CREATE TABLE BMD_farmrecords_goats (
goat_id INT NOT NULL AUTO_INCREMENT,
goat_name TEXT,
...more columns but whatever, unimportant...
primary_key(goat_id))
CREATE TABLE BMD_farmrecords_pictures (
media_id INT NOT NULL AUTO_INCREMENT,
media_name TEXT,
media_description TEXT,
media_description_short TEXT,
media_date_upload DATE,
media_date_taken DATE,
media_uploader INT, //foreign key constrained to user table but unimportant for question
media_type ENUM('jpg','gif','png'),
media_hidden BOOL,
media_category INT, //foreign key constrained to category table but unimportant for question
PRIMARY KEY (media_id)
So the problem(s):
Obviously a picture could have multiple goats in it so I can't just
have one foreign key in picture to refer to goat.
there are more than one livestock tables that would also make that a poor choice but not worried about that right now
Basically no optimization has been applied as of yet (ie no lengths set, using TEXT rather than varchar(length)) etc; I'm not worried about that until I populate it a bunch and see exactly how long I want everything.
so the question:
what is the best_ way to link a picture to multiple goats (in terms of A) best performance B) best code conformance to standards. I'm thinking I'll have to do an extra table:
create TABLE BMD_farmrecords_goatpictures (
id INT NOT NULL AUTO_INCREMENT
picture_id INT //foreign key to BMD_farmrecords_pictures->media_id
goat_id INT//foreign key to BMD_farmrecords_goats->goat_id
So is there any better way to do that?
Of course with that method I'll probably have to change *_goats table to be a parent *_animals table with then a type field and reference animal_id instead but I'm not worried about that, just about whether or not the extra table referencing both tables is the best method.
thanks;
From the discussion just changing my original idea to use a composite primary key:
create TABLE BMD_farmrecords_goatpictures (
picture_id INT //foreign key to BMD_farmrecords_pictures->media_id
goat_id INT//foreign key to BMD_farmrecords_goats->goat_id
PRIMARY KEY (picture_id, goat_id))
I have following comments table in my app:
comments
--------
id INT
foreign_id INT
model TEXT
comment_text TEXT
...
the idea of this table is to store comments for various parts of my app - it can store comments for blog post i.e.:
1|34|blogpost|lorem ipsum...
user picture:
2|12|picture|lorem ipsum...
and so on.
now, is there a way to force FOREIGN KEY constraint on such data?
i.e. something like this in comments table:
FOREIGN KEY (`foreign_id`) REFERENCES blogposts (`id`)
-- but only when model='blogpost'
You're attempting to do a design that is called Polymorphic Associations. That is, the foreign key may reference rows in any of several related tables.
But a foreign key constraint must reference exactly one table. You can't declare a foreign key that references different tables depending on the value in another column of your Comments table. This would violate several rules of relational database design.
A better solution is to make a sort of "supertable" that is referenced by the comments.
CREATE TABLE Commentable (
id SERIAL PRIMARY KEY
);
CREATE TABLE Comments (
comment_id SERIAL PRIMARY KEY,
foreign_id INT NOT NULL,
...
FOREIGN KEY (foreign_id) REFERENCES Commentable(id)
);
Each of your content types would be considered a subtype of this supertable. This is analogous to the object-oriented concept of an interface.
CREATE TABLE BlogPosts (
blogpost_id INT PRIMARY KEY, -- notice this is not auto-generated
...
FOREIGN KEY (blogpost_id) REFERENCES Commentable(id)
);
CREATE TABLE UserPictures (
userpicture_id INT PRIMARY KEY, -- notice this is not auto-generated
...
FOREIGN KEY (userpicture_id) REFERENCES Commentable(id)
);
Before you can insert a row into BlogPosts or UserPictures, you must insert a new row to Commentable to generate a new pseudokey id. Then you can use that generated id as you insert the content to the respective subtype table.
Once you do all that, you can rely on referential integrity constraints.
In MySQL 5.7 you can have a single polymorphic table AND enjoy something like a polymorphic foreign key!
The caveat is that technically you will need to implement it as multiple FKs on multiple columns (one per each entity that has comments), but the implementation can be limited to the DB side (i.e. you will not need to worry about these columns in your code).
The idea is to use MySQL's Generated Columns:
CREATE TABLE comments (
id INT NOT NULL AUTO_INCREMENT,
foreign_id INT,
model TEXT,
commented_text TEXT,
generated_blogpost_id INT AS (IF(model = 'blogpost', foreign_id, NULL)) STORED,
generated_picture_id INT AS (IF(model = 'picture', foreign_id, NULL)) STORED,
PRIMARY KEY (id) ,
FOREIGN KEY (`generated_blogpost_id`) REFERENCES blogpost(id) ON DELETE CASCADE,
FOREIGN KEY (`generated_picture_id`) REFERENCES picture(id) ON DELETE CASCADE
)
You can ignore the generated_* columns; they will be populated automatically by MySQL as comments are added or modified, and the FKs defined for them will ensure data consistency as expected.
Obviously it would impact both the size requirements and performance, but for some (most?) systems it would be negligible, and a price worth paying for achieving data consistency with a simpler design.