I'm having trouble modeling a particular database structure I'm working on. To be short, considering the following:
A webpage can have one or more threads on it
A thread consists of one or more comments
Comments can have one or more complaints filed against it
Complaints can also be filed against the thread as a whole
Complaints can also be filed against the page
I can't quite figure out how to model this at the DB level. The first three are easy:
webpage
----------
id
name
thread
---------
id
page_id
name
comment
--------
id
thread_id
name
But if I wanted a single table of complaints, how would one model that? I don't think you would want to do:
complaint
----------
id
page_id
thread_id
comment_id
If you ever added a new object type, like picture, you'd have to add more columns to the complaint. Is there a better way to do this, or is at as good as it gets?
Thanks in advance,
- Anthony
I would create the complaint as an entity in it's own right, then have link table between all the different things it can be associated with.
So, I'd have the following tables ...
complaint
compliant_comment_link
complaint_thread_link
complaint_page_link
This is a slightly different variation on Waleed's solution. As with all things like this, there are many ways to solve it :)
The advantage of this approach is that you can have foreign keys to maintain data integrity. The disadvantage is that whenever you need to have complaint against a new "thing" you will need a new link table, but I suppose you'd have to create a new "thing" table anyway.
One solution off the top of my head is to have a table:
ObjectType
-------------------
| id | name |
-------------------
| 1 | Webpage |
| 2 | Thread |
| 3 | Comment |
-------------------
Then your complaint table can be as follows:
----------------------------------------
| id | object_type_id | objectid |
----------------------------------------
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 1 |
---------------------------------------|
Of course this could add additional work later on when querying the complaint table and joining with the others, but that all depends on what you want to query.
Another approach is to have a new entity table that has a supertype/subtype relationship with the 3 tables (webpage, thread, comment):
entity
----------
id (PK)
webpage
----------
id (PK)
name
FOREIGN KEY id REFERENCES entity(id)
thread
---------
id (PK)
page_id
name
FOREIGN KEY id REFERENCES entity(id)
comment
--------
id (PK)
thread_id
name
FOREIGN KEY id REFERENCES entity(id)
complaint
----------
id (PK)
entity_id
FOREIGN KEY entity_id REFERENCES entity(id)
This way, creating of a new webpage (or thread or comment) or deleting one will be slightly more complicated (inserting or deleting a new row in two tables than one.)
Related
Im playing around with MySQL at the moment, learning stuff about database design and wondered something i couldnt find an answer to in Google.
Imagine a table named 'products' with the primary key 'id' and two additional columns named 'name' and 'primary_image_id', where 'primary_image_id' is a foreign key linking to a second table.
The second table is named 'product_images' also with the primary key 'id' and two additional columns this time called 'path' (path to the image) and 'product_id'. 'product_id' is of course a foreign key linking back to the first table.
+----+-----------+------------------+
| id | name | primary_image_id |
+----+-----------+------------------+
| 1 | product_A | 3 |
+----+-----------+------------------+
| 2 | product_B | 6 |
+----+-----------+------------------+
+----+-----------+------------------+
| id | path | product_id |
+----+-----------+------------------+
| 1 | /image_01 | 2 |
+----+-----------+------------------+
| 2 | /image_02 | 1 |
+----+-----------+------------------+
| 3 | /image_03 | 1 |
+----+-----------+------------------+
| 4 | /image_04 | 1 |
+----+-----------+------------------+
| 5 | /image_05 | 2 |
+----+-----------+------------------+
| 6 | /image_06 | 2 |
+----+-----------+------------------+
The idea is to have a table with all product images while only one image per product is the preview image (primary image). Is this type of foreign key linking even possible? And if yes, is it good databse design or should I use an other method?
Thank you in advance!
This is a valid use case and the table design looks good if your intention is to just read data using foreign key like "Get all image paths for product id 1" or "Get primary image of product id 1" or "Get paths of all primary images".
People tend to avoid the cycle of foreign key reference in tables specially if there is a cascade dependency on delete/update events. You need to answer questions like "What should happen to image 2, 3 ,4 if product 1 is deleted" or "what should happen to product 1 if image 3 is deleted".
The answers would help you come with a design that fulfills your requirement
Just use indexes without FOREIGN KEYs.
A more typical approach would be to move the primary flag to the images table. Both of these approaches have the potential for illogical data —
Your way would allow product 1 to name image A as its primary while image A could identify product 2 as its product.
My way would allow products to have 0 or 2+ primary images if the flag wasn’t well-managed.
Depending on how worried you are about either inconsistency, you could try to manage it via triggers or constraints, although MySQL is a little lacking in these areas compared to other DBMSs.
One way to absolutely prevent a problem would be to have the primary flag in the images table, but use it as an int (rank), not a Boolean with a convention that minimum rank is the “primary” — create a unique index on the combination of (product ID, rank) — and access this data via a stored proc or view that could implement the rank convention for you, e.g. select * from images a where product_id = whatever and does not exist (select 1 from images b where a.product_id = b.product_id and a.rank > b.rank).
Seems like overkill, but you need to be the judge how important potential data integrity issues are for your application.
I have 2 tables in a my MySQL Database.
Let's call 1st main, 2nd final.
TABLE `main` has the structure | TABLE `final` has the structure
|
`id` --> PRIMARY KEY (Auto Increment) | `id` --> PRIMARY KEY (Auto Increment)
| `id_main` --> ?? (Need help here)
|
id | name | info | id | id_main | name | info(changed)
--------------------- | ---------------------------------------
1 | Peter | 5,9 | 1 | 2 | Butters | 0.3,34
2 | Butters | 3,3 | 2 | 4 | Stewie | 1.2,4.4
3 | Stan | 2,96 | 3 | 1 | Peter | 5.7,0.9
4 | Stewie | 1,84 | 4 | 3 | Stan | 4.8,0.74
After analysing data in main the results get put into final.
As you can see final has an extra column (id_main) which points back to main.id
In actuality these 2 tables are 100 million+ rows each, my problem arises while performing SQL queries.
How should final especially (id & id_main) be configured so that Querying from main to final is the fastest.
Can I do away with final.id (PRIMARY KEY, Auto Increment) & keep
final.id_main (As an UNIQUE Index?)
OR
Should I keep id AS PRIMARY KEY (AI) & final.id_main AS UNIQUE Index?
I would be making calls like:
int id_From_Main= 10000;
SELECT `id_main` FROM `final` WHERE `id`='"+id_From_Main+"'
If there's a 1:1 relation between those tables, I don't see any reason why they would need two separate auto-incremented primary keys.
I would remove the final.id column and have the final.id_main as a non-auto-incremented primary key and a foreign key to the main.id column.
In general, you can also have a table without a primary key at all. It depends on if you want to be able to select specific individual rows or not.
I don't understand your query SELECT id_main FROM final WHERE id = '"+id_From_Main+"' — you're trying to select the value of ID from main by ID from main. What's the purpose, why are you trying to get the value you already have?
Anyway, you're not providing enough information to give you a qualified answer. You have to optimize you data structures according to queries you'll be doing.
Make sure you have indexes on columns which you are using in the WHERE clausule. If you're selecting by final.id_main, have an index on that column. If you're selecting by final.id_main and final.name, have a composite index on both columns, etc.
Do you really need to have the name column in both tables? It's a bad database design, unless it's some performance optimization (to avoid a join).
So, you should:
collect all queries you're currently using, set proper indexes according to them
remove any unnecessary columns (e.g. final.id, final.name)
use the EXPLAIN on your queries to get execution information (you can also use the Explain analyzer to help you interpret the results)
you can try query profiling
In mysql, you have to define id as PK because it is auto_increment. Define id_main as UNIQUE.
I have a table called branch
It looks something like.
+----------------+--------------+
| branch_id | branch_name |
+----------------+--------------+
| 1 | TestBranch1 |
| 2 | TestBranch2 |
+----------------+--------------+
I've set the branch_id as primary key.
Now my question is related to the next table called item
It looks like this.
+----------------+-----------+---------------------------+
| branch_id | item_id | item_name |
+----------------+-----------+---------------------------+
| 1 | 1 | Apple |
| 1 | 2 | Ball |
| 2 | 1 | Totally Difference Apple |
| 2 | 2 | Apple Apple 2 |
+----------------+-----------+---------------------------+
I'd like to know if I need to create a primary key for my item table?
UPDATE
They do not share the same items. Sorry for the confusion.. A branch can create a product that doesn't exist in the other branch. They are like two stores sharing the same database.
UPDATE
Sorry for the incomplete information.
These tables are actually from two local database...
I'm trying to create a database that can exist on its own but would still have no problem when mixed with another. So the system would just append all the item data from another branch without mixing them up.. The branches doesn't take the item_id of the other branches in consideration when generating a unique_id for their items. All the databases however may share same branch table as reference.
Thank you guys in advance.
I'd like to know if I need to create a primary key for my item table?
You always1 need a key2, whether the table is involved in a relationship3 or not. The only question is what kind of key?
Here are your options in this case:
Make {item_id} alone a key. This makes the relationship "non-identifying" and item a "strong" entity...
Which produces a slimmer key (compared to the second option), therefore any child tables that may reference it are slimmer.
Any ON UPDATE CASCADE actions are cut-off at the level of the item and not propagated to children.
May play better with ORMs.
Make a composite4 key on {branch_id, item_no}. This makes the relationship "identifying" and item a "weak" entity...
Which makes item itself slimmer (one less index).
Which may be very useful for clustering.
May help you avoid a JOIN in some cases (if there are child tables, branch_id is propagated to them).
May be necessary for correctly modelling "diamond-shaped" dependencies.
So pick your poison ;)
Of course, branch_id is a foreign key (but not key) in both cases.
And orthogonal to all that, if item_name has to be unique per-branch (as opposed to per whole table), you need a composite key on {branch_id, item_name} as well.
1 From the logical perspective, you always need a key, otherwise your table would be a multiset, therefore not a relation (which is a set), therefore your database would no longer be "relational". From the physical perspective, there may be some special cases for breaking this rule, but they are rare.
2 Whether its primary or not is immaterial from the logical standpoint, although it may be important if the DBMS ascribes a special meaning to it, such is the case with InnoDB which uses primary key as clustering key.
3 Please make a distinction between "relation" and "relationship".
4 Aka. "compound".
According to your example data you are using n to m relations and not 1 to m. It should be like this
item table
----------
item_id | item_name
1 | Apple
2 | Ball
branch_item table
-----------------
item_id | branch_id
1 | 1
1 | 2
2 | 1
2 | 2
And your brach_item table should have a compound unique key containg branch_id and item_id to make sure no duplicate entries can be added.
Yes you do. The Primary key is what allows the many to one relationship to exist.
This requirement is already catered for by the branch_id column.
The item_id column is not required for the one-to-many relationship in your example.
I have a Users table that belongs to a Role, and has one Server or no one (depends on role), but should to have a Server table with user_id field or should I put all Server info in Users table and when the role hasn't Servers, the fields will be null?
I just think that if a User have one server (or not), this shouldn't be a new row in Servers table, maybe if the user_id field be unique, then it will be correct, I don't know.. I'm confused.
Just explain to me which is the best way to build this thing.
-- edit
This is my tables actually
Roles
id (PK) | name
1 | Administrator
Users
id (PK) | role_id | name
1 | 1 | Juliano
Servers
id (PK) | user_id (UNIQUE) | name
1 | 1 | Test
I don't know.. in servers, user_id should be UNIQUE or PK?
ONE TO ONE Relation , Then put the server in the user table.
Users Table:
ID(Pk)
RoleID
Name
ServerID
I have a CMS system that stores data across tables like this:
Entries Table
+----+-------+------+--------+--------+
| id | title | text | index1 | index2 |
+----+-------+------+--------+--------+
Entries META Table
+----+----------+-------+-------+
| id | entry_id | value | param |
+----+----------+-------+-------+
Files Table
+----+----------+----------+
| id | entry_id | filename |
+----+----------+----------+
Entries-to-Tags Table
+----+----------+--------+
| id | entry_id | tag_id |
+----+----------+--------+
Tags Table
+----+-----+
| id | tag |
+----+-----+
I am in trying to implement a revision system, a bit like SO has. If I was just doing it for the Entries Table I was planning to just keep a copy of all changes to that table in a separate table. As I have to do it for at least 4 tables (the TAGS table doesn't need to have revisions) this doesn't seem at all like an elegant solution.
How would you guys do it?
Please notice that the Meta Tables are modeled in EAV (entity-attribute-value).
Thank you in advance.
Hi am currently working on solution to similar problem, I am solving it by splitting my tables into two, a control table and a data table. The control table will contain a primary key and reference into the data table, the data table will contain auto increment revision key and the control table's primary key as a foreign key.
taking your entries table as an example
Entries Table
+----+-------+------+--------+--------+
| id | title | text | index1 | index2 |
+----+-------+------+--------+--------+
becomes
entries entries_data
+----+----------+ +----------+----+--------+------+--------+--------+
| id | revision | | revision | id | title | text | index1 | index2 |
+----+----------+ +----------+----+--------+------+--------+--------+
to query
select * from entries join entries_data on entries.revision = entries_data.revision;
instead of updating the entries_data table you use an insert statement and then update the entries table's revision with the new revision of the entries table.
The advantage of this system is that you can move to different revisions simply by changing the revision property within the entries table. The disadvantage is you need to update your queries. I am currently integrating this into an ORM layer so the developers don't have worry about writing SQL anyway. Another idea I am toying with is for there to be a centralised revision table which all the data tables use. This would allow you to describe the state of the database with a single revision number, similar to how subversion revision numbers work.
Have a look at this question: How to version control a record in a database
Why not have a separate history_table for each table (as per the accepted answer on the linked question)? That simply has a compound primary key of the original tables' PK and the revision number. You will still need to store the data somewhere after all.
For one of our projects we went the following way:
Entries Table
+----+-----------+---------+
| id | date_from | date_to |
+----+--------_--+---------+
EntryProperties Table
+----------+-----------+-------+------+--------+--------+
| entry_id | date_from | title | text | index1 | index2 |
+----------+-----------+-------+------+--------+--------+
Pretty much complicated, still allows to keep track of full object's lifecycle. So for querying active entities we were going for:
SELECT
entry_id, title, text, index1, index2
FROM
Entities INNER JOIN EntityProperties
ON Entities.id = EntityProperties.entity_id
AND Entities.date_to IS NULL
AND EntityProperties.date_to IS NULL
The only concern was for a situation with entity being removed (so we put a date_to there) and then restored by admin. Using given scheme there's no way to track such kind of tricks.
Overall downside of any attempt like that is obvious - you've to write tons of TSQL where non-versioned DBs will go for something like select A join B.