Designing complex relations tables - mysql

I have some problems with database projecting.
I have some item types, let's say those are "news", "articles", "files", etc.
Also I have "categories" table to store categories for each type. Type is specified in special field "category_type". So there're defined constants in my application code: news = 0, articles = 1, files = 2, etc.
Now the question is - what will be the correct way to build connections tables?
One way: I can create several tables for each type - "news2categories", "articles2categories", "files2categories", etc.
And the second option is to build one global table, which will have 3 fields "item_id (int), item_type (int), category_id (int)". Well actually I already have one global table for categories with type division by one field only. But is it the right way? I don't want to spawn dozens of identical tables, but on the other hand relations with one table and multiple types are seem to be too abstract and complex. Please advise.
DB will be used by Yii framework mainly, if it makes any sense in solving this problem.

What is the difference between articles, news, and files? Can a each of these have multiple categories (many-to-many)? Assuming yes on the many-to-many need of multiple categories to each type (article, news, etc...) and that they are each sufficiently different from each other, I'd have the following tables:
category
news
category_news
article
article_category
file
category_file
Each category linking table would have 2 columns: category_id and item_id (news_id, article_id, etc...)
I use the db coding standard of naming all db tables in the singular (feel free change to plural -ack- if Yii does otherwise) and having my linking tables the name of each linked table separated with an underscore (names in alphabetical order).
Whatever you do, be consistent.

First way seems problematic since you might want your design to be flexible with adding / removing different types. If you add new type, you will have to add a new table, which I feel is not a good design.
How about this design.
Table : Category,
Columns :
CategoryID (PK)
TypeID (FK to Types table.)
Table : Types, Columns: TypeID (PK), TypeValue
Store Types news, article, etc in Types table. This will make design more flexible with adding or removing types and categories.

Related

Structure of database

I'm creating a gaming website that will have a database of items found in a game (several thousand) and I'm wondering what best practice would be for the database structure.
There are 5 item types: weapons, armor, potions, quest items and junk. Each of these has unique attributes that don't apply to other items. For example, weapons can be different types (i.e. two-handed, main-hand, off-hand), armor can be different weights (plate, leather), etc.
Everything is an 'item', but I'm wondering if I should I have a single table of items, i.e.:
item id, item_type ... item_weapon_type_id, item_armor_weight_id
Thus having all possible attributes in one table.
Or separate tables for each item type, i.e.:
weapon_id, weapon_type, weapon_name
I am wondering this since each item type has a number of unique attributes, which means a single table of items would have a lot of columns, many of which don't apply to 80% of the items in the table.
Thanks in advance for any help.
I think that will be useful having a base table for item types, called Item-Base-Type, containing shared properties of all item types.
Per item type(like shield) there will be a separate table ( like shield-type) with one-one relationship to the Item-Base-Type.
Item table that contains all items of the game will have a foreign key to the Item-Base-Type.
I think you will need some helper tables like Effect be added to the model.
is-quest-item will be a property of Item table.
Having a look at the wiki may helps.
Have two tables:
Weapons
WeaponId
Description
WeaponType
WeaponTypeId
WeaponId
Description (One handed, two handed etc)
This way you can have the same weapon but equipped to one hand or two hands.
Separating out the tables will normalize the database and also improve performance. For example, you are not selecting repeating rows when you want a basic list of Weapons available.
You can make two tables:
items
id - PK
type
name
some_attribute
some_other_attribute
item_attributes
item_id - PK
attribute_name - PK
attribute_value
Put the common attributes (e.g. price,weight...) in the items table, and the rest of them in the attributes table.
Also, don't worry too much if some of the fields in items are not used for all of the types. 2-3 enum fields should be sufficient to describe the types, armor 'heavy-ness', weapon types, etc... of the items (at least in the games I have seen).

Database Normalization - I think?

We have a J2EE content management and e-commerce system, and in this system – for sake of a simple example – let’s say that we have 100 objects. All of these objects extend the same base class, and all share many of the same fields.
Let’s take two objects as an example: a news item that would be posted on a website, and a product that would be sold on a website. Both of these share common properties:
IDs: id, client ID, parent ID (long)
Flags: deleted, archived, inactive (boolean)
Dates: created, modified, deleted (datetime)
Content: name, description
And of course they have some properties that are different:
News item: author, posting date
Product: price, tax
So (finally) here is my question. Let’s say we have 100 objects in our system, and they all follow this pattern. They have many fields that overlap, and some unique fields. In terms of a relational database, would we be better off with:
Option One: Less Tables, Common Tables
table_id: id, client ID, parent ID (long) (id is the primary key, a GUID for all objects)
table_flag: id, deleted, archived, inactive (boolean)
table_date: id, created, modified, deleted (datetime)
table_content: id, name, description
table_news: id, author, posting date
table_product: id, price, tax
Option Two: More Tables, Common Fields Repeated
table_news: id, client ID, parent ID, deleted, archived, inactive, name, description, author, posting date
table_product: id, client ID, parent ID, deleted, archived, inactive, name, description, price, tax
For full disclosure – I am a developer and not a DBA, and because of that I prefer option one. But there is another team member that prefers option two, and I think he makes valid points.
Option One: Pros and Cons
Pro: Encapsulates common fields into common tables.
Pro: Need to change a common field? Change it in one place.
Pro: Only creates new fields/tables when they are needed.
Pro: Easier to create the queries dynamically, less repetitive code
Con: More joining to create objects (not sure of DB impact on that)
Con: More complex queries to store objects (not sure of DB impact on that)
Con: Common tables will become huge over time
Option Two: Pros and Cons
Pro: Perhaps it is better to distribute the load of all objects across tables?
Pro: Could index the news table on the client ID, and index the product table on the parent ID.
Pro: More readable to human eye: easy to see all the fields for an object in one table.
My Two Cents
For me, I much prefer the elegance of the first option – but maybe that is me trying to force object oriented patterns on a relational database. If all things were equal, I would go with option one UNLESS a DB expert told me that when we have millions of objects in the system, option one is going to create a performance problem.
Apologies for the long winded question. I am not great with DB lingo, so I probably could have summarized this more succinctly if I better understood terms like normalization. I tried to search for answers on this topic, and while I found many that were close (I suspect this is a common DB issue) I could not find any that answered all my questions. I read through this article on normalization:
But I did not totally understand it. On the one hand it was saying that you should remove any redundancies. But on the other hand, it was saying that each attribute should define only one object.
Thanks,
John
You should read Patterns of Enterprise Application Architecture by Martin Fowler. He writes about several options for the scenario you describe:
Single Table Inheritance: One table for all object subtypes. Stores all attributes, setting them NULL where they are inapplicable to the row's object subtype.
Class Table Inheritance: One table for column common to all subtypes, then one table for each subtype to store subtype-specific columns.
Concrete Table Inheritance: One table for each subtype, storing both subtype-specific columns and columns common to all subtypes.
Serialized LOB: One table for all object subtypes. Store common attributes as conventional columns, but combine optional or subtype-specific columns as fields in a BLOB that stores XML or JSON or whatever format you want.
Each one of these designs has pros and cons, so choose a solution depending on the most common way you access your data.
However, notice I use the word subtype above. I would use these designs only if the different object types are subtypes of a common base class. I'm assuming that News item and Product don't actually share a logical base class (besides Object); they are not subtypes of a common superclass.
So for the sake of OO design, I would choose Concrete Table Inheritance. This avoids any inappropriate coupling between these subtypes. There are columns the two tables have in common, but they basically amount to bookkeeping, not anything to do with the function of the class and hence the table.

Database structure for a list of items ( very different )

I came to the following problem:
I have a list of users that can create lists
I have tables that contains entities like:
- movies - with different properties
- pictures - with different properties
- people profiles - with different properties
- X type - with different properties ( a to do list).
A user can create a list ( where other users can add elements) with all the above types. ( for example a list with movies and actors and pictures galleries).
How can I store that list efficiently and without many problems :)
Thanks in advance
If I understand your question correctly, you should create a separate table Entity that would be a base type for all your movies, pictures, etc. And a separate table ListsItems that would contain following columns: (list_id, entity_id), and table UsersLists of (user_id, list_id).
UsersLists would contain users to lists mapping (every user may have many lists), ListsItems would contain lists to entities mapping (every list may have many entities), and Entity would contain entity type (movie, picture, whatever) and specific entity ID that would point to its native table (Movie, Picture, etc).
Aufziehvogel asked you about number of types because it's important for this entity type field resolving in a design. If you have a finite predefined number of types, you may make column entity_type an Enumeration, but if user should be able to create their own types, that's a more complicated problem and a table SpecifiedEntity should replace specific tables (Movies, Pictures, etc).
You may read something about normalization of relational databases to understand this issue in all the existing details.
The solution you're looking for (and that Mkaz started describing) is a database pattern called Disjoint Subtypes.
I battled with the same problem a while ago, here's what I found:
Object-oriented-like structures in relational databases
Polymorphism in SQL database tables?

Shared Primary Key

I would guess this is a semi-common question but I can't find it in the list of past questions. I have a set of tables for products which need to share a primary key index. Assume something like the following:
product1_table:
id,
name,
category,
...other fields
product2_table:
id,
name,
category,
...other fields
product_to_category_table:
product_id,
category_id
Clearly it would be useful to have a shared index between the two product tables. Note, the idea of keeping them separate is because they have largely different sets of fields beyond the basics, however they share a common categorization.
UPDATE:
A lot of people have suggested table inheritance (or gen-spec). This is an option I'm aware of but given in other database systems I could share a sequence between tables I was hoping MySQL had a similar solution. I shall assume it doesn't based on the responses. I guess I'll have to go with table inheritance... Thank you all.
It's not really common, no. There is no native way to share a primary key. What I might do in your situation is this:
product_table
id
name
category
general_fields...
product_type1_table:
id
product_id
product_type1_fields...
product_type2_table:
id
product_id
product_type2_fields...
product_to_category_table:
product_id
category_id
That is, there is one master product table that has entries for all products and has the fields that generalize between the types, and type-specified tables with foreign keys into the master product table, which have the type-specific data.
A better design is to put the common columns in one products table, and the special columns in two separate tables. Use the product_id as the primary key in all three tables, but in the two special tables it is, in addition, a foreign key back to the main products table.
This simplifies the basic product search for ids and names by category.
Note, also that your design allows each product to be in one category at most.
It seems you are looking for table inheritance.
You could use a common table product with attributes common to both product1 and product2, plus a type attribute which could be either "product2" or "product1"
Then tables product1 and product2 would have all their specific attributes and a reference to the parent table product.
product:
id,
name,
category,
type
product1_table:
id,
#product_id,
product1_specific_fields
product2_table:
id,
#product_id,
product2_specific_fields
First let me state that I agree with everything that Chaos, Larry and Phil have said.
But if you insist on another way...
There are two reasons for your shared PK. One uniqueness across the two tables and two to complete referential integrity.
I'm not sure exactly what "sequence" features the Auto_increment columns support. It seem like there is a system setting to define the increment by value, but nothing per column.
What I would do in Oracle is just share the same sequence between the two tables. Another technique would be to set a STEP value of 2 in the auto_increment and start one at 1 and the other at 2. Either way, you're generating unique values between them.
You could create a third table that has nothing but the PK Column. This column could also provide the Autonumbering if there's no way of creating a skipping autonumber within one server. Then on each of your data tables you'd add CRUD triggers. An insert into either data table would first initiate an insert into the pseudo index table (and return the ID for use in the local table). Likewise a delete from the local table would initiate a delete from the pseudo index table. Any children tables which need to point to a parent point to this pseudo index table.
Note this will need to be a per row trigger and will slow down crud on these tables. But tables like "product" tend NOT to have a very high rate of DML in the first place. Anyone who complains about the "performance impact" is not considering scale.
Please note, this is provided as a functioning alternative and not my recommendation as the best way
You can't "share" a primary key.
Without knowing all the details, my best advice is to combine the tables into a single product table. Having optional fields that are populated for some products and not others is not necessarily a bad design.
Another option is to have a sort of inheritence model, where you have a single product table, and then two product "subtype" tables, which reference the main product table and have their own specialized set of fields. Querying this model is more painful than a single table IMHO, which is why I see it as the less-desirable option.
Your explanation is a little vague but, from my basic understanding I would be tempted to do this
The product table contains common fields
product
-------
product_id
name
...
the product_extra1 table and the product_extra2 table contain different fields
these tables habe a one to one relationship enforced between product.product_id and
product_extra1.product_id etc. Enforce the one to one relationship by setting the product_id in the Foreign key tables (product_extra1, etc) to be unique using a unique constraint.
you will need to decided on the business rules as to how this data is populated
product_extra1
---------------
product_id
extra_field1
extra_field2
....
product_extra2
---------------
product_id
different_extra_field1
different_extra_field2
....
Based on what you have above the product_category table is an intersecting table (1 to many - many to 1) which would imply that each product can be related to many categories
This can now stay the same.
This is yet another case of gen-spec.
See previous discussion

Super general database structure

Say I have a store that sells products that fall under various categories... and each category has associated properties... like a drill bit might have coating, diameter, helix angle, or whatever. The issue is that I'd like the user to be able to edit these properties. If I wasn't interested in having the user change the properties, and I was building the store for a certain set of categories, I'd have one table for drill bits, etc. Alternatively, I could just modify the schema online but that doesn't seem to be done very often (unless we're talking phpmyadmin or something), and plus that doesn't fit in well at all with the way models are coupled to tables.
In general, I'm interested in implementing a multi-table database structure with various datatypes (because diameter might be a decimal, coating would be a string/index into a table, etc), within mysql. Any idea how this might be done?
If I understand correctly what you're asking, an, admittedly hacky, solution would be to have a products table that has to related tables, product_properties and product_properties_lookup (or some better name) where product_properties_lookup has an entry for every possible property a product can have and where product_properties contains the value of a property as a string with the ID of the property and the ID of the product. You could then coerce the property value into whatever type you wanted. Not ideal, but I'm not sure what else to do short of adding individual columns to the DB for property types.
Just use the database. It does all of this already. For free. And fast. How is having a table of products point to a table of properties with data types any different from a table with columns? It's not. Save if you use the DBs tables you get to use SQL to query it in all sorts of neat, and efficient ways compared to your own (crosstabs suck in SQL dbs).
Get a new product, make a new table. No big deal. Get a new property, alter the table. If you have 1M products in that table, yea, it may be a slow update (depends on the DB). Do you have 1M products? I don't think WalMart has 1M products.
Building Databases on top of Databases is a silly thing. Just use the one that's there. It is putty in your hands. Mold it to your whim.
Create a Property table first. This will contain all properties. It should have (at minimum) a Name column and a Type column ('string', 'boolean', 'decimal', etc.). Note: Primary keys are implied for all these tables.
Next, create a CategoryProperty table. Here you will be able to assign properties to a category. It should have these columns: CategoryID, PropertyID. Both foreign keys.
Then, create a Category table. This describes the categories. It should have a Name column and possibly some other columns like Description.
Then, create a ProductCategory table. Here, you will assign the categories for each product. It should have these columns: CategoryID, ProductID. Both foreign keys.
Next, create a PropertyValue table. Here, you will "instantiate" the properties and give them values. Columns include ProductID, PropertyID, and PropertyValue. The primary key can consist of ProductID and PropertyID.
Finally, create a Product table that just describes each product with columns like Name, Price, etc.
Note how for each relationship there is a separate table. If you only want one category for each product, you can do away with the ProductCategory table and just put a CategoryID field in the Product table. Similarly, if you want each property to belong to only one category, you can put a PropertyID column in the Category table and get rid of the CategoryProperty table.
Lastly, you will not be able to verify the data type for each property since each property has a different type (and they are rows, not columns). So just make the PropertyValue column a string and then perform your validation either as a trigger, or in your application, by checking the Type column of the Property table for that property.
If you're using a recentish version of mysql (5.1.5 or greater) you can store your data as XML in the database. You can then query that data using thigns like this.
Suppose I have a table that contains some items and I have a widgetpack that contains numerous
widgets. I can get my total number of widgets:
SELECT SUM( EXTRACTVALUE( infoxml, '/info/widget_count/text()' ) ) as widget_count
WHERE product_type="widgetpack"
assuming the table has an infoxml column and each widgetpacks infxml column contain XML that looks like this
<info>
<widget_count>10</widget_count>
<!-- Any other unstructured info can go in here too -->
</info>
DB purists will cringe at this, and it is kinda hacky. But often its easier to keep all your unstructured data in one place.
Have a look at this database schema on DatabaseAnswers.org:
http://www.databaseanswers.org/data_models/products_and_generic_characteristics/index.htm
Maybe consider an Entity-Attribute-Value (EAV) approach (not for the whole model of course!).
Related questions
Entity Attribute Value Database vs. strict Relational Model Ecommerce question
Approach to generic database design
How do you build extensible data model