One to One Master-Detail Relationship in Database Design - mysql

I am in the process of designing a MySQL schema for a website which allows the visitors first to view the list of products (browsing or searching) and view each product in details. Only certain fields are displayed in the listing view and a lot more information is to be displayed when viewing the details, with some fairly big column data (e.g, the product description). The design is quite common. I am wondering what is a good way to store the product information.
One choice is just to have one PRODUCT table. Viewing the list will have a select a subset of the fields, say, name, price and the main product picture, but not including other fields like description which could be fairly big, say, a VARCHAR(2000). My question is that I do a select name, price, main_pic from product where ..., will the description fields also get loaded into the memory by MySQL engine, thus consume more spaces?
If MySQL does load other unselected fields into memory or simply just want to keep details in another table, there could be a PRODUCT_DETAIL table. Would this be a good design? I feel a little weird since both PRODUCT and PRODUCT_DETAIL would then have the same primary key, and because of the master-detail relationship between the two, the primary key of the PRODUCT_DETAIL would also be a foreign key referencing the primary key of PRODUCT! Anyone actually use this kind of design?

This question is similar and can help you.
Question : select * vs select column
And yes Master-detail type relationships are also used and is a good way to design.

Related

Optimising a database with two separate category tables

I have a database for a website that provides all the data storage capabilities of the website. It stores articles in a knowlegebase, and services for internal and end-user access.
Both articles and services are stored in categories which can have an indefinite amount of parent categories by self-referencing. It is possible to add multiple categories to either via the connecting table.
It needs to be possible to find the categories of a service or an article, including all the way up the category-parent tree. It also needs to be possible to find the services or articles of a category. Of course, a category can't have both.
Is this an optimal way of doing this? It doesn't feel right, and I'd welcome alternate ideas.
EDIT: Does this way usually work? The categories all have roughly the same content, just a name and description and perhaps an image.
The primary keys of category_service and category_article should include both fields in the respective tables (if a category can have more than one service or article). Also, do you really need a VARCHAR(45) type indicator? I recommend a short ENUM instead.
Otherwise, the basic design in the second diagram looks good. I suggest you add a closure table for efficiently querying recursive hierarchies.
If you want to enforce consistency between the category type and records in category_article/category_service, you can duplicate the type indicator in those tables and include it in the foreign key constraint. Yes, doing so feels redundant, but it's effective. Resist the temptation to combine these two tables, mixing values from different domains in a single column usually leads to more difficulties.

What's the preferred method for providing customizable categories to a database?

For example, I have a contacts database that has some basic information about a contact. First name, last name, phone number, etc.
Although my options can encompass some of the obvious contact details, it would need to have a method by which users could add their own custom fields, like 'website', 'widget_1', and so on.
My first thought was to add some miscellaneous columns in the contacts table after first_name, last_name, etc., and making them a large varchar data type so somebody could store any information there. That seems sloppy and even still we couldn't expand contact details past the number of miscellaneous fields.
Optimally I'd like a user to click something like 'add a custom field', then populate it with data. What is an intelligent method of doing this without muddying the database?
The easiest way to do this is to have another table that looks something like this:
contact_id INT PK,
field_name VARCHAR(64) PK,
field_value TEXT
You can have the contact_id field be a foreign key to your contacts table, then join on this table whenever you want to read in the custom fields.
There's no good solution that really fits the relational database paradigm. Allowing each user to pick his or her custom fields to supplement the conventional columns fundamentally breaks the definition of a relation.
Nevertheless, what you describe is a common requirement of data-driven applications. I did a presentation showing options, and their pros and cons: Extensible Data Modeling
You may also like to read my answer to Product table, many kinds of product, each product has many parameters, which is a similar data management problem.
All of the solutions seem more or less clumsy in SQL, because SQL wasn't designed for this task. That's why some people are attracted by NoSQL solutions. But by doing so, they give up some of the good advantages that SQL has, for instance table headings, constraints, and proper data types.

how bad could be duplicate tables or left null fields when designing a database?

I am currently designing a DB for a website, its quite simple, but I will have many entries of information and I dont want to leave a weak design. So , basically, I found a problem in create two tables with same structure or create just one but at other leave some text fields as null.
I have a table area , but I need also to create other named sub-area , both will have their own set of images, but some data only from area and subarea will be shared, and subarea might have many long text information that area wont have, text fields.
So, basically, what I did was, create a table named area and created a boolean field that will tell me if is sub-area or not, also a foreign key to itself that can be null but will be used to point the parent area in case the area its a subarea, and at the images table create a foreign key to area (because both area and subarea can have many images).
My problem is now, I have an area-information table (because its gonna have quite many fields that I wont use, so i dont really want to load it for nothing) , that table has a one-one relation to area table, but some fields of that area-information are specific of sub-area only, due I dont have a sub-area only table, I thought about leave them as NULL at the schema, fields are TEXT and i dont know if this is a big mistake or is an accurate decision, taking in consideration I want not to overload the server with queries (due the info will be plenty enough, so traffic will)
Any idea? Thanks.
One of several ways you could start to approach this. Note despite people running around talking about Sixth Normal form, practical database design is as much art as science.
Based on what I could glean from your question
InfoGroups InfoGroupID (PK)
InfoType, (Unique Key with InfoGroupID ?)
Info
Areas AreaID (PK),
any other attributes soley down to area,
InfoGroupID (FK to InfoGroups)
SubAreas SubAreaID (PK)
any other attributes soley down to sub area,
AreaID (FK to Areas),
InfoGroupID (FK to InfoGroups)
You could go further if Info's are common to areas/subareas and Make InfoGroups a many to many and have an Info Table...
Whether Info Type is a magic number, and enum, a string or an FK to an InfoTypes table is another set of options.
If the only difference between Areas and SubAreas was the link, you go for a self referential table, though I personally wouldn't unless subAreas had further subAreas..
Not seeing this being too expensive off the bat, but I don't know your needs. Data wise it's simplish, neat and efficient and it's way better than a shed load of ambiguous null columns.

Doctrine ClassTable vs SingleTable inheritance (specific for a project)

I'm developing an art web where users can publish different types of art: Images, Literature, Fonts, etc.. my question is about the database structure for the Work table.
Each work has basically the same fields (id, owner, name, description) but also some unique fields:
Image: image_path, album_id (relation)
Literature: text, book_id (relation)
Fonts: file_path
What will be the best table structure? Please keep in mind that I'll have Comments and other relational tables pointing to Work
Single Table Inheritance
Pros:
easy to manage and use in relation. No JOINS are required.
Cons:
no seperation of the unique fields (FontWork will have book_id, album_id, etc)
Class Table Inheritance
Pros:
each table will have only it's unique fields.
Cons:
Performance. multiple JOINS for about every query executed.
I would like to hear your opinion about it and also get new implementation ideas!
Thanks :)
I would recommend that you start with simple Single Table inheritance because your classes have few fields, along with a type attribute (a tiny integer) to separate the different entities.
This also means that your comments and work tables will join to one table. As the numbers grow you can partition your table to improve performance.
Bottom line is start simple and make more complex as your needs change.

DB design to store different products for each customer order

I'm building a simple way to insert customer orders into the db.
We have several products, each one needs different properties.
I've started designing the following tables:
CUSTOMER -> Order (FK to CUSTOMER) -> OrderItem (FK to Order)
Now I'm thinking How could I link product-specific tables to OrderItem.
Suppose I've two products: product1 (room_name, width, height, color) and product2 (number, width, height, type, optionals). I'd create two different tables and link them with the OrderItem, to get specific options, am I wrong? (of course there will be more than just two products)
How can I do this?
I'd have one Product table with a one-to-many relationship between OrderItem and Product. Put a FOREIGN KEY in the OrderItem table that points to its associated Product.
A design like yours would mean you'd have to add a table every time there was a new product. That would not do. You want to add products by inserting new rows.
No approach can resolve all of the issues you may be dealing with, the choice you make depends on which factor is most important to you.
Most people shirk away from having multiple tables. One reason is that you don't know how many tables you may end up with in the future. Another is that your queries may also bloat by having to join to multiple tables. And it may become a maintenance headache with multiple queries to update every time you add a table. Finally, adding a table is not even remotely as friendly as adding a record (Do you really want your App to be able to create tables?).
One option is just to add more and more fields to the Product table. By making the property fields NULLable, different products can use different fields.
But... You may then need to add logic to ensure that ProductX -always- has a value in FieldA, but that ProductY always has a value in FieldB, etc. And probably some meta-data about each product type so that your application knows which fields to use for which products. You still may need to add new fields, which is possibly tidier than adding new tables, but you still probably don't want the Application doing.
An option that totally avoids using DDL to add a product is to further normalise your data, and have the product-specific-properties in an Entity-Attribute-Value table. This is initially very attractive to many people as it is so generic and flexible.
Product(id, name, another-global-property, etc)
Product_Properties(product_id, property_id, property_value)
You'll probably have some meta-data and extra logic to ensure all the correct properties are used. But now you just add records to a generic structure whenever you create a new product.
But what type should "property value" be? It may need to hold strings, dates, numbers, anything. You could make it a string and use the meta-data to know how to CAST the value. Of you may have several value fields, one of each type, and a "field_type_id" or something to indicate which value-field should be read from.
It's also less friendly for certain searches. If you know a product_id, finding the properties is easy. If you want all products where the expiry date is in the past, you need to be careful about how you structure the data and indexes to make the query efficient. But if you want (expiry < today AND cost > 50) then you get a much different query from what you are used to - Each value is in a different ROW instead of a different FIELD.
Search performance really does begin to shrink as query complexity increases and design considerations become more technical.
Which way you go depends on application functional requirement, architecture and design decisions, and a good helpful dash of 'taste'.
You have tagged question as django. Then you should read this recent post:
Coding an inventory system, with polymorphic items and manageable item types
In this post #ThibaultJ explain how to accomplish this with Django model utils.
The idea is that you have a 'product' model and you inherit product1 and product2 from this model adding specific information for both. #ThibaultJ has posted intesting samples.
I will notice #ThibaultJ about this question. If #ThibaultJ writes an answer I will remove my post.
Here are some options
IMHO I would choose an Inheritance pattern, i.e. a new table called "ProductBase" with a unique Surrogate. Product base would have a classification e.g. "ProductType" which would then allow you to join into the appropriate 'subclass' Product table. OrderItem would reference just the Surrogate. Referential Integrity is enforcable, and it gives the opportunity for extending to additional forms of products. It does however require the use of a common unique surrogate amongst all Product table types. If there are other tables (other than OrderItem) referencing Product, it would also avoid the use of having to FK to composite keys.
Nullable Foreign Keys in OrderItem, i.e. OrderItem would have nullable FK to both (all) types of Product Tables, although only one of them would be present on each row.
By inner joining OrderItem to the appropriate Product tables would eliminate the 'wrong' product joins based on the NULLs. RI can still be enforced.
If you have the SAME type of Primary Key on all your Product subclass tables, then you could also add a single Product "Foreign" Key and a "ProductType" "Switch" on OrderItem. The problem here is that you can't enforce RI.
That said, I really wouldn't be creating a new table for each and every product - surely there are some broad 'categories' of Product which can be modelled in a uniform manner.
No doubt if you sell Aircraft and Groceries that you would probably need a AircraftProduct and a GroceryProduct, but surely A300, Boeing 747 and Cessna Skyhawk would fit as rows inside AircraftProduct, even if there are a few 'optional' nullable fields in each table not applicable to all products in this 'category'?
Edit : First see Dems and Duffmo's posts to see if you can avoid the requirement for having multiple Product tables at all, by using EAV / Multivalue / Metadata patterns to model Product.