I'm no MySQL expert and I have to design a rather complex db for my level.
The problem I'm facing now consist in having a supplier-customer relationship within the same table (macro categories of companies):
Macro table
id name mega_id macro_customer_id
------------------------------------------
1 Furniture 2 2,4,5,35
I want to represent the fact that macro entry with id 1 has other macro companies (which are their customers) described within the same table.
Which is the best way to represent this?
Thanks!
It depends: We all used to use the normalization forms (as #1000111 indicated), however depending on the use of the data, you can choose to look different at certain parts of this normalization discussion:
The normal model for this would be:
Table userData(id,name)
- 1:N table linkTable(id,macro_customer_id)
- N:1 table metaData(macro_customer_id,value)
Or:
Table userData(id,name)
- 1:N table linkTable(macro_customer_id,id)
The big question is however in how the data is used. If data is just for this user and not queried in any other way (no where, or group by), then storing it as a serialized String is a completely valid approach.
The relationship between entities in an RDBMS should be stored in a relational way. Ask yourself if you care about this relationship in your database - will you need to write queries that will link macro.id to table/rows represented by IDs in macro.macro_customer_id? If yes, then you must store this relationship in a (one-to-many or many-to-many) separate table.
Related
I would like know the best way of designing a table structure for dependent column values.
If i have a scenario like this this
if the status of the field is alive nothing to do
if the status is died some other column values are stored somehow.
What is the best way to handle this situation
whether to create table containing all columns ie 'Died in the hospital','Cause of death','Date of Death' and 'Please narrate the event' and let it be null when status is alive
or
to use seperate table for storing all the other attributes using Entity-attribute-value (EVA) concepts
in the above scenario signs and symptoms may be single, multiple or others with specification. how to store this .
what is the best way for performance and querying
either to provide 15 columns in single table and store null if no value or to store foreign key of symptoms in another table (in this strategy how to store other symptom description column).
In general, if you know what the columns are, you should include those in the table. So, a table with columns such as: died_in_hospital, cause_of_death, and so on seems like a reasonable solution.
Entity-attribute-value models are useful under two circumstances:
The attributes are not known and new ones are added over time.
The number of attributes is so large and sparsely populated that most columns would be NULL.
In your case, you know the attributes, so you should put them into a table as columns.
Entity-attribute-value models is the best method, it will be helpful in data filtering/searching. Keeping the columns in the base table itself is against Normalization rules.
I am working on creating an application with multiple parent accounts each of which has different multiple users. Each account consists of a set of data of similar type but needs t be maintained separately. eg. inventory of each organization which their respective users can view.
What is the best practice:
1: Create different database tables for each organization
2: Create a common table and have an extra column for the organization it belongs to.
As mentioned, do a one table for organization, one for equipment, one for persons and so on. It is step 1 - separate table for separate entity.
After that connect them with relationships: primary key in main entity to foreign key in sub entity. Other words every row in equipment table would have column with id of organization it belongs to. And so on.
There are many other circumstances, including subdividing entities to such called normal forms, you can study it if it needed, to reduce data consistency supply costs. But it could also negatively affect performance.
Anyway: same class entities commonly should be stored in one table.
The best practice in OLTP (transaction processing) is to create a common table and to implement a subtyping in some way, for example "have extra tables with columns for the organization subtype". In OLAP (analytical processing) warehousing it is still a good practice but the mapping of subtypes can be implemented differently. In OLAP datamarts the solution "one table per organization" can be a good practice.
You may have a look on the book "Programming with databases" which covers these topics: subtype/subclass mapping, OLTP vs OLAP, denormalization etc.
We have a J2EE content management and e-commerce system, and in this system – for sake of a simple example – let’s say that we have 100 objects. All of these objects extend the same base class, and all share many of the same fields.
Let’s take two objects as an example: a news item that would be posted on a website, and a product that would be sold on a website. Both of these share common properties:
IDs: id, client ID, parent ID (long)
Flags: deleted, archived, inactive (boolean)
Dates: created, modified, deleted (datetime)
Content: name, description
And of course they have some properties that are different:
News item: author, posting date
Product: price, tax
So (finally) here is my question. Let’s say we have 100 objects in our system, and they all follow this pattern. They have many fields that overlap, and some unique fields. In terms of a relational database, would we be better off with:
Option One: Less Tables, Common Tables
table_id: id, client ID, parent ID (long) (id is the primary key, a GUID for all objects)
table_flag: id, deleted, archived, inactive (boolean)
table_date: id, created, modified, deleted (datetime)
table_content: id, name, description
table_news: id, author, posting date
table_product: id, price, tax
Option Two: More Tables, Common Fields Repeated
table_news: id, client ID, parent ID, deleted, archived, inactive, name, description, author, posting date
table_product: id, client ID, parent ID, deleted, archived, inactive, name, description, price, tax
For full disclosure – I am a developer and not a DBA, and because of that I prefer option one. But there is another team member that prefers option two, and I think he makes valid points.
Option One: Pros and Cons
Pro: Encapsulates common fields into common tables.
Pro: Need to change a common field? Change it in one place.
Pro: Only creates new fields/tables when they are needed.
Pro: Easier to create the queries dynamically, less repetitive code
Con: More joining to create objects (not sure of DB impact on that)
Con: More complex queries to store objects (not sure of DB impact on that)
Con: Common tables will become huge over time
Option Two: Pros and Cons
Pro: Perhaps it is better to distribute the load of all objects across tables?
Pro: Could index the news table on the client ID, and index the product table on the parent ID.
Pro: More readable to human eye: easy to see all the fields for an object in one table.
My Two Cents
For me, I much prefer the elegance of the first option – but maybe that is me trying to force object oriented patterns on a relational database. If all things were equal, I would go with option one UNLESS a DB expert told me that when we have millions of objects in the system, option one is going to create a performance problem.
Apologies for the long winded question. I am not great with DB lingo, so I probably could have summarized this more succinctly if I better understood terms like normalization. I tried to search for answers on this topic, and while I found many that were close (I suspect this is a common DB issue) I could not find any that answered all my questions. I read through this article on normalization:
But I did not totally understand it. On the one hand it was saying that you should remove any redundancies. But on the other hand, it was saying that each attribute should define only one object.
Thanks,
John
You should read Patterns of Enterprise Application Architecture by Martin Fowler. He writes about several options for the scenario you describe:
Single Table Inheritance: One table for all object subtypes. Stores all attributes, setting them NULL where they are inapplicable to the row's object subtype.
Class Table Inheritance: One table for column common to all subtypes, then one table for each subtype to store subtype-specific columns.
Concrete Table Inheritance: One table for each subtype, storing both subtype-specific columns and columns common to all subtypes.
Serialized LOB: One table for all object subtypes. Store common attributes as conventional columns, but combine optional or subtype-specific columns as fields in a BLOB that stores XML or JSON or whatever format you want.
Each one of these designs has pros and cons, so choose a solution depending on the most common way you access your data.
However, notice I use the word subtype above. I would use these designs only if the different object types are subtypes of a common base class. I'm assuming that News item and Product don't actually share a logical base class (besides Object); they are not subtypes of a common superclass.
So for the sake of OO design, I would choose Concrete Table Inheritance. This avoids any inappropriate coupling between these subtypes. There are columns the two tables have in common, but they basically amount to bookkeeping, not anything to do with the function of the class and hence the table.
I do have a datbase with multiple tables.
this multiple table is related to single name for example..
Table 1 contains name of the person, joined date,position,salary..etc
Table2 contains name of the person,current projects,finished,assigned...etc
Table 3 contains name of the person,time sheets,in,out,etc...
Table 4 contains name of the person,personal details,skill set,previous experiance,...etc
All table contains morethan 50000 names, and their details.
so my question is all tables contains information related to a name say Jose20856 this name is unique index of all 4 tables. when I search for Jose20856 all four table will give result and output to a front end software/html.
so do I need to keep multiple table or combined to a single table??
If so
CASE 1
Single table -> what are the advantages? will result will be faster? what about the system resource usage?
CASE 2
Multiple table ->what are the advantages? will result will be faster? what about the system resource usage?
As I am new to MySQL I would like to have your valuable opinion to move ahead
You can combine these into a single table but only if it makes sense. It's hard to tell if the relationships in your tables are one-to-one or one-to-many but seem to be one-to-many. e.g. A single employee from table 1 should be able to have multiple projects, skills, time sheets in the other tables. These are all one-to-many relationships.
So, keep the multiple table design. You also should consider using an integer-based primary key for the employee rather than the name. Use this pkey as the fkey in your other tables and you'll see performance improvement. (Also consider the amount of work you need to do if and when you want to change the name. You have to change all the names in all the tables. If you use a surrogate key, the int pkey, as suggested above, you only have to update a single row.)
Read on the web about database normalization.
E.g. http://en.wikipedia.org/wiki/Database_normalization
I think you can even add more tables to it. It all depends on the data and the relations.
Table1 = users incl. userdata
Table2 = Projects (if multiple users work on the same project)
Table3 = Linking user to projects (if multiple users work on the same project)
Table4 = Time spent? Contains the links to the user and to the project.
I think your table 4 can be merged into table 1 cause it also contains data specific to 1 user.
There is probably more you can do but as already stated it all depends and the relations.
What we're talking about here is vertical table partitioning (as opposed to horizontal table partitioning). It is a valid database design pattern, which can be useful in these cases:
There are too many columns to fit into one table. That's pretty obvious.
There are columns which are accessed relatively often, and some that are accessed relatively rarely. For example, if you very often need to display columns joined date,position,salary and columns personal details,skill set,previous experiance very rarely, then it makes sense to move these columns to separate a table, as it will (probably) improve performance in accessing those most commonly used. In MySQL this is especially true in case of TEXT and BLOB columns, since they're stored apart from the rest of the fileds, so accessing them takes more time.
There are NULLable columns, where majority of rows are NULL. Once again, if it's mostly null, moving it to a separate table will let you reduce size of your 'mani' table and improve performance. The new table should not allow null values and have entries only for rows where value is set. This way you reduce amount of storeage/memory resources as well.
MySQL specific - You might want tom move some of your columns from nnoDB table to MyISAM, so that you can use full text indexing, while still being able to use some of the features InnoDB provides. It's not a good design gnerally speaking though - it's better to use a full text search engine like Sphinx.
Last but not least. I'd suggest using a numeric field as a key joining all these tables, not a string.
Additional reading aboout MySQL partitioning (a bit outdated, since MySQL 5.5 added some new features)
I am designing a relational database of products where there are products that are copies/bootlegs of each other, and I'd like to be able to show that through the system.
So initially during my first draft, I had a field in products called " isacopyof " thinking to just list a comma delimited list of productIDs that are copies of the current product.
Obviously once I started implementing, that wasn't going to work out.
So far, most many-to-many relationship solutions revolve around an associative table listing related id from table A and related id from table B. That works, but my situation involves related items from the SAME table of products...
How can I build a solution around that ? Or maybe I am thinking in the wrong direction ?
You're overthinking.
If you have a products table with a productid key, you can have a clones table with productid1 and productid2 fields mapping from products to products and a multi-key on both fields. No issue, and it's still 3NF.
Because something is a copy, that means you have a parent and child relationship... Hierarchical data.
You're on the right track for the data you want to model. Rather than have a separate table to hold the relationship, you can add a column to the existing table to hold the parent_id value--the primary key value indicating the parent to the current record. This is an excellent read about handling hierarchical data in MySQL...
Sadly, MySQL doesn't have hierarchical query syntax, which for things like these I highly recommend looking at those that do:
PostgreSQL (free)
SQL Server (Express is free)
Oracle (Express is also free)
There's no reason you can't have links to the same product table in your 'links' table.
There are a few ways to do this, but a basic design might simply be 2 columns:
ProductID1, ProductID2
Where both these columns link back to ProductID in your product table. If you know which is the 'real' product and which is the copy, you might have logic/constraints which place the 'real' productID in ProductID1 and the 'copy' productID in ProductID2.