Best database table design for a table with dependent column values - mysql

I would like know the best way of designing a table structure for dependent column values.
If i have a scenario like this this
if the status of the field is alive nothing to do
if the status is died some other column values are stored somehow.
What is the best way to handle this situation
whether to create table containing all columns ie 'Died in the hospital','Cause of death','Date of Death' and 'Please narrate the event' and let it be null when status is alive
or
to use seperate table for storing all the other attributes using Entity-attribute-value (EVA) concepts
in the above scenario signs and symptoms may be single, multiple or others with specification. how to store this .
what is the best way for performance and querying
either to provide 15 columns in single table and store null if no value or to store foreign key of symptoms in another table (in this strategy how to store other symptom description column).

In general, if you know what the columns are, you should include those in the table. So, a table with columns such as: died_in_hospital, cause_of_death, and so on seems like a reasonable solution.
Entity-attribute-value models are useful under two circumstances:
The attributes are not known and new ones are added over time.
The number of attributes is so large and sparsely populated that most columns would be NULL.
In your case, you know the attributes, so you should put them into a table as columns.

Entity-attribute-value models is the best method, it will be helpful in data filtering/searching. Keeping the columns in the base table itself is against Normalization rules.

Related

A more efficient way to store data in MySQL using more than one table

I had one single table that had lots of problems. I was saving data separated by commas in some fields, and afterwards I wasn't able to search them. Then, after search the web and find a lot of solutions, I decided to separate some tables.
That one table I had, became 5 tables.
First table is called agendamentos_diarios, this is the table that I'm gonna be storing the schedules.
Second Table is the table is called tecnicos, and I'm storing the technicians names. Two fields, id (primary key) and the name (varchar).
Third table is called agendamento_tecnico. This is the table (link) I'm goona store the id of the first and the second table. Thats because there are some schedules that are gonna be attended by one or more technicians.
Forth table is called veiculos (vehicles). The id and the name of the vehicle (two fields).
Fith table is the link between the first and the vehicles table. Same thing. I'm gonna store the schedule id and the vehicle id.
I had an image that can explain better than I'm trying to say.
Am I doing it correctly? Is there a better way of storing data to MySQL?
I agree with #Strawberry about the ids, but normally it is the Hibernate mapping type that do this. If you are not using Hibernate to design your tables you should take the ID out from agendamento_tecnico and agendamento_veiculos. That way you garantee the unicity. If you don't wanna do that create a unique key on the FK fields on thoose tables.
I notice that you separate the vehicles table from your technicians. On your model the same vehicle can be in two different schedules at the same time (which doesn't make sense). It will be better if the vehicle was linked on agendamento_tecnico table which will turn to be agendamento_tecnico_veiculo.
Looking to your table I note (i'm brazilian) that you have a column called "servico" which, means service. Your schedule table is designed to only one service. What about on the same schedule you have more than one service? To solve this you can create a table services and create a m-n relationship with schedule. It will be easier to create some reports and have the services well separated on your database.
There is also a nome_cliente field which means the client for that schedule. It would be better if you have a cliente (client) table and link the schedule with an FK.
As said before, there is no right answer. You have to think about your problem and on the possible growing of it. Model a database properly will avoid lot of headache later.
Better is subjective, there's no right answer.
My natural instinct would be to break that schedule table up even more.
Looks like data about the technician and the client is duplicated.
There again you might have made a decisions to de-normalise for perfectly valid reasons.
Doubt you'll find anyone on here who disagrees with you not having comma separated fields though.
Where you call a halt to the changes is dependant on your circumstances now. Comma separated fields caused you an issue, you got rid of them. So what bit of where you are is causing you an issue now?
looks ok, especially if a first try
one comment: I would name PK/FK (ids) the same in all tables and not using 'id' as name (additionaly we use '#' or '_' as end char of primary / foreighn keys: example technicos.technico_ and agendamento_tecnico has fields agend_tech_ and technico_. But this is not common sense. It makes queries a bit more coplex (because you must fully qualify the fields), but make the databse schema mor readable (you know in the moment wich PK belong to wich FK)
other comment: the two assotiative (i never wrote that word before!) tables, joining technos and agendamento_tecnico have an own ID field, but they do not need that, because the two (primary/unique) keys of the two tables they join, are unique them selfes, so you can use them as PK for this tables like:
CREATE TABLE agendamento_tecnico (
technico_ int not null,
agend_tech_ int not null,
primary key(technico_,agend_tech_)
)

Integer values for status fields

Often I find myself creating 'status' fields for database tables. I set these up as TINYINT(1) as more than often I only need a handful of status values. I cross-reference these values to array-lookups in my code, an example is as follows:
0 - Pending
1 - Active
2 - Denied
3 - On Hold
This all works very well, except I'm now trying to create better database structures and realise that from a database point of view, these integer values don't actually mean anything.
Now a solution to this may be to create separate tables for statuses - but there could be several status columns across the database and to have separate tables for each status column seems a bit of overkill? (I'd like each status to start from zero - so having one status table for all statuses wouldn't be ideal for me).
Another option is to use the ENUM data type - but there are mixed opinions on this. I see many people not recommending to use ENUM fields.
So what would be the way to go? Do I absolutely need to be putting this data in to its own table?
I think the best approach is to have a single status table for each kind of status. For example, order_status ("placed", "paid", "processing", "completed") is qualitatively different from contact_status ("received", "replied", "resolved"), but the latter might work just as well for customer contacts as for supplier contacts.
This is probably already what you're doing — it's just that your "tables" are in-memory arrays rather than database tables.
As I really agree with "ruakh" on creating another table structured as id statusName which is great. However, I would like to add that for such a table you can still use tinyint(1) for the id field. as tinyint accepts values from 0 to 127 which would cover all status cases you might need.
Can you add (or remove) a status value without changing code?
If yes, then consider a separate lookup table for each status "type". You are already treating this data in a generic way in your code, so you should have a generic data structure for it.
I no, then keep the ENUM (or well-documented integer). You are treating each value in a special way, so there isn't much purpose in trying to generalize the data model.
(I'd like each status to start from zero - so having one status table for all statuses wouldn't be ideal for me
You should never mix several distinct sets of values within the same lookup table (regardless of your "zero issue"). Reasons:
A simple FOREIGN KEY alone won't be able to prevent referencing a value from the wrong set.
All values are forced into the same type, which may not always be desirable.
That's such a common anti-pattern that it even has a name: "one true lookup table".
Instead, keep each lookup "type" within a separate table. That way, FKs work predictably and you can tweak datatypes as necessary.

Database design - which would be better?

I have multiple tables.
They all have the following fields in them:
item_title | item_description | item_thumbnail | item_keywords
Would I be better off having a single items_table with an extra item_type field and then joining with the respective table, or just keep them all in separate tables?
Depends on the context. If your items have very little differentiation and you’re certain you’re not going to have a scenario in 6 months, 12 months, 2 years where you need items separated, then go the route of one generic “items” table. If a particular item type does have specific requirements, then you can create a separate table that contains this data and create a LEFT JOIN when querying to include the extra data.
I’d also suggest looking at other database types. Judging from your scenario (lots of item types with little variance in the data stored) I think you may benefit from a document-based database engine like MongoDB rather than a relational data-based database engine like MySQL.
OK, so the tables share fields. Do they also share constraints1?
If yes, then go ahead and merge them together.
If not, you may keep them separate, of may merge them together, depending on what kind of tradeoff you are willing to make.
For example, if tables have separate foreign keys, you may keep them separate, or you may merge them into a single table, but keep FKs separate:
item_title
item_description
item_thumbnail
item_keywords
table1_id REFERENCES table1 (table1_id)
table2_id REFERENCES table2 (table2_id)
...
CHECK (
(table1_id IS NOT NULL AND table2_id IS NULL ...)
OR (table1_id IS NULL AND table2_id IS NOT NULL ...)
...
)
(NOTE: MySQL Doesn't enforce CHECK, so you'll need to do the equivalent enforcement from a trigger or client code, or use a different DBMS if you can.)
I'd need to know more about your database to figure out which is better.
with an extra item_type field and then joining with the respective table,
Never enforce FKs in code, if you can help it. Even if you merge the tables together, don't merge FKs, instead do something like the above. Enforcing FKs in code in the context of the concurrent environment (where multiple clients can try to modify the same data at the same time) is difficult to do correctly and with good performance - it's much better to let the DBMS do it for you.
BTW, what is item_keywords? It it's a comma-separated list of keywords (or similar), you'll need to normalize further and extract the keywords into their own separate table.
1 Domain (data type and CHECK), key (PRIMARY KEY and UNIQUE) and referential (FOREIGN KEY) constraints.
I believe that it is good to have as less table as possible. It is easy to maintain. It is hard to imagine that if you have 3000 type of item_type. Then, there would be 3,000 different table. So single table is good idea to me in your case. In the future, when you run into situation when you need to separate the table, you can easily do so.
So the short answer, YES.
If i understand well, you only need to normalize your schema:
items:
item_id
item_name
item_description
items_types
item_id
type_id
types
type_id
item_file_name
So this way you can have any number of items with any number of types
Is this you want to do???
I would suggest you to use one table for item and one table for type for the following reasons (assume there are 10 types).
I am not sure which programming language you are using. As a Java developer, i will have to create each entity class for each type if I have multiple tables. So i would rather have only one class and have a type as an attribute.
When you have to display all of the types in the same page, you will have to execute the select query from all 10 tables for 10 types.
When you introduce a new type, you have to write the code to for the CRUD and Business specific operations. The developer will keep on adding the code for every new type.
Basically, if you have one table for item and one table for type, you won't have to change the database schema and code for each new type you introduce. But if you are sure that, the number of types is less and won't change, you can consider using muiltiple tables.
Create two separate tables and join them as per your required output.
i.e>
1.1'st TABLE (master table==>item_type)
item_type(item_type_id,item_type_name,status)
2.2'nd TABLE(child table==>item_details)
item_details(item_id,item_type_id,item_title,item_description,item_thumbnail,item_keywords)
See more examples..
I feel signle table would be more suitable. It will avoid more joins, complication in program(Code) and errors in compare of multiple tables. Even it will be better from the management point of view like db clustering etc.
If you have so many tables which needs to have the same repeated columns then yes it is a good way to create a separate table for the common fields. This is more efficient if these repeated columns are not fixed and can be changed like adding one more column to the list of common default columns.
So how could you do that?
The idea is to create a seperate table and put the common default columns there.
This table is like a dummy table i.e. the columns can be added/deleted as needed.
For example-
Table - DefaultFields
Columns - item_title | item_description | item_thumbnail | item_keywords
You can then also be able to insert the values in the DefaultFields table dynamically in a loop like:
"INSERT INTO DefaultFields (item_table, item_title , item_description,item_thumbnail ,item_keywords) VALUES('"+ field.item_table + "','" + field.item_title + "','" + field.item_description+ "','" + field.item_thumbnail + "','" + field.item_keywords)");
NOTE: field is the object that holds the values in a table wise loop.
Then further you can alter your tables to create these default fields from DefaultFields table like:
"ALTER TABLE " + item_table+ " ADD COLUMN [" + field.item_title + "] Text"
This can be repeated for each table to alter it as needed.
In this design pattern, even if you want to:
1) add one more column or
2) delete pre existing column or
3) change pre existing column name
Then you can do so in the dummy table and the rest is updated by the ALTER table command in corresponding tables.
In my opinion... I would say no, never.
There is two reason for that:
You really want to preserve a logical meaning in your database. For now it's pretty obvious for you how it's organised. But in two month (or 1 year), will it be so evident? If somebody join the project, isn't it easier for him to understand if the different logical block of your app are separated? I mean... It's true that a human and a cat are animals. Is it still logical to store both of them inside the same box?
Performance. The shorter the table, the faster your request will be. The data will still take as much space on your disk. And i don't talk about the comparison for knowing which type of item you are looking for. I mean, if you want to select all the pages of your application, just compare the two request:
Multiple tables:
Select * from pages_tbl;
Single table:
Select * from item_tbl where type = 'page';
What will you gain from this design? No performance, no disk space, no readability. I really don't see a good reason for it.

Will multiple table reduce the speed of the result?

I do have a datbase with multiple tables.
this multiple table is related to single name for example..
Table 1 contains name of the person, joined date,position,salary..etc
Table2 contains name of the person,current projects,finished,assigned...etc
Table 3 contains name of the person,time sheets,in,out,etc...
Table 4 contains name of the person,personal details,skill set,previous experiance,...etc
All table contains morethan 50000 names, and their details.
so my question is all tables contains information related to a name say Jose20856 this name is unique index of all 4 tables. when I search for Jose20856 all four table will give result and output to a front end software/html.
so do I need to keep multiple table or combined to a single table??
If so
CASE 1
Single table -> what are the advantages? will result will be faster? what about the system resource usage?
CASE 2
Multiple table ->what are the advantages? will result will be faster? what about the system resource usage?
As I am new to MySQL I would like to have your valuable opinion to move ahead
You can combine these into a single table but only if it makes sense. It's hard to tell if the relationships in your tables are one-to-one or one-to-many but seem to be one-to-many. e.g. A single employee from table 1 should be able to have multiple projects, skills, time sheets in the other tables. These are all one-to-many relationships.
So, keep the multiple table design. You also should consider using an integer-based primary key for the employee rather than the name. Use this pkey as the fkey in your other tables and you'll see performance improvement. (Also consider the amount of work you need to do if and when you want to change the name. You have to change all the names in all the tables. If you use a surrogate key, the int pkey, as suggested above, you only have to update a single row.)
Read on the web about database normalization.
E.g. http://en.wikipedia.org/wiki/Database_normalization
I think you can even add more tables to it. It all depends on the data and the relations.
Table1 = users incl. userdata
Table2 = Projects (if multiple users work on the same project)
Table3 = Linking user to projects (if multiple users work on the same project)
Table4 = Time spent? Contains the links to the user and to the project.
I think your table 4 can be merged into table 1 cause it also contains data specific to 1 user.
There is probably more you can do but as already stated it all depends and the relations.
What we're talking about here is vertical table partitioning (as opposed to horizontal table partitioning). It is a valid database design pattern, which can be useful in these cases:
There are too many columns to fit into one table. That's pretty obvious.
There are columns which are accessed relatively often, and some that are accessed relatively rarely. For example, if you very often need to display columns joined date,position,salary and columns personal details,skill set,previous experiance very rarely, then it makes sense to move these columns to separate a table, as it will (probably) improve performance in accessing those most commonly used. In MySQL this is especially true in case of TEXT and BLOB columns, since they're stored apart from the rest of the fileds, so accessing them takes more time.
There are NULLable columns, where majority of rows are NULL. Once again, if it's mostly null, moving it to a separate table will let you reduce size of your 'mani' table and improve performance. The new table should not allow null values and have entries only for rows where value is set. This way you reduce amount of storeage/memory resources as well.
MySQL specific - You might want tom move some of your columns from nnoDB table to MyISAM, so that you can use full text indexing, while still being able to use some of the features InnoDB provides. It's not a good design gnerally speaking though - it's better to use a full text search engine like Sphinx.
Last but not least. I'd suggest using a numeric field as a key joining all these tables, not a string.
Additional reading aboout MySQL partitioning (a bit outdated, since MySQL 5.5 added some new features)

Super general database structure

Say I have a store that sells products that fall under various categories... and each category has associated properties... like a drill bit might have coating, diameter, helix angle, or whatever. The issue is that I'd like the user to be able to edit these properties. If I wasn't interested in having the user change the properties, and I was building the store for a certain set of categories, I'd have one table for drill bits, etc. Alternatively, I could just modify the schema online but that doesn't seem to be done very often (unless we're talking phpmyadmin or something), and plus that doesn't fit in well at all with the way models are coupled to tables.
In general, I'm interested in implementing a multi-table database structure with various datatypes (because diameter might be a decimal, coating would be a string/index into a table, etc), within mysql. Any idea how this might be done?
If I understand correctly what you're asking, an, admittedly hacky, solution would be to have a products table that has to related tables, product_properties and product_properties_lookup (or some better name) where product_properties_lookup has an entry for every possible property a product can have and where product_properties contains the value of a property as a string with the ID of the property and the ID of the product. You could then coerce the property value into whatever type you wanted. Not ideal, but I'm not sure what else to do short of adding individual columns to the DB for property types.
Just use the database. It does all of this already. For free. And fast. How is having a table of products point to a table of properties with data types any different from a table with columns? It's not. Save if you use the DBs tables you get to use SQL to query it in all sorts of neat, and efficient ways compared to your own (crosstabs suck in SQL dbs).
Get a new product, make a new table. No big deal. Get a new property, alter the table. If you have 1M products in that table, yea, it may be a slow update (depends on the DB). Do you have 1M products? I don't think WalMart has 1M products.
Building Databases on top of Databases is a silly thing. Just use the one that's there. It is putty in your hands. Mold it to your whim.
Create a Property table first. This will contain all properties. It should have (at minimum) a Name column and a Type column ('string', 'boolean', 'decimal', etc.). Note: Primary keys are implied for all these tables.
Next, create a CategoryProperty table. Here you will be able to assign properties to a category. It should have these columns: CategoryID, PropertyID. Both foreign keys.
Then, create a Category table. This describes the categories. It should have a Name column and possibly some other columns like Description.
Then, create a ProductCategory table. Here, you will assign the categories for each product. It should have these columns: CategoryID, ProductID. Both foreign keys.
Next, create a PropertyValue table. Here, you will "instantiate" the properties and give them values. Columns include ProductID, PropertyID, and PropertyValue. The primary key can consist of ProductID and PropertyID.
Finally, create a Product table that just describes each product with columns like Name, Price, etc.
Note how for each relationship there is a separate table. If you only want one category for each product, you can do away with the ProductCategory table and just put a CategoryID field in the Product table. Similarly, if you want each property to belong to only one category, you can put a PropertyID column in the Category table and get rid of the CategoryProperty table.
Lastly, you will not be able to verify the data type for each property since each property has a different type (and they are rows, not columns). So just make the PropertyValue column a string and then perform your validation either as a trigger, or in your application, by checking the Type column of the Property table for that property.
If you're using a recentish version of mysql (5.1.5 or greater) you can store your data as XML in the database. You can then query that data using thigns like this.
Suppose I have a table that contains some items and I have a widgetpack that contains numerous
widgets. I can get my total number of widgets:
SELECT SUM( EXTRACTVALUE( infoxml, '/info/widget_count/text()' ) ) as widget_count
WHERE product_type="widgetpack"
assuming the table has an infoxml column and each widgetpacks infxml column contain XML that looks like this
<info>
<widget_count>10</widget_count>
<!-- Any other unstructured info can go in here too -->
</info>
DB purists will cringe at this, and it is kinda hacky. But often its easier to keep all your unstructured data in one place.
Have a look at this database schema on DatabaseAnswers.org:
http://www.databaseanswers.org/data_models/products_and_generic_characteristics/index.htm
Maybe consider an Entity-Attribute-Value (EAV) approach (not for the whole model of course!).
Related questions
Entity Attribute Value Database vs. strict Relational Model Ecommerce question
Approach to generic database design
How do you build extensible data model