I'm new in database normalization and I want to know if I'm doing the right thing about the structure of the table that I created.
I have a table called "RoomRates" with next columns:
RoomRateId(int), RoomType(int), Season(int), Monday(decimal),Tuesday(decimal), Wednesday(decimal), Thursday(decimal),Friday(decimal), Saturday(decimal), Sunday(decimal)
As far as I know this structure of the table is breaking First Normal Form.
I have a foreign key on RoomType and Season.
Should I turn table's structure into:
RoomRateId(int), RoomType(int), Season(int), DayOfTheWeek(int), Rate(decimal)
in order to not breaking First Normal Form ?
Fastly, I will provide you some tips. I recommend do not walk step by step throught normal forms, because you will do re-make your previous work (and your team's work), unless you are practicing.
As general rule designing databases, each field must exist once in the whole database. You also must group semantic data into tables, and link them with a foreign key.
Referencing your question, I could do a full remake of tables. You probably don't need a column for each day of week. Did you think if instead of several columns, you can use only one? The foreign would be light as possible (one column recommended, unless does not exist any other way to solve the link). Always think if your design will be useful for huge amount of data, in terms of querying and storing.
I recommend to you take a look to this link: A beginner's guide to SQL database design
Hope this helps.
To me your table is not breaking the 1 Normalized Form.
The first form 1NF has the following characteristics: no ordering, neither left to right for columns not top-to-bottom for the rows. Every intersection has only one value of the specific domain and nothing else. But the most important characteristic, I think of a relation in the first form is simply that all attributes in that relation should only have atomic values. Attributes with non atomic values should be decomposed to reach the 1NF.
I think you table columns will have atomic values only.
Also, the idea of a column for the day of the week is better than several columns, leading to much storage space and more processing time.
Related
We are looking to extend MySQL by adding extra columns to each "column". Right now you have the following.
Field, Type, Null, Key, Default, Extra
We want to be able to add to the "column" definition an extra column like, Attributes. Our system has certain design specifications that we need to describe more data per "column". How can we accomplish this in MySQL?
The query to return back all of the columns is as follows.
SHOW COLUMNS FROM MyDB.MyTable;
EDIT 1
I should have added this to begin with, and I apologize for not doing so. We are currently describing attributes in the Comments section for each column type, and we understand that this is a very dirty solution, but it was the only one we could think of at the time. We have built a code generator that revolves around the DB structure and is what really stems from this initiative. We want to describe code attributes for a column so the code generator can pick up the changes and refresh the code base on each change or run.
First, terminology: "field" and "column" are basically synonyms in this context. There is no distinction between fields and columns. Some MySQL commands even allow you to use these two words interchangeably (e.g. SHOW FIELDS FROM MyDB.MyTable).
We want to assign attributes to each column in a table. Adding "field_foo" for "field" would repeat the same data over and over again for each row.
Simple answer:
If you want more attributes that pertain to a given column foo, then you should create another table, where foo is its primary key, so each distinct value gets exactly one row. This is part of the process of database normalization. This allows you to have attributes to describe a given value of foo without repeating data, even when you use that value many times in your original table.
It sounds like you might also need to allow for extensibility and you want to allow new columns at some future time, but you don't know which columns or how many right now. This is a pretty common project requirement.
You might be interested in my presentation Extensible Data Modeling, in which I give an overview of different solutions in SQL for this type of problem.
Extra Columns
Entity-Attribute-Value
Class Table Inheritance
Serialized LOB
Inverted Indexes
Online Schema Changes
Non-Relational Databases
None of these solutions are foolproof, each has their strengths and weaknesses. So it is worth learning about all of them, and then decide which ones have strengths that matter to your specific project, while their weaknesses are something that doesn't inconvenience you too much (that's the decision process for many software design choices).
We are currently describing attributes in the Comments section for each column type
So you're using something like the "Serialized LOB" solution.
I am designing a database which holds a lot of information for a user. Currently I need to store 20 different values, but over time I could be be adding more and more.
I have looked around StackOverflow for simular questions, but it usually ends up with the asker just not designing his table correctly.
So based of what I have seen around StackOverflow, should I:
Create a table with many null columns and use them when needed (this seems terrible to me)
Create a users table and a information table where information is a key-value pair: [user_id, key, value]
Anything else you can suggest?
Keep in mind this is for a MySQL database, so I understand the disliking for a Key-Value table on a relational database.
Thanks.
hmm, i am a bit confused by the question, but it sounds like you want to have lots of attributes for one user right? And in the future you want to add more??
Well, isn't that just have a customer_attribute_ref ref table of some sort, then you can easily add more by then inserting to the ref table, then in the customer table you have at least three columns : 1. customer ID 2. customer attribute ID 3. customer attribute value...
may be i missed your question. Can you clarify
I'd suggest 3. A hybrid of 1 and 2. That is, put your core fields, which are already known, and you know you'll be querying frequently, into the main table. Then add the key-value table for more obscure or expanded properties. I think this approach balances competing objectives of keeping your table width relatively narrow, and minimizing the number of joins needed for basic queries.
Another approach you could consider instead of or in combination with the above is an ETL process of some kind. Maybe you define a key-value table as a convenient way for your applications to add data; then set up replication, triggers, and/or a nightly/hourly stored procedure to transform the data into a form more suitable for querying and reporting purposes.
The exact best approach should be determined by careful planning and consideration of the entire architecture of your application.
When should one use one to one relationships? When should you add new fields and when should you separate them into a new table?
It seems to me that you'd use it whenever you're grouping fields and/or that group tends to be optional. Yes?
I'm trying to create the tables for an object but grouping/separating everything would require me about 20 joins and some even 4 levels deep.
Am I doing something wrong? How can I improve?
First, I highly recommend reading about Normal Forms
A normalized relational database is extremely useful, and doing this properly is the reason tools such as Hibernate exist - to help manage the difference between objects-represented-as-relational-mappings and objects-as-progrommatic-entities.
Anything that has a one-to-one mapping should probably be in the same table. A Person has only one first name, one last name. Those should logically be in the same table. Having a reference to a table of names isn't necessary - in particular because little additional data can be stored about a name. Obviously, this isn't always true (an etymology database might want to do exactly that), but for most uses, you don't care about where a name comes from - indeed all you want is the name.
Therefore, think of the objects being represented. A person has some singular data points, and some one-to-many relationships (addresses they have lived, for instance). One to many and many to many will almost always require a separate table (or two, to have many to many). Following those two guidelines, you can get a normalized database pretty fast.
Note that optional fields should be avoided if at all possible. Usually this is a case of having a separate table holding the field with a reference back to the original table. Try to keep your tables lean. If a field isn't likely to have something, it probably should be a row in it's own table. Many such properties suggests a 'Property' table that can hold arbitrary optional properties of a particular type (ie, as are applied to a 'Person').
I am designing a database for a project. I have a table that has 10 columns, most of them are used whenever the table is accessed, and I need to add 3 more columns;
View Count
Thumbs Up (count)
Thumbs Down (Count)
which will be used on %90 of the queries when the table is accessed. So, my question is that whether it is better to break the table up and create new table which will have these 3 columns + Foreign ID, or just make it 13 columns and use no joins?
Since these columns will be used frequently, I guess adding 3 more columns is better, but if I need to create 10 more columns which will be used %90 of the time, should I add them as well, or create a new table and use joins?
I am not sure when to break the table if the columns are used very frequently. Do you have any suggestions?
since it's such a high number of usage cases (90%) and the fields are only numbers (not text) then i would certainly be inclined to just add the fields to the existing table.
edit: only break tables apart if the information is large and/or infrequently accessed. there's no fixed rule, you might just have to run tests if you're unsure as to the benefits.
Space is not a big deal these days - I'd say that the decision to add columns to a table should be based on "are the columns directly related to the table", not "how often will the columns be used".
So basically, yes, add them to the table. For further considerations on mainstream database design, see 3NF.
The frequency of usage should be of no concern for your table layout, at least not until you start with huge tables (in number of rows or columns)
The question to answer is: Is it normalized with the additional columns. Google it, there a plenty of resources about it (with varying quality though)
Ditto some earlier posters. 95% of the time, you should design your tables based on logical entities. If you have 13 data elements that all describe the same "thing", than they all belong in one table. Don't break them into multiple tables based on how often you expect them to be used or to be used together. This usually creates more problems than it solves.
If you end up with a table that has some huge number of very large fields, and only a few of them are normally used, and it's causing a performance problem, then you might consider breaking it up. But you should only do that when you see that it really is causing a performance problem. Pre-emptive strikes in this area are almost always a mistake.
In my experience, the only time breaking up a table for performance reasons has shown any value is when there are some rarely-used, very large text fields. Like when there's a "Miscellaneous Extra Comments" field or "Text of the novel this customer is writing".
My advice is the same as cedo's:go with the 13 columns.
Adding another table to the DB, with another Index might just eat up the space you saved but will result in slower and more complicated queries.
Try looking into Database Normalization for some clearly outlined guidelines for planning database structures.
I have a MySQL-InnoDB table with 350,000+ rows, containing a couple of things like id, otherId, shortTitle and so on. Now I'm in need of a Bool/ Bit field for perhaps a couple of hundreds or thousands of those rows. Should I just add that bool field into the table, or should I best create a new table referencing the IDs of the old table -- thereby not risking to cause performance issues on all the old existing functions that access the first table?
(Side info: I'm never using "SELECT * ...". The main table has lots of reading, rarely writing.)
Adding a field can indeed hamper performance a little, since your table row grow larger, but it's hardly a problem for a BIT field.
Most probably, you will have exactly same row count per page, which means having no performance decrease at all.
On the other hand, using an extra JOIN to access the row value in another table will be much slower.
I'd add the column right into the table.
What does the new column denote?
From the data modelling perspective, if the column belongs with the data under whichever normal form is in use, then put it with the data; performance impact be damned. If the column doesn't directly belong to the table, then put it in a second table with a foreign key.
Realistically, the performance impact of adding a new column on a table with ~350,000 isn't going to be particularly huge. Have you tried issuing the ALTER TABLE statement against a copy, perhaps on a local workstation?
I don't know why people insist in called 350K-row tables big. In the mainframe world, that's how big the DBMS configuration tables are :-).
That said, you should be designing your tables in third normal form. If, and only if, you have performance problems, then should you consider de-normalizing.
If you have a column that will apply only to certain of the rows, it's (probably) not going to be 3NF to put it in the same table. You should have a separate table with a foreign key into your 'primary' table.
Keep in mind that's if the boolean field actually doesn't apply to some of the rows. That's a different situation to the field applying to all rows but not being known for some. In that case, a nullable column in the primary table would be better. But that doesn't sound like what you're describing.
Requiring a bit field for the next entries only sounds like you want to implement inheritance. If that is the case, I would add it to a new table to keep things readable. Otherwise, it doesn't matter if you add it to the main table or not, unless your queries are not using indexes, in which case I would change that first before making any other decisions regarding performance.