How many field is possible in one table,
Shall i maintain 150 field in one table is good way ,
OR
Maintain relation ship with other tables,
Thanks
Bharanikumar
In the vast majority of cases having 150 columns in a single table is symptomatic of a badly denormalized database.
You might want to read this and re-evaluate your db design.
To put it in your terms, go with "maintain relationship with other tables"
http://dev.mysql.com/doc/refman/5.0/en/column-count-limit.html
If you have a business need to have 150 columns, then it's a "good way". I've never seen such a business need, but that doesn't mean one doesn't exist. I have seen very wide tables used in olap type cases, so if that's what you're doing, there's a good chance that you're on the right track. If you're using this table for more otap functionality, then you're probably going down the wrong road. Perhaps if you provided a bit more info about what you're trying to accomplish, we could provide some advice (instead of "do that" or "do it a different way").
150 bit type fields might be OK, but you also have to consider the maximum length of the record your database will allow you to store. With varchar fields, most databases will let you create a table that would in theory violate the max if all the fields were filled to their max length. However, it won't let you actually add records which are too long,. This is the kind of trap that can go along fine for years until someone puts just one character too many into a potential insert and then blow up and it takes a long time generally to find an fix such a problem. It is best to avoid ever designing a table where the total legnth of the columns is bigger than the length of the maximum record bytes.
Less wide tables can also tend to be faster to query.
Additonally 150 columns is usually a sign that you really need to look at the design and see if a related table would be better. For instance if you have phone1, phone2, phone3, then you need a related phone table.
If you genuiniely need all 150 columns, consider which are likely to be queried together most often. Put those inteh parent table. Then add the less often queried (or columns related only toa particular function) to the related table. There is no reason not to have a 1-1 relationship between tables, just use the id from the parante table sa the PK inthe child table as well as the FK to the parent table.
Related
I got one table called Table1, it has around 20 columns. Half of these columns are string values, and the rest are integer. My question is so simple: what's better, have all the columns into only one table, or have it distributed into 2, 3 or even 4 tables? If so, I'd have to join them using LEFT JOIN.
What's the best choice?
Thanks
The question of "best" depends on how the table is being used. So, there is no real answer to the question. I can say that 20 columns is not a lot and many very reasonable tables have more than 20 columns of mixed types.
First observation: If you are asking such a question, you have some knowledge of SQL but not in-depth knowledge. One table is almost certainly the way to go.
What might change this advice? If many of the integer columns are NULL -- say 90% of the records have all of them as NULL -- then those NULL values are probably just wasting space on the data page. By eliminating those rows and storing the values in another table, you would reduce the size of the data.
The same is true of the string values, but with a caveat. Whereas the integers occupy at least 4 bytes, variable length strings might be even smaller (depends on the exact way that the database stores them).
Another reason would be on how the data is typically used. If the queries are usually using just a handful of columns, then storing each column in a separate table could be beneficial. To be honest, the overhead of the key column generally overwhelms any savings. And, such a data structure is really lousy for updates, inserts, and deletes.
However, this becomes quite practical in a columnar database such as Paraccel, Amazon Redshift, or Vertica. Such databases have built-in support for this type of splitting and it can have some very remarkable effects on performance.
Answering this with an example for users table -
1) `users` - id, name, dob, city, zipcode etc.
2) `users_products` - id, user_id(FK), product_name, product_validity,...
3) `users_billing_details` - id, user_id(FK to `users`), billing_name, billing_address..
4) `users_friends` - id, user_id(FK to `users`), friend_id(FK to same table `users`)
Hence if have many relations, use MANY-to-MANY relationship. If few relationship go with using the same table. All depends upon your structure and requirements.
SUGGESTION - Many-to-Many makes your data structure more flexible.
You can have 20 columns in 1 table. Nothing wrong with that. But then are you sure you are designing the structure properly?
Could some of these data change significantly in the future?
Is the table trying to encapsulate a single activity or entity?
Does the table have a singular meaning with respect to the domain or does it encapsulate multiple entities?
Could the structure be simplified into smaller tables having singular meaning for each table and then "Relationships" added via primary key/foreign keys?
These are some of the questions you take into consideration while designing a database.
If you find answer to these questions, you will know yourself whether you should have a single table or multiple tables?
How to store records that don't contain any fact? For example, let's say that a shop wants to count how many people have entered inside a store (and that they take info on every person that goes inside the shop). In warehouse, I guess there would be dimension table "Person" with different attributes, but how would fact table look like? Would it contain only foreign keys?
As you described it, that would be just a fact table. Actually, there is name for this -- factless fact table; fact table without any measures.
It is quite common for recoding events. Essentially anything that records: who, what, where, when and why? would be fact-measure-less table. If you add how much? then that goes into a measure.
You can think of it as the fact table containing an implicit count column, with the number of people entering, which is always "1" if you store data on individual person level, so that makes a fact table containing only FKs to the dimensions.
This of course only enabled analysis on the number of people entering, filtered for the various dimensions, but it seams like a realistic use case to me. I think you're on the right way.
Ok, I am creating a game, I have one table where I save a lot of information about a member, so I have many field in it. How many fields is normal to have in one table? Does it matter? Maybe I should split that info into two-three-four tables? What do you think?
Normalize the Database
If you feel you have too many columns, you probably have repeating groups, which suggests you should normalize the database. See an example here: Description of the database normalization basics
Hard MySQL Limits
MySQL 5.5 Column Count Limit
Every table has a maximum row size of 65,535 bytes.
There is a hard limit of 4096 columns
per table
Splitting of data into tables should generally not be dictated by the number of columns, but by the nature of the data. The process of splitting a large table into smaller ones is called normalization.
The only other reason I can think of to split a table is, if you may need data in clusters, i.e. you often need columns A-D together or columns E-L, but never all columns or columns D-F, then you can split the table into two tables, one containing columns A-D and the primary key, the other one containing columns E-L and the primary key.
Speaking about limits, MySQL says it's 4096 (source).
Yet I haven't seen so big tables yet, even those huge data mining tables don't come close.
You shouldn't be concerned about it as long as your database is normalized. As soon as you can spot same data being stored twice (for example, player table might have player_type column storing some fixed values), it's worth moving such data to separate table, instead od duplicating information in other tables and hence reducing columns count (but that's only "side effect" of normalization, not something that drives it).
I've never personally encountered one with more than 500 columns in it, but short of the maximum sizes there's no reason to have any fewer than the design demands. Beware of doing SELECT * FROM it though.
"information about a member" - umm always difficult, but I always separate identifiable information into another table and use a salt key to link the 2 together. That way it is not as easy to "hijack" usernames and passwords etc. And you can always use the SALT as a session variable rather than username/password/userId or whatever.
Typically I only store a ID, salt and joining date in 1 table. As I said, the rest I try to "hide" so that they cannot be "linked/hijacked".
Hope helps
Is there a performance cost to having large numbers of columns in a table, aside from the increase in the total amount of data? If so, would splitting the table into a few smaller ones help the situation?
I don't agree with all these posts saying 30 columns smells like bad code. If you've never worked on a system that had an entity that had 30+ legitimate attributes, then you probably don't have much experience.
The answer provided by HLGEM is actually the best one of the bunch. I particularly like his question of "is there a natural split....frequently used vs. not frequently used" are very good questions to ask yourself, and you may be able to break up the table in a natural way (if things get out of hand).
My comment would be, if your performance is currently acceptable, don't look to reinvent a solution unless you need it.
I'm going to weigh in on this even though you've already selected an answer. Yes, tables that are too wide could cause performance problems (and data problems as well) and should be separated out into tables with one-one relationships. This is due to how the database stores the data (well at least in SQL Server not sure about MySQL but it is worth doing some reading in the documentation about how the database stores and accesses the data).
Thirty columns might be too wide and might not, it depends on how wide the columns are. If you add up the total number of bytes that your 30 columns will take up, is it wider than the maximum number of bytes that can be stored in a record?
Are some of the columns ones you will need less often than others (in other words is there a natural split between required and frequently used info and other stuff that may appear in only one place not everywhere else), then consider splitting up the table.
If some of your columns are things like phone1, phone2, phone3 - then it doesn't matter how many columns you have you need a related table with a one-to-many relationship instead.
In general, though 30 columns are not unusually big and will probably be OK.
If you really need all those columns (that is, it's not just a sign that you have a poorly designed table) then by all means keep them.
It's not a performance problem, as long as you
use appropriate indexes on columns you need to use to select rows
don't retrieve columns you don't need in SELECT operations
If you have 30, or even 200 columns it's no problem to the database. You're just making it work a little harder if you want to retrieve all those columns at once.
But having a lot of columns is a bad code smell; I can't think of any legitimate reason a well-designed table would have this many columns and you may instead be needing a one-many relationship with some other, much simpler, table.
Technically speaking, 30 columns is absolutely fine. However, tables with many columns are often a sign that your database isn't properly normalized, that is, it can contain redundant and / or inconsistent data.
30 doesn't seem too many to me. In addition to necessary indexes and proper SELECT queries, for wide tables, 2 basic tips apply well:
Define your column as small as possible.
Avoid using dynamic columns such as VARCHAR or TEXT as much as possible when you have large number of columns per table. Try using fixed length columns such as CHAR. This is to trade off disk storage for performance.
For instance, for columns 'name', 'gender', 'age', 'bio' in 'person' table with as many as 100 or even more columns, to maximize performance, they are best to be defined as:
name - CHAR(70)
gender - TINYINT(1)
age - TINYINT(2)
bio - TEXT
The idea is to define columns as small as possible and in fixed length where reasonably possible. Dynamic columns should be to the end of the table structure so fixed length columns are ALL before them.
It goes without saying this would introduce tremendous disk storage wasted with large amount of rows, but as you want performance I guess that would be the cost.
Another tip is as you go along you would find columns that are much more frequently used (selected or updated) than the others, you should separate them into another table to form a one to one relationship to the other table that contains infrequent used columns and perform the queries with less columns involved.
Should be fine, unless you have select * from yourHugeTable all over the place. Always only select the columns you need.
30 columns would not normally be considered an excessive number.
Three thousand columns, on the other hand...
How would you implement a very wide "table"?
Beyond performance, DataBase normalization is a need for databases with too many tables and relations. Normalization gives you easy access to your models and flexible relations to execute diffrent sql queries.
As it is shown in here, there are eight forms of normalization. But for many systems, applying first, second and third normal forms is enough.
So, instead of selecting related columns and write long sql queries, a good normalized database tables would be better.
Usage wise, it's appropriate in some situations, for example where tables serve more than one application that share some columns but not others, and where reporting requires a real-time single data pool for all, no data transitions. If a 200 column table enables that analytic power and flexibility, then I'd say "go long." Of course in most situations normalization offers efficiency and is best practice, but do what works for your need.
I am designing a database for a project. I have a table that has 10 columns, most of them are used whenever the table is accessed, and I need to add 3 more columns;
View Count
Thumbs Up (count)
Thumbs Down (Count)
which will be used on %90 of the queries when the table is accessed. So, my question is that whether it is better to break the table up and create new table which will have these 3 columns + Foreign ID, or just make it 13 columns and use no joins?
Since these columns will be used frequently, I guess adding 3 more columns is better, but if I need to create 10 more columns which will be used %90 of the time, should I add them as well, or create a new table and use joins?
I am not sure when to break the table if the columns are used very frequently. Do you have any suggestions?
since it's such a high number of usage cases (90%) and the fields are only numbers (not text) then i would certainly be inclined to just add the fields to the existing table.
edit: only break tables apart if the information is large and/or infrequently accessed. there's no fixed rule, you might just have to run tests if you're unsure as to the benefits.
Space is not a big deal these days - I'd say that the decision to add columns to a table should be based on "are the columns directly related to the table", not "how often will the columns be used".
So basically, yes, add them to the table. For further considerations on mainstream database design, see 3NF.
The frequency of usage should be of no concern for your table layout, at least not until you start with huge tables (in number of rows or columns)
The question to answer is: Is it normalized with the additional columns. Google it, there a plenty of resources about it (with varying quality though)
Ditto some earlier posters. 95% of the time, you should design your tables based on logical entities. If you have 13 data elements that all describe the same "thing", than they all belong in one table. Don't break them into multiple tables based on how often you expect them to be used or to be used together. This usually creates more problems than it solves.
If you end up with a table that has some huge number of very large fields, and only a few of them are normally used, and it's causing a performance problem, then you might consider breaking it up. But you should only do that when you see that it really is causing a performance problem. Pre-emptive strikes in this area are almost always a mistake.
In my experience, the only time breaking up a table for performance reasons has shown any value is when there are some rarely-used, very large text fields. Like when there's a "Miscellaneous Extra Comments" field or "Text of the novel this customer is writing".
My advice is the same as cedo's:go with the 13 columns.
Adding another table to the DB, with another Index might just eat up the space you saved but will result in slower and more complicated queries.
Try looking into Database Normalization for some clearly outlined guidelines for planning database structures.