Problem1: What is the maximum no of columns we can have in a table
Problem2: What is the maximum no of columns we should have in a table
Answer 1: Probably more than you have, but not more than you will grow to have.
Answer 2: Fewer than you have.
Asking these questions usually indicates that you haven't designed the table well. You probably are practicing the Metadata Tribbles antipattern. The columns tend to accumulate over time, creating an unbounded set of columns that store basically the same type of data. E.g. subtotal1, subtotal2, subtotal3, etc.
Instead, I'm guessing you should create an additional dependent table, so your many columns become many rows. This is part of designing a proper normalized database.
CREATE TABLE Subtotals (
entity_id INT NOT NULL,
year_quarter SMALLINT NOT NULL, -- e.g. 20094
subtotal NUMERIC(9,2) NOT NULL,
PRIMARY KEY (entity_id, year_quarter),
FOREIGN KEY (entity_id) REFERENCES Entities (entity_id)
);
My former colleague also wrote a blog about this:
Understanding the maximum number of columns in a MySQL table
The answer is not so straightforward as you might think.
SQL 2000 : 1024
SQL 2005 : 1024
SQL 2008 : 1024 for a non-wide table, 30k for a wide table.
The wide tables are for when you have used the new sparse column feature in SQL 2008 which is designed for when you have a large number of columns that are normally empty.
Just because these limits are available, does not mean you should be using them however, I would start with designing the tables based on the requirements and then check whether a vertical partitioning of 1 table into 2 smaller tables is required etc.
1)
http://msdn.microsoft.com/en-us/library/aa933149%28SQL.80%29.aspx
1024 seems to be the limit.
2)
Much less than 1024 :). Seriously, it depends on how normalized you want your DB to be. Generally, the fewer columns you have in a table the easier it will be for someone to understand (normally). Like for a person table, you might want to store the person's address in another table (person_address, for example). It's best to break your data up into entities that make sense for your business model, and go from there.
2) There are plenty of guidelines out there. In particular regarding database normalization. The overarching principle is always to be able to adapt. Similar to classes, tables with large number of columns are not very flexible. Some of the questions you should ask yourself:
Does Column A describes an attribute of the Object (Table) that could/should be grouped with Column B
Data update. Keep in mind that most RDBMS perform a row lock when updating values. This means that if you are constantly updating Column A while another process is updating Column B, they will fight for the row and this will create contention.
Database design is an art more than a science. While guidelines and technical limitations will get you in the right direction, there are no hard rules that will make your system work or fail 100%.
I think 4096 in mysql, SQL Server I don't know
I asked the same question a few months ago in a special scenario, maybe the answers help you decide. Usually, as few as possible I would say.
Related
I have a big table with over 100 million rows. I have been trimming it down for months getting rid of bad data (rows wise), trying to keep it small. I already had 9 columns on this table. I want to add a new boolean column to it. Below is the current state.
This table started off small, and now its getting pretty wide. Yet again, I am tasked with adding more information per row. This time it's a new boolean field. I expect this field to be low volume, meaning less than 10% will have this set to true. I know I can make it default null, and it is a boolean column which should be small.
However, I wanted to get some advice. This table cannot get infinitely wide, and I will need to work around this. Under these circumstances, does it make more sense to create another table and foreign key reference the record when I have additional data to add? How do the pro's handle this in database design?
The best situation for usability is to have all data on the record so any form of a query can get or calculate on the table itself without joins. I just do not have confidence that it will scale to 1 BILLION rows (insert meme).
At my job I support MySQL instances that have multi-billion row tables. At that scale, care must be taken to optimize queries properly. You don't want to do a table-scan at that scale.
But that's about rows, not columns. You asked first about columns.
The way to choose between adding a column versus adding another table is to follow rules of database normalization. If the new column is for an attribute of the same entity as your current table, add the column to that table. If it's a multi-valued attribute or if it's really an attribute of some other entity, then add it to a different table.
Very, very rarely is it the right choice to make another table solely for the sake of having too many columns. A given MySQL table can have dozens of columns pretty easily, and hundreds if you're careful.
In theory, there is no limit to the number of columns that might be appropriate to put in the same table with respect to normalization. But there are limitations due to the code to store those columns in a given implementation (e.g. InnoDB storage engine in MySQL).
See https://www.percona.com/blog/2013/04/08/understanding-the-maximum-number-of-columns-in-a-mysql-table/
So the maximum number of columns for a table in MySQL is somewhere between 191 and 2829, depending on a number of factors.
In the comments on that blog, I was able to design a table that failed to be created at 59 columns. Read the blog for details.
I got one table called Table1, it has around 20 columns. Half of these columns are string values, and the rest are integer. My question is so simple: what's better, have all the columns into only one table, or have it distributed into 2, 3 or even 4 tables? If so, I'd have to join them using LEFT JOIN.
What's the best choice?
Thanks
The question of "best" depends on how the table is being used. So, there is no real answer to the question. I can say that 20 columns is not a lot and many very reasonable tables have more than 20 columns of mixed types.
First observation: If you are asking such a question, you have some knowledge of SQL but not in-depth knowledge. One table is almost certainly the way to go.
What might change this advice? If many of the integer columns are NULL -- say 90% of the records have all of them as NULL -- then those NULL values are probably just wasting space on the data page. By eliminating those rows and storing the values in another table, you would reduce the size of the data.
The same is true of the string values, but with a caveat. Whereas the integers occupy at least 4 bytes, variable length strings might be even smaller (depends on the exact way that the database stores them).
Another reason would be on how the data is typically used. If the queries are usually using just a handful of columns, then storing each column in a separate table could be beneficial. To be honest, the overhead of the key column generally overwhelms any savings. And, such a data structure is really lousy for updates, inserts, and deletes.
However, this becomes quite practical in a columnar database such as Paraccel, Amazon Redshift, or Vertica. Such databases have built-in support for this type of splitting and it can have some very remarkable effects on performance.
Answering this with an example for users table -
1) `users` - id, name, dob, city, zipcode etc.
2) `users_products` - id, user_id(FK), product_name, product_validity,...
3) `users_billing_details` - id, user_id(FK to `users`), billing_name, billing_address..
4) `users_friends` - id, user_id(FK to `users`), friend_id(FK to same table `users`)
Hence if have many relations, use MANY-to-MANY relationship. If few relationship go with using the same table. All depends upon your structure and requirements.
SUGGESTION - Many-to-Many makes your data structure more flexible.
You can have 20 columns in 1 table. Nothing wrong with that. But then are you sure you are designing the structure properly?
Could some of these data change significantly in the future?
Is the table trying to encapsulate a single activity or entity?
Does the table have a singular meaning with respect to the domain or does it encapsulate multiple entities?
Could the structure be simplified into smaller tables having singular meaning for each table and then "Relationships" added via primary key/foreign keys?
These are some of the questions you take into consideration while designing a database.
If you find answer to these questions, you will know yourself whether you should have a single table or multiple tables?
Ok, I am creating a game, I have one table where I save a lot of information about a member, so I have many field in it. How many fields is normal to have in one table? Does it matter? Maybe I should split that info into two-three-four tables? What do you think?
Normalize the Database
If you feel you have too many columns, you probably have repeating groups, which suggests you should normalize the database. See an example here: Description of the database normalization basics
Hard MySQL Limits
MySQL 5.5 Column Count Limit
Every table has a maximum row size of 65,535 bytes.
There is a hard limit of 4096 columns
per table
Splitting of data into tables should generally not be dictated by the number of columns, but by the nature of the data. The process of splitting a large table into smaller ones is called normalization.
The only other reason I can think of to split a table is, if you may need data in clusters, i.e. you often need columns A-D together or columns E-L, but never all columns or columns D-F, then you can split the table into two tables, one containing columns A-D and the primary key, the other one containing columns E-L and the primary key.
Speaking about limits, MySQL says it's 4096 (source).
Yet I haven't seen so big tables yet, even those huge data mining tables don't come close.
You shouldn't be concerned about it as long as your database is normalized. As soon as you can spot same data being stored twice (for example, player table might have player_type column storing some fixed values), it's worth moving such data to separate table, instead od duplicating information in other tables and hence reducing columns count (but that's only "side effect" of normalization, not something that drives it).
I've never personally encountered one with more than 500 columns in it, but short of the maximum sizes there's no reason to have any fewer than the design demands. Beware of doing SELECT * FROM it though.
"information about a member" - umm always difficult, but I always separate identifiable information into another table and use a salt key to link the 2 together. That way it is not as easy to "hijack" usernames and passwords etc. And you can always use the SALT as a session variable rather than username/password/userId or whatever.
Typically I only store a ID, salt and joining date in 1 table. As I said, the rest I try to "hide" so that they cannot be "linked/hijacked".
Hope helps
How many field is possible in one table,
Shall i maintain 150 field in one table is good way ,
OR
Maintain relation ship with other tables,
Thanks
Bharanikumar
In the vast majority of cases having 150 columns in a single table is symptomatic of a badly denormalized database.
You might want to read this and re-evaluate your db design.
To put it in your terms, go with "maintain relationship with other tables"
http://dev.mysql.com/doc/refman/5.0/en/column-count-limit.html
If you have a business need to have 150 columns, then it's a "good way". I've never seen such a business need, but that doesn't mean one doesn't exist. I have seen very wide tables used in olap type cases, so if that's what you're doing, there's a good chance that you're on the right track. If you're using this table for more otap functionality, then you're probably going down the wrong road. Perhaps if you provided a bit more info about what you're trying to accomplish, we could provide some advice (instead of "do that" or "do it a different way").
150 bit type fields might be OK, but you also have to consider the maximum length of the record your database will allow you to store. With varchar fields, most databases will let you create a table that would in theory violate the max if all the fields were filled to their max length. However, it won't let you actually add records which are too long,. This is the kind of trap that can go along fine for years until someone puts just one character too many into a potential insert and then blow up and it takes a long time generally to find an fix such a problem. It is best to avoid ever designing a table where the total legnth of the columns is bigger than the length of the maximum record bytes.
Less wide tables can also tend to be faster to query.
Additonally 150 columns is usually a sign that you really need to look at the design and see if a related table would be better. For instance if you have phone1, phone2, phone3, then you need a related phone table.
If you genuiniely need all 150 columns, consider which are likely to be queried together most often. Put those inteh parent table. Then add the less often queried (or columns related only toa particular function) to the related table. There is no reason not to have a 1-1 relationship between tables, just use the id from the parante table sa the PK inthe child table as well as the FK to the parent table.
How much data should be in a table so that reading is optimal? Assuming that I have 3 fields varchar(25). This is in MySQL.
I would suggest that you consider the following in optimizing your database design:
Consider what you want to accomplish with the database. Will you be performing a lot of inserts to a single table at very high rates? Or will you be performing reporting and analytical functions with the data?
Once you've determined the purpose of the database, define what data you need to store to perform whatever functions are necessary.
Normalize till it hurts. If you're performing transaction processing (the most common function for a database) then you'll want a highly normalized database structure. If you're performing analytical functions, then you'll want a more denormalized structure that doesn't have to rely on joins to generate report results.
Typically, if you've really normalized the structure till it hurts then you need to take your normalization back a step or two to have a data structure that will be both normalized and functional.
A normalized database is mostly pointless if you fail to use keys. Make certain that each table has a primary key defined. Don't use surrogate keys just cause its what you always see. Consider what natural keys might exist in any given table. Once you are certain that you have the right primary key for each table, then you need to define your foreign key references. Establishing explicit foreign key relationships rather than relying on implicit definition will give you a performance boost, provide integrity for your data, and self-document the database structure.
Look for other indexes that exist within your tables. Do you have a column or set of columns that you will search against frequently like a username and password field? Indexes can be on a single column or multiple columns so think about how you'll be querying for data and create indexes as necessary for values you'll query against.
Number of rows should not matter. Make sure the fields your searching on are indexed properly. If you only have 3 varchar(25) fields, then you probably need to add a primary key that is not a varchar.
Agree that you should ensure that your data is properly indexed.
Apart from that, if you are worried about table size, you can always implement some type of data archival strategy to later down the line.
Don't worry too much about this until you see problems cropping up, and don't optimise prematurely.
For optimal reading you should have an index. A table exists to hold the rows it was designed to contain. As the number of rows increases, the value of the index comes into play and reading remains brisk.
Phrased as such I don't know how to answer this question. An idexed table of 100,000 records is faster than an unindexed table of 1,000.
What are your requirements? How much data do you have? Once you know the answer to these questions you can make decisions about indexing and/or partitioning.
This is a very loose question, so a very loose answer :-)
In general if you do the basics - reasonable normalization, a sensible primary key and run-of-the-mill queries - then on today's hardware you'll get away with most things on a small to medium sized database - i.e. one with the largest table having less than 50,000 records.
However once you get past the 50k - 100k rows, which roughly corresponds to the point when the rdbms is likely to be memory constrained - then unless you have your access paths set up correctly (i.e. indexes) then performance will start to fall off catastrophically. That is in the mathematical sense - in such scenario's it's not unusual to see performance deteriorate by an order of magnitude or two for a doubling in table size.
Obviously therefore the critical table size at which you need to pay attention will vary depending upon row size, machine memory, activity and other environmental issues, so there is no single answer, but it is well to be aware that performance generally does not degrade gracefully with table size and plan accordingly.
I have to disagree with Cruachan about "50k - 100k rows .... roughly correspond(ing) to the point when the rdbms is likely to be memory constrained". This blanket statement is just misleading without two additional data: approx. size of the row, and available memory. I'm currently developing a database to find the longest common subsequence (a la bio-informatics) of lines within source code files, and reached millions of rows in one table, even with a VARCHAR field of close to 1000, before it became memory constrained. So, with proper indexing, and sufficient RAM (a Gig or two), as regards the original question, with rows of 75 bytes at most, there is no reason why the proposed table couldn't hold tens of millions of records.
The proper amount of data is a function of your application, not of the database. There are very few cases where a MySQL problem is solved by breaking a table into multiple subtables, if that's the intent of your question.
If you have a particular situation where queries are slow, it would probably be more useful to discuss how to improve that situation by modifying query or the table design.