Accoridng to this SO Post the max length accpeted by IE is about 2048. However this seems way too big to set my varchar field in mysql as most urls are typically smaller about 200 characters. Is this field meant to be set to the maximum or the average?
Don't worry -- you can still set the max size to 2048. This is just a maximum -- if a URL only takes 200 characters then that's all the DB engine will use.
If you use varchar, it won't matter. Check out the example from the MySQL docs.
If you decide to go with char, on the other hand, you will be storing a constant amount of data. Then you may wish to only store the domain -- a domain name is allowed to be up to 253 characters long. I suppose if you wanted to draw a line in the sand, that would probably be a reasonable one. You need to let the user know all of this BTW. Otherwise things could get bad.
Related
I have learnt that VARCHAR occupies the only memory which is required unlike CHAR which always occupy same amount of memory whether needed or not.
My question: Suppose I have a field VARCHAR(50), I know that if it needs to store 30 characters, it will only occupy 30 bytes and no more than that(assuming 1 char takes one byte). So why should I even mention 50 or 30 or any upper limit, since it will only take the memory which is required.
UPDATE: Why do I have to mention the upper limit since there will be no useless memory occupied?
UPDATE: Why do I have to mention the upper limit since there will be no useless memory occupied?
If you are sanitizing your inputs with something like final_value = left(provided_value, 30) then it's a non-issue for your database. You can set it to varchar(255) if you like.
The idea of putting the max limit is to ensure you don't mistakenly send more chars than what you actually plan for.
Would be a pain in the future for code maintenance to recall the data size limit of every column of every table. You need to do that anyway but by looking at your table definitions as the single source for info about that.
Would a table be written to (insert/update) from only one piece of code in your app or website? If there's another interface to the database like, say, a REST API listener, if you don't enter the same values again, you'll have an issue with non-uniform data - exactly what db's are able to prevent.
If a coding error (or hack) bypasses your app/website controls for data (size limits, or worse) then at least your db will still be maintaining the data correctly.
You wouldn't. You would make it VARCHAR(30). It's about the maximum amount of characters allowed. So why would you even make a column that takes 30 characters accept anything up to 50?
to make things dynamic you are using VARCHAR(50) because in future the string size can be exceed and you knows that the maximum size can be 50, But for constant(s) you can use CHAR(30),this means that the size of string will be always 30 , my sql will report exception if the size exceed or decrease
take a look
http://dev.mysql.com/doc/refman/5.0/en/char.html
When memory usage is your only concern, you can give any large number to varchar. But if you want to make sure that an upper limit is kept, than you give that as a maximum to varchar.
You can take VARCHAR(50) or VARCHAR(30). It's not a problem, but if it's dynamic we can't tell the limit.
In that case we take maximum limit.
Here is something that troubles me as I am creating a database table columns. For each of these there is a data type which has it's length. For e.g say one of the tables is a file path, and I assume this file path to be not longer than 100 in length at max, obviously i specify this as
filepath Varchar(100)
However, this still takes the same amount of memory space as say varchar(255) which is 1 byte. Given this, what is the benefit of me specifying the length as 100. Taking an outlier example, if my filepath exceeds varchar(100), does the database reject/trim down the filepath value to fit it to 100? Or does it allow it to exceed beyond 100 since the allotted memory space is still around 1 byte?
Essentially the above explanation frames my question as should one try and be very specific about the expected maximum length for a table column? Or just play it safe and specify the upper limit of the expected length of the table column depending on the memory requirement ?
Thanks much !
Parijat
MySQL will auto-truncate the value down to 100 characters. The number in the brackets for text/char fields is the MAXIMUM length. Note that this is a CHARACTER limit. If you've got a multibyte collation on that field, you can store more than 100 bytes in the field, but only 100 characters worth of text.
This is different than saying int(10), where the bracketed number is for display purposes only. An int is an int internally and takes up 16bits, regardless of how many digits you allow with the (#), but you'll never SEE more than those # digits.
very specific about the expected maximum length for a table column? Or just play it safe
If one would make a table containing addresses, you undoubtedly know that there will be some kind of limit to the length of the address. It would be useless to allow longer fields in the database.
You should play it safe, and be very careful.
I'm working with some database abstraction layers and most of them are using attributes like "String" which is VARCHAR 250 or INTEGER which has length of 11 digits. But for example I have something that will be less than 250 characters long. Should I go and make it less? Does it really makes any valuable difference?
Thanks in advance!
INT length does nothing. All INTs are 4 bytes. The number you can set, is only used for zerofill (and who uses that!?).
VARCHAR length does more. It's the maxlength of the field. VARCHAR is saved so that only the actual data is stored, so the length doesn't mattter. These days, you can have bigger VARCHARs than 255 bytes (being 256^2-1). The difference is the bytes that are used for the field length. VARCHAR(100) and VARCHAR(8) and VARCHAR(255) use 1 byte to save the field length. VARCHAR(1000) uses 2.
Hope that helps =)
edit
I almost always make my VARCHARs 250 long. Actual length should be checked in the app anyway. For bigger fields I use TEXT (and those are stored differently, so can be much much longer).
edit
I don't know how current this is, but it used to help me (understand): http://help.scibit.com/Mascon/masconMySQL_Field_Types.html
First, remember that the database is meant to store facts and is designed to protect itself against bad data. Thus, the reason you do not want to allow a user to enter 250 characters for a first name is that a user will put all kinds of data in there that is not a first name. They'll put their whole name, their underwear size, a novel about what they did last summer and so on. Thus, you want to strive to enforce that the data is as correct as possible. It is a mistake to assume that the application is the sole protector against bad data. You want users to tell you that they had a problem stuffing War in Peace into a given column.
Thus, the most important question is, "What is the most appropriate value for the data being stored?" Ideally, you would use an int and a check constraint to ensure that the values have an appropriate range (e.g. greater than zero, less than a billion etc.). Unfortunately, this is one of MySQL's greatest weakness: it does not honor check constraints. That simply means you must implement those integrity checks in triggers which admittedly is more cumbersome.
Will the difference between an int (4 bytes) make an appreciable difference to a tinyint (1 byte)? Obviously, it depends on the amount of data. If you will have no more than 10 rows, the answer is obviously no. If you will have 10 billion rows, the answer is obviously "Yes". However, IMO, this is premature optimization. It is far better to focus on ensuring correctness first.
For text, you should ask whether your data should support Chinese, Japanese or non-ANSI values (i.e., should you use nvarchar or varchar)? Does this value represent a real world code like a currency code, or bank code which has a specific specification?
Not so sure in MySQL, but in MS SQL it only makes a difference for sufficiently large databases. Typically, I like to use smaller fields for a) the space saving (it never hurts to practice good habits) and b) for the implied validation (if you know a certain field should never be more than 10 characters, why allow eleven, let alone 250?).
I thinks Rudie is wrong, not all INTs are 4 bytes... in MySQL you have:
tinyint = 1 byte,
smallint = 2 bytes,
mediumint = 3 bytes,
int = 4 bytes,
bigint = 8 bytes.
I think Rudie refers to the "display with" that is the number you put between parenthesis when you are creating a column, e.g.:
age INT(3)
You're telling to the RDBMS just to SHOW no more than 3 numbers.
And VARCHARs are (variable length charcter string) so if you declare let's say name varchar(5000) and you store a name like "Mario" you only are using 7 bytes (5 for the data and 2 for the length of the value).
The correct field size serves to limit the bad data that can be put in. For instance suppose you have a phone number field. If you allow 250 characters, you will often end up with things like the following in the phone field (an example not taken at random):
Call the good-looking blonde secretary instead.
So first limiting the length is part of how we enforce data integrity rules. As such it is critical.
Second, there is only so much space on a datapage and while some databases will allow you to create tables where the potential record is longer than the width of the data page, they often will not allow you to actually exceed it when storing the data. This can lead to some very hard to find bugs when suddenly one record can't be saved. I don't know about MySql and whether it does this but I know SQL Server does and it is very hard to figure out what is wrong. So making data the correct size can be critical to preventing bugs.
When I add user info to MySQL through a PHP registration form, there are with limits on the data fields (e.g. name is 20 max chars, email 18 chars, additional info 200, pass 12 chars, etc.)
Should I create exact same fields in the MySQL table, or I should define longer fields?
Is there any benefits of doing so rather than just creating all string fields e.g. 500 characters long?
When storing age as an integer, should I use a small int (i.e. with max 256) or not?
In general, it doesn't really matter. The important part is how you validate the information on the server side.
Make sure the entered data does not exceed the size of the column. If you don't, you can run into issues where mysql will auto-truncate the data.
Don't limit the password size. If someone wants to enter a 200 character password, let them. You should be storing it in a storing hash and not in plain text, so the exact length shouldn't make a difference.
Always store your data types properly. If you expect an integer age, store it in an integer column. There's no real reason to store it in a string column type.
As far as the rest of your limits, it's really application dependent more than anything. If you expect 200 character info limit, then store it in a VARCHAR(200). But if you're just assuming, store it in a TEXT type so that the user can enter as much as they'd like. But that's more application and use-case dependent than anything else...
Suggest you be liberal with your database column lengths (for your varchar), but strict on your application in enforcing size/lengths.
The business logic may change over time. Your application tier will be the keeper and enforcer of those rules.
Your database shouldn't have to adjust often to the changing business rules regarding length. Defining a column of type varchar(100) doesn't cost you anything today. The length is variable up to 100, so your performance and storage won't suffer at all.
Application and database changes/maintenance are expensive; database storage is cheap.
Some other detailed suggestions, if you will:
don't store age. Derive it from a date (birthdate) by using math (Today-Birthdate).
passwords shouldn't be stored or have a max length!
all your string fields -- define them as varchar(256) or 1024 and be done with them. Let your application enforce the business rules of the day.
If you're using the MyISAM table type it will be a bit more efficient for queries if you keep the record length static. So if you can use char fields of a fixed size instead of varchar it's better (as long as all the fields in a record are static). However, the whole number of characters you specify will be blocked out in memory so you need to decide if memory usage is more important.
Unless your application is huge and your DB is going to be massive, this should matter very little in terms of performance. I would say that you should give yourself a bit of extra room in the MySQL fields as it is a lot easier to change the registration form max lengths than the MySQL max lengths later. Using smallint for age is fine. Or not. Generally you shouldn't allow for fields to take up more space than they are going to need, but I would give myself some padding just in case. Again, though, it shouldn't make much of a difference.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Are there disadvantages to using a generic varchar(255) for all text-based fields?
In MYSQL you can choose a length for the VARCHAR field type. Possible values are 1-255.
But what are its advantages if you use VARCHAR(255) that is the maximum instead of VARCHAR(20)? As far as I know, the size of the entries depends only on the real length of the inserted string.
size (bytes) = length+1
So if you have the word "Example" in a VARCHAR(255) field, it would have 8 bytes. If you have it in a VARCHAR(20) field, it would have 8 bytes, too. What is the difference?
I hope you can help me. Thanks in advance!
Check out: Reference for Varchar
In short there isn't much difference unless you go over the size of 255 in your VARCHAR which will require another byte for the length prefix.
The length indicates more of a constraint on the data stored in the column than anything else. This inherently constrains the MAXIMUM storage size for the column as well. IMHO, the length should make sense with respect to the data. If your storing a Social Security # it makes no sense to set the length to 128 even though it doesn't cost you anything in storage if all you actually store is an SSN.
There are many valid reasons for choosing a value smaller than the maximum that are not related to performance. Setting a size helps indicate the type of data you are storing and also can also act as a last-gasp form of validation.
For instance, if you are storing a UK postcode then you only need 8 characters. Setting this limit helps make clear the type of data you are storing. If you chose 255 characters it would just confuse matters.
I don't know about mySQL but in SQL Server it will let you define fields such that the total number of bytes used is greater than the total number of bytes that can actually be stored in a record. This is a bad thing. Sooner or later you will get a row where the limit is reached and you cannot insert the data.
It is far better to design your database structure to consider row size limits.
Additionally yes, you do not want people to put 200 characters in a field where the maximum value should be 10. If they do, it is almost always bad data.
You say, well I can limit that at the application level. But data does not get into the database just from one application. Sometimes multiple applications use it, sometimes data is imported and sometimes it is fixed manually from the query window (update all the records to add 10% to the price for instance). If any of these other sources of data don't know about the rules you put in your application, you will have bad, useless data in your database. Data integrity must be enforced at the database level (which doesn't stop you from also checking before you try to enter data) or you have no integrity. Plus it has been my experience that people who are too lazy to design their database are often also too lazy to actually put the limits into the application and there is no data integrity check at all.
They have a word for databases with no data integrity - useless.
There is a semantical difference (and I believe that's the only difference): if you try to fill 30 non-space characters into varchar(20), it will produce an error, whereas it will succeed for varchar(255). So it is primarily an additional constraint.
Well, if you want to allow for a larger entry, or limit the entry size perhaps.
For example, you may have first_name as a VARCHAR 20, but perhaps street_address as a VARCHAR 50 since 20 may not be enough space. At the same time, you may want to control how large that value can get.
In other words, you have set a ceiling of how large a particular value can be, in theory to prevent the table (and potentially the index/index entries) from getting too large.
You could just use CHAR which is a fixed width as well, but unlike VARCHAR which can be smaller, CHAR pads the values (although this makes for quicker SQL access.
From a database perspective performance wise I do not believe there is going to be a difference.
However, I think a lot of the decision on the length to use comes down to what you are trying to accomplish and documenting the system to accept just the data that it needs.