At which point does MySQL start treating VARCHAR cols like TEXT cols? - mysql

I'm aware that since MySQL 5, VARCHAR can have a length of up to 65,000. VARCHAR is stored inline, which means faster retrievals, as opposed to TEXT, which is stored outside of the table. That said, the documentation states that MySQL will treat LONG VARCHAR exactly TEXT.
According to this Source:
From storage prospective BLOB, TEXT as
well as long VARCHAR are handled same
way by Innodb. This is why Innodb
manual calls it “long columns” rather
than BLOBs.
When does MySQL start treating VARCHAR like TEXT? At what character count does MySQL make this distinction, and VARCHAR stops getting stored inline?

Short answer: A "long" VARCHAR is a normal VARCHAR and will be inline.
MySQL won't magically start treating a straight VARCHAR as a text type. It'll always be stored inline. With 5.0.3, the upper limit for VARCHARs was relaxed to 65,535 bytes. They also take up 2 bytes of header if over 255 characters. This limit is still applied to the maximum row size of 65,535 bytes. A LONG VARCHAR is actually a different type which backcompats to MEDIUMTEXT.
See: http://dev.mysql.com/doc/refman/5.0/en/char.html and http://dev.mysql.com/doc/refman/5.0/en/blob.html

Related

How to constrain varchar in mysql 5.1?

I need to create column in mysql 5.1 that can store user's feedback.
It shouldn't be too long, so I think not more 1000 characters of UTF-8.
The question is how to represent this efficiently in mysql 5.1.
For now I have:
`description` varchar NOT NULL,
But how to constrain varchar to hold at most 1000 characters of UTF-8?
From the documentation:
Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 255 before MySQL 5.0.3, and 0 to 65,535 in 5.0.3 and later versions. The effective maximum length of a VARCHAR in MySQL 5.0.3 and later is subject to the maximum row size (65,535 bytes, which is shared among all columns) and the character set used.
This means that you can store up to 65,535 bytes in a VARCHAR column. However, from the String Type Overview:
MySQL interprets length specifications in character column definitions in character units. (Before MySQL 4.1, column lengths were interpreted in bytes.) This applies to CHAR, VARCHAR, and the TEXT types.
So, declare your table with a UTF8 collation and set the length of the varchar to 1,000 characters and MySQL will do the work for you behind the scenes.
Since the size is apparently defined in bytes, ...
-correction- Field size is defined in 'character units'. It's a bit unclear what they mean by that, but I guess they mean 'code units'.
Removed the rest of the detailed explanation, since it wasn't (entirely true).
Correction. In MySQL you actually define the number of characters in the field. It is still limited to the 65535 byte boundary though. Above that, MySQL just reserves 3 bytes per character for UTF-8, which means that you cannot have UTF-8 fields of more than 21844 characters, and declaring a field als VARCHAR(21900) will just fail for that reason: " Column length too big for column 'field1' (max = 21845); use BLOB or TEXT instead: ". The number in this message is wrong, by the way. The actual maximum size is 21844. 21845 is 1/3 of 65535, but I guess you need to subtract the two bytes for the field size header as well.
The limit of 3 bytes is weird, though. The unicode definition is designed to be able to expand with extra characters. There are already supplementary characters of 4 bytes, that actually cannot be stored in a UTF-8 varchar(1) field, or any varchar field for that matter, since MySQL just doesn't seem able to read those characters: "Incorrect string value: '\xF0\xA0\x9C\x8E' for column 'field1' at row 1". So I guess you would need an actual binary/blob column to be able to store these characters.
I think the documentation about this subject is pretty poor, but I've tried some things and came to this conclusion. You can see the fiddle here: http://sqlfiddle.com/#!2/4d938
To the question:
So for your specific situation, declaring the field as varchar(1000) will do the trick, presuming you don't want people to use the supplementary characters in their feedback.
Some things to consider though:
I think a 'feedback' field of 1000 characters is pretty small. For many folks this will be enough, but if you have to say more, it is annoying if you can't. So I would make the field bigger.
varchar fields are stored in the record and consume a part of the maximum row size of 65536 bytes. This is an important fact. You cannot have two varchar(20000) fields in a row, because together they would be larger than this maximum row size.
A better alternative for large text fields would be therefor be to make them TEXT or MEDIUMTEXT, which can be even larger and are stored in a different way.
Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 255 before MySQL 5.0.3, and 0 to 65,535 in 5.0.3 and later versions.
http://dev.mysql.com/doc/refman/5.0/en/char.html

MySQL TEXT or VARCHAR

We have a very large historical table that contains a column with at most 500 UTF8 characters, and the disk space grows really fast!
We're having at least 2 million rows a day... and we were wondering which would do a better job (mostly in storage but in performance as well)? TEXT or VARCHAR(512)?
VARCHAR is probably preferable in your case from both the storage and performance perspective. View this oft-reposted article.
This is useful information; I think in general, the answer is the varchar is usually the better bet.
From the MySQL manual:
In most respects, you can regard a
BLOB column as a VARBINARY column that
can be as large as you like.
Similarly, you can regard a TEXT
column as a VARCHAR column. BLOB and
TEXT differ from VARBINARY and VARCHAR
in the following ways:
There is no trailing-space removal for BLOB and TEXT columns when values
are stored or retrieved. Before MySQL
5.0.3, this differs from VARBINARY and VARCHAR, for which trailing spaces are
removed when values are stored.
On comparisons, TEXT is space extended to fit the compared object,
exactly like CHAR and VARCHAR.
For indexes on BLOB and TEXT columns, you must specify an index
prefix length. For CHAR and VARCHAR, a
prefix length is optional. See Section
7.5.1, “Column Indexes”.
BLOB and TEXT columns cannot have DEFAULT values.
http://dev.mysql.com/doc/refman/5.0/en/blob.html

Confusion about varchar datatype

My server has my SQL version of 5.0.91-community, now i have to store a long string of approx about 500 character more or less, i thought of going with text data type but then someone told me it slows the performance, i wanted to know more about varchar and it's limit.
i used to think that varchar is only limited to 255 characters, but then i read it somewhere it is capable of storing more then that in the newer version i.e >= 5.0.3 , as i am using 5.0.91 what do you think i should use? if i use it like varchar(1000) is it still valid?
thank you.
The documentation is here,
varchar has a max size of 65,535 in MySQL 5.0.3 and later , before 5.0.3 the limit was 255
Note that the effective size is less,
The effective maximum length of a
VARCHAR in MySQL 5.0.3 and later is
subject to the maximum row size
(65,535 bytes, which is shared among
all columns) and the character set
used.
You have to specify the max size, e.g. varchar(1000). Just stating varchar isn't enough.
From The CHAR and VARCHAR Types
Values in VARCHAR columns are
variable-length strings. The length
can be specified as a value from 0 to
65,535. The effective maximum length
of a VARCHAR is subject to the maximum
row size (65,535 bytes, which is
shared among all columns) and the
character set used.
According to the MySQL doc:
TEXT differs from VARCHAR in the following ways:
There is no trailing-space removal for TEXT columns when values are stored or retrieved. Before MySQL 5.0.3, this differs from VARCHAR, for which trailing spaces are removed when values are stored.
For indexes on TEXT columns, you must specify an index prefix length. For CHAR and VARCHAR, a prefix length is optional.
TEXT columns cannot have DEFAULT values.
Apart from these differences, using VARCHAR like using TEXT, so the question of size is not what should make you choose between those two, unless you really need to store no more characters than 1000.
In MySQL, VARCHAR accepts maximum of 65535 chars.
You can assure yourself very easy. Mysql documentation is openly accessed and it says
Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 255 before MySQL 5.0.3, and 0 to 65,535 in 5.0.3 and later versions
as for the performance issues, it doesn't matter. Not data type but data relations affect performance.

Why would I use VARCHAR over TEXT in MySQL?

I noticed that in MySQL, both VARCHAR and TEXT offer variable-sized data. Well, VARCHAR is a bit more efficient in data storage, but still, TEXT MEDIUMTEXT and LONGTEXT offer a lot more potential. So, what are the real uses of VARCHAR?
First of all, you should read the 10.4. String Types section of the MySQL's manual : it'll give you all the informations you are looking for :
10.4.1. The CHAR and VARCHAR Types
10.4.3. The BLOB and TEXT Types
A couple of important differences :
Difference in the amount of text those can contain :
varchar have a quite small size limit ; with the newest versions of MySQL, it's 64 KB, for the total of all varchar columns of a row -- which is not that much.
TEXT have virtually no limit, as they can contain something like 2^32 bytes.
There are differences in indexing and sorting, if I'm not mistake ; quoting the page about TEXT :
About sorting : "Only the first max_sort_length bytes of the column are used when sorting."
And, about performances : "Instances of BLOB or TEXT columns in the result of a query that is processed using a temporary table causes the server to use a table on disk rather than in memory"
Considering these informations, if you are sure that your strings will not be too long, and that you'll always be able to store them in a varchar, I would use a varchar.

Which MySQL datatype is more space efficient for scalable apps? TEXT or VARCHAR?

I'm building a highly scalable app, and I need to know which data type to use for small strings (50-1000 chars). I heard that VARCHAR is fixed sized and therefore might be faster, but with TEXT, chars might be stored in a seperate CLOB, and only a pointer in the row data. This would be smaller, but would it have any significant performance hit?
Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 255 before MySQL 5.0.3, and 0 to 65,535 in 5.0.3 and later versions.
If you have MySQL 5.0.3 or later, and don't need more than 65k characters, it doesn't really matter which one you use, because both TEXT and VARCHAR have a variable size in storage. If you have many texts with less than 255 characters, you can save one byte for those by choosing VARCHAR.
But on a completely different aspect, I would always choose the data type which is more appropriate for it. If you store a text, that is semantically a text that can exceed "standard" sizes easily, you should use the TEXT datatype.