My table:
create table test( pk bigint primary key,
value1 varchar(255),
value2 varchar(255),
value3 varchar(255),
value4 varchar(255),
value5 varchar(255),
value6 varchar(255),
value7 varchar(255),
value8 varchar(255),
value9 varchar(255),
value10 varchar(255),
value11 varchar(255),
);
Insert Query:
insert into test values(1, '‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱','‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱','‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱','‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱','‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱','‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱','‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱','‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱','‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱','‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱','‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱‱');
My page size is 16KB. So a row in my table can contain max of 8192bytes(i.e. 8KB).
I have created 11 VARCHAR column(each 255 characters), where these 11 columns can hold max of 255*11 = 2805 characters.
If I store a 2805 - 3byte characters, it will take (255*11*3) = 8415bytes which exceeds the limit of max row size(8192bytes).
Now I tried to insert the above single row query, which has 8415bytes data. But my MYSQL didn't throw error which accepted that insert query.
3 byte character - ‱
Row Format - DYNAMIC
Collation - UTF8mb3
Character set - utf8_general_ci
Mysql - 5.7
Update: Same thing is happening for CHAR column too(changed VarChar to Char), which is Fixed-length column.
You misunderstood the innodb row size limit description. Quote from the mysql manual (emphasis is mine):
The maximum row length, except for variable-length columns (VARBINARY, VARCHAR, BLOB and TEXT), is slightly less than half of a page for 4KB, 8KB, 16KB, and 32KB page sizes. For example, the maximum row length for the default innodb_page_size of 16KB is about 8000 bytes.
...
If a row is less than half a page long, all of it is stored locally within the page. If it exceeds half a page, variable-length columns are chosen for external off-page storage until the row fits within half a page, as described in Section 14.12.2, “File Space Management”.
Since your fields are varchar (a variable length fiekd type), the data above the half-page size limit is simply stored in an other, off-page location, hence your sql statement is correct.
edit
For char fields the behaviour depends on the row format and the charater set used.
A char field's length is fixed in terms of number of characters, but depending on the character set, the byte length may be either fixed (e.g. latin1 is fixed 1 byte / character), or variable (e.g. utf8mb3 is variable length 1-3 bytes / character).
For compact row format the character set is irrelevant, you get an error message if the max possible byte length exceeds the data page limit derived from the page size configuration when you want to create the table.
For dynamic row format, if the character set is fixed length and the byte length exceeds the data page limit, then you get an error when you want to create the table. However, if the character set is variable length, then the data gets stored in the overflow pages.
As SHADOW said for VARHCAR columns
If a row is less than half a page long, all of it is stored locally within the page. If it exceeds half a page, variable-length columns are chosen for external off-page storage until the row fits within half a page
For CHAR columns
InnoDB encodes fixed-length fields greater than or equal to 768 bytes in length as variable-length fields, which can be stored off-page. For example, a CHAR(255) column can exceed 768 bytes if the maximum byte length of the character set is greater than 3, as it is with utf8mb4.
https://forums.mysql.com/read.php?24,645115,645215#msg-645215
Where mysql internally converts fixed-length fields to variable-length fields
https://dev.mysql.com/doc/refman/5.6/en/storage-requirements.html
Internally, for variable-length character sets such as utf8mb3 and utf8mb4, InnoDB attempts to store CHAR(N) in N bytes by trimming trailing spaces. If the byte length of a CHAR(N) column value exceeds N bytes, trailing spaces are trimmed to a minimum of the column value byte length. The maximum length of a CHAR(N) column is the maximum character byte length × N.
A minimum of N bytes is reserved for CHAR(N). Reserving the minimum space N in many cases enables column updates to be done in place without causing index page fragmentation. By comparison, CHAR(N) columns occupy the maximum character byte length × N when using the REDUNDANT row format.
-- https://dev.mysql.com/doc/refman/5.6/en/innodb-row-format.html#innodb-compact-row-format-characteristics
The above refers to ROW_FORMAT = COMPACT, DYNAMIC, and COMPRESSED (before compression). And it is talking about on-page storage of CHAR.
Related
Suppose a field is declared thusly:
a VARCHAR(255)
How many characters can be stored in it, is it 255 or 256? And how much space is used?
Should we use a power of 2 and then subtract 1, or it doesn’t matter?
A VARCHAR(255) can store up to 255 characters, regardless of the number of bytes per character required by the character set encoding.
The storage requirement is the length of the actual data stored (not the maximum), plus 1 or 2 bytes to store the length of the data -- 1 byte is used unless the maximum possible length in bytes > 255... so a VARCHAR(255) COLLATE utf8mb4 uses 2 bytes to store the length, while a VARCHAR(255) COLLATE ascii_general_ci uses 1 byte to store the length. Either column can store not more than 255 characters.
Declare the column size as appropriate for the data being stored. Using 255 is common, but usually a red flag of sloppy design, since it's rare that this particular value meaningfully represents the maximum appropriate length of a column.
By contrast, a CHAR(255) COLLATE utf8mb4 always consumes 255 × 4 (the maximum possible) bytes per column per row, and 0 bytes to store the length, since the stored length does not vary. These columns are rarely appropriate, except when the column is always a known length and the character set is single-byte, such as a UUID, which would be CHAR(36) COLLATE ascii_general_ci.
https://dev.mysql.com/doc/refman/5.7/en/storage-requirements.html#data-types-storage-reqs-strings
In MySQL, varchar(255) would use 255 bytes (maximum) to store the data and 1 byte to store metadata (length information) about that data. Essentially, it would be 2^8. Now, how many characters can you store in 255 bytes depends on what character set you are using.
The number of bytes do you need to address & store a character depends upon how many characters does the character set have in total.
Background
The MySQL documentation states the following:
In contrast to CHAR, VARCHAR values are stored as a 1-byte or 2-byte length prefix plus data. The length prefix indicates the number of bytes in the value. A column uses one length byte if values require no more than 255 bytes, two length bytes if values may require more than 255 bytes.
To put this to the test myself, I created two tables:
CREATE TABLE `varchar_length_test_255` (
`characters` varchar(255) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE `varchar_length_test_256` (
`characters` varchar(256) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
I then inserted 10,000 rows into each table, each row with values having the maximum length for the characters column.
Since I am using a character set that has a maximum byte length of one byte per character (latin1), I expected to see a difference of 20,000 bytes in storage size between the two tables, derived from the following:
Each row in the varchar_length_test_256 table contains an additional character than the rows in the varchar_length_test_255 table. With the latin1 character set, that adds up to 10,000 bytes, since there are 10,000 rows in each table.
Based on the MySQL documentation, VARCHAR values exceeding 255 bytes require an additional "length" byte. Since each row in the varchar_length_test_256 table contains a value in the characters column that has a length of 256, which equates to 256 bytes for each value since the latin1 character set is used, that adds up to another 10,000 bytes utilized.
Problem
When issuing a query to retrieve the size of each table, it appears that the tables are the same size! I used the following query (based on off of this SO post) to determine the size of each table:
SELECT
table_name AS `Table`,
(data_length + index_length) `Size in Bytes`
FROM
information_schema.TABLES
WHERE
table_schema = "test";
which yielded this output:
+-------------------------+---------------+
| Table | Size in Bytes |
+-------------------------+---------------+
| varchar_length_test_255 | 4734976 |
| varchar_length_test_256 | 4734976 |
+-------------------------+---------------+
2 rows in set (0.00 sec)
What am I missing here?
Am I correctly understanding the MySQL documentation?
Is there something wrong with my test that is preventing the expected outcome?
Is the query I am using to calculate the size of the tables correct?
How could I correctly observe the information communicated in the MySQL documentation?
Check he data_free column too.
InnoDB stores data on so called 'pages' which are 16KB in size (by default). When a page is almost full, and you insert a new record, but it can't fit on the page, MySQL will open a new page leaving the leftover space empty.
It is my assumption, that MySQL reports the number of pages times the page size as data/index sizes.
This is the effective size used on the OS to store the table's data, not the actual size stored on those pages.
Update: https://mariadb.com/kb/en/library/information-schema-tables-table/
On this page (even if it is MariaDB, but the storage engine is the same) the descrtiption of data_lenght is the following:
For InnoDB/XtraDB, the index size, in pages, multiplied by the page
size. For Aria and MyISAM, length of the data file, in bytes. For
MEMORY, the approximate allocated memory.
Edit (some calculations)
16 KB = 16384 B
Storage (B) # of record # of pages
on a page
---------------------------------------------------
varchar(255) 256 64 156.25
varchar(256) 258 63.5 158.73
As you see the raw data (with the length marker) can be stored on almost the same amount of pages.
Due to the fact that a page is not necessary filled to 100% (however innodb_fill_factor defaults to 100) and there is some overhead in the row structure, this little difference won't necessarily visible.
The database files are not like a csv file, but they have to handle multiple things such as NULL values, row size when it is varying, etc which takes up additional space.
More about the InnoDB Row Structure: https://dev.mysql.com/doc/refman/5.5/en/innodb-physical-record.html
I've been working with MySQL and very vaguely understand the VARCHAR(This Number Here) part. Is that number the total amount of characters the column can store?
For instance, lets say i have a VARCHAR(400) latin1_general_ci, does the 400 mean a 400 byte limit on the string, or that the string can have 400 characters? How big of a string can i store in that column variable?
This is the maximum string length of the field (see here) (NOT bytes):
The CHAR and VARCHAR types are declared with a length that indicates the maximum number of characters you want to store. For example, CHAR(30) can hold up to 30 characters.
This will allow 30 characters regardless of the encoding.
Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 65,535. The effective maximum length of a VARCHAR is subject to the maximum row size (65,535 bytes, which is shared among all columns) and the character set used.
In contrast to CHAR, VARCHAR values are stored as a 1-byte or 2-byte length prefix plus data. The length prefix indicates the number of bytes in the value. A column uses one length byte if values require no more than 255 bytes, two length bytes if values may require more than 255 bytes.
If I create a column of type VARCHAR (50) on a table and add rows, do the rows actually have 50 characters (or 51 if there's a null-terminating character)? In other words, if I deploy my application and the user input that goes to that column ends up only being strings of no more than 10 characters, am I wasting 80% of memory?
CHARACTER SET
In addition to what is said by the others, the CHARACTER SET for the column needs factoring in.
ascii uses 1 byte for 1 character.
latin1 uses 1 byte for 1 character.
utf8 uses 1, 2, or 3 bytes for 1 character.
utf8mb4 uses 1, 2, 3, or 4 bytes for 1 character.
The number on the declaration is characters, not bytes.
CHAR(10) can hold the widest 10 characters in the given CHARACTER SET. For utf8mb4, it will always occupy 40 bytes. This is a reason to either
never use CHAR, always use VARCHAR, and/or
explicitly say CHARACTER SET ascii for things like Y/N, M/F, country code, postal code, SSN, hex strings, etc.
VARCHAR(10) CHARACTER SET utf8mb4 will handle up to 10 characters, whether it is 1-byte English characters or 3- and 4-byte Chinese characters.
Temp table in a SELECT
A SELECT that does certain things like GROUP BY or ORDER BY or 'UNION' may decide it needs to build a "temp" table for the intermediate processing. If it does, it first considering building the table in RAM using the MEMORY engine. If so, then it turns all VARCHARs into CHARs for the processing. It is vary common to see last_name VARCHAR(255) CHARACTER SET utf8. But when one of these temp tables is used, that becomes 765 bytes per row. This is not very efficient. How often have you seen a last_name that was 255 characters long? So
Don't always use (255); make it something reasonable; and
Use ascii/latin1 when appropriate.
The best way to answer your question is thru comparison.
The CHAR and VARCHAR types are similar, but differ in the way they are stored and retrieved. As of MySQL 5.0.3, they also differ in maximum length and in whether trailing spaces are retained.
For example:
DECLARE CHARARRAY CHAR(30) = 'TEST' -- RESULT IS 'TEST..<30 - 4 SPACES>' (WITH TRAILING SPACES)
on the other hand:
DECLARE VARCHARARRAY VARCHAR(30) = 'TEST' -- RESULT IS 'TEST' (WITHOUT TRAILING SPACES)
The CHAR and VARCHAR types are declared with a length that indicates the maximum number of characters you want to store. For example, CHAR(30) can hold up to 30 characters.
The length of a CHAR column is fixed to the length that you declare when you create the table. The length can be any value from 0 to 255. When CHAR values are stored, they are right-padded with spaces to the specified length. When CHAR values are retrieved, trailing spaces are removed.
Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 255 before MySQL 5.0.3, and 0 to 65,535 in 5.0.3 and later versions. The effective maximum length of a VARCHAR in MySQL 5.0.3 and later is subject to the maximum row size (65,535 bytes, which is shared among all columns) and the character set used
In contrast to CHAR, VARCHAR values are stored as a 1-byte or 2-byte length prefix plus data. The length prefix indicates the number of bytes in the value. A column uses one length byte if values require no more than 255 bytes, two length bytes if values may require more than 255 bytes.
Conclusion
If you want to optimize your database, I would suggest you use varchar rather than a char. The sizes of the field may vary depending on field usage. If you are starting to have a design yourself database, this link might help you.
Reference:
The CHAR and VARCHAR Types
I have a MySQL database and I am wondering about the consequences of the varchar size on my query performances.
For example, what would be the difference between a varchar(10) and a varchar(50) in terms of the performances or database size.
If I have something like 10000 rows, would it affect a lot on performances or is it insignificant?
Note : I don't do any join on this column (if that is important)
varchar(10) means that maximum allowed bytes is 10 and varchar(50) means that maximum allowed bytes is 50. Basically, a varchar(10) is no different disk-wise than a varchar(128).So, in whatever manner you declare your columns, it wont make a difference on the storage end. But it will certainly make a difference while making a query.
From the source:
Values in VARCHAR columns are variable-length strings. The length can
be specified as a value from 0 to 255 before MySQL 5.0.3, and 0 to
65,535 in 5.0.3 and later versions. The effective maximum length of a
VARCHAR in MySQL 5.0.3 and later is subject to the maximum row size
(65,535 bytes, which is shared among all columns) and the character
set used.
There shouldn't be any real difference in performances between a VARCHAR(10) and VARCHAR(50).
The real difference would be between CHAR and VARCHAR.
http://dev.mysql.com/doc/refman/5.0/en/char.html
The length of a CHAR column is fixed to the length that you declare
when you create the table. The length can be any value from 0 to 255.
When CHAR values are stored, they are right-padded with spaces to the
specified length. When CHAR values are retrieved, trailing spaces are
removed.
Values in VARCHAR columns are variable-length strings. The length can
be specified as a value from 0 to 255 before MySQL 5.0.3, and 0 to
65,535 in 5.0.3 and later versions. The effective maximum length of a
VARCHAR in MySQL 5.0.3 and later is subject to the maximum row size
(65,535 bytes, which is shared among all columns) and the character
set used. See Section E.7.4, “Limits on Table Column Count and Row
Size”.
Replacing every VARCHAR by CHAR columns might improve performances, since then, rows will have fixed size, thus reducing fragmentation and somehow optimizing disks access.
That being said, if you have only 10000 rows, I doubt you would see any real difference, unless maybe if you have unusually "long" rows.