Impact of size being set to columns in db - mysql

I run a website with MySql db back end. I need to know what is the impact when we choose a column type (say MediumTEXT) in order to save some heavy data.
eg :
MEDIUMTEXT | 16,777,215 (224−1) bytes = 16 MiB
From above , the MEDIUMTEXT is 16,777,215 (224−1) bytes . which means it can hold up to 16Mb of data. Does this mean, it reserves 16mb data to every entry inserted?
i.e if my entry is just "Hello World", how would mysql ( or in the case any db) handle writing to the disc?

No, it doesn't reserve all that space. It allocates 3 bytes per row to store the (future, variable) length of the actual data, and then only as much space is used as is needed to store the data for each row created.
L represents the actual length in bytes of a given string value.
...storage for a MEDIUMTEXT value requires L bytes to store the value plus three bytes to store the length of the value.
http://dev.mysql.com/doc/refman/5.6/en/storage-requirements.html

Related

Does empty LONGTEXT string takes 4GB of disc space?

I've been reading about disc usage/space for different strings, it says that LONGTEXT takes 4GB.
Is that disk space declared for FULLY FILLED column or JUST CREATED (Empty)
Thank You.
The answer is: L + 4 bytes, where L < 2^32
Variable-length string types are stored using a length prefix plus
data. The length prefix requires from one to four bytes depending on
the data type, and the value of the prefix is L (the byte length of
the string). For example, storage for a MEDIUMTEXT value requires L
bytes to store the value plus three bytes to store the length of the
value.
Source: https://dev.mysql.com/doc/refman/5.7/en/storage-requirements.html#data-types-storage-reqs-strings

smallest storage of integer array in mysql?

I have a table of user entries, and for every entry I have an array of (2-byte) integers to store (15-25, sporadically even more). The array elements will be written and read all at the same time, it is never needed to update or to access them individually. Their order matters. It makes sense to think of this as an array object.
I have many millions of these user entries and want to store this with the minimum possible amount of disk space. I'm however struggling with MySQL's lack of Array datatype.
I've been considering the following options.
Do it the MySQL way. Make a table my_data with columns user_id, data_id and data_int. To make this efficient, one needs an index on user_id, totalling well over 10 bytes per integer.
Store the array in text format. This takes ~6.5 bytes per integer.
making 35-40 columns ("enough") and having -32768 be 'empty' (since this value cannot occur in my data). This takes 3.5-4 bytes per integer, but is somewhat ugly (as I have to impose a strict limit on the number of elements in the array).
Is there a better way to do this in MySQL? I know MySQL has an efficient varchar type, so ideally I'd store my 2-byte integers as 2-byte chars in a varchar (or a similar approach with blob), but I'm not sure how to do that. Is this possible? How should this be done?
You could store them as separate SMALLINT NULL columns.
In MyISAM this this uses 2 bytes of data + 1 bit of null indicator for each value.
In InnoDB, the null indicators are encoded into the column's field start offset, so they don't take any extra space, and null values are not actually stored in the row data. If the rows are small enough that all the offsets are 1 byte, then this uses 3 bytes for every existing value (1 byte offset, 2 bytes data), and 1 byte for every nonexistent value.
Either of these would be better than using INT with a special value to indicate that it doesn't exist, since that would be 4 bytes of data for every value.
See NULL in MySQL (Performance & Storage)
The best answer was given in the comments, so I'll repost it here with some use-ready code, for further reference.
MySQL has a varbinary type that works really well for this: you can simply use PHP's pack/unpack functions to convert them to and from binary form, and store that binary form in the database using varbinary. Example code for the conversion is below.
function pack24bit($n) { //input: 24-bit integer, output: binary string of length 3 bytes
$b3 = $n%256;
$b2 = $n/256;
$b1 = $b2/256;
$b2 = $b2%256;
return pack('CCC',$b1,$b2,$b3);
}
function unpack24bit($packed) { //input: binary string of 3 bytes long, output: 24-bit int
$arr = unpack('C3b',$packed);
return 256*(256*$arr['b1']+$arr['b2'])+$arr['b3'];
}

Recommended way to store a string in this case?

I am storing strings and 99.5+% are less than 255 characters, so I store them in a VARCHAR(255).
The thing is, some of them can be 4kb or so. What's the best way to store those?
Option #1: store them in another table with a pointer to the main.
Option #1.0: add an INT column with DEFAULT NULL and the pointer will be stored there
Option #1.1: the pointer will be stored in the VARCHAR(255) column, e.g 'AAAAAAAAAAA[NUMBER]AAAAAAAAAAAA'
Option #2: increase the size of VARCHAR from 255 to 32767
What's the best of the above, Option #1.0, Option #1.1 or Option #2, performance wise?
Increase the size of your field to fit the max size of your string. A VARCHAR will not use the space unless needed.
VARCHAR values are stored as a 1-byte or 2-byte length prefix plus
data. The length prefix indicates the number of bytes in the value. A
column uses one length byte if values require no more than 255 bytes,
two length bytes if values may require more than 255 bytes.
http://dev.mysql.com/doc/refman/5.0/en/char.html
The MySQL Definition says that VARCHAR(N) will take up to L + 1 bytes if column values require 0 – 255 bytes, L + 2 bytes if values may require more than 255 bytes where L is the length in bytes of the stored string.
So I guess that option #2 is quite okay, because the small strings will still take less space than 32767 bytes.
EDIT:
Also imagine the countless problems options 1.0 and 1.1 would raise when you actually want to query a string without knowing whether it exceeds the length or not.
Option #2 is clearly best. It just adds 1 byte to the size of each value, and doesn't require any complicated joins to merge in the fields from the second table.

MySQL InnoDB DECIMAL - data size driven by column declaration or by actual data?

I'm using MySQL, all my tables are using InnoDB engine. I have some columns declared as DECIMAL(38, 0) and they are used extensively. According to the MySQL documentation (http://dev.mysql.com/doc/refman/5.5/en/storage-requirements.html), 38-digit value requires 17 bytes (38 = 4 * 9 + 2; 4 * 4 + 1 = 17). Okay.
But, does that mean that any value stored in this column will take 17 bytes? For example, for value 432 - will it take 4 bytes only (I really hope so...) or will it take 17 bytes anyway?
Finally, I know that in Oracle the size occupied depends on the actual values stored. But is it optimized that way in MySQL as well?
I think the answer is that it will take 17 bytes anyway. If you notice, detailed in the linked manual page there is no means for the DBMS to record how "long" the value is. By comparison, for a VARCHAR(255) CHARACTER SET ascii column there is a single byte at the start of the value that indicates how long the value is (for a maximum size of 256 bytes). For a VARCHAR(1000) CHARACTER SET ascii column there are two bytes to indicate the length. Here no means is detailed to record the length of the value, leading me to conclude that the column always takes the maximum amount of space.
Decimal is "fixed length" so every value requires 17 bytes

MySQL entry and space used

I have a MySQL table with one of the columns like "varchar(255)". Will the Database use 255bytes of space even if that column is empty? or only if it has some data and the amount of space used is proportional to the data?
every cell will take only the amount of space proportional to the data.
http://dev.mysql.com/doc/refman/5.0/en/char.html
only if it has some data and the amount of space used is proportional to the data.
VARCHAR(M) takes N+1 bytes or more depending on the size of the data you're adding. A blank field (empty string) will still consume 1 byte (that is the +1 on the N+1) that is used to indicate where that field's data ends - so you have 1 byte for the terminator.
From MySQL's website:
VARCHAR(M), VARBINARY(M)
L + 1 bytes if column values require 0 – 255 bytes,
L + 2 bytes if values may require more than 255 bytes
where L is the length of your data. In your case, you'll be consuming (data length + 1) on your VARCHAR(255) field.