Varbinary vs Blob in MySQL - mysql

I have about 2k of raw binary data that I need to store in a table, but don't know whether to choose the Varbinary or Blob type. I have read through the descriptions in the MySQL docs but didn't find any contract and compare descriptions. I also read that varbinary only supports up to 255 characters, but I successfully created a varbinary(2048) field, so I'm a bit confused.
The binary data does not need to be indexed, nor will I need to query on it. Is there an advantage to using one type over the other from PHP?
Thanks!

VARBINARY is bound to 255 bytes on MySQL 5.0.2 and below, to 65kB on 5.0.3 and above.
BLOB is bound to 65kB.
Ultimately, VARBINARY is virtually the same as BLOB (from the perspective of what can be stored in it), unless you want to preserve compatibility with "old" versions of MySQL. The MySQL Documentation says:
In most respects, you can regard a BLOB column as a VARBINARY column that can be as large as you like.

Actually blob can be bigger (there are tinyblob, blob, mediumblob & longblob http://dev.mysql.com/doc/refman/5.0/en/storage-requirements.html) with up to 2^32 -1 on size limit.
Also blob storage grows "outside" of the row, while max varbinary size is tied by amount of free row size available (so it can actually be less than 64Kb).
There are some minor differences between both
1) With Index scripting (blob needs a prefix size on indexes,
varbinary doesn't) http:/en/column-indexes.html
CREATE TABLE test (blob_col BLOB, INDEX(blob_col(10)));
2) As already mentioned there are trailling space issues managed
differently between varbinary & blob at MySql 5.0.x or earlier
versions: http:///en/blob.html
http:///en/binary-varbinary.html
(truncating the links, since stackoverflow thinks too many links are spam)

One significant difference is blob types are stored in secondary storage, while varbinaries are stored inline in the row in the same way as varchars and other "simple" types.
This can have an impact on performance in a busy system, where the additional lookup to fetch and manipulate the blob data can be expensive.

It is worth to point that Memory storage engine does not support BLOB/TEXT but it works with VARBINARY.

I am just looking at a test app that stores around 5k binary data in a column. It initially used varbinary but since it is so slow I decided to try blob. Well I'm looking at disk write speed with atop and can't see any difference.
The only significant difference I read in mysql manual is that blobs are unsupported by the memory engine so any temporary tables you create with queries (see when mysql uses temp tables) will be created on-disk and that is much slower.
So you better bet on varbinary/binary if it is a short enough to fit into a row (at the moment 64k total for all columns).

Related

mysql json vs mongo - storage space

I am experiencing an interesting situation and although is not an actual problem, I can't understand why this is happening.
We had a mongo database, consisting mainly of some bulk data stored into an array. Due to the fact that over 90% of the team was familiar with mysql while only a few of us were familiar with mongo, combined with the fact that is not a critical db and all queries are done over 2 of the fields (client or product) we decided to move the data in mysql, in a table like this
[idProduct (bigint unsigned), idClient (bigint unsigned), data (json)]
Where data is a huge json containing hundreds of attributes and their values.
We also partitioned in 100 partitions by a hash over idClient.
PARTITION BY HASH(idClient)
PARTITIONS 100;
All is working fine but I noticed an interesting fact:
The original mongo db had about 70 GB, give or take. The mysql version (containing actually less data because re removed some duplicates that we were using as indexes in mongo) has over 400 GB.
Why does it take so much more space? In theory bson should actually be slightly larger than json (at least in most cases). Even if indexes are larger in mysql... the difference is huge (over 5x).
I did a presentation How to Use JSON in MySQL Wrong (video), in which I imported Stack Overflow data dump into JSON columns in MySQL. I found the data I tested with took 2x to 3x times more space than importing the same data into normal tables and columns using conventional data types for each column.
JSON uses more space for the same data, for example because it stores integers and dates as strings, and also because it stores key names on every row, instead of just once in the table header.
That's comparing JSON in MySQL vs. normal columns in MySQL. I'm not sure how MongoDB stores data and why it's so much smaller. I have read that MongoDB's WiredTiger engine supports options for compression, and snappy compression is enabled by default since MongoDB 3.0. Maybe you should enable compressed format in MySQL and see if that gives you better storage efficiency.
JSON in MySQL is stored like TEXT/BLOB data, in that it gets mapped into a set of 16KB pages. Pages are allocated one at a time for the first 32 pages (that is, up to 512KB). If the content is longer than that, further allocation is done in increments of 64 pages (1MB). So it's possible if a single TEXT/BLOB/JSON content is say, 513KB, it would allocate 1.5MB.
Hi I think the main reason could be due to the fact that internally Mongo stores json as bson ( http://bsonspec.org/ ) and in the spec it is stressed that this representation is Lightweight.
The WiredTiger Storage Engine in MongoDB uses compression by default. I don't know the default behavior of MySQL.
Unlike MySQL, the MongoDB is designed to store JSON/BSON, actually it does not store anything else. So, this kind of "competition" might be a bit unfair for MySQL which stores JSON like TEXT/BLOB data.
If you would have relational data, i.e. column-based values then most likely MySQL would be smaller as stated by #Bill Karwin. However, with smart bucketing in MongoDB you can reduce the data size significantly.

Best MySql data types to store variable length strings and binary data

I have a data table which has to be read often. I need to store in it strings and binary data of variable length. I could store data as BLOB or TEXT, but the way I understand MySql, those types are stored on the hard drive instead of memory, and if I use them, the speed of reading the table is going to be low.
Are there any alternative variable length types which I could use? Or, maybe, is there a way to tell MySql to hold the data in columns of those types in memory?
Is this 'data table' the only place that the strings are stored? If so, you need the 'persistence' of storing it on disk. However, MySQL will "cache" the data, so reads will almost always be from RAM.
If each element of data is not 'too' big, you could use ENGINE=MEMORY for the table; that would leave the data only in RAM. A system crash would lose the data.
But if you don't need persistence, there are many flavors of caching outside MySQL. Please describe where the data comes from, what language is using the data, how big the data is, etc.

How to compress columns in MySQL?

I have a table which stores e-mail correspondences. Every time someone
replies, the whole body of the trail is also included and saved into
the database (and I need it that way because the amount of application
level changes to rectify that are going to be too high).
The size of the mail text column is 10000. But, I am having trouble storing text more than that. As I am not sure, how many correspondences can occur, I don't know what a good number will be for the column.
The engine is InnoDB. Can I use some kind of columnar compression technique in MySQL to avoid increasing the size of the column?
And, what if I go ahead and increase the varchar column to, say, 20000. The table has about 2 million records. Will that be a good thing to do?
You are probably looking for MySQL COMPRESS() and UNCOMPRESS() function to compress data for storage and retrieval respectively.
Also look at InnoDB Compression Usage.
As long as the data doesn't need editing, you can use the archive engine.
This answer is specific to Percona
Percona introduced a compressed column format a while ago. That you can use on CREATE or ALTERs
CREATE TABLE test_compressed (
id INT NOT NULL PRIMARY KEY,
value MEDIUMTEXT COLUMN_FORMAT COMPRESSED
);
Reference: https://www.percona.com/doc/percona-server/5.7/flexibility/compressed_columns.html
For me the best way to use text data compression is to use a Percona compressed column format.
ALTER TABLE `tableName` MODIFY `mail` TEXT COLUMN_FORMAT COMPRESSED NOT NULL;
I've tested compression on table used as cache, storing mainly HTML data, the size decreased from 620 MB to 110.6MB.
I think you should consider using TEXT type instead of long VARCHAR.
Data fields are stored separately from innodb clustered index and it can affect and probably improve the performance of your database.
You have a few different options:
Wait for the RFE to add column compression to MySQL (see https://bugs.mysql.com/bug.php?id=106541) - unlikely this will ever be done
Use application level compression and decompression - much more work involved in doing this
Rely on MySQL's compress and uncompress functions to do this for you (see https://dev.mysql.com/doc/refman/8.0/en/encryption-functions.html#function_compress) - these are not reliable as they depend on how MySQL was compiled (zlib or not) - and they don't give great results a lot of the time
Don't worry about the file size as disk space is cheap and simply change the column type to TEXT (see https://dev.mysql.com/doc/refman/8.0/en/blob.html)
Often the best option if disk space is your main concern is changing the table to be compressed using: ALTER TABLE t1 ROW_FORMAT = COMPRESSED; - for emails this can give very good compression and if need be it can be tuned for even better compression for your particular workload (see https://dev.mysql.com/doc/refman/8.0/en/innodb-compression-tuning.html)

Is there a binary safe column type for MySQL memory tables?

Since the MEMORY storage engine does not support BLOB columns, what is the recommended way to store binary data?
We want to store gzip compressed strings between 600 and 10,000 characters long. We are using MySQL 5.5.32.
Is varchar(10000) a safe alternative?
Your choice will be VARBINARY data type. However, be aware of storage requirements - you can not store too long strings within this data type (see version-specific manual page)
you can use LONGTEXT , it is a good alternative

How to insert a file in MySQL database?

I want to insert a file in MYSQL database residing on a remote webserver using a webservice.
My question is: What type of table column (e.g. varchar, etc.) will store a file? And will the insert statement be somewhat different in case of a file?
File size by MySQL type:
TINYBLOB 255 bytes = 0.000255 Mb
BLOB 65535 bytes = 0.0655 Mb
MEDIUMBLOB 16777215 bytes = 16.78 Mb
LONGBLOB 4294967295 bytes = 4294.97 Mb = 4.295 Gb
Yet, in most cases, I would NOT recommend storing big blobs of bytes in database, even if it supports it, because it will increase overall database size & may cause real performance issues. You can read more on topic here. Many databases that care about consistent performance won't even let you do such thing. Like e.g. AWS DynamoDB, which is known to perform extremely well at any scale, limits single item record to 400KB. MongoDB does allow 16MB, which is also already too much, imo. MySQL allows all 4GB if you wish. But again, think twice before doing that. The case where you may be OK to store big blob of data with these column types would be - you have small traffic database and you just want to save all the stuff in one place for faster development. Like internal system in a small company.
The BLOB datatype is best for storing files.
See: How to store .pdf files into MySQL as BLOBs using PHP?
The MySQL BLOB reference manual has some interesting comments
The other answers will give you a good idea how to accomplish what you have asked for....
However
There are not many cases where this is a good idea. It is usually better to store only the filename in the database and the file on the file system.
That way your database is much smaller, can be transported around easier and more importantly is quicker to backup / restore.
You need to use BLOB, there's TINY, MEDIUM, LONG, and just BLOB, as with other types, choose one according to your size needs.
TINYBLOB 255
BLOB 65535
MEDIUMBLOB 16777215
LONGBLOB 4294967295
(in bytes)
The insert statement would be fairly normal. You need to read the file using fread and then addslashes to it.