How can mysql blobs be edited? - mysql

Is there a way to edit a blob with mysql, for example to delete from 39th to 48 byte or to insert some bytes (characters), at some position? Are there any such commands.

If your blob is extremely large, entering it into the database in the first place is very difficult.
MySQL only allows the maximum packet size to be sent as a command - which makes using large blobs difficult.
If your blobs are big enough that you care, you probably need to create some schema where they're stored as several rows with chunks, to allow them to be created in a sensible fashion.
BUT storing very large blobs is probably not a good idea in MySQL, as it's effectively a massive waste of your innodb_buffer_pool.
NB: By "Very large" I mean > 10M or so.

Related

mysql json vs mongo - storage space

I am experiencing an interesting situation and although is not an actual problem, I can't understand why this is happening.
We had a mongo database, consisting mainly of some bulk data stored into an array. Due to the fact that over 90% of the team was familiar with mysql while only a few of us were familiar with mongo, combined with the fact that is not a critical db and all queries are done over 2 of the fields (client or product) we decided to move the data in mysql, in a table like this
[idProduct (bigint unsigned), idClient (bigint unsigned), data (json)]
Where data is a huge json containing hundreds of attributes and their values.
We also partitioned in 100 partitions by a hash over idClient.
PARTITION BY HASH(idClient)
PARTITIONS 100;
All is working fine but I noticed an interesting fact:
The original mongo db had about 70 GB, give or take. The mysql version (containing actually less data because re removed some duplicates that we were using as indexes in mongo) has over 400 GB.
Why does it take so much more space? In theory bson should actually be slightly larger than json (at least in most cases). Even if indexes are larger in mysql... the difference is huge (over 5x).
I did a presentation How to Use JSON in MySQL Wrong (video), in which I imported Stack Overflow data dump into JSON columns in MySQL. I found the data I tested with took 2x to 3x times more space than importing the same data into normal tables and columns using conventional data types for each column.
JSON uses more space for the same data, for example because it stores integers and dates as strings, and also because it stores key names on every row, instead of just once in the table header.
That's comparing JSON in MySQL vs. normal columns in MySQL. I'm not sure how MongoDB stores data and why it's so much smaller. I have read that MongoDB's WiredTiger engine supports options for compression, and snappy compression is enabled by default since MongoDB 3.0. Maybe you should enable compressed format in MySQL and see if that gives you better storage efficiency.
JSON in MySQL is stored like TEXT/BLOB data, in that it gets mapped into a set of 16KB pages. Pages are allocated one at a time for the first 32 pages (that is, up to 512KB). If the content is longer than that, further allocation is done in increments of 64 pages (1MB). So it's possible if a single TEXT/BLOB/JSON content is say, 513KB, it would allocate 1.5MB.
Hi I think the main reason could be due to the fact that internally Mongo stores json as bson ( http://bsonspec.org/ ) and in the spec it is stressed that this representation is Lightweight.
The WiredTiger Storage Engine in MongoDB uses compression by default. I don't know the default behavior of MySQL.
Unlike MySQL, the MongoDB is designed to store JSON/BSON, actually it does not store anything else. So, this kind of "competition" might be a bit unfair for MySQL which stores JSON like TEXT/BLOB data.
If you would have relational data, i.e. column-based values then most likely MySQL would be smaller as stated by #Bill Karwin. However, with smart bucketing in MongoDB you can reduce the data size significantly.

Best MySql data types to store variable length strings and binary data

I have a data table which has to be read often. I need to store in it strings and binary data of variable length. I could store data as BLOB or TEXT, but the way I understand MySql, those types are stored on the hard drive instead of memory, and if I use them, the speed of reading the table is going to be low.
Are there any alternative variable length types which I could use? Or, maybe, is there a way to tell MySql to hold the data in columns of those types in memory?
Is this 'data table' the only place that the strings are stored? If so, you need the 'persistence' of storing it on disk. However, MySQL will "cache" the data, so reads will almost always be from RAM.
If each element of data is not 'too' big, you could use ENGINE=MEMORY for the table; that would leave the data only in RAM. A system crash would lose the data.
But if you don't need persistence, there are many flavors of caching outside MySQL. Please describe where the data comes from, what language is using the data, how big the data is, etc.

How to compress columns in MySQL?

I have a table which stores e-mail correspondences. Every time someone
replies, the whole body of the trail is also included and saved into
the database (and I need it that way because the amount of application
level changes to rectify that are going to be too high).
The size of the mail text column is 10000. But, I am having trouble storing text more than that. As I am not sure, how many correspondences can occur, I don't know what a good number will be for the column.
The engine is InnoDB. Can I use some kind of columnar compression technique in MySQL to avoid increasing the size of the column?
And, what if I go ahead and increase the varchar column to, say, 20000. The table has about 2 million records. Will that be a good thing to do?
You are probably looking for MySQL COMPRESS() and UNCOMPRESS() function to compress data for storage and retrieval respectively.
Also look at InnoDB Compression Usage.
As long as the data doesn't need editing, you can use the archive engine.
This answer is specific to Percona
Percona introduced a compressed column format a while ago. That you can use on CREATE or ALTERs
CREATE TABLE test_compressed (
id INT NOT NULL PRIMARY KEY,
value MEDIUMTEXT COLUMN_FORMAT COMPRESSED
);
Reference: https://www.percona.com/doc/percona-server/5.7/flexibility/compressed_columns.html
For me the best way to use text data compression is to use a Percona compressed column format.
ALTER TABLE `tableName` MODIFY `mail` TEXT COLUMN_FORMAT COMPRESSED NOT NULL;
I've tested compression on table used as cache, storing mainly HTML data, the size decreased from 620 MB to 110.6MB.
I think you should consider using TEXT type instead of long VARCHAR.
Data fields are stored separately from innodb clustered index and it can affect and probably improve the performance of your database.
You have a few different options:
Wait for the RFE to add column compression to MySQL (see https://bugs.mysql.com/bug.php?id=106541) - unlikely this will ever be done
Use application level compression and decompression - much more work involved in doing this
Rely on MySQL's compress and uncompress functions to do this for you (see https://dev.mysql.com/doc/refman/8.0/en/encryption-functions.html#function_compress) - these are not reliable as they depend on how MySQL was compiled (zlib or not) - and they don't give great results a lot of the time
Don't worry about the file size as disk space is cheap and simply change the column type to TEXT (see https://dev.mysql.com/doc/refman/8.0/en/blob.html)
Often the best option if disk space is your main concern is changing the table to be compressed using: ALTER TABLE t1 ROW_FORMAT = COMPRESSED; - for emails this can give very good compression and if need be it can be tuned for even better compression for your particular workload (see https://dev.mysql.com/doc/refman/8.0/en/innodb-compression-tuning.html)

storing telemetry data from 10000s of nodes

I need to store telemetry data that is being generated every few minutes from over 10000 nodes (which may increase), each supplying the data over the internet to a server for logging. I'll also need to query this data from a web application.
I'm having a bit of trouble deciding what the best storage solution would be..
Each node has a unique ID, and there will be a timestamp for each packet of variables. (probably will need to be generated by the server).
The telemetry data has all of the variables in the same packet, so conceptually it could easily be stored in a single database table with a column per variable. The serial number + timestamp would suffice as a key.
The size of each telemetry packet is 64 bytes, including the device ID and timestamp. So around 100Gb+ per year.
I'd want to be able to query the data to get variables across time ranges and also store aggregate reports of this data so that I can draw graphs.
Now, how best to handle this? I'm pretty familiar with using MySQL, so I'm leaning towards this. If I were to go for MySQL would it make sense to have a separate table for each device ID? - Would this make queries much faster or would having 10000s of tables be a problem?
I don't think querying the variables from all devices in one go is going to be needed but it might be. Or should I just stick it all in a single table and use MySQL cluster if it gets really big?
Or is there a better solution? I've been looking around at some non relational databases but can't see anything that perfectly fits the bill or looks very mature. MongoDB for example would have quite a lot of size overhead per row and I don't know how efficient it would be at querying the value of a single variable across a large time range compared to MySQL. Also MySQL has been around for a while and is robust.
I'd also like it to be easy to replicate the data and back it up.
Any ideas or if anyone has done anything similar you input would be greatly appreciated!
Have you looked at time-series databases? They're designed for the use case you're describing and may actually end up being more efficient in terms of space requirements due to built-in data folding and compression.
I would recommend looking into implementations using HBase or Cassandra for raw storage as it gives you proven asynchronous replication capabilities and throughput.
HBase time-series databases:
OpenTSDB
KairosDB
Axibase Time-Series Database - my affiliation
If you want to go with MySQL, keep in mind that although it will keep on going when you throw something like a 100GB per year at it easily on modern hardware, do be advised that you cannot execute schema changes afterwards (on a live system). This means you'll have to have a good, complete database schema to begin with.
I don't know if this telemetry data might grow more features, but if they do, you don't want to have to lock your database for hours if you need to add a column or index.
However, some tools such as http://www.percona.com/doc/percona-toolkit/pt-online-schema-change.html are available nowadays which make such changes somewhat easier. No performance problems to be expected here, as long as you stay with InnoDB.
Another option might be to go with PostgreSQL, which allows you to change schemas online, and sometimes is somewhat smarter about the use of indexes. (For example, http://kb.askmonty.org/en/index-condition-pushdown is a new trick for MySQL/MariaDB, and allows you to combine two indices at query time. PostgreSQL has been doing this for a long time.)
Regarding overhead: you will be storing your 64 bytes of telemetry data in an unpacked form, probably, so your records will take more than 64 bytes on disk. Any kind of structured storage will suffer from this.
If you go with an SQL solution, backups are easy: just dump the data and you can restore it afterwards.

How to insert a file in MySQL database?

I want to insert a file in MYSQL database residing on a remote webserver using a webservice.
My question is: What type of table column (e.g. varchar, etc.) will store a file? And will the insert statement be somewhat different in case of a file?
File size by MySQL type:
TINYBLOB 255 bytes = 0.000255 Mb
BLOB 65535 bytes = 0.0655 Mb
MEDIUMBLOB 16777215 bytes = 16.78 Mb
LONGBLOB 4294967295 bytes = 4294.97 Mb = 4.295 Gb
Yet, in most cases, I would NOT recommend storing big blobs of bytes in database, even if it supports it, because it will increase overall database size & may cause real performance issues. You can read more on topic here. Many databases that care about consistent performance won't even let you do such thing. Like e.g. AWS DynamoDB, which is known to perform extremely well at any scale, limits single item record to 400KB. MongoDB does allow 16MB, which is also already too much, imo. MySQL allows all 4GB if you wish. But again, think twice before doing that. The case where you may be OK to store big blob of data with these column types would be - you have small traffic database and you just want to save all the stuff in one place for faster development. Like internal system in a small company.
The BLOB datatype is best for storing files.
See: How to store .pdf files into MySQL as BLOBs using PHP?
The MySQL BLOB reference manual has some interesting comments
The other answers will give you a good idea how to accomplish what you have asked for....
However
There are not many cases where this is a good idea. It is usually better to store only the filename in the database and the file on the file system.
That way your database is much smaller, can be transported around easier and more importantly is quicker to backup / restore.
You need to use BLOB, there's TINY, MEDIUM, LONG, and just BLOB, as with other types, choose one according to your size needs.
TINYBLOB 255
BLOB 65535
MEDIUMBLOB 16777215
LONGBLOB 4294967295
(in bytes)
The insert statement would be fairly normal. You need to read the file using fread and then addslashes to it.