Converting varbinary to longblob in MySQL - mysql

We are storing data in an Innodb table having varbinary column. However, our data size requirement has grown to over 1 MB and hence I converted the column to longblob.
alter table mytable modify column d longblob;
Everything seems to be working as expected after I converted the column. However, I like to know from people who have done it earlier if anything more is required other than just converting column as shown above, especially
is there any MySQL / MariaDB version specific issues with longblob that I should take care of. There is no index on the column.
We use mysqldump to take regular backup. Do we need to change anything since the blob storage mechanism seems to be different than varbinary.
Any other precautions/suggestion.
Thank you for your guidance

Related

VARCHAR(MAX) vs TEXT vs .txt file for use in MySQL database

I tried to google this, but any results I found were related to importing data from a txt file to populate the database as opposed to storing data.
To me, it seems strange that the contents of a file should be stored in a database. We're working on building an eCommerce site, and each item has a description. I assumed the standard would be to store the description in a txt file and the URL in the database, and not to store the huge contents in the database to keep the file size low and speeds high. When you need to store images in a database, you reference it using a URL instead of storing all the pixel data - why would text be any different?
That's what I thought, but everyone seems to be arguing about VARCHAR vs TEXT, so what really is the best way to store text data up to 1000 characters or so?
Thanks!
Whether you store long text data or image data in a database or in external files has no right or wrong answer. There are pros and cons on both sides—despite many developers making unequivocal claims that you should store images outside the database.
Consider you might want the text data to:
Allow changes to be rolled back.
Support transaction isolation.
Enforce SQL access privileges to the data.
Be searchable in SQL when you create a fulltext index.
Support the NOT NULL constraint, so your text is required to contain something.
Automatically be included when you create a database backup (and the version of the text is the correct version, assuring consistency with other data).
Automatically transfer the text to replica database instances.
For all of the above, you would need the text to be stored in the database. If the text is outside the database, those features won't work.
With respect to the VARCHAR vs. TEXT, I can answer for MySQL (though you mentioned VARCHAR(MAX) so you might be referring to Microsoft SQL Server).
In MySQL, both VARCHAR and TEXT max out at 64KB in bytes. If you use
a multibyte character set, the max number of characters is lower.
Both VARCHAR and TEXT have a character set and collation.
VARCHAR allows a DEFAULT, but TEXT does not.
Internally the InnoDB storage engine, VARCHAR and TEXT are stored identically (as well as VARBINARY and BLOB and all their cousins). See https://www.percona.com/blog/2010/02/09/blob-storage-in-innodb/

Why does imported MySQL dump show some column value as BLOB?

I imported a MySQL table dump from my server to my local system. I am using phpMyAdmin to view my local MySQL databases. The column that held emails shows as BLOB - instead of actual email.
When I press on "edit" it shows correct emails, but not on listing.
I am totally confused as why this happens. Can anyone suggest solution?
BLOB data type is meant to store arbitrary binary data (Binary Large OBject = BLOB), for example an image or another document. It does not make sense to show the value as-is. It would look the same as viewing an image in a text editor.
You have several options, depending on the version of phpMyAdmin you are using - which I unfortunately do not know.
Make phpMyAdmin show BLOB values by default.
Show BLOB values for a complete result set.
These two possibilities are covered by an already asked question.
But basically, this is just fighting symptoms instead of curing the disease. The question is: Why did you chose the email field to be a BLOB? Basically, a VARCHAR is enough. I do not know the MySQL version you are running, but since MySQL 5.0.3, a VARCHAR can be as large as 65k byte.
ALTER TABLE
`table`
CHANGE
`email` `email` VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL;
The query above changes the field email to a VARCHAR(255). Pay attention to the length you like to use, and to the character set and collation. But UTF-8 should be very fine in this case.
It shows the text of "Blob" as a placeholder as the blob datatype (which is the datatype that your emails are stored in) is vastly variable in size. For large Blob sizes, if the data were displayed, it could take up 100s + screens to display.

Calculating total data size of BLOB column in a table

I have a table with large amounts of BLOB data in a column. I am writing a utility to dump the data to file system. But before dumping, I need to check if necessary space is available on the disk to export all the blob fields throughout the table.
Please suggest an efficient approach to get size of all the blob fields in the table.
You can use the MySQL function OCTET_LENGTH(your_column_name). See here for more details.
select sum(length(blob_column)) as total_size
from your_table
select sum(length(blob_column_name)) from desired_tablename;
Sadly this is DB specific at best.
To get the total size of a table with blobs in Oracle I use the following:
https://blog.voina.org/?p=374
Sadly this does not work in DB2 I still have to find an alternative.
The simple
select sum(length(blob_column)) as total_size
from your_table
is not a correct query as is not going to estimate correctly the blob size based on the reference to the blob that is stored in your blob column. You have to get the actual allocated size on disk for the blobs from the blob repository.

What will happen to existing data if I change the collation of a column in MySQL?

I am running a production application with MySQL database server. I forget to set column's collation from latin to utf8_unicode, which results in strange data when saving to the column with multi-language data.
My question is, what will happen with my existing data if I change my collation to utf8_unicode now? Will it destroy or corrupt the existing data or will the data remain, but the new data will be saved as utf8 as it should?
I will change with phpMyAdmin web client.
The article http://mysqldump.azundris.com/archives/60-Handling-character-sets.html discusses this at length and also shows what will happen.
Please note that you are mixing up a CHARACTER SET (actually an encoding) with a COLLATION.
A character set defines the physical representation of a string in bytes on disk. You can make this visible, using the HEX() function, for example SELECT HEX(str) FROM t WHERE id = 1 to see how MySQL stores the bytes of your string. What MySQL delivers to you may be different, depending on the character set of your connection, defined with SET NAMES .....
A collation is a sort order. It is dependent on the character set. For example, your data may be in the latin1 character set, but it may be ordered according to either of the two german sort orders latin1_german1_ci or latin1_german2_ci. Depending on your choice, Umlauts such as ö will either sort as oe or as o.
When you are changing a character set, the data in your table needs to be rewritten. MySQL will read all data and all indexes in the table, make a hidden copy of the table which temporarily takes up disk space, then moves the old table into a hidden location, moves the hidden table into place and then drops the old data, freeing up disk space. For some time inbetween, you will need two times the storage for that.
When you are changing a collation, the sort order of the data changes but not the data itself. If the column you are changing is not part of an index, nothing needs to be done besides rewriting the frm file, and sufficiently recent versions of MySQL should not do more.
When you are changing a collation of a column that is part of an index, the index needs to be rewritten, as an index is a sorted excerpt of a table. This will again trigger the ALTER TABLE table copy logic outlined above.
MySQL tries to preserve data doing this: As long as the data you have can be represented in the target character set, the conversion will not be lossy. Warnings will be printed if there is data truncation going on, and data which cannot be represented in the target character set will be replaced by ?
Running a quick test in MySQL 5.1 with a VARCHAR column set to latin1_bin I inserted some non-latin chars
INSERT INTO Test VALUES ('英國華僑');
I select them and get rubbish (as expected).
SELECT text from Test;
gives
text
????
I then changed the collation of the column to utf8_unicode and re-ran the SELECT and it shows the same result
text
????
This is what I would expect - It will keep the data and the data will remain rubbish, because when the data was inserted the column lost the extra character information and just inserted a ? for each non-latin character and there is no way for the ???? to again become 英國華僑.
Your data will stay in place but it won't be fixed.
Valid data will be properly converted:
When you change a data type using
CHANGE or MODIFY, MySQL tries to
convert existing column values to the
new type as well as possible. Warning:
This conversion may result in
alteration of data.
http://dev.mysql.com/doc/refman/5.5/en/alter-table.html
... and more specifically:
To convert a binary or nonbinary
string column to use a particular
character set, use ALTER TABLE. For
successful conversion to occur, one of
the following conditions must
apply:[...] If the column has a
nonbinary data type (CHAR, VARCHAR,
TEXT), its contents should be encoded
in the column character set, not some
other character set. If the contents
are encoded in a different character
set, you can convert the column to use
a binary data type first, and then to
a nonbinary column with the desired
character set.
http://dev.mysql.com/doc/refman/5.1/en/charset-conversion.html
So your problem is invalid data, e.g., data encoded in a different character set. I've tried the tip suggested by the documentation and it basically ruined my data, but the reason is that my data was already lost: running SELECT column, HEX(column) FROM table showed that multibyte chars had been inserted as 0x3F (i.e., the ? symbol in Latin1). My MySQL stack had been smart enough to detect that input data was not Latin1 and convert it into something "compatible". And once data is gone, you can't get it back.
To sum up:
Use HEX() to find out if you still have your data.
Make your tests in a copy of your table.
My question is, what will happen with my existing data if I change my
collation to utf8_unicode now?
Answer: If you change to utf8_unicode_ci, nonthing will happen to your existing data (which is already corrupt and remain corrupt till you modify it).
Will it destroy or corrupt the existing data or will the data remain,
but the new data will be saved as utf8 as it should?
Answer: After you change to utf8_unicode_ci, existing data will not be destroyed. It will remain the same like before (something like ????). However, if you insert new data containing Unicode characters, it will be stored correctly.
I will change with phpMyAdmin web client.
Answer: Sure, you can change collation with phpMyAdmin by going to Operations > Table options
CAUTION! Some problems are solved via
ALTER TABLE ... CONVERT TO ...
Some are solved via a 2-step process
ALTER TABLE ... MODIFY ... VARBINARY...
ALTER TABLE ... MODIFY ... VARCHAR...
If you do the wrong one, you will have a worse mess!
Do SELECT HEX(col), col ... to see what you really have.
Study this to see what case you have: Trouble with utf8 characters; what I see is not what I stored
Perform the correct fix, based on these cases: http://mysql.rjweb.org/doc.php/charcoll#fixes_for_various_cases

Data Corruption with MediumText and ASP/MySQL

I have a website written in ASP with a mySQL database. ASP uses the ODBC 5.1 driver to connect to the database. Inside the database there is a varchar(8000) column (the length started small, but the application has evolved A LOT since its conception). Anyway, it recently became evident that the varchar column should be changed into a MEDIUMTEXT column. I made the change and everything appeared alright. However, whenever I do an UPDATE statement, the data in that column for that specific row gets corrupted. Do to the nature of the website, I am unable to provide data or example queries, but the queries are not using any functions or anything; just a straight UPDATE.
Everything works fine with the varchar, but blows up when I make the field a MEDIUMTEXT. The corruption I'm talking about is as follows:
ٔڹ���������������ߘ����ߘ��������
Any ideas?
Have you checked encodings (ASP+HTML+DB)? Using UTF8?
Not using UTF8 and that text is not English , right?
You might have a version specific bug. I searched for "mysql alter table mediumtext corruption" and there were some bugs specifically having to do with code pages and non-latin1 character sets.
Your best best is conduct a survey of the table, comparing it against a backup. If this is a MyISAM table, you might want to recreate the table with CHECKSUM option enabled. What does a CHECK TABLE tell you? If an ALTER TABLE isn't working for you, you could consider partitioning the mediumtext field into it's own table, or duplicating the table contents using a variation of an INSERT...SELECT:
CREATE TABLE b LIKE a;
ALTER TABLE b MODIFIY b.something MEDIUMTEXT;
INSERT INTO b SELECT * FROM a LIMIT x,1000;
-- now check those 1000 rows --
By inserting a few rows at a time and then checking them, it you might be able to tease out what kind of input isn't converting well.
Check dmesg and syslog output to see if you've got ram or disk issues. I have seen table corruptions occur due to ECC errors, bad raid controllers, bad sectors and faulty network transmission. You might attempt the ALTER TABLE on a comparable machine and see if it checks out.