Why does it take so long to rename a column in mysql? - mysql

I have a 12 GB table full of pictures, I'm trying to rename the blob column that holds the data, and it is taking forever. Can someone give me a blow by blow account of why it is taking so long to rename the column? I would have thought that this operation would be pretty quick, no matter the size of the table?
EDIT: The query I ran is as follows
alter table `rails_production`.`pictures` change `data` `image_file_data` mediumblob NULL
It appears that most of the time is spent waiting for mysql to make a temporary copy of the pictures table, which since it is very large is taking a while to do.
It is on the list of things to do, to change the picture storage from the database to the filesystem.
EDIT2: Mysql Server version: 5.0.51a-24+lenny2 (Debian)

I can't give you the blow-by-blow (feature request #34354 would help, except that it probably wouldn't be back-ported to MySQL 5.0), but the extra time is due to the fact that an ALTER ... CHANGE may change the type of the column (and column attributes, if any), which necessitates converting the values stored in the column and other checks. MySQL 5.0 doesn't include optimizations for when the new type and attributes are the same as the old. From the documentation for ALTER under MySQL 5.0:
In most cases, ALTER TABLE works by making a temporary copy of the original table. The alteration is performed on the copy, and then the original table is deleted and the new one is renamed. While ALTER TABLE is executing, the original table is readable by other sessions. Updates and writes to the table are stalled until the new table is ready, and then are automatically redirected to the new table without any failed updates.
[...]
If you use any option to ALTER TABLE other than RENAME, MySQL always creates a temporary table, even if the data wouldn't strictly need to be copied (such as when you change the name of a column).
Under 5.1, ALTER has some additional optimizations:
In some cases, no temporary table is necessary:
Alterations that modify only table metadata and not table data can be made immediately by altering the table's .frm file and not touching table contents. The following changes are fast alterations that can be made this way:
Renaming a column, except for the InnoDB storage engine.
[...]

Because MySQL will rebuild the entire table when you make schema changes.
This is done because it's the only way of doing it in some cases, and it makes it much easier for the server to rebuild it anyway.

Yes mysql does a temporary copy of the table. I don't think there's an easy way around that. You should really think about to store the pictures on the filesystem and only store paths in mysql. That's the only way to fasten it up, I guess.

Related

Is there a way to turn off the creation of a temp table during ALTER TABLE?

Is there a way to perform ALTER TABLE in MySQL, telling the server to skip creating a backup of the table first? I have a backup of the table already and I'm doing some tests on it (adding indexes), so I don't care if the table gets corrupted in the process. I'll just restore it from the backup. But what I do care about is for the ALTER TABLE to finish quickly, so I can see the test results.
Given that I have a big MyISAM table (700 GB) it really isn't an option to wait for couple of hours so that MySQL can first finish creating a backup of the original table, before actually adding an index to it.
It's not doing a backup; it is building the new version. (The existing table serves as a backup in case of a crash.)
With InnoDB, there are many flavors of ALTER TABLE -- some of which take essentially zero time, regardless of the size of the table. MyISAM (mostly) does the brute force way: Create an empty table with the new schema; copy all the data and build all the indexes; swap tables. For some alters, InnoDB must also do the brute force way: Example changing the PRIMARY KEY.

Mysql : Is adding column take same time as renaming column [duplicate]

I have a 12 GB table full of pictures, I'm trying to rename the blob column that holds the data, and it is taking forever. Can someone give me a blow by blow account of why it is taking so long to rename the column? I would have thought that this operation would be pretty quick, no matter the size of the table?
EDIT: The query I ran is as follows
alter table `rails_production`.`pictures` change `data` `image_file_data` mediumblob NULL
It appears that most of the time is spent waiting for mysql to make a temporary copy of the pictures table, which since it is very large is taking a while to do.
It is on the list of things to do, to change the picture storage from the database to the filesystem.
EDIT2: Mysql Server version: 5.0.51a-24+lenny2 (Debian)
I can't give you the blow-by-blow (feature request #34354 would help, except that it probably wouldn't be back-ported to MySQL 5.0), but the extra time is due to the fact that an ALTER ... CHANGE may change the type of the column (and column attributes, if any), which necessitates converting the values stored in the column and other checks. MySQL 5.0 doesn't include optimizations for when the new type and attributes are the same as the old. From the documentation for ALTER under MySQL 5.0:
In most cases, ALTER TABLE works by making a temporary copy of the original table. The alteration is performed on the copy, and then the original table is deleted and the new one is renamed. While ALTER TABLE is executing, the original table is readable by other sessions. Updates and writes to the table are stalled until the new table is ready, and then are automatically redirected to the new table without any failed updates.
[...]
If you use any option to ALTER TABLE other than RENAME, MySQL always creates a temporary table, even if the data wouldn't strictly need to be copied (such as when you change the name of a column).
Under 5.1, ALTER has some additional optimizations:
In some cases, no temporary table is necessary:
Alterations that modify only table metadata and not table data can be made immediately by altering the table's .frm file and not touching table contents. The following changes are fast alterations that can be made this way:
Renaming a column, except for the InnoDB storage engine.
[...]
Because MySQL will rebuild the entire table when you make schema changes.
This is done because it's the only way of doing it in some cases, and it makes it much easier for the server to rebuild it anyway.
Yes mysql does a temporary copy of the table. I don't think there's an easy way around that. You should really think about to store the pictures on the filesystem and only store paths in mysql. That's the only way to fasten it up, I guess.

which is the better way to change the character set for huge data tables?

In my production database Alerts related tables are created with default CharSet of "latin", due to this we are getting error when we try
to insert Japanese characters in the table. We need to change the table and columns default charset to UTF8.
As these tables are having huge data, Alter command might take so much time (it took 5hrs in my local DB with same amount of data)
and lock the table which will cause data loss. Can we plan a mechanism to change the Charset to UTF8, without data loss.
which is the better way to change the charset for huge data tables?
I found this on mysql manual http://dev.mysql.com/doc/refman/5.1/en/alter-table.html:
In most cases, ALTER TABLE makes a temporary copy of the original
table. MySQL waits for other operations that are modifying the table,
then proceeds. It incorporates the alteration into the copy, deletes
the original table, and renames the new one. While ALTER TABLE is
executing, the original table is readable by other sessions. Updates
and writes to the table that begin after the ALTER TABLE operation
begins are stalled until the new table is ready, then are
automatically redirected to the new table without any failed updates
So yes -- it's tricky to minimize downtime while doing this. It depends on the usage profile of your table, are there more reads/writes?
One approach I can think of is to use some sort of replication. So create a new Alert table that uses UTF-8, and find a way to replicate original table to the new one without affecting availability / throughput. When the replication is complete (or close enough), switch the table by renaming it ?
Ofcourse this is easier said than done -- need more learning if it's even possible.
You may take a look into Percona Toolkit::online-chema-change tool:
pt-online-schema-change
It does exactly this - "alters a table’s structure without blocking reads or writes" - with some
limitations(only InnoDB tables etc) and risks involved.
Create a replicated copy of your database on an other machine or instance, when you setup the replication issue stop slave command and alter the table. If you have more than one table, between each conversation you may consider issuing again start slave to synchronise two databases. (If you do not this it may take longer to synchronise) When you complete the conversion the replicated copy can replace your old production database and you remove the old one. This is the way i found out to minimize downtime.

InnoDB: ALTER TABLE performance related to NULLability?

I've got a table with 10M rows, and I'm trying to ALTER TABLE to add another column (a VARCHAR(80)).
From a data-modelling perspective, that column should be NOT NULL - but the amount of time it takes to run the statement is a consideration, and the client code could be changed to deal with a NULL column if that's warranted.
Should the NULL-ability of the column I'm trying to add significantly impact the amount of time it takes to add the column either way?
More Information
The context in which I'm doing this is a Django app, with a migration generated by South - adding three separate columns, and adding an index on one of the newly-added columns. Looking at the South-generated SQL, it spreads this operation (adding three columns and an index) over 15 ALTER TABLE statements - which seems like it will make this operation take a whole lot longer than it should.
I've seen some references that suggest that InnoDB doesn't actually have to create a field in the on-disk file for nullable fields that are NULL, and just modifies a bitfield in the header. Would this impact the speed of the ALTER TABLE operation?
I don't think the nullability of the column has anything to do with the speed of ALTER TABLE. In most alter table operations, the whole table - with all the indexes - has to be copied (temporarily) and then the alteration is done on the copy. With 10M rows, it's kind of slow. From MySQL docs:
Storage, Performance, and Concurrency Considerations
In most cases, ALTER TABLE makes a temporary copy of the original table. MySQL waits for other operations that are modifying the table, then proceeds. It incorporates the alteration into the copy, deletes the original table, and renames the new one. While ALTER TABLE is executing, the original table is readable by other sessions. Updates and writes to the table that begin after the ALTER TABLE operation begins are stalled until the new table is ready, then are automatically redirected to the new table without any failed updates. The temporary table is created in the database directory of the new table. This can differ from the database directory of the original table for ALTER TABLE operations that rename the table to a different database.
If you want to make several changes in a table's structure, it's usually better to do them in one ALTER TABLE operation.
Allowing client code to make changes in tables is probably not the best idea - and you have hit on one good reason for not allowing that. Why do you need it? If you can't do otherwise, it would probably be better - for performance reasons - to allow your client code to be creating a table (with the new column and the PK of the existing table) instead of adding a column.

Converting a big MyISAM to InnoDB

I'm trying to convert a 10million rows MySQL MyISAM table into InnoDB.
I tried ALTER TABLE but that made my server get stuck so I killed the mysql manually. What is the recommended way to do so?
Options I've thought about:
1. Making a new table which is InnoDB and inserting parts of the data each time.
2. Dumping the table into a text file and then doing LOAD FILE
3. Trying again and just keep the server non-responsive till he finishes (I tried for 2hours and the server is a production server so I prefer to keep it running)
4. Duplicating the table, Removing its indexes, then converting, and then adding indexes
Changing the engine of the table requires rewrite of the table, and that's why the table is not available for so long. Removing indexes, then converting, and adding indexes, may speed up the initial convert, but adding index creates a read lock on your table, so the effect in the end will be the same. Making new table and transferring the data is the way to go. Usually this is done in 2 parts - first copy records, then replay any changes that were done while copying the records. If you can afford disabling inserts/updates in the table, while leaving the reads, this is not a problem. If not, there are several possible solutions. One of them is to use facebook's online schema change tool. Another option is to set the application to write in both tables, while migrating the records, than switch only to the new record. This depends on the application code and crucial part is handling unique keys / duplicates, as in the old table you may update record, while in the new you need to insert it. (here transaction isolation level may also play crucial role, lower it as much as you can). "Classic" way is to use replication, which, as far as I know is also done in 2 parts - you start replication, recording the master position, then import dump of the database in the second server, then start it as a slave to catch up with changes.
Have you tried to order your data first by the PK ? e.g:
ALTER TABLE tablename ORDER BY PK_column;
should speed up the conversion.