How should I convert MyISAM table to InnoDB on live working site in real time?
Should I just change in phpmyadmin in operations menu "Storage Engine" to InnoDB?
Would it lock all table during conversion?
You will indeed get a write lock on the table. You will still be able to read from the table while the new table is being created and copied over.
From the documentation on ALTER TABLE:
While ALTER TABLE is executing, the original table is readable by
other sessions (with the exception noted shortly). Updates and writes
to the table that begin after the ALTER TABLE operation begins are
stalled until the new table is ready, then are automatically
redirected to the new table without any failed updates.
...
The exception referred to earlier is that ALTER TABLE blocks reads
(not just writes) at the point where it is ready to install a new
version of the table .frm file, discard the old file, and clear
outdated table structures from the table and table definition caches.
At this point, it must acquire an exclusive lock. To do so, it waits
for current readers to finish, and blocks new reads (and writes).
Check out pt-online-schema-change. See example in this answer https://dba.stackexchange.com/questions/60570/best-way-to-add-a-new-column-to-a-large-table-mysql-myisam/60576#60576
It will block a table for a second or so to rename two tables.
While it might work you'd better make a copy of your table and try the storage engine change operation on the copy.
That would also give you an estimate of time needed for the operation.
Even better would be if you do that on a copy of your application and make extensive checks whether it would work at all with the new storage engine.
Related
I was wondering what happens to the binlog when run an alter using pt-online-schema-change or gh-ost?
for the pt-online-schema-change I have read that it copies the table and use some triggers to apply the changes. I don't know if it create a table from the beginning with the new schema or it just apply the alter after copying the table?
if it alters the table from the beginning, then what happens to binglog?
is the positions different than the previous binglog?
pt-online-schema change copies the table structure and applies the desired ALTER TABLE to the zero-row table. This is virtually instantaneous. Then it creates triggers to mirror changes against the original table. Then it starts copying old data from the original table to the new table.
What happens to the binlog? It gets quite huge. The CREATE TABLE and ALTER TABLE and CREATE TRIGGER are pretty small. DDL is always statement-based in the binlog. The DML changes created by the triggers and the process of copying old data become transactions in the binlog. We prefer row-based binlogs, so these end up being pretty bulky.
gh-ost is similar, but without the triggers. gh-ost reads the binlog to find events that applied to the old table, and it applies those to the new table. Meanwhile, it also copies old data. Together these actions result in a similar volume of extra events in the binlog as occur when using pt-online-schema-change.
So you should check the amount of free disk space before you begin either of these online schema change operations. It will expand the binlogs approximately in proportion to the amount of data to be copied. And of course you need to store two copies of the whole table — the original and the altered version — temporarily, until the original table can be dropped at the end of the process.
I have had to run pt-online-schema change on large tables (500GB+) when I had a disk that was close to being full. It causes some tense moments. I had to PURGE BINARY LOGS periodically to get some more free space, because the schema change would fill the disk to 100% if I didn't! This is not a situation I recommend.
In my production database Alerts related tables are created with default CharSet of "latin", due to this we are getting error when we try
to insert Japanese characters in the table. We need to change the table and columns default charset to UTF8.
As these tables are having huge data, Alter command might take so much time (it took 5hrs in my local DB with same amount of data)
and lock the table which will cause data loss. Can we plan a mechanism to change the Charset to UTF8, without data loss.
which is the better way to change the charset for huge data tables?
I found this on mysql manual http://dev.mysql.com/doc/refman/5.1/en/alter-table.html:
In most cases, ALTER TABLE makes a temporary copy of the original
table. MySQL waits for other operations that are modifying the table,
then proceeds. It incorporates the alteration into the copy, deletes
the original table, and renames the new one. While ALTER TABLE is
executing, the original table is readable by other sessions. Updates
and writes to the table that begin after the ALTER TABLE operation
begins are stalled until the new table is ready, then are
automatically redirected to the new table without any failed updates
So yes -- it's tricky to minimize downtime while doing this. It depends on the usage profile of your table, are there more reads/writes?
One approach I can think of is to use some sort of replication. So create a new Alert table that uses UTF-8, and find a way to replicate original table to the new one without affecting availability / throughput. When the replication is complete (or close enough), switch the table by renaming it ?
Ofcourse this is easier said than done -- need more learning if it's even possible.
You may take a look into Percona Toolkit::online-chema-change tool:
pt-online-schema-change
It does exactly this - "alters a table’s structure without blocking reads or writes" - with some
limitations(only InnoDB tables etc) and risks involved.
Create a replicated copy of your database on an other machine or instance, when you setup the replication issue stop slave command and alter the table. If you have more than one table, between each conversation you may consider issuing again start slave to synchronise two databases. (If you do not this it may take longer to synchronise) When you complete the conversion the replicated copy can replace your old production database and you remove the old one. This is the way i found out to minimize downtime.
Adding a new column or adding a new index can take hours and days for large innodb tables in MySQL with more than 10 million rows. What is the best way to increase the performance on large innodb tables in these two cases? More memory, tweaking the configuration (for example increasing the sort_buffer_size or innodb_buffer_pool_size), or some kind of trick? Instead of altering a table directly, one could create a new one, change it, and copy the old data the new, like this which is useful for ISAM tables and multiple changes:
CREATE TABLE tablename_tmp LIKE tablename;
ALTER TABLE tablename_tmp ADD fieldname fieldtype;
INSERT INTO tablename_tmp SELECT * FROM tablename;
ALTER TABLE tablename RENAME tablename_old;
ALTER TABLE tablename_tmp RENAME tablename;
Is it recommendable for innodb tables, too, or is it just what the ALTER TABLE command does anway?
Edit 2016: we've recently (August 2016) released gh-ost, modifying my answer to reflect it.
Today there are several tools which allow you to do online alter table for MySQL. These are:
edit 2016: gh-ost: GitHub's triggerless schema migration tool (disclaimer: I am author of this tool)
oak-online-alter-table, as part of the openark-kit (disclaimer: I am author of this tool)
pt-online-schema-change, as part of the Percona Toolkit
Facebook's online schema change for MySQL
Let's consider the "normal" `ALTER TABLE`:
A large table will take long time to ALTER. innodb_buffer_pool_size is important, and so are other variables, but on very large table they are all negligible. It just takes time.
What MySQL does to ALTER a table is to create a new table with new format, copy all rows, then switch over. During this time the table is completely locked.
Consider your own suggestion:
It will most probably perform worst of all options. Why is that? Because you're using an InnoDB table, the INSERT INTO tablename_tmp SELECT * FROM tablename makes for a transaction. a huge transaction. It will create even more load than the normal ALTER TABLE.
Moreover, you will have to shut down your application at that time so that it does not write (INSERT, DELETE, UPDATE) to your table. If it does - your whole transaction is pointless.
What the online tools provide
The tools do not all work alike. However, the basics are shared:
They create a "shadow" table with altered schema
They create and use triggers to propagate changes from original table to ghost table
They slowly copy all the rows from your table to shadow table. They do so in chunks: say, 1,000 rows at a time.
They do all the above while you are still able to access and manipulate the original table.
When satisfied, they swap the two, using a RENAME.
The openark-kit tool has been in use for 3.5 years now. The Percona tool is a few months old, but possibly more tested then the former. Facebook's tool is said to work well for Facebook, but does not provide with a general solution to the average user. I haven't used it myself.
Edit 2016: gh-ost is a triggerless solution, which significantly reduces master write-load on the master, decoupling the migration write load from the normal load. It is auditable, controllable, testable. We've developed it internally at GitHub and released it as open source; we're doing all our production migrations via gh-ost today. See more here.
Each tool has its own limitations, look closely at documentation.
The conservative way
The conservative way is to use an Active-Passive Master-Master replication, do the ALTER on the standby (passive) server, then switch roles and do the ALTER again on what used to be the active server, now turned passive. This is also a good option, but requires an additional server, and deeper knowledge of replication.
Rename screws up referenced tables.
If you have say table_2 which is child to tablename, on ALTER TABLE tablename RENAME tablename_old;
table_2 will start pointing to tablename_old.
Now without altering table_2 you cannt point it back to tablename. You have to keep on going making alters in every child and referenced table.
I am using Engine InnoDB on my MySQL server.
I have a patch script to upgrade my tables like add new columns and fill in default data.
I want to make sure there is no other session using the database. So I need a way to lock the database:
The lock shouldn't kick out an existing session. If their is any other existing session just fail the lock and report error
The lock need to prevent other sessions to read/write/change the database.
Thanks a lot everyone!
You don't need to worry about locking tables yourself. As the MySQL documentation (http://dev.mysql.com/doc/refman/5.1/en/alter-table.html) says:
In most cases, ALTER TABLE makes a temporary copy of the original
table. MySQL waits for other operations that are modifying the table,
then proceeds. It incorporates the alteration into the copy, deletes
the original table, and renames the new one. While ALTER TABLE is
executing, the original table is readable by other sessions. Updates
and writes to the table that begin after the ALTER TABLE operation
begins are stalled until the new table is ready, then are
automatically redirected to the new table without any failed updates.
So I have been asked to change the engine of a few tables in a production database from MyISAM to InnoDB. I am trying to figure out how that will affect usage in production (as the server can afford no downtime).
I have read some conflicting information. Some information I have read state that the tables are locked and will not receive updates until after the conversion completes (IE, updates are not queued, just discarded until it completes).
In other places, I have read that while the table is locked, the inserts and updates will be queued until the operation is complete, and THEN the write actions are performed.
So what exactly is the story here?
This is directly from the manual:
In most cases, ALTER TABLE makes a temporary copy of the original
table. MySQL waits for other operations that are modifying the table,
then proceeds. It incorporates the alteration into the copy, deletes
the original table, and renames the new one. While ALTER TABLE is
executing, the original table is readable by other sessions. Updates
and writes to the table that begin after the ALTER TABLE operation
begins are stalled until the new table is ready, then are
automatically redirected to the new table without any failed updates.
So, number two wins. They're not "failed", they're "stalled".
The latter is correct. All queries against a table that's being altered are blocked until the alter completes, and are processed once the alter finishes. Note that this includes read queries (SELECT) as well as write queries (INSERT, UPDATE, DELETE).