Performing Alter Table on Large Innodb table - mysql

I've recently been thrust into the position of db admin for our server so I'm having to learn as I go. We recently found that one of our tables had maxed out the id column and needs to be migrated to bigint.
This is for an INNODB table with roughly roughly 301GB of data. We are running mysql version 5.5.38. The command I'm running to migrate the table is
ALTER TABLE tb_name CHANGE id id BIGINT NOT NULL;
I kicked off the migration and we are now 18 hours into the migration, but I'm not seeing our disk space on the server change at all which makes me think nothing is happening. We have plenty of memory so no concern there, but it still shows the following message state when I run "show processlist;"
copy to tmp table
Does anyone have any ideas or know what I'm doing incorrectly? Please ask if you need more information.

Yes, it will take a looooong time. The disks are probably spinning as fast as they can. (SSDs employ faster hamsters.)
You can kill the ALTER, since all it is doing is, as it says, "copying to tmp table", after which it will rename the tmp table to be the real table and drop the old copy.
I hope you had innodb_file_per_table = ON when you started the ALTER. Else it will be expanding ibdata1, which won't shrink afterwards.
pt-online-schema-change is an alternative. It will still take a loooooong time (with one extra 'o' because it will be slightly slower). It will do the job without blocking other activity.
This might have been a good time to check all the columns and indexes in the table:
Could some INTs be turned into MEDIUMINT or something smaller?
Are some of the INDEXes unused?
How about normalizing some of the VARCHARs?
Maybe even PARTITIONing (but not without a good reason)? Time-series is a typical use for Data Warehousing.
Summarize the data, and toss at least the older data?
If you would like further guidance, please provide SHOW CREATE TABLE.

Related

Huge wp_post table

My client has a store on Woocommerce with 1.2 Gb database. I know that similar store (count by product ) should have approx 700Mb.
The biggest table is wp_posts (760Mb) alone ! Which, I think is strange. Usually biggest table is wp_postmete or wp_options.
I tried optimize this database by plugins: WP-Sweep and wp-optimize so there is no revisions and draft left.
I also tried SQL:
OPTIMIZE TABLE
but it is innoDB so it do not support it. I get this message:
Table does not support optimize,doing recreate + analyze instead
So it is done? I mean:”recreate + analyze” or I should do it? And how?
I read that in innoDB i should dump table and restore but when I do this by DBeaver - i get same size.
Any Idea what should I do?
The error message is a bit misleading, because it dates back to the days when MyISAM was the default storage engine, and OPTIMIZE TABLE does a few things in MyISAM that are different from what it does in InnoDB. For example, MyISAM can't reclaim space from deleted rows until you do OPTIMIZE TABLE (whereas InnoDB does reclaim space dynamically).
InnoDB does support OPTIMIZE TABLE and it does useful things. It does basically the same as an ALTER TABLE when using the COPY algorithm. That is, it creates a new file, and copies the data row by row into the new file. This accomplishes defragmentation and rebuilding the indexes, just as if you had done a dump and restore. So you don't need to dump and restore.
After OPTIMIZE TABLE, the InnoDB table may be close to the same size it was before, if there was little fragmentation.
Frankly, a table 1.2GB in size is not so large by the standards of most MySQL projects I've worked on. We start to get concerned if a table is larger than 500GB, and we start alerting developers if the table is larger than 800GB, or larger than the remaining free disk space.

Delete many rows from a large percona mysql db

I need a fresh opinion on the case. Any thoughts are appreciated.
Input: we have a huge percona mysql (5.5) database that takes a couple of Tb (terabytes). Tables on innodb engine.
More than a half (2/3) of that size should be deleted as quick as possible.
Also we have master-slave configuration.
As the quickest way to achieve that I am considering the following solution:
Execute for each table on the slave server (to avoid production downtime) :
Stop replication
Select the rows NOT to be deleted into an empty new table that has the same structure as the original table
Rename original table to "table_old", new table - to correct name
Drop the original table "table_old"
Start replication
The problem is that we have a lot of FK constraints. Also I am afraid to break the replication during this process.
Questions:
1) What the potential problems can be with FK constraints in this solution?
2) How do not break replication?
3) Opinions? Alternative solutions?
Thank you in advance.
if you can put db offline (aka no one is accessing the db except you) for a while, you can go with your solution but you need to drop the FK involved before and to recreate them after. You should also check for AUTO_INCREMENT columns that will change number with copy operation.
the FK are needed if you want the db online, I had a similar problem with some huge log tables, any try to delete all the records at a time will probably lock the database or corrupt the table.
so I went for a slow approach, I made a procedure that will delete batches of rows from the tables using clustered primary key, and then I scheduled it to run every n seconds.

Converting a big MyISAM to InnoDB

I'm trying to convert a 10million rows MySQL MyISAM table into InnoDB.
I tried ALTER TABLE but that made my server get stuck so I killed the mysql manually. What is the recommended way to do so?
Options I've thought about:
1. Making a new table which is InnoDB and inserting parts of the data each time.
2. Dumping the table into a text file and then doing LOAD FILE
3. Trying again and just keep the server non-responsive till he finishes (I tried for 2hours and the server is a production server so I prefer to keep it running)
4. Duplicating the table, Removing its indexes, then converting, and then adding indexes
Changing the engine of the table requires rewrite of the table, and that's why the table is not available for so long. Removing indexes, then converting, and adding indexes, may speed up the initial convert, but adding index creates a read lock on your table, so the effect in the end will be the same. Making new table and transferring the data is the way to go. Usually this is done in 2 parts - first copy records, then replay any changes that were done while copying the records. If you can afford disabling inserts/updates in the table, while leaving the reads, this is not a problem. If not, there are several possible solutions. One of them is to use facebook's online schema change tool. Another option is to set the application to write in both tables, while migrating the records, than switch only to the new record. This depends on the application code and crucial part is handling unique keys / duplicates, as in the old table you may update record, while in the new you need to insert it. (here transaction isolation level may also play crucial role, lower it as much as you can). "Classic" way is to use replication, which, as far as I know is also done in 2 parts - you start replication, recording the master position, then import dump of the database in the second server, then start it as a slave to catch up with changes.
Have you tried to order your data first by the PK ? e.g:
ALTER TABLE tablename ORDER BY PK_column;
should speed up the conversion.

MySql ALTER TABLE on Production Databases - Any Issues?

I have about 100 databases (all the same structure, just on different servers) with approx a dozen tables each. Most tables are small (lets say 100MB or less). There are occasional edge-cases where a table may be large (lets say 4GB+).
I need to run a series of ALTER TABLE commands on just about every table in each database. Mainly adding some rows to the structure, but a few changes like change a row from a varchar to tinytext (or vice versa). Also adding a few new indexes (but indexing new rows, not existing ones, so assuming that isn't a big deal).
I am wondering how safe this is to do, and if there are any best practices to this process.
First, is there any chance I may corrupt or delete data in the tables. I suspect no, but need to be certain.
Second, I presume for the larger tables (4GB+), this may be a several-minutes to several-hours process?
Anything and everything I should know about performing ALTER TABLE commands on a production database I am interested in learning.
If its of any value knowing, I am planning on issuing commands via PHPMYADMIN for the most part.
Thanks -
First off before applying any changers, make backups. Two ways you can do it: mysqldump everything or you can copy your mysql data folder.
Secondly, you may want to use mysql from the command line. PHPMyAdmin will probably time out. Most PHP server has timeout less than 10 minutes. Or you accidently close the browser.
Here is my suggestion.
You can do fail-over the apps.(make sure no connections on all dbs) .
You can create indexes by using "create index statements" .don't use alter table add index statements.
Do these all using script like(keep all these statements in a file and run from source).
Looks like table sizes are very small so it wont create any headache.

MySQL ALTER TABLE on very large table - is it safe to run it?

I have a MySQL database with a MyISAM table with 4 million rows. I update this table about once a week with about 2000 new rows. After updating, I then alter the table like this:
ALTER TABLE x ORDER BY PK DESC
I order the table by the primary key field in descending order. This has not given me any problems on my development machine (Windows with 3GB memory). Three times I have tried it successfully on the production Linux server (with 512MB RAM - and achieving the resulted sorted table in about 6 minutes each time), the last time I tried it I had to stop the query after about 30 minutes and rebuild the database from a backup.
Can a 512MB server cope with that alter statement on such a large table? I have read that a temporary table is created to perform the ALTER TABLE command.
Question: Can this alter command be safely run? What should be the expected time for the alteration of the table?
As I have just read, the ALTER TABLE ... ORDER BY ... query is useful to improve performance in certain scenarios. I am surprised that the PK Index does not help with this. But, from the MySQL docs, it seems that InnoDB does use the index. However InnoDB tends to be slower as MyISAM. That said, with InnoDB you wouldn't need to re-order the table but you would lose the blazing speed of MyISAM. It still may be worth a shot.
The way you explain the problems, it seems that there is too much data loaded into memory (maybe there is even swapping going on?). You could easily check that with monitoring your memory usage. It's hard to say as I do not know MySQL all that well.
On the other hand, I think your problem lies at a very different place: You are using a machine with only 512 Megs of RAM as Database server with a table containing more than 4Mio rows... And you are performing a very memory-heavy operation on the whole table on that machine. It seems that 512Megs will not nearly be enough for that.
A much more fundamental issue I am seeing here: You are doing development (and quite likely testing as well) in an environment that is very different to the production environment. The kind of problem you are explaining is to be expected. Your development machine has six times as much memory as your production machine. I believe I can safely say, that the processor is much faster as well. In that case, I suggest you create a virtual machine mimicking your production site. That way you can easily test your project without disrupting the production site.
What you're asking it to do is rebuild the entire table and all its indexes; this is an expensive operation particularly if the data doesn't fit in ram. It will complete, but it will be vastly slower if the data doesn't fit in ram, particularly if you have lots of indexes.
I question your judgement when choosing to run a machine with such tiny memory in production. Anyway:
Is this ALTER TABLE really necessary; what specific query are you trying to speed up, and have you tried it without?
Have you considered making your development machine more like production? I mean, using a dev box with MORE memory is never a good idea, and using a different OS is definitely not either.
There is probably also some tuning you can do to try to help; it largely depends on your schema (indexes in particular). 4M rows is not very many (for a machine with normal amounts of ram).
is the primary key auto_increment? if so, then doing ALTER TABLE ... ORDER BY isn't going to improve anything since everything will be inserted in order anyway.
(unless you have lots of deletes)
I'd probably create a View instead which is ordered by the PK value, so that for one thing you don't need to lock up that huge table while the ALTER is being performed.
If you're using InnoDB, you shouldn't have to explicitly perform the ORDER BY either post-insert or at query time. According to the MySQL 5.0 manual, InnoDB already defaults to primary key ordering for query results:
http://dev.mysql.com/doc/refman/5.0/en/alter-table.html#id4052480
MyISAM tables return records in insertion order by default, instead, which may work as well if you only ever append to the table, rather than using an UPDATE query to modify any rows in-place.