Move very large MYSQL table - mysql

I have a few very large MySql tables on an Amazon Std EBS 1TB Volume (the file-per-table flag is ON and each ibd file is about 150 GB). I need to move all these tables from database db1 to database db2. Alongwith this, I would also like to move the tables out to a different Amazon Volume (which I think is considered a different Partition/File-System, even if the FileSystem type is the same). The reason I am moving to another volume is so I can get another 1TB space.
Things I have tried:
RENAME TABLE db1.tbl1 TO db2.tbl1 does not help because I cannot move it out to a different volume. I cannot mount a Volume at db2 because then it is considered a different file-system and MYSQL fails with an error:
"Invalid cross-device link" error 18
Created a stub db2.tbl1, stopped mysql, deleted db2's tbl1 and copied over db1's tbl.ibd. Doesn't work (the db information is buried in the ibd?)
I do not want to try the obvious mysqldump-import OR selectinto-loadfile because each table takes a day and a half to move even with most optimizations (foreign-key checks off etc). If I take indexes out before the import , re-indexing takes long and the overall time taken is still too long.
Any suggestions would be much appreciated.

Usually what I would suggest in this case, is to create an ec2 snapshot of the volume and write that snapshot into your larger volume.
You'll need to resize the partition afterwards.
As a sidenote, if your database is that large, EBS might be a major bottleneck. You're better off getting locally attached storage, but unfortunately the process is a bit different.
You might want to use Percona xtrabackup for this:
https://www.percona.com/doc/percona-xtrabackup/LATEST/index.html

Related

Free up space in MySQL 5.6.20 - InnoDB

first off, not a DB guy. here is the problem, data drive for the database is 96% full. in the my.cnf there is a line that has the following, (only showing part due to space)
innodb_data_file_path=nmsdata1:4000M;nmsdata2:4000M;
going up to
nmsdata18:4000M:autoextend
so in the folder where the files are stored files 1-17 are 4gb in size, file 18 is 136gb as of today.
I inherited the system and it has no vendor support or much documentation.
I can see there are a few tables that are really large
Table_name NumRows Data Length
---------- ------- -----------
pmdata 100964536 14199980032
fault 310864227 63437946880
event 385910821 107896160256
I know ther is a ton of writes happening and there should be a cron job that tells it to only keep the last 3 months data but I am concerned the DB is fragmented and not releasing space back for use.
so my task is to free up space in the DB so the drive does not fill up.
This is a weakness of innodb: tablespaces never shrink. They grow, and even if you "defragment" the tables, they just get written internally to another part of the tablespace, leaving more of the tablespace "free" for use by other data, but the size of the file on disk does not shrink.
Even if you DROP TABLE, that doesn't free space to the drive.
This has been a sore point for InnoDB for a long time: https://bugs.mysql.com/bug.php?id=1341 (reported circa 2003).
The workaround is to use innodb_file_per_table=1 in your configuration, so each table has its own tablespace. Then when you use OPTIMIZE TABLE <tablename> it defragments by copying data to a new tablespace, in a more efficient, compact internal layout, and then drops the fragmented one.
But there's a big problem with this in your case. Even if you were to optimize tables after setting innodb_file_per_table=1, their data would be copied into new tablespaces, but that still wouldn't shrink or drop the old multi-table tablespaces like your nmsdata1 through 18. They would still be huge, but "empty."
What I'm saying is that you're screwed. There is no way to shrink these tablespaces, and since you're full up on disk space, there's no way to refactor them either.
Here's what I would do: Build a new MySQL Server. Make sure innodb_file_per_table=1 is configured. Also configure the default for the data file path: innodb_data_file_path=ibdata1:12M:autoextend. That will make the central tablespace small from the start. We'll avoid expanding it with data.
Then export a dump of your current database server, all of it. Import that into your new MySQL server. It will obey the file-per-table setting, and data will create and fill new tablespaces, one per table.
This is also an opportunity to build the new server with larger storage, given what you know about the data growth.
It will take a long time to import so much data. How long depends on your server performance specifications, but it will take many hours at least. Perhaps days. This is a problem if your original database is still taking traffic while you're importing.
The solution to that is to use replication, so your new server can "catch up" from the point where you created the dump to the current state of the database. This procedure is documented, but it may be quite a bit of learning curve for someone who is not a database pro, as you said: https://dev.mysql.com/doc/refman/8.0/en/replication-howto.html
You should probably get a consultant who knows how to do this work.

Reclaim disk space from failed insert

I foolishly tried to add a column to a table that I did not have enough space on disk to copy and had to kill it and expand my RDS instance's storage capacity to avert a site crash. I would like to do it again (this time with enough disk space) but I can't seem to get back to my pre-query free storage levels. My query was to create a table like a giant table, add a column and then insert the entire contents of the old table together with null into the new table. I tried CALL mysql.rds_rotate_slow_log; and CALL mysql.rds_rotate_general_log; but judging by my AWS Cloudwatch panel, I'm still down ~10GB from my pre-query levels. No lines were successfully inserted into the new table. Is there some "clear hdd cache" command or something like that? Since it's RDS, I don't have access to the instance that's running it but I do have master user and RDS CLI access.
EDIT: It seems my problem may be related to giant ibdata files but since I don't have root access, I can't really execute the solutions mentioned in How to shrink/purge ibdata1 file in MySQL
The solution was to drop the new table. I didn't think that anything was stored in the new table because select count(*) from new_table; returned 0 but I guess the temporary data was tied in to the new table anyway. I'm not sure how exactly this works from a database structural point of view but fortunately it did what I wanted.
Bottom line: killed inserts still use storage space.
If somebody can explain why this is the case, it would be helpful for the future.

Reducing priority of MySQL commands/jobs (add an index/other commands)?

We have a moderately large production MySQL database. Periodically, we will run commands, usually via a rails migration, that while running, bog down the database. As a specific example, we might add an index to a large table.
Is there any method that can reduce the priority MySQL gives to a specific task. A sort of "nice" within MySQL itself? I found this, which is what inspired the question:
PostgreSQL tips and tricks
Since adding an index causes the work to be done within the DB and MySQL process, lowering the priority of the Rails migration process seems like it won't help. Are there other ways we can lower the priority?
We use multiple, replicated database servers to make changes like this.
In our case, db1 is the master, replicated to db2. (db1->db2).
Start by making the change to db2. If things lock, replication will stall, but that's OK.
Move your traffic to db2. Any remnant traffic going to db1 will replicate over, so you won't lose anything.
Once there's no traffic on db1, rebuild it as a slave of db2 (db2->db1).
That's the general idea and you get very little downtime and you don't have to pull an all-nighter! We actually have three servers, so it's a little more complicated, but not much.
Good luck.
Unfortunately, there is no simple way to do this: commands that alter the database structure don't have a priority option.
If your tables are MyISAM, you can try this:
mysqlhotcopy to make a backup of the table
import that backup it into a different database server (one that's not under load)
make the changes there
make a mysqlhotcopy backup of the altered table
import it into the live server
Note that this may or may not be faster than adding the index on the live server, depending on the time it takes you to transfer the table back and forth.

SQL Server synchronization with cutoff

I have a production DB (running on SQL Server 2008) with some ever-growing tables (orders etc). These tables are large and keep growing, so I want to make a cutoff at some point, but naturally, I do not want to lose the history entirely. So, I thought along the lines of:
One time: Backup the entire DB to another server
Periodically:
Back up differentially / synchronize from Production DB to Backup DB
In Production DB, delete all rows older the cutoff period
This would not, of course, replace the regular backup plan of the production server, but rather would allow shrinking its size while keeping the historical data available off-site, where I can use it for statistics and whatnot.
Does this make sense? And if it does, could you point me towards some solution / tool which allow this, other than manually writing code for EACH of the ever-growing tables.
Any advice will be appreciated.
Micky
May be partioning will help you.
It helps you to split table on different datafiles and filegroups. You can backup and restore each partition independenly.

Periodically replacing a running MySQL database

I've got a large-ish MySQL database which contains weather information. The DB is about 3.5 million rows, and 486 MB in size.
I get a refresh of this data every 6 hours, in the form of a mysql dump file that I can import. It takes more than 2 minutes to import the data and probably a similar amount of time to create the index.
Any thoughts on how I can import this data while still keeping the DB available and not losing responsiveness? My first thought was two have two databases within the same MySQL instance. I'd be running off DB1 and would load data into DB2 and then switch. However, I'm concerned that the load process would make DB1 unresponsive (or significantly slow).
So, my next thought is two have two different MySQL instances running, on different ports. While DB instance 1 is serving queries, DB instance 2 can be loaded with the next dataset. Then on the next query, the code switches to DB2.
That seems like it would work to me, but I wanted to check with some who have tried similar things in the past to see if there were any "gotchas" I was missing.
Thoughts?
Have two databases and switch between them after the import finishes each time.
Load on one database shouldn't make the other database unresponsive. 486MB is not too big for it all to fit in memory a couple of times over - depending I guess on whether you're in a small virtual server.
But even so, two MySQL instances on one server shouldn't present any differences in performance than two databases on one instance, except that two instances may actually take more memory and be more complicated to set up.