Compare content of two mysql-database dumps and generate patch sql file

Compare content of two mysql-database dumps and generate patch sql file - mysql

I am trying to mirror a production database to a local environment, perform a few operations on it and have to sync it back to production. Currently I do a database dump using mysqldump of the remote database, load it into my local database, use it and when I am finished I dump it again using the same command. I don't alter the structure, but only the contents of the database. Is it possible to generate "the difference" between the two dumps, meaning the sql-commands required to apply the changes back to the production database?
A bit of context why I need this: I use a program which performs a lot of SELECT operations on the database. It is, however, very poorly written, as it waits for every Query to complete and only proceeds execution after that. This works locally, as the delay isn't that big, but over an ssh connection to a remote server, the delay is just too big for this to work efficiently.
There are only read-operations performed on the remote-database, while I edit the local cache, so lost update errors are unlikely. I can't just delete all tables and load the dump though, as that would make the read-only operations fail during the "downtime".

Related

Mysqldump on RDS Read Replica Slave is 50x slower

I created a read-replica of my MySQL database on amazon RDS.
When executing the following command, it is super fast (half a second) on the master, but takes more like 30 seconds on the slave. Super annoying because I wanted to dump off of the slave so that I don't slow down the master.
mysqldump --set-gtid-purged=OFF -h myDomain.com -u dev -pmyPassword mySchema > out.sql

There are three issues to consider.
The most significant is that mysqldump does not perform well when run at a distance from the database, due to limitations in the traditional MySQL client/server wire protocol, which makes no allowance for pipelining a series of commands.
The mysqldump utility uses no magic to generate dump files -- it issues SQL statements to the server, and takes the results of those queries to generate its output.
As a result, every single object (schema, table, view, stored function/procedure, event) in the database requires at least one round trip and sometimes more than one.
For each table, mysqldump first issues SHOW CREATE TABLE t1; followed by SELECT * FROM t1; ... so a round trip time of 100 ms would mean that extracting a dump file of 150 tables would mean 150 × 2 × 0.100 = 30 seconds are simply wasted by the distance between the machine running mysqldump and the server -- and this is true even if the tables are completely empty.
This is not a recommendation, but you might take a look at mydumper, which claims to have the ability of creating the backup using multiple database connections, in parallel, and this could help mediate the cycles wasted as commands pass to the server and return to the client, by parallelizing the dump process. I don't know the quality of this code base, but something like this could help.
Next, you almost always want to use the --compress option for mysqldump. Contrary to what you might assume, this does not compress the backup file. The generated backup file is identical when this option is used, but when this feature is activated, the server compresses the data it sends to mysqldump on the wire, and mysqldump decompresses the data again before writing it out -- so this option will almost always make for a faster process unless the machine running mysqldump and the database server are connected by a low-latency, high-bandwidth network. Because the generated file is identical, there are no compatibility concerns when using this option.
Finally, there's an issue with newly-created RDS servers that you need to be aware of, so that it doesn't skew your benchmarks. When you create an RDS replica, it is originally seeded with data from a snapshot of the upstream master. This is, behind the scenes, an EBS snapshot of the master's hard drive, and the new database instance is backed by an EBS volume restored from that snapshot. EBS volumes are lazily-loaded from the snapshot, so they have a documented first-touch penalty. This issue could have a substantial impact on the performance of the first complete backup, but should have no meaningful impact after that.

Efficiently restoring one database to another using AWS RDS

I have a MySQL database called latest, and another database called previous, both running on the same server. Both databases have identical content. Once per day, an application runs that updates latest. Later on, towards the end of the applications execution, a comparison is made between latest and previous for certain data. Differences that are found, if any, will trigger certain actions e.g. notification emails to sent. After that, a copy of latest is dumped to a file using mysqldump and restored to previous. Both databases are now in sync again and the process repeats the following day.
I would like to migrate the database(s) to AWS RDS. I'm open to using Aurora, but the MySQL engine is fine too. Is there a simpler or more efficient way of performing the restore process so that both databases are in sync using RDS? A way that avoids having to use mysqldump and feeding the result into previous?
I understand that I could create a read replica of an instance running latest to act as previous, but I think that updates the read replica as the source DB is updated (well, asynchronously anyway) which would ruin the possibility of performing a comparison between the two later on.
I don't have any particular problem with using mysqldump for the restore process, but I'm just not sure If I'm missing a trick.

If you don't want a read replica, your option using mysqldump is good but probably you could use it with mysqlimport as suggested in the MySQL Docs:
Copying MySQL Databases to Another Machine
You can also use mysqldump and mysqlimport to transfer the database. For large tables, this is much faster than simply using mysqldump.

Moving of large MySQL database from limited resource server

I have a Windows Server with MySQL Database Server installed.
Multiple databases exist among them, database A contains a huge table named 'tlog', size about 220gb.
I would like to move over database A to another server for backup purposes.
I know I can do SQL Dump or use MySQL Workbench/SQLyog to do table copy.
But due to limited disk storage in server (less than 50gb) SQL Dump is not possible.
The server is serving other works so basically the CPU & RAM is limited too. As a result, copy table without used up CPU & RAM is not possible.
Is there any other method that can do the moving of the huge database A over to another server please?
Thanks in advance.

You have a few ways:
Method 1
Dump and compress at the same time: mysqldump ... | gzip > blah.sql.gz
This method is good because chances are your database will be less than 50GB; as the database dump should be in ASCII; you're then compressing it on the fly.
Method 2
You can use slave replication; this method will require a dump of the data.
Method 3
You can also use xtrabackup.
Method 4
You can shutdown the database, and rsync the data directory.
Note: You don't actually have to shutdown the database; you can however do multiple rsyncs; and eventually nothing will change (unlikely if the database is busy; have to do during slow time); which means the database would have sync'd over.
I've had to do this method with fairly large PostgreSQL databases (1TB+). It takes a few rsyncs: but, hey; it's the cost of 0 down time.
Method 5
If you're in a virtual environment you could:
Clone the disk image.
If you're in AWS you could create an AMI.
You could add another disk and just sync locally; then detach the disk, and re-attach to the new VM.
If you're worried about consuming resources during the dump or transfer you can use ionice and renice to limit the priority of the dump/transfer.

What is an efficient way to maintain a local readonly copy of a live remote MySQL database?

I maintain a server that runs daily cron jobs to aggregate data sources and generate reports, accessible by a private Ruby on Rails application.
One of our data sources is a partial dump of one of our partner's databases. The partner runs an active application and the MySQL DB has hundreds of tables. They have given us read-only access to a relatively underpowered readonly slave of their application DB.
Because of latency issues and performance bottlenecking on their slave DB, we have been maintaining a limited local copy of their DB. We only need about 20 tables for our reports, so I only dump those tables. We also only need the data to a daily granularity, so realtime sync is not a requirement.
For a few months, I had implemented a nightly cron which streamed the dump of the necessary tables into a local production_tmp database. Then, when all tables were imported, I dropped production and renamed production_tmp to production. This was working until the DB grew to over 25GB, and we started running into disk space limitations.
For now, I have removed the redundancy step and am just streaming the dump straight into production on our local server. This feels a bit flimsy to me, and I would like to implement a safer approach. Also, currently doing the full dump/load takes our server over 2 hours, and I'd like to implement an approach that doesn't take as long. The database will only keep growing, so I'd like to implement something future proof.
Any suggestions would be appreciated!

I take it you have never heard of, or considered MySQL Replication?
The idea is that you do your backup & restore once, and then configure the replica to "subscribe" to a continuous stream of changes as they are made on the primary MySQL instance. Any change applied to the primary is applied automatically to the replica within seconds. You don't have to do the backup & restore procedure again, unless the replica gets damaged.
It takes some care to set up and keep working, but it's a much more efficient method of keeping two instances in sync.
#SusannahPotts mentions hot backup and/or incremental backup. You can get both of these features for free, without paying for MySQL Enterprise using Percona XtraBackup.
You can also consider using MySQL Transportable Tablespaces.
You'll need filesystem access to run either Percona XtraBackup or MySQL Enterprise Backup. It's not possible to use these physical backup tools for Amazon RDS, for example.
One alternative is to create a replication slave in the same network as the live system, and run Percona XtraBackup on that slave, where you do have filesystem access.
Another option is to stream the binary logs to another host (see https://dev.mysql.com/doc/refman/5.6/en/mysqlbinlog-backup.html) and then transfer them periodically to your local instance and replay them.
Each of these solutions has pros and cons. It's hard to recommend which solution is best for you, because you aren't sharing full details about your requirements.

This was working until the DB grew to over 25GB, and we started running into disk space limitations.
Some question marks "here":
Why don't you just increase the available Diskspace for your database? 25 GB seems nothing when it comes down to disk-space?
Why don't you modify your script to: download table1, import table1_tmp, drop table1_prod, rename table1_tmp to table1_prod; rinse and repeat.
Other than that:
Why don't you ask your partner for a system with enough performance to run your reports on? I'm quite sure, he would prefer this rather than having YOU download sensitive data every day to your "local site"?
Last thought (requires MySQL Enterprise Backup https://www.mysql.de/products/enterprise/backup.html):
Rather than dumping, downloading and importing 25 GB every day:
Create a full backup
Download and import
Use Differential or incremental backups from now.
The next day you download (and import) only the data-delta: https://dev.mysql.com/doc/mysql-enterprise-backup/4.0/en/mysqlbackup.incremental.html

Fastest method of producing a copy of a MySQL database

We have a very large database that we need to occasionally replicate on our dev+staging machines.
At the moment we use mysqldump and then import the database script using "mysql -u xx -p dbname < dumpscript.sql"
This would be fine if it didn't take 2 days to complete!
Is it possible to simply copy the entire database as a file from one server to another and skip the whole export/import nonsense?
Cheers

there are couple of solutions:
have a separate replication slave you can stop at any time and take the file-level backup
if you use the innodb engine - you can take file system level snapshot [eg with lvm] and then copy the files over to your test environment
if you have plenty of tables/databases - you can paralleled the dumping and restoring process to speed things up.

I have many restrictions on where I can run scripts, access sources and targets, and have enough space to prepare the data for the task.
I get my zipped database dump from the hosting provider.
I split the unzipped commands so INSERT INTO lines get put into one file, and all the others go into a second one.
Then I create the database structures from the second one.
I convert the INSERT INTO statements to table related CSV files.
Finally, I upload the csv files in parallel (up to 50 tables concurrently) and this way a 130GB text file dump is cloned in 3 hours, instead of the 17 hours it'd take when using the statement by statement method.
In the 3 hours, I include:
the copy over (10 minutes),
sanity check (10 minutes) and
filtering of logs (10 minutes), as the log entries need to be from the latest academic year only.
The remote zipped file is between 7GB to 13GB passed over a 40MBps line.
The upload is to a remote server via a 40MBps line.
If your mysql server is local, the speed of uploading can be faster.
I utilise scp, gzip, zgrep, sed, awk, ps, mysqlimport, mysql and some other utilities to speed up decompression and filtering (pv, rg, pigz) if available.
If I had direct access to the database server, an LVM with folder level snapshot abilities would be the preferred solution, giving you speeds restricted only by the copy speed of the media.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008