Question on MySQL replication (mainly).
So I have 2 MySQL databases, which are identical schema, but not connected by a network of any kind. I need one-way (only) syncing of data, i.e. Db1 always needs to copied/dumped down, and synced with Db2 . There is update/insert/delete activity on both.
I have ensured that DB2 (which is the receiver, always) has sequences in a very high range - so records created on or 'owned' by DB2 won't conflict with DB1 records when synced. There is also a rule that down on DB2, we won't edit any data that was created on DB1 (we can tell by the sequence number, but also by the kinds of data being inputted on each DB).
I've already got this working via a mysqldump (from Db1), and modifying the dump to have "REPLACE INTO" instead of "INSERT into" - and running this modified mysqldump-output as a SQL script. Volumes are not too high, works fine.
Is it possible to do this (simply enough) via replication. Mysql can create snapshot dumps, and I would copy them over and run a replication command - is that feasible.
I would use auto_increment_increment & auto_increment_offset to make sure the records don't overlap. I know you said you set the records on DB2 to a high range, but eventually they may still overlap. Just making DB1 use odd values and DB2 use even values makes sure that will never happen.
You can't use replication, if the two instances are not connected by a network. Replication requires that the replica is able to make a connection to its master.
But you can use binary logs, which is one part of how replication works. On DB1, enable binary logging, which will record every change made to the database. Periodically, copy these logs to the server for DB2 (I assume you have some way of doing this if you're currently using mysqldump). Use the mysqlbinlog tool to convert the logs into SQL commands to replay against your DB2 instance.
Example:
mysqlbinlog binlog.000001 binlog.000002 | mysql -u root -p
It's up to you to keep track of which binlogs you have copied and executed. Just before you copy a set of binlog files, run FLUSH LOGS on DB1. This causes DB1 to close the binlog file it's currently writing to, and open a new binlog file. This way you can work with whole files at a time without worrying about partial files.
Related
is there a way to replicate mysql while the master server already has a lot of data.I tried the normal way, but I had difficulty getting the MASTER_LOG_POS value. how can the slave server be able to replicate data that previously existed on the master server.
Generally you start with an exact full copy of your existing database. This means creating a real copy of your MySQL data directory (while the server is off), go with a (consistent) snapshot, or use a tool like Percona XtraBackup.
Only after you have 2 identical MySQL servers, you can start replicating. Note that using a tool like mysqldump is not a good idea for consistent snapshots.
If you have a relatively small amount of data you could use mysqldump --master-data=1 --single-transaction. This will create a snapshot with the correct master-binlog and position required. This should not be used for production environments or large amounts of data.
I created a read-replica of my MySQL database on amazon RDS.
When executing the following command, it is super fast (half a second) on the master, but takes more like 30 seconds on the slave. Super annoying because I wanted to dump off of the slave so that I don't slow down the master.
mysqldump --set-gtid-purged=OFF -h myDomain.com -u dev -pmyPassword mySchema > out.sql
There are three issues to consider.
The most significant is that mysqldump does not perform well when run at a distance from the database, due to limitations in the traditional MySQL client/server wire protocol, which makes no allowance for pipelining a series of commands.
The mysqldump utility uses no magic to generate dump files -- it issues SQL statements to the server, and takes the results of those queries to generate its output.
As a result, every single object (schema, table, view, stored function/procedure, event) in the database requires at least one round trip and sometimes more than one.
For each table, mysqldump first issues SHOW CREATE TABLE t1; followed by SELECT * FROM t1; ... so a round trip time of 100 ms would mean that extracting a dump file of 150 tables would mean 150 × 2 × 0.100 = 30 seconds are simply wasted by the distance between the machine running mysqldump and the server -- and this is true even if the tables are completely empty.
This is not a recommendation, but you might take a look at mydumper, which claims to have the ability of creating the backup using multiple database connections, in parallel, and this could help mediate the cycles wasted as commands pass to the server and return to the client, by parallelizing the dump process. I don't know the quality of this code base, but something like this could help.
Next, you almost always want to use the --compress option for mysqldump. Contrary to what you might assume, this does not compress the backup file. The generated backup file is identical when this option is used, but when this feature is activated, the server compresses the data it sends to mysqldump on the wire, and mysqldump decompresses the data again before writing it out -- so this option will almost always make for a faster process unless the machine running mysqldump and the database server are connected by a low-latency, high-bandwidth network. Because the generated file is identical, there are no compatibility concerns when using this option.
Finally, there's an issue with newly-created RDS servers that you need to be aware of, so that it doesn't skew your benchmarks. When you create an RDS replica, it is originally seeded with data from a snapshot of the upstream master. This is, behind the scenes, an EBS snapshot of the master's hard drive, and the new database instance is backed by an EBS volume restored from that snapshot. EBS volumes are lazily-loaded from the snapshot, so they have a documented first-touch penalty. This issue could have a substantial impact on the performance of the first complete backup, but should have no meaningful impact after that.
I have a MySQL database called latest, and another database called previous, both running on the same server. Both databases have identical content. Once per day, an application runs that updates latest. Later on, towards the end of the applications execution, a comparison is made between latest and previous for certain data. Differences that are found, if any, will trigger certain actions e.g. notification emails to sent. After that, a copy of latest is dumped to a file using mysqldump and restored to previous. Both databases are now in sync again and the process repeats the following day.
I would like to migrate the database(s) to AWS RDS. I'm open to using Aurora, but the MySQL engine is fine too. Is there a simpler or more efficient way of performing the restore process so that both databases are in sync using RDS? A way that avoids having to use mysqldump and feeding the result into previous?
I understand that I could create a read replica of an instance running latest to act as previous, but I think that updates the read replica as the source DB is updated (well, asynchronously anyway) which would ruin the possibility of performing a comparison between the two later on.
I don't have any particular problem with using mysqldump for the restore process, but I'm just not sure If I'm missing a trick.
If you don't want a read replica, your option using mysqldump is good but probably you could use it with mysqlimport as suggested in the MySQL Docs:
Copying MySQL Databases to Another Machine
You can also use mysqldump and mysqlimport to transfer the database. For large tables, this is much faster than simply using mysqldump.
I need to make working mysql replication from master to slave. (tried it once already)
The database is quite large (over 100GB) and it will take some hours to make it ready for new slave.
The database has MyIsam and innoDB engine and both are being written
I think my only choice is to copy the data files from master to a new slave? (or make a database dump which im referring later in the topic of ROUND 2)
Before that I have to run down all the services which uses the database and
make writelock for tables or should i shut down the whole database?
After data directory sync to the new replication server I started it up and the database with the tables was there. First error that I got rid off by changing bin.log to 007324 and position to 0.
Error 1:
140213 4:52:07 [ERROR] Got fatal error 1236: 'Could not find first log file name in binary log index file' from master when reading data from binary log
140213 4:52:07 [Note] Slave I/O thread exiting, read up to log 'bin-log.007323', position 46774422
After that I got new problems from database and this error came out from every table.
Error 2:
Error 'Incorrect information in file: './database/table.frm'' on query. Default database: 'database'.
Seems that something went wrong.
ROUND 2!
After this scene I started to think that can this be done without long service break.
Master database has been already configured and it works ok to another slave.
So i did some googling and this is what i came up with.
Making read lock to tables:
FLUSH TABLES WITH READ LOCK;
Taking dump:
mysqldump --skip-lock-tables --single-transaction --flush-logs --master-data=2 -A > dbdump.sql
Packaging and moving:
gzip (pigz) the the dbdump and moving it to slave server after that finding the MASTER_LOG_FILE and MASTER_LOG_POS from the dump.
After that i don't think that i want to import the dbdump.sql because its over 100GB and
will take time. So i think SOURCE would be ok option for it.
On SLAVE server:
CREATE DATABASE dbdump;
USE dbdump;
SOURCE dbdump.db;
CHANGE MASTER TO MASTER_HOST='x.x.x.x',MASTER_USER='replication',MASTER_PASSWORD='slavepass',
MASTER_LOG_FILE='mysql-bin.000001',MASTER_LOG_POS=X;
start slave;
SHOW SLAVE STATUS \G
I haven't tested this yet, am I on to something?
--bp
Realize that issuing a SOURCE command is the same as running an import of the dumped SQL from shell. Either way, it is going to take a long time. Outside of that, you have the steps correct - flush table with read lock on master, make a database dump of master, make sure you note master binlog coordinates, import dump on slave, set binlog coordinates, start replication. Do not work with the raw binaries unless you REALLY know what you are doing (especially for INNODB tables).
If you have a number of large tables (i.e. not just one big one), you could consider parallelizing your dumps/imports by table (or groups of tables) to speed things along. There are actually tools out there to help you do this.
You CAN work with the raw binaries, but it is not for the faint of heart. In the past, I have used rsync to differentially update the raw binaries between master and slave (you still must use flush table with read lock and gather master binlog coordinates before doing this). For MyISAM tables this works pretty well actually. For InnoDB, it can be more tricky. I prefer to use the option to set InnoDB to write index and data files per table. You would need to rsync the ibdata* files. You would delete ib_logfile* files from slave.
This whole thing is a bit of a high wire act, so I would not resort to doing this unless you have no other viable options. Absolutely take a traditional SQL dump before even thinking about attempting a binary file sync, and each time until you are VERY comfortable that you actually know what you are doing.
The databases are prohibitively large (> 400MB), so dump > SCP > source is proving to be hours and hours work.
Is there an easier way? Can I connect to the DB directly and import from the new server?
You can simply copy the whole /data folder.
Have a look at High Performance MySQL - transferring large files
Use can use ssh to directly pipe your data over the Internet. First set up SSH keys for password-less login. Next, try something like this:
$ mysqldump -u db_user -p some_database | gzip | ssh someuser#newserver 'gzip -d | mysql -u db_user --password=db_pass some_database'
Notes:
The basic idea is that you are just dumping standard output straight into a command on the other side, which SSH is perfect for.
If you don't need encryption then you can use netcat but it's probably not worth it
The SQL text data goes over the wire compressed!
Obviously, change db_user to user user and some_database to your database. someuser is the (Linux) system user, not the MySQL user.
You will also have to use --password the long way because having mysql prompt you will be a lot of headache.
You could setup a MySQL slave replication and let MySQL copy the data, and then make the slave the new master
400M is really not a large database; transferring it to another machine will only take a few minutes over a 100Mbit network. If you do not have 100M networks between your machines, you are in a big trouble!
If they are running the exact same version of MySQL and have identical (or similar ENOUGH) my.cnf and you just want a copy of the entire data, it is safe to copy the server's entire data directory across (while both instances are stopped, obviously). You'll need to delete the data directory of the target machine first of course, but you probably don't care about that.
Backup/restore is usually slowed down by the restoration having to rebuild the table structure, rather than the file copy. By copying the data files directly, you avoid this (subject to the limitations stated above).
If you are migrating a server:
The dump files can be very large so it is better to compress it before sending or use the -C flag of scp. Our methodology of transfering files is to create a full dump, in which the incremental logs are flushed (use --master-data=2 --flush logs, please check you don't mess any slave hosts if you have them). Then we copy the dump and play it. Afterwards we flush the logs again (mysqladmin flush-logs), take the recent incremental log (which shouldn't be very large) and play only it. Keep doing it until the last incremental log is very small so that you can stop the database on the original machine, copy the last incremental log and then play it - it should take only a few minutes.
If you just want to copy data from one server to another:
mysqldump -C --host=oldhost --user=xxx --database=yyy -p | mysql -C --host=newhost --user=aaa -p
You will need to set the db users correctly and provide access to external hosts.
try importing the dump on the new server using mysql console, not an auxiliar software
I have no experience with doing this with mysql, but to me it seems the bottleneck is transferring the actual data?
4oo MB isnt that much. But if dump -> SCP is slow, i dont think connecting to the db server from the remove box would be any faster?
I'd suggest dumping, compressing, then copying over network or burning to disk and manually transfering the data.
Compressing such a dump will most likely give you quite good compression rate since, most likely , theres a lot of repeptetive data.
If you are only copying all the databases of the server, copy the entire /data directory.
If you are just copying one or more databases and adding them to an existing mysql server:
create the empty database in the new server, set up the permissions for users etc.
copy the folder for the database in /data/databasename to the new server /data/databasename
I like to use BigDump: Staggered Mysql Dump Importer after Exporting my database from the old server.
http://www.ozerov.de/bigdump/
One thing to note though, if you don't set the export options (namely the maximum length of created queries) respective to the load your new server can handle, it'll just fail and you will have to try again with different parameters. Personally, I set mine to about 25,000, but that's just me. Test it out a bit and you'll get the hang of it.