(Partial) database synchronization with mysql - mysql

I'm looking for a solution with which I would be able to synchronize, upon request, a selection of tables between a live mysql database with a local database. If there is no specific solution to this, what would be a good solution for synchronizing local and live databases?

There is always phpmyadmin where you can export selected tables and import them again.
This is a practical solution if you have a live DB where you need to work on it with the development system. You can take a copy of the db, work on it, drop, e.g. the customer tables, then take the live offline, back it up, drop everything except the customer tables and import your updated db.
The advantage of phpmyadmin is that it shows you the queries, you can then put together a script from these and use them next time you update.

The desktop db manager SQLYog has a "Job Scheduler" tool for setting up things like this. It requires a professional or enterprise license(cheaper than they sound), but the 30-day trial is full-featured if you just want to check it out first.

If you have the following conditions
Live DB can be accessed by public IP
All Needed Tables are InnoDB
You can perform a mysqldump of individual tables
Here is how using a shell script may accomplishe this when run from the local machine:
LIVEDB_IP=192.168.1.10
MYSQL_SRCUSER=whateverusername
MYSQL_SRCPASS=whateverpassword
MYSQL_SRCCONN="-h${LIVEDB_IP} -u${MYSQL_SRCUSER} -p${MYSQL_SRCPASS}"
SOURCE_DB=mydb_source
MYSQL_TGTUSER=whateverusername2
MYSQL_TGTPASS=whateverpassword2
MYSQL_TGTCONN="-h127.0.0.1 -u${MYSQL_TGTUSER} -p${MYSQL_TGTPASS}"
TARGET_DB=mydb_target
TBLLIST="tb1 tb2 tb3"
for TB in `echo "${TBLLIST}"`
do
mysqldump ${MYSQL_SRCCONN} --single-transaction ${SOURCE_DB} ${TB} | mysql ${MYSQL_TGTCONN} -A -D${TARGET_DB} &
done
wait
The --single-transaction option causes a point-in-time snapshot of the table to take place while still allowing INSERTs, UPDATEs, and DELETEs back in the source database.
If the tables are rather big, there may be timeout issues. Try this script instead:
LIVEDB_IP=192.168.1.10
MYSQL_SRCUSER=whateverusername
MYSQL_SRCPASS=whateverpassword
MYSQL_SRCCONN="-h${LIVEDB_IP} -u${MYSQL_SRCUSER} -p${MYSQL_SRCPASS}"
SOURCE_DB=mydb_source
MYSQL_TGTUSER=whateverusername2
MYSQL_TGTPASS=whateverpassword2
MYSQL_TGTCONN="-h127.0.0.1 -u${MYSQL_TGTUSER} -p${MYSQL_TGTPASS}"
TARGET_DB=mydb_target
TBLLIST="tb1 tb2 tb3"
for TB in `echo "${TBLLIST}"`
do
mysqldump ${MYSQL_SRCCONN} --single-transaction ${SOURCE_DB} ${TB} | gzip > ${TB}.sql.gz &
done
wait
for TB in `echo "${TBLLIST}"`
do
gzip -d < ${TB}.sql.gz | mysql ${MYSQL_TGTCONN} -A -D${TARGET_DB} &
done
wait
Give it a Try !!!
CAVEAT
If the tables needed are MyISAM, just remove --single-transaction from the mysqldump. This may cause queries at the source database to freeze until the the mysqldump is done with the tables the queries need. In that event, just perform this script during low traffic hours.

Related

Table files transfered between servers flags table as crashed

Work has a web site that uses large data sets, load balanced between two MySQL 5.6.16-64.2 servers using MyISAM, running on Linux (2.6.32-358.el6.x86_64 GNU/Linux.) This data is being updated hourly from a text based file set that is received from a MS-SQL database. To avoid disruption on reads from the web site and at the same time make sure the updates doesn't take too long following process was put in place:
Have the data one a third Linux box (only used for data update processing,) update the different data tables as needed, move a copy of the physical table files to the production servers under a temporary name, and then do a table swap by MySQL TABLE RENAME.
But every time the table (under the temporary name) is seen by the destination MySQL servers as being crashed and require repair. The repair takes too long, so it cannot be forced to do a repair before doing the table swap.
The processing is programmed in Ruby 1.8.7 by having a thread for each server (just as a FYI, this also happens if not doing it in a thread to a single server.)
The steps to perform file copy is as follows:
Use Net::SFTP to transfer the files to a destination folder that is not the database folder (done due to permissions.) Code example of the file transfer for the main table files (if table also has partition files then they are transferred separately and rspFile is assigned differently to match the temporary name.) For speed it is parallel uploaded:
Net::SFTP.start(host_server, host_user, :password => host_pwd) do |sftp|
uploads = fileList.map { |f|
rcpFile = File.basename(f, File.extname(f)) + prcExt + File.extname(f)
sftp.upload(f, "/home/#{host_user}/#{rcpFile}")
}
uploads.each { |u| u.wait }
end
Then assign the files the owner and group to the mysql user and to move the files to the MySQL database folder, by using Net::SSH to execute sudo shell commands:
Net::SSH.start(host_server, host_user, :port => host_port.to_i, :password => host_pwd) do |ssh|
doSSHCommand(ssh, "sudo sh -c 'chown mysql /home/#{host_user}/#{prcLocalFiles}'", host_pwd)
doSSHCommand(ssh, "sudo sh -c 'chgrp mysql /home/#{host_user}/#{prcLocalFiles}'", host_pwd)
doSSHCommand(ssh, "sudo sh -c 'mv /home/#{host_user}/#{prcLocalFiles} #{host_path}'", host_pwd)
end
The doSSHCommand method:
def doSSHCommand(ssh, cmd, pwd)
result = ""
ssh.open_channel do |channel|
channel.request_pty do |c, success|
raise "could not request pty" unless success
channel.exec "#{cmd}" do |c, success|
raise "could not execute command '#{cmd}'" unless success
channel.on_data do |c, data|
if (data[/\[sudo\]|Password/i]) then
channel.send_data "#{pwd}\n"
else
result += data unless data.nil?
end
end
end
end
end
ssh.loop
result
end
If done manually by using scp to move the files over, do the owner/group changes, and move the files, then it never crashes the table. By checking the file sizes compared between scp and Net::SFTP there are no difference.
Other process methods has been tried, but experience they take too long compared to using the method described above. Anyone have an idea of why the tables are being crashed and if there a solution to avoid table crash without having to do a table repair?
The tables are marked as crashed because you're probably getting race conditions as you copy the files. That is, there are writes pending to the tables during your execution of your Ruby script, so the resulting copy is incomplete.
The safer way to copy MyISAM tables is to run the SQL commands FLUSH TABLES followed by FLUSH TABLES WITH READ LOCK first, to ensure that all pending changes are written to the table on disk, and then block any further changes until you release the table lock. Then perform your copy script, and then finally unlock the tables.
This does mean that no one can update the tables while you're copying them. That's the only way you can ensure you get uncorrupt files.
But I have to comment that it seems like you're reinventing MySQL replication. Is there any reason you're not using that? It could probably work faster, better, and more efficiently, incrementally and continually updating only the parts of the tables that have changed.
The issue was found and solved:
The process database had the table files copied from one of the production databases, and did not show crashed on the process server and no issues when query and updating the data.
While searching the web following SO answer was found: MySQL table is marked as crashed
So by guessing that when the tables was copied from production to the process server, that the header info stayed the same and might interfere when copied back to the production servers during the processor. So it was tried by repairing the table on the process server and then run a few tests on our staging environment where the issue was also experienced. And surely enough that corrected the issue.
So the final solution was to repair the tables once on the process server before having the process script run hourly.
I see you've already found an answer, but two things struck me about this question.
One, you should look at rsync which gives you many more options, not the least of which is a speedier transfer, that may better suit this problem. File transfer between servers is basically why rsync exists as a thing.
Second, and I'm not trying to re engineer your system but you may have outgrown MySQL. It may not be the best fit for this problem. This problem may be better served by Riak where you have multiple nodes, or Mongo where you can deal with large files and have multiple nodes. Just two thoughts I had while reading your question.

mysqldump vs select into outfile

I use select * into outfile option in mysql to backup the data into text files in tab separated format. i call this statement against each table.
And I use load data infile to import data into mysql for each table.
I have not yet done any lock or disable keys while i perform this operation
Now I face some issues:
While it is taking backup the other, updates and selects are getting slow.
It takes too much time to import data for huge tables.
How can I improve the method to solve the above issues?
Is mysqldump an option? I see that it uses insert statements, so before I try it, I wanted to request advice.
Does using locks and disable keys before each "load data" improve speed in import?
If you have a lot of databases/tables, it will definitely be much easier for you to use mysqldump, since you only need to run it once per database (or even once for all databases, if you do a full backup of your system). Also, it has the advantage that it also backs up your table structure (something you cannot do using only select *).
The speed is probably similar, but it would be best to test both and see which one works best in your case.
Someone here tested the options, and mysqldump proved to be faster in his case. But again, YMMV.
If you're concerned about speed, also take a look at the mysqldump/mysqlimport combination. As mentioned here, it is faster than mysqldump alone.
As for locks and disable keys, I am not sure, so I will let someone else answer that part :)
Using mysqldump is important if you want your data backup to be consistent. That is, the data dumped from all tables represents the same instant in time.
If you dump tables one by one, they are not in sync, so you could have data for one table that references rows in another table that aren't included in the second table's backup. When you restore, it won't be pretty.
For performance, I'm using:
mysqldump --single-transaction --tab mydatabase
This dumps for each table, one .sql file for table definition, and one .txt file for data.
Then when I import, I run the .sql files to define tables:
mysqladmin create mydatabase
cat *.sql | mysql mydatabase
Then I import all the data files:
mysqlimport --local --use-threads=4 mydatabase *.txt
In general, running mysqlimport is faster than running the insert statements output by default by mysqldump. And running mysqlimport with multiple threads should be faster too, as long as you have the CPU resources to spare.
Using locks when you restore does not help performance.
The disable keys is intended to defer index creation until after the data is fully loaded and keys are re-enabled, but this helps only for non-unique indexes in MyISAM tables. But you shouldn't use MyISAM tables.
For more information, read:
https://dev.mysql.com/doc/refman/5.7/en/mysqldump.html
https://dev.mysql.com/doc/refman/5.7/en/mysqlimport.html

Transfer mySQL from development to production

I need to synch development mysql db with the production one.
Production db gets updated by user clicks and other data generated via web.
Development db gets updated with processing data.
What's the best practice to accomplish this?
I found some diff tools (eg. mySQL diff), but they don't manage updated records.
I also found some application solution: http://www.isocra.com/2004/10/dumptosql/
but I'm not sure it's a good practice as in this case I need to retest my code each time I add new innodb related tables.
Any ideas?
Take a look at mysqldump. It may serve you well enough for this.
Assuming your tables are all indexed with some sort of unique key you could do a dump and have it leave out the 'drop/create table' bits. Have it run as 'insert ignore' and you'll get the new data without effecting the existing data.
Another option would be to use the query part of mysqldump to dump only the new records from the production side. Again - have mysqldump leave off the 'drop/create' bits.

mysqldump without interrupting live production INSERT

I'm about to migrate our production database to another server. It's about 38GB large and it's using MYISAM tables. Due to I have no physical access to the new server file system, we can only use mysqldump.
I have looked through this site and see whether will mysqldump online backup bring down our production website. From this post: Run MySQLDump without Locking Tables , it says obviously mysqldump will lock the db and prevent insert. But after a few test, I'm curious to find out it shows otherwise.
If I use
mysqldump -u root -ppassword --flush-logs testDB > /tmp/backup.sql
mysqldump will eventually by default do a '--lock-tables', and this is a READ LOCAL locks (refer to mysql 5.1 doc), where concurrent insert still available. I have done a for loop to insert into one of the table every second while mysqldump take one minute to complete. Every second there will be record inserted during that period. Which mean, mysqldump will not interrupt production server and INSERT can still go on.
Is there anyone having different experience ? I want to make sure this before carry on to my production server, so would be glad to know if I have done anything wrong that make my test incorrect.
[My version of mysql-server is 5.1.52, and mysqldump is 10.13]
Now, you may have a database with disjunct tables, or a data warehouse - where everything isn't normalized (at all), and where there are no links what so ever between the tables. In that case, any dump would work.
I ASSUME, that a production database containing 38G data is containing graphics in some form (BLOB's), and then - ubuquitously - you have links from other tables. Right?
Therefore, you are - as far as I can see it - at risk of loosing serious links between tables (usually primary / foreign key pairs), thus, you may capture one table at the point of being updated/inserted into, while its dependant (which uses that table as its primary source) has not been updated, yet. Thus, you will loose the so called integrity of your database.
More often than not, it is extremely cumbersome to restablish integrity, most often due to that the system using/generating/maintaining the database system has not been made as a transaction oriented system, thus, relationships in the database cannot be tracked except via the primary/foreign key relations.
Thus, you may surely get away with copying your table without locks and many of the other proposals here above - but you are at risk of burning your fingers, and depending on, how sensitive the operations are of the system - you may burn yourself severely or just get a surface scratch.
Example: If your database is a critical mission database system, containing recommended heart beat rate for life support devices in an ICU, I would think more than twice, before I make the migration.
If, however, the database contains pictures from facebook or similar site = you may be able to live with the consequences of anything from 0 up to 129,388 lost links :-).
Now - so much for analysis. Solution:
YOU WOULD HAVE to create a software which does the dump for you with full integrity, table-set by table-set, tuple by tuple. You need to identify that cluster of data, which can be copied from your current online 24/7/365 base to your new base, then do that, then mark that it has been copied.
IFFF now changes occur to the records you have already copied, you will need to do subsequent copy of those. It can be a tricky affair to do so.
IFFF you are running a more advanced version of MYSQL - you can actually create another site and/or a replica, or a distributed database - and then get away with it, that way.
IFFF you have a window of lets say 10 minutes, which you can create if you need it, then you can also just COPY the physical files, located on the drive. I am talking about the .stm .std - and so on - files - then you can close down the server for a few minutes, then copy.
Now to a cardinal question:
You need to do maintenance of your machines from time to time. Haven't your system got space for that kind of operations? If not - then what will you do, when the hard disk crashes. Pay attention to the 'when' - not 'if'.
1) Use of --opt is the same as specifying --add-drop-table, --add-locks, --create-options, --disable-keys, --extended-insert, --lock-tables, --quick, and --set-charset. All of the options that --opt stands for also are on by default because --opt is on by default.
2) mysqldump can retrieve and dump table contents row by row, or it can retrieve the entire content from a table and buffer it in memory before dumping it. Buffering in memory can be a problem if you are dumping large tables. To dump tables row by row, use the --quick option (or --opt, which enables --quick). The --opt option (and hence --quick) is enabled by default, so to enable memory buffering, use --skip-quick.
3) --single-transaction This option issues a BEGIN SQL statement before dumping data from the server (transactional tables InnoDB).
If your schema is a combination of both InnoDB and MyISAM , following example will help you:
mysqldump -uuid -ppwd --skip-opt --single-transaction --max_allowed_packet=512M db > db.sql
I've never done it before but you could try --skip-add-locks when dumping.
Though it might take longer, you could dump in several patches, each of which would take very little time to complete. Adding --skip--add-drop-table would allow you to upload these multiple smaller dumps into the same table without re-creating it. Using --extended-insert would make the sql file smaller to boot.
Possibly try something like mysqldump -u -p${password} --skip-add-drop-table --extended-insert --where='id between 0 and 20000' test_db test_table > test.sql. You would need to dump the table structures and upload them first in order to do it this way, or remove the --skip-add-drop-table for the first dump
mysqldump doesn't add --lock-tables by default. Try to use --lock-tables
Let me know if it helped
BTW - you should also use add-locks which will make your import faster!

Keeping integrity between two separate data stores during backups (MySQL and MongoDB)

I have an application I designed where relational data sits and fits naturally into MySQL. I have other data that has a constantly evolving schema and doesn't have relational data, so I figured the natural way to store this data would be in MongoDB as a document. My issue here is one of my documents references a MySQL primary ID. So far this has worked without any issues. My concern is that when production traffic comes in and we start working with backups, that there might be inconsistency for when the document changes, it might not point to the correct ID in the MySQL database. The only way to guarantee it to a certain degree would be to shutdown the application and take backups, which doesn't make much sense.
There has to be other people that deploy a similar strategy. What is the best way to ensure data integrity between the two data stores, particularly during backups?
MySQL Perspective
All your MySQL data would have to use InnoDB. Then you could make a snapshot of the MySQL Data as follows:
MYSQLDUMP_OPTIONS="--single-transaction --routines --triggers"
mysqldump -u... -p... ${MYSQLDUMP_OPTIONS} --all-databases > MySQLData.sql
This will create a clean point-in-time snapshot of all MySQL Data as a single transaction.
For instance, if you start this mysqldump at midnight, all data in the mysqldump output will be from midnight. Data can still be added to MySQL (provided all your data uses the InnoDB Storage Engine) and you can have MongoDB reference any new data added to MySQL after midnight, even if it is during the backup.
If you have any MyISAM tables, you need to convert them to InnoDB. Let's cut to the chase. Here is how you make a script to convert all your MyISAM tables to InnoDB:
MYISAM_TO_INNODB_CONVERSION_SCRIPT=/root/ConvertMyISAMToInnoDB.sql
echo "SET SQL_LOG_BIN = 0;" > ${MYISAM_TO_INNODB_CONVERSION_SCRIPT}
mysql -u... -p... -AN -e"SELECT CONCAT('ALTER TABLE ',table_schema,'.',table_name,' ENGINE=InnoDB;') InnoDBConversionSQL FROM information_schema.tables WHERE engine='MyISAM' AND table_schema NOT IN ('information_schema','mysql','performance_schema') ORDER BY (data_length+index_length)" >> ${MYISAM_TO_INNODB_CONVERSION_SCRIPT}
Just run this script when you are ready to convert all user-defined MyISAM tables. Any system-related MyISAM tables are ignored and should not be touched anyway.
MongoDB Perspective
I cannot speak for MongoDB for I know very little. Yet, for the MongoDB side of things, if you setup a Replica Set for any MongoDB data, you could just use mongodump against a replica. Since mongodump is not point-in-time, you would have to disconnect the replica (to stop changes from coming over) and then perform the mongodump on the replica. Then reestablish the replica to its master. Find out from your developers or from 10gen if mongodump can be used against a disconnected replica set.
Common Objectives
If point-in-time truly matters to you, please makes sure all OS clocks have the same synchronized time and timezone. If you have to perform such a synchronization, your must restart mysqld and mongod. Then, your crontab jobs for mysqldump and mongodump will go off at the same time. Personally, I would delay a mongodump about 30 seconds to assure the ids from mysql you want posted in MongoDB are accounted for.
If you have mysqld and mongod running on the same server, then you do not need any MongoDB replication. Just start a mysqldump at 00:00:00 (midnight) and the mongodump at 00:30:00 (30 sec after midnight).
I don't think there is an easy way to do this. Mongo doesn't have complex transactions with rollback support so its very hard to maintain such integrity. One way to approach this would be to think of it as two ledgers, records all the updates on mysql ledger and then replay it on mongo ledger to maintain integrity. The other possible solution is to do this at the application level and stop the writes.
There really is no way to do it without some kind of outside check or enforcement.
If you really need to ensure perfect integrity between the two, one way to do this is to use timestamps for both your mysql data (all records) and mongo records, then back up each one filtered by the timestamps using the tools for each to select only the the records existing right before the scheduled backup (see http://www.electrictoolbox.com/mysqldump-selectively-dump-data/ for how to use mysqldump with a WHERE clause and http://www.mongodb.org/display/DOCS/Import+Export+Tools#ImportExportTools-mongodump to dump a MongoDB collection with a query)
Depending on how you're actually using each of your data stores, you may be able to do something else... For example, if you are only writing to your MongoDB and never updating or deleting, then it would be reasonable to backup your MySQL database, then backup you MongoDB (which may now have some extra records in it because it is backed up afterwards) and then purge the MongoDB records that do not correspond to anything in MySQL. As I said, it depends on how you're using them.
But the timestamp thing will work regardless - you just have the extra overhead of the timestamps.