mysqldump without interrupting live production INSERT - mysql

I'm about to migrate our production database to another server. It's about 38GB large and it's using MYISAM tables. Due to I have no physical access to the new server file system, we can only use mysqldump.
I have looked through this site and see whether will mysqldump online backup bring down our production website. From this post: Run MySQLDump without Locking Tables , it says obviously mysqldump will lock the db and prevent insert. But after a few test, I'm curious to find out it shows otherwise.
If I use
mysqldump -u root -ppassword --flush-logs testDB > /tmp/backup.sql
mysqldump will eventually by default do a '--lock-tables', and this is a READ LOCAL locks (refer to mysql 5.1 doc), where concurrent insert still available. I have done a for loop to insert into one of the table every second while mysqldump take one minute to complete. Every second there will be record inserted during that period. Which mean, mysqldump will not interrupt production server and INSERT can still go on.
Is there anyone having different experience ? I want to make sure this before carry on to my production server, so would be glad to know if I have done anything wrong that make my test incorrect.
[My version of mysql-server is 5.1.52, and mysqldump is 10.13]

Now, you may have a database with disjunct tables, or a data warehouse - where everything isn't normalized (at all), and where there are no links what so ever between the tables. In that case, any dump would work.
I ASSUME, that a production database containing 38G data is containing graphics in some form (BLOB's), and then - ubuquitously - you have links from other tables. Right?
Therefore, you are - as far as I can see it - at risk of loosing serious links between tables (usually primary / foreign key pairs), thus, you may capture one table at the point of being updated/inserted into, while its dependant (which uses that table as its primary source) has not been updated, yet. Thus, you will loose the so called integrity of your database.
More often than not, it is extremely cumbersome to restablish integrity, most often due to that the system using/generating/maintaining the database system has not been made as a transaction oriented system, thus, relationships in the database cannot be tracked except via the primary/foreign key relations.
Thus, you may surely get away with copying your table without locks and many of the other proposals here above - but you are at risk of burning your fingers, and depending on, how sensitive the operations are of the system - you may burn yourself severely or just get a surface scratch.
Example: If your database is a critical mission database system, containing recommended heart beat rate for life support devices in an ICU, I would think more than twice, before I make the migration.
If, however, the database contains pictures from facebook or similar site = you may be able to live with the consequences of anything from 0 up to 129,388 lost links :-).
Now - so much for analysis. Solution:
YOU WOULD HAVE to create a software which does the dump for you with full integrity, table-set by table-set, tuple by tuple. You need to identify that cluster of data, which can be copied from your current online 24/7/365 base to your new base, then do that, then mark that it has been copied.
IFFF now changes occur to the records you have already copied, you will need to do subsequent copy of those. It can be a tricky affair to do so.
IFFF you are running a more advanced version of MYSQL - you can actually create another site and/or a replica, or a distributed database - and then get away with it, that way.
IFFF you have a window of lets say 10 minutes, which you can create if you need it, then you can also just COPY the physical files, located on the drive. I am talking about the .stm .std - and so on - files - then you can close down the server for a few minutes, then copy.
Now to a cardinal question:
You need to do maintenance of your machines from time to time. Haven't your system got space for that kind of operations? If not - then what will you do, when the hard disk crashes. Pay attention to the 'when' - not 'if'.

1) Use of --opt is the same as specifying --add-drop-table, --add-locks, --create-options, --disable-keys, --extended-insert, --lock-tables, --quick, and --set-charset. All of the options that --opt stands for also are on by default because --opt is on by default.
2) mysqldump can retrieve and dump table contents row by row, or it can retrieve the entire content from a table and buffer it in memory before dumping it. Buffering in memory can be a problem if you are dumping large tables. To dump tables row by row, use the --quick option (or --opt, which enables --quick). The --opt option (and hence --quick) is enabled by default, so to enable memory buffering, use --skip-quick.
3) --single-transaction This option issues a BEGIN SQL statement before dumping data from the server (transactional tables InnoDB).
If your schema is a combination of both InnoDB and MyISAM , following example will help you:
mysqldump -uuid -ppwd --skip-opt --single-transaction --max_allowed_packet=512M db > db.sql

I've never done it before but you could try --skip-add-locks when dumping.
Though it might take longer, you could dump in several patches, each of which would take very little time to complete. Adding --skip--add-drop-table would allow you to upload these multiple smaller dumps into the same table without re-creating it. Using --extended-insert would make the sql file smaller to boot.
Possibly try something like mysqldump -u -p${password} --skip-add-drop-table --extended-insert --where='id between 0 and 20000' test_db test_table > test.sql. You would need to dump the table structures and upload them first in order to do it this way, or remove the --skip-add-drop-table for the first dump

mysqldump doesn't add --lock-tables by default. Try to use --lock-tables
Let me know if it helped
BTW - you should also use add-locks which will make your import faster!

Related

How to compare 2 mysql dumps

I've been testing the command
mysqldump databaseName > mysqlDump1
Run separately on 2 servers where I process same data against same software.
When I diff the outputted files, there are numerous differences (including the file size). I guess its datestamps etc that causes it, but is there a way that both dumps would be the same?
That way I can then use it to regression test software changes where I don't expect changes in DB when processing (unless my change is supposed to affect it...rare)
Most likely the problem is that mysqldump doesn't guarantee to dump rows in a consistent order. Even two dumps on the same data taken on the same machine could theoretically come out with inserts in a different order.
mysqldump does have an option --order-by-primary, which might help, but the documentation warns that it takes longer than an ordinary dump.

mysqldump vs select into outfile

I use select * into outfile option in mysql to backup the data into text files in tab separated format. i call this statement against each table.
And I use load data infile to import data into mysql for each table.
I have not yet done any lock or disable keys while i perform this operation
Now I face some issues:
While it is taking backup the other, updates and selects are getting slow.
It takes too much time to import data for huge tables.
How can I improve the method to solve the above issues?
Is mysqldump an option? I see that it uses insert statements, so before I try it, I wanted to request advice.
Does using locks and disable keys before each "load data" improve speed in import?
If you have a lot of databases/tables, it will definitely be much easier for you to use mysqldump, since you only need to run it once per database (or even once for all databases, if you do a full backup of your system). Also, it has the advantage that it also backs up your table structure (something you cannot do using only select *).
The speed is probably similar, but it would be best to test both and see which one works best in your case.
Someone here tested the options, and mysqldump proved to be faster in his case. But again, YMMV.
If you're concerned about speed, also take a look at the mysqldump/mysqlimport combination. As mentioned here, it is faster than mysqldump alone.
As for locks and disable keys, I am not sure, so I will let someone else answer that part :)
Using mysqldump is important if you want your data backup to be consistent. That is, the data dumped from all tables represents the same instant in time.
If you dump tables one by one, they are not in sync, so you could have data for one table that references rows in another table that aren't included in the second table's backup. When you restore, it won't be pretty.
For performance, I'm using:
mysqldump --single-transaction --tab mydatabase
This dumps for each table, one .sql file for table definition, and one .txt file for data.
Then when I import, I run the .sql files to define tables:
mysqladmin create mydatabase
cat *.sql | mysql mydatabase
Then I import all the data files:
mysqlimport --local --use-threads=4 mydatabase *.txt
In general, running mysqlimport is faster than running the insert statements output by default by mysqldump. And running mysqlimport with multiple threads should be faster too, as long as you have the CPU resources to spare.
Using locks when you restore does not help performance.
The disable keys is intended to defer index creation until after the data is fully loaded and keys are re-enabled, but this helps only for non-unique indexes in MyISAM tables. But you shouldn't use MyISAM tables.
For more information, read:
https://dev.mysql.com/doc/refman/5.7/en/mysqldump.html
https://dev.mysql.com/doc/refman/5.7/en/mysqlimport.html

Keeping integrity between two separate data stores during backups (MySQL and MongoDB)

I have an application I designed where relational data sits and fits naturally into MySQL. I have other data that has a constantly evolving schema and doesn't have relational data, so I figured the natural way to store this data would be in MongoDB as a document. My issue here is one of my documents references a MySQL primary ID. So far this has worked without any issues. My concern is that when production traffic comes in and we start working with backups, that there might be inconsistency for when the document changes, it might not point to the correct ID in the MySQL database. The only way to guarantee it to a certain degree would be to shutdown the application and take backups, which doesn't make much sense.
There has to be other people that deploy a similar strategy. What is the best way to ensure data integrity between the two data stores, particularly during backups?
MySQL Perspective
All your MySQL data would have to use InnoDB. Then you could make a snapshot of the MySQL Data as follows:
MYSQLDUMP_OPTIONS="--single-transaction --routines --triggers"
mysqldump -u... -p... ${MYSQLDUMP_OPTIONS} --all-databases > MySQLData.sql
This will create a clean point-in-time snapshot of all MySQL Data as a single transaction.
For instance, if you start this mysqldump at midnight, all data in the mysqldump output will be from midnight. Data can still be added to MySQL (provided all your data uses the InnoDB Storage Engine) and you can have MongoDB reference any new data added to MySQL after midnight, even if it is during the backup.
If you have any MyISAM tables, you need to convert them to InnoDB. Let's cut to the chase. Here is how you make a script to convert all your MyISAM tables to InnoDB:
MYISAM_TO_INNODB_CONVERSION_SCRIPT=/root/ConvertMyISAMToInnoDB.sql
echo "SET SQL_LOG_BIN = 0;" > ${MYISAM_TO_INNODB_CONVERSION_SCRIPT}
mysql -u... -p... -AN -e"SELECT CONCAT('ALTER TABLE ',table_schema,'.',table_name,' ENGINE=InnoDB;') InnoDBConversionSQL FROM information_schema.tables WHERE engine='MyISAM' AND table_schema NOT IN ('information_schema','mysql','performance_schema') ORDER BY (data_length+index_length)" >> ${MYISAM_TO_INNODB_CONVERSION_SCRIPT}
Just run this script when you are ready to convert all user-defined MyISAM tables. Any system-related MyISAM tables are ignored and should not be touched anyway.
MongoDB Perspective
I cannot speak for MongoDB for I know very little. Yet, for the MongoDB side of things, if you setup a Replica Set for any MongoDB data, you could just use mongodump against a replica. Since mongodump is not point-in-time, you would have to disconnect the replica (to stop changes from coming over) and then perform the mongodump on the replica. Then reestablish the replica to its master. Find out from your developers or from 10gen if mongodump can be used against a disconnected replica set.
Common Objectives
If point-in-time truly matters to you, please makes sure all OS clocks have the same synchronized time and timezone. If you have to perform such a synchronization, your must restart mysqld and mongod. Then, your crontab jobs for mysqldump and mongodump will go off at the same time. Personally, I would delay a mongodump about 30 seconds to assure the ids from mysql you want posted in MongoDB are accounted for.
If you have mysqld and mongod running on the same server, then you do not need any MongoDB replication. Just start a mysqldump at 00:00:00 (midnight) and the mongodump at 00:30:00 (30 sec after midnight).
I don't think there is an easy way to do this. Mongo doesn't have complex transactions with rollback support so its very hard to maintain such integrity. One way to approach this would be to think of it as two ledgers, records all the updates on mysql ledger and then replay it on mongo ledger to maintain integrity. The other possible solution is to do this at the application level and stop the writes.
There really is no way to do it without some kind of outside check or enforcement.
If you really need to ensure perfect integrity between the two, one way to do this is to use timestamps for both your mysql data (all records) and mongo records, then back up each one filtered by the timestamps using the tools for each to select only the the records existing right before the scheduled backup (see http://www.electrictoolbox.com/mysqldump-selectively-dump-data/ for how to use mysqldump with a WHERE clause and http://www.mongodb.org/display/DOCS/Import+Export+Tools#ImportExportTools-mongodump to dump a MongoDB collection with a query)
Depending on how you're actually using each of your data stores, you may be able to do something else... For example, if you are only writing to your MongoDB and never updating or deleting, then it would be reasonable to backup your MySQL database, then backup you MongoDB (which may now have some extra records in it because it is backed up afterwards) and then purge the MongoDB records that do not correspond to anything in MySQL. As I said, it depends on how you're using them.
But the timestamp thing will work regardless - you just have the extra overhead of the timestamps.

(Partial) database synchronization with mysql

I'm looking for a solution with which I would be able to synchronize, upon request, a selection of tables between a live mysql database with a local database. If there is no specific solution to this, what would be a good solution for synchronizing local and live databases?
There is always phpmyadmin where you can export selected tables and import them again.
This is a practical solution if you have a live DB where you need to work on it with the development system. You can take a copy of the db, work on it, drop, e.g. the customer tables, then take the live offline, back it up, drop everything except the customer tables and import your updated db.
The advantage of phpmyadmin is that it shows you the queries, you can then put together a script from these and use them next time you update.
The desktop db manager SQLYog has a "Job Scheduler" tool for setting up things like this. It requires a professional or enterprise license(cheaper than they sound), but the 30-day trial is full-featured if you just want to check it out first.
If you have the following conditions
Live DB can be accessed by public IP
All Needed Tables are InnoDB
You can perform a mysqldump of individual tables
Here is how using a shell script may accomplishe this when run from the local machine:
LIVEDB_IP=192.168.1.10
MYSQL_SRCUSER=whateverusername
MYSQL_SRCPASS=whateverpassword
MYSQL_SRCCONN="-h${LIVEDB_IP} -u${MYSQL_SRCUSER} -p${MYSQL_SRCPASS}"
SOURCE_DB=mydb_source
MYSQL_TGTUSER=whateverusername2
MYSQL_TGTPASS=whateverpassword2
MYSQL_TGTCONN="-h127.0.0.1 -u${MYSQL_TGTUSER} -p${MYSQL_TGTPASS}"
TARGET_DB=mydb_target
TBLLIST="tb1 tb2 tb3"
for TB in `echo "${TBLLIST}"`
do
mysqldump ${MYSQL_SRCCONN} --single-transaction ${SOURCE_DB} ${TB} | mysql ${MYSQL_TGTCONN} -A -D${TARGET_DB} &
done
wait
The --single-transaction option causes a point-in-time snapshot of the table to take place while still allowing INSERTs, UPDATEs, and DELETEs back in the source database.
If the tables are rather big, there may be timeout issues. Try this script instead:
LIVEDB_IP=192.168.1.10
MYSQL_SRCUSER=whateverusername
MYSQL_SRCPASS=whateverpassword
MYSQL_SRCCONN="-h${LIVEDB_IP} -u${MYSQL_SRCUSER} -p${MYSQL_SRCPASS}"
SOURCE_DB=mydb_source
MYSQL_TGTUSER=whateverusername2
MYSQL_TGTPASS=whateverpassword2
MYSQL_TGTCONN="-h127.0.0.1 -u${MYSQL_TGTUSER} -p${MYSQL_TGTPASS}"
TARGET_DB=mydb_target
TBLLIST="tb1 tb2 tb3"
for TB in `echo "${TBLLIST}"`
do
mysqldump ${MYSQL_SRCCONN} --single-transaction ${SOURCE_DB} ${TB} | gzip > ${TB}.sql.gz &
done
wait
for TB in `echo "${TBLLIST}"`
do
gzip -d < ${TB}.sql.gz | mysql ${MYSQL_TGTCONN} -A -D${TARGET_DB} &
done
wait
Give it a Try !!!
CAVEAT
If the tables needed are MyISAM, just remove --single-transaction from the mysqldump. This may cause queries at the source database to freeze until the the mysqldump is done with the tables the queries need. In that event, just perform this script during low traffic hours.

Database dumping in mysql after certain checkpoints

I want to get mysqldump after certain checkpoint e.g. if i take the mysqldump now then next time when i will take the dump it should give me only the commands which executed between this time interval. is there anyway to get this using mysqldump.
One more thing how to show the commands delete, update in the mysqldump files.
Thanks
I dont think this is possible from a MySQLdump, however that feature exists as part of MySQL core - its called Binlogging or binary logging.
The binary log contains “events” that describe database changes such as table creation operations or changes to table data. It also contains events for statements that potentially could have made changes (for example, a DELETE which matched no rows). The binary log also contains information about how long each statement took that updated data
Check this out http://dev.mysql.com/doc/refman/5.0/en/binary-log.html
Word of warning, binlogs can slow down the performance of your server.