How to compare 2 mysql dumps - mysql

I've been testing the command
mysqldump databaseName > mysqlDump1
Run separately on 2 servers where I process same data against same software.
When I diff the outputted files, there are numerous differences (including the file size). I guess its datestamps etc that causes it, but is there a way that both dumps would be the same?
That way I can then use it to regression test software changes where I don't expect changes in DB when processing (unless my change is supposed to affect it...rare)

Most likely the problem is that mysqldump doesn't guarantee to dump rows in a consistent order. Even two dumps on the same data taken on the same machine could theoretically come out with inserts in a different order.
mysqldump does have an option --order-by-primary, which might help, but the documentation warns that it takes longer than an ordinary dump.

Related

Compare 2 large sql files and find differences to recover data

I have 2 large SQL files of around 8GB each. But, in the latest backup I find that one file has 300MB data missing.
I just want to compare which data is missing, so that I can check that was it just temporary data OR important data that has vanished.
On comparing both files via diff on Ubuntu 14.04 I always get memory allocation error. I have also tried other allowing more than memory solutions and all that, but still no help.
I want to gather all data which exists in sql1 but missing in sql2 to a new file sql3.
Please help!
EDIT: I moved from Simple MySQl-Server to Percona XtraDB Cluster recently, A lot of tables were converted from MyISAM to INNODB in the process. So, can that be a reason for 300MB decreases in mysqldump SQL files? I seriously doubt this because a SQL will be an SQL, but does INNODB SQL code is decreased in any case? An expert advice on this will help.
SQL dumps comparison is quite hard to do when dealing with large amounts of data. I would try the following:
Import each SQL file data into its own database
Use one of the methods indicated here to compare database content (I assume the schema is the same). E.g. Toad for MySql
This way of comparison should be faster, as data manipulation is much faster when stored into database and also has the advantage the missing data can easily used. E.g.
SELECT *
FROM db1.sometable
WHERE NOT EXISTS (SELECT 1
FROM db2.sometable
WHERE db1.sometable.pkcol = db2.sometable.pk2)
will return the exact missing information into a convenient way.
If you export the dump you can use tools like Beyond Compare, Semantic Merge, Winmerge, Code Compare or other diff tools.
Not that some tools (i.e. Beyond Compare) have 4096 characters limit for a row, which becomes a problem in the comparison (I got mad). It's possible to change that in Tools->FileFormat->[choose your format, maybe it is EverythingElse]->Conversion->64000 characters Per Line (this is the maximum).
Also you can try changing the fileformat to SQL(might not help much though; and it will slow your comparison).

mysqldump without interrupting live production INSERT

I'm about to migrate our production database to another server. It's about 38GB large and it's using MYISAM tables. Due to I have no physical access to the new server file system, we can only use mysqldump.
I have looked through this site and see whether will mysqldump online backup bring down our production website. From this post: Run MySQLDump without Locking Tables , it says obviously mysqldump will lock the db and prevent insert. But after a few test, I'm curious to find out it shows otherwise.
If I use
mysqldump -u root -ppassword --flush-logs testDB > /tmp/backup.sql
mysqldump will eventually by default do a '--lock-tables', and this is a READ LOCAL locks (refer to mysql 5.1 doc), where concurrent insert still available. I have done a for loop to insert into one of the table every second while mysqldump take one minute to complete. Every second there will be record inserted during that period. Which mean, mysqldump will not interrupt production server and INSERT can still go on.
Is there anyone having different experience ? I want to make sure this before carry on to my production server, so would be glad to know if I have done anything wrong that make my test incorrect.
[My version of mysql-server is 5.1.52, and mysqldump is 10.13]
Now, you may have a database with disjunct tables, or a data warehouse - where everything isn't normalized (at all), and where there are no links what so ever between the tables. In that case, any dump would work.
I ASSUME, that a production database containing 38G data is containing graphics in some form (BLOB's), and then - ubuquitously - you have links from other tables. Right?
Therefore, you are - as far as I can see it - at risk of loosing serious links between tables (usually primary / foreign key pairs), thus, you may capture one table at the point of being updated/inserted into, while its dependant (which uses that table as its primary source) has not been updated, yet. Thus, you will loose the so called integrity of your database.
More often than not, it is extremely cumbersome to restablish integrity, most often due to that the system using/generating/maintaining the database system has not been made as a transaction oriented system, thus, relationships in the database cannot be tracked except via the primary/foreign key relations.
Thus, you may surely get away with copying your table without locks and many of the other proposals here above - but you are at risk of burning your fingers, and depending on, how sensitive the operations are of the system - you may burn yourself severely or just get a surface scratch.
Example: If your database is a critical mission database system, containing recommended heart beat rate for life support devices in an ICU, I would think more than twice, before I make the migration.
If, however, the database contains pictures from facebook or similar site = you may be able to live with the consequences of anything from 0 up to 129,388 lost links :-).
Now - so much for analysis. Solution:
YOU WOULD HAVE to create a software which does the dump for you with full integrity, table-set by table-set, tuple by tuple. You need to identify that cluster of data, which can be copied from your current online 24/7/365 base to your new base, then do that, then mark that it has been copied.
IFFF now changes occur to the records you have already copied, you will need to do subsequent copy of those. It can be a tricky affair to do so.
IFFF you are running a more advanced version of MYSQL - you can actually create another site and/or a replica, or a distributed database - and then get away with it, that way.
IFFF you have a window of lets say 10 minutes, which you can create if you need it, then you can also just COPY the physical files, located on the drive. I am talking about the .stm .std - and so on - files - then you can close down the server for a few minutes, then copy.
Now to a cardinal question:
You need to do maintenance of your machines from time to time. Haven't your system got space for that kind of operations? If not - then what will you do, when the hard disk crashes. Pay attention to the 'when' - not 'if'.
1) Use of --opt is the same as specifying --add-drop-table, --add-locks, --create-options, --disable-keys, --extended-insert, --lock-tables, --quick, and --set-charset. All of the options that --opt stands for also are on by default because --opt is on by default.
2) mysqldump can retrieve and dump table contents row by row, or it can retrieve the entire content from a table and buffer it in memory before dumping it. Buffering in memory can be a problem if you are dumping large tables. To dump tables row by row, use the --quick option (or --opt, which enables --quick). The --opt option (and hence --quick) is enabled by default, so to enable memory buffering, use --skip-quick.
3) --single-transaction This option issues a BEGIN SQL statement before dumping data from the server (transactional tables InnoDB).
If your schema is a combination of both InnoDB and MyISAM , following example will help you:
mysqldump -uuid -ppwd --skip-opt --single-transaction --max_allowed_packet=512M db > db.sql
I've never done it before but you could try --skip-add-locks when dumping.
Though it might take longer, you could dump in several patches, each of which would take very little time to complete. Adding --skip--add-drop-table would allow you to upload these multiple smaller dumps into the same table without re-creating it. Using --extended-insert would make the sql file smaller to boot.
Possibly try something like mysqldump -u -p${password} --skip-add-drop-table --extended-insert --where='id between 0 and 20000' test_db test_table > test.sql. You would need to dump the table structures and upload them first in order to do it this way, or remove the --skip-add-drop-table for the first dump
mysqldump doesn't add --lock-tables by default. Try to use --lock-tables
Let me know if it helped
BTW - you should also use add-locks which will make your import faster!

how to delete some tables from sql dump?

I have got dump of all sql databases.
in this dump i have got "database1", "database2", "database3"
how to take all data in another files from dump? may be some program or script?
or delete only "database2" from dump for example?
Depends how big it is.
If it's small (i.e. < 1G) then you can easily load it into a mysql instance on a test-box (VM or somewhere) and then do another dump just containing the DBs you're interested in. This is definitely the most reliable way.
If the dump is very large, say 500G, then it could be more difficult.
Applying text-processing on mysql dump-files is not advisable because they aren't actually text files! They can contain arbitrary binary data. These binary data might happen to contain things that you're searching for (for example, if using an "awk" program to process it).
Depends on your use-case really.

Database dumping in mysql after certain checkpoints

I want to get mysqldump after certain checkpoint e.g. if i take the mysqldump now then next time when i will take the dump it should give me only the commands which executed between this time interval. is there anyway to get this using mysqldump.
One more thing how to show the commands delete, update in the mysqldump files.
Thanks
I dont think this is possible from a MySQLdump, however that feature exists as part of MySQL core - its called Binlogging or binary logging.
The binary log contains “events” that describe database changes such as table creation operations or changes to table data. It also contains events for statements that potentially could have made changes (for example, a DELETE which matched no rows). The binary log also contains information about how long each statement took that updated data
Check this out http://dev.mysql.com/doc/refman/5.0/en/binary-log.html
Word of warning, binlogs can slow down the performance of your server.

Is there a MySql binary dump format? Or anything better than plain text INSERT statements?

Is there anything better (faster or smaller) than pages of plain text CREATE TABLE and INSERT statements for dumping MySql databases? It seems awfully inefficient for large amounts of data.
I realise that the underlying database files can be copied, but I assume they will only work in the same version of MySql that they come from.
Is there a tool I don't know about, or a reason for this lack?
Not sure if this is what you're after, but I usually pipe the output of mysqldump directly to gzip or bzip2 (etc). It tends to be a considerably faster than dumping to stdout or something like that, and the output files are much smaller thanks to the compression.
mysqldump --all-databases (other options) | gzip > mysql_dump-2010-09-23.sql.gz
It's also possible to dump to XML with the --xml option if you're looking for "portability" at the expense of consuming (much) more disk space than the gzipped SQL...
Sorry, no binary dump for MySQL. However the binary logs of MySQL are specifically for backup and database replication purposes http://dev.mysql.com/doc/refman/5.5/en/binary-log.html . They are not hard to configure. Only changes such as update and delete are logged, so each log file (created authomatically by MySQL) is also an incremental backup of the changes in the DB. This way you can save from time to time a whole snapshot of the db (once in a month?), and then store just the log files, and in case of crash, restore the latest snapshot and run through the logs.
It's worth noting that MySQL has a special syntax for doing bulk inserts. From the manual:
INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);
Would insert 3 rows in a single operation. So loading this way isn't as inefficient as it might otherwise be with one statement per row, and instead of 129 bytes in 3 INSERT statements, this is 59 bytes, and that advantage only gets bigger the more rows you have.
I've never tried this, but aren't mysql tables just binary files on the hard drive? Couldn't you just copy the table files themselves? Presumably that's essentially what you are asking for.
I don't know how to stitch that together, but it seems to me a copy of /var/lib/mysql would do the trick