how to delete some tables from sql dump? - mysql

I have got dump of all sql databases.
in this dump i have got "database1", "database2", "database3"
how to take all data in another files from dump? may be some program or script?
or delete only "database2" from dump for example?

Depends how big it is.
If it's small (i.e. < 1G) then you can easily load it into a mysql instance on a test-box (VM or somewhere) and then do another dump just containing the DBs you're interested in. This is definitely the most reliable way.
If the dump is very large, say 500G, then it could be more difficult.
Applying text-processing on mysql dump-files is not advisable because they aren't actually text files! They can contain arbitrary binary data. These binary data might happen to contain things that you're searching for (for example, if using an "awk" program to process it).
Depends on your use-case really.

Related

How to compare 2 mysql dumps

I've been testing the command
mysqldump databaseName > mysqlDump1
Run separately on 2 servers where I process same data against same software.
When I diff the outputted files, there are numerous differences (including the file size). I guess its datestamps etc that causes it, but is there a way that both dumps would be the same?
That way I can then use it to regression test software changes where I don't expect changes in DB when processing (unless my change is supposed to affect it...rare)
Most likely the problem is that mysqldump doesn't guarantee to dump rows in a consistent order. Even two dumps on the same data taken on the same machine could theoretically come out with inserts in a different order.
mysqldump does have an option --order-by-primary, which might help, but the documentation warns that it takes longer than an ordinary dump.

Compare 2 large sql files and find differences to recover data

I have 2 large SQL files of around 8GB each. But, in the latest backup I find that one file has 300MB data missing.
I just want to compare which data is missing, so that I can check that was it just temporary data OR important data that has vanished.
On comparing both files via diff on Ubuntu 14.04 I always get memory allocation error. I have also tried other allowing more than memory solutions and all that, but still no help.
I want to gather all data which exists in sql1 but missing in sql2 to a new file sql3.
Please help!
EDIT: I moved from Simple MySQl-Server to Percona XtraDB Cluster recently, A lot of tables were converted from MyISAM to INNODB in the process. So, can that be a reason for 300MB decreases in mysqldump SQL files? I seriously doubt this because a SQL will be an SQL, but does INNODB SQL code is decreased in any case? An expert advice on this will help.
SQL dumps comparison is quite hard to do when dealing with large amounts of data. I would try the following:
Import each SQL file data into its own database
Use one of the methods indicated here to compare database content (I assume the schema is the same). E.g. Toad for MySql
This way of comparison should be faster, as data manipulation is much faster when stored into database and also has the advantage the missing data can easily used. E.g.
SELECT *
FROM db1.sometable
WHERE NOT EXISTS (SELECT 1
FROM db2.sometable
WHERE db1.sometable.pkcol = db2.sometable.pk2)
will return the exact missing information into a convenient way.
If you export the dump you can use tools like Beyond Compare, Semantic Merge, Winmerge, Code Compare or other diff tools.
Not that some tools (i.e. Beyond Compare) have 4096 characters limit for a row, which becomes a problem in the comparison (I got mad). It's possible to change that in Tools->FileFormat->[choose your format, maybe it is EverythingElse]->Conversion->64000 characters Per Line (this is the maximum).
Also you can try changing the fileformat to SQL(might not help much though; and it will slow your comparison).

Fastest way to copy a large MySQL table?

What's the best way to copy a large MySQL table in terms of speed and memory use?
Option 1. Using PHP, select X rows from old table and insert them into the new table. Proceed to next iteration of select/insert until all entries are copied over.
Option 2. Use MySQL INSERT INTO ... SELECT without row limits.
Option 3. Use MySQL INSERT INTO ... SELECT with a limited number of rows copied over per run.
EDIT: I am not going to use mysqldump. The purpose of my question is to find the best way to write a database conversion program. Some tables have changed, some have not. I need to automate the entire copy over / conversion procedure without worrying about manually dumping any tables. So it would be helpful if you could answer which of the above options is best.
There is a program that was written specifically for this task called mysqldump.
mysqldump is a great tool in terms of simplicity and careful handling of all types of data, but it is not as fast as load data infile
If you're copying on the same database, I like this version of Option 2:
a) CREATE TABLE foo_new LIKE foo;
b) INSERT INTO foo_new SELECT * FROM foo;
I've got lots of tables with hundreds of millions of rows (like 1/2B) AND InnoDB AND several keys AND constraints. They take many many hours to read from a MySQL dump, but only an hour or so by load data infile. It is correct that copying the raw files with the DB offline is even faster. It is also correct that non-ASCII characters, binary data, and NULLs need to be handled carefully in CSV (or tab-delimited files), but fortunately, I've pretty much got numbers and text :-). I might take the time to see how long the above steps a) and b) take, but I think they are slower than the load data infile... which is probably because of transactions.
Off the three options listed above.
I would select the second option if you have a Unique constraint on at least one column, therefore not creating duplicate rows if the script has to be run multiple times to achieve its task in the event of server timeouts.
Otherwise your third option would be the way to go, while manually taking into account any server timeouts to determine your insert select limits.
Use a stored procedure
Option two must be fastest, but it's gonna be a mighty long transaction. You should look into making a stored procedure doing the copy. That way you could offload some of the data parsing/handling from the MySQL engine.
MySQL's load data query is faster than almost anything else, however it requires exporting each table to a CSV file.
Pay particular attention to escape characters and representing NULL values/binary data/etc in the CSV to avoid data loss.
If possible, the fastest way will be to take the database offline and simply copy data files on disk.
Of course, this have some requirements:
you can stop the database while copying.
you are using a storage engine that stores each table in individual files, MyISAM does this.
you have privileged access to the database server (root login or similar)
Ah, I see you have edited your post, then I think this DBA-from-hell approach is not an option... but still, it's fast!
The best way i find so far is creating the files as dump files(.txt), by using the outfile to a text then using infile in mysql to get the same data to the database

mysql optimization script file

I'm looking at having someone do some optimization on a database. If I gave them a similar version of the db with different data, could they create a script file to run all the optimizations on my database (ie create indexes, etc) without them ever seeing or touching the actual database? I'm looking at MySQL but would be open to other db's if necessary. Thanks for any suggestions.
EDIT:
What if it were an identical copy with transformed data? Along with a couple sample queries that approximated what the db was used for (ie OLAP vs OLTP)? Would a script be able to contain everything or would they need hands on access to the actual db?
EDIT 2:
Could I create a copy of the db, transform the data to make it unrecognizable, create a backup file of the db, give it to vendor and them give me a script file to run on my db?
Why are you concerned that they should not access the database? You will get better optimization if they have the actual data as they can consider table sizes, which queries run the slowest, whether to denormalise if necessary, putting small tables completely in memory, ...?
If it is a issue of confidentiality you can always make the data anomous by replacement of names.
If it's just adding indices, then yes. However, there are a number of things to consider when "optimizing". Which are the slowest queries in your database? How large are certain tables? How can certain things be changed/migrated to make those certain queries run faster? It could be harder to see this with sparse sample data. You might also include a query log so that this person could see how you're using the tables/what you're trying to get out of them, and how long those operations take.

Is there a MySql binary dump format? Or anything better than plain text INSERT statements?

Is there anything better (faster or smaller) than pages of plain text CREATE TABLE and INSERT statements for dumping MySql databases? It seems awfully inefficient for large amounts of data.
I realise that the underlying database files can be copied, but I assume they will only work in the same version of MySql that they come from.
Is there a tool I don't know about, or a reason for this lack?
Not sure if this is what you're after, but I usually pipe the output of mysqldump directly to gzip or bzip2 (etc). It tends to be a considerably faster than dumping to stdout or something like that, and the output files are much smaller thanks to the compression.
mysqldump --all-databases (other options) | gzip > mysql_dump-2010-09-23.sql.gz
It's also possible to dump to XML with the --xml option if you're looking for "portability" at the expense of consuming (much) more disk space than the gzipped SQL...
Sorry, no binary dump for MySQL. However the binary logs of MySQL are specifically for backup and database replication purposes http://dev.mysql.com/doc/refman/5.5/en/binary-log.html . They are not hard to configure. Only changes such as update and delete are logged, so each log file (created authomatically by MySQL) is also an incremental backup of the changes in the DB. This way you can save from time to time a whole snapshot of the db (once in a month?), and then store just the log files, and in case of crash, restore the latest snapshot and run through the logs.
It's worth noting that MySQL has a special syntax for doing bulk inserts. From the manual:
INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);
Would insert 3 rows in a single operation. So loading this way isn't as inefficient as it might otherwise be with one statement per row, and instead of 129 bytes in 3 INSERT statements, this is 59 bytes, and that advantage only gets bigger the more rows you have.
I've never tried this, but aren't mysql tables just binary files on the hard drive? Couldn't you just copy the table files themselves? Presumably that's essentially what you are asking for.
I don't know how to stitch that together, but it seems to me a copy of /var/lib/mysql would do the trick