How to dump a single table in MySQL without locking? - mysql

When I run the following command, the output only consists of the create syntax for 'mytable', but none of the data:
mysqldump --single-transaction -umylogin -p mydatabase mytable > dump.sql
If I drop --single-transaction, I get an error as I can't lock the tables.
If I drop 'mytable' (and do the DB), it looks like it's creating the INSERT statements, but the entire DB is huge.
Is there a way I can dump the table -- schema & data -- without having to lock the table?
(I also tried INTO OUTFILE, but lacked access for that as well.)

The answer might depend on the database engine that you are using for your tables. InnoDB has some support for non-locking backups. Given your comments about permissions, you might lack the permissions required for that.
The best option that comes to mind would be to create a duplicate table without the indicies. Copy all of the data from the table you would like to backup over to the new table. If you create your query in a way that can easily page through the data, you can adjust the duration of the locks. I have found that 10,000 rows per iteration is usually pretty darn quick. Once you have this query, you can just keep running it until all rows are copied.
Now that you have a duplicate, you can either drop it, truncate it, or keep it around and try to update it with the latest data and leverage it as a backup source.
Jacob

Related

write lock all tables in mysql for a moment

I need to perform some scripted actions, which may take a while (like maybe a minute). At the beginning of these actions, I take some measures from the MySQL DB and it's important that they do not change until the actions are done. The DB has dozens of tables since it belongs to a quite old fashioned but huge CMS, and the CMS users have a dozen options to modify it.
I do not even want to change anything in the time my scripts runs in the DB myself, it just shall be frozen. It's not a Dump or Update. But tables should be kept open for reading for everyone, to prevent visitors of the connected homepage from getting errors.
If the database altering actions, which may be performed by other CMS users in the meantime would be triggered after the DB is unlocked again, it would be perfect, but if they fail, I would not mind.
So I thought at the beginning of the script I lock the tables down with
lock first_table write;
lock second_table write;
...
And after I do
unlock tables
I think that should do exactly what I want. But can I archive this for all tables of the db without naming them explicitly, to make this more futureproof?
This does not work for sure:
lock tables (select TABLE_NAME from information_schema.tables
where table_schema='whatever') write;
Another question would be, if someone can answer this on the fly, if I would have to perfom the lock/unlock with another MYSQL user than the one used by the CMS. If I understood this right, then yes.
Below is the statement to lock all tables (actually it creates a single global lock):
FLUSH TABLES WITH READ LOCK;
Then release it with:
UNLOCK TABLES;
Mysqldump does this, for example, unless you are backing up only transactional tables and use the --single-transaction option.
Read http://dev.mysql.com/doc/refman/5.7/en/flush.html for more details about FLUSH TABLES.
Re your comment:
Yes, this takes a global READ LOCK on all tables. Even your own session cannot write. My apologies for overlooking this requirement of yours.
There is no equivalent global statement to give you a write lock. You'll have to lock tables by name explicitly.
There's no syntax for wildcard table names, nor is there syntax for putting a subquery in the LOCK TABLES statement.
You'll have to get a list of table names and build a dynamic SQL query.

Best practices for daily MySQL (partial and filtered) replication?

I have a reasonable large database with > 40 tables. I only need to replicated a few tables (+/- 5). And each tables is also filtered.
I'm looking for some best practices for replicating this data (daily is enough), where i can select just a few tables and included some WHERE clauses for each table.
I'm thinking of starting mysqldump for each table (with a where clause) and make for each table a separate .sql file. I then can then truncate all tables (all data is daily overwritten) on the destination db, and run mysql for importing each table separate.
Example:
# dump each table
mysqldump -u repl_user my_database my_table -w 'id between 1000 and 1005' > my_table.sql
Im aware of replicating a full database and use blackhole table type. But since 35 tables are not needed, it seems somehow overkill. Besides, some tables only needs a filtered version, and that i can't solve via blackhole.
Any better solutions?
MySQL natively supports replication filters, but only at the database or table level. This doesn't meet your requirement to filter a subset of rows from these tables.
FlexViews is a tool to read the binary log and replay only changes that are relevant to keeping a materialized view up to date. You could define your materialized view in such a way to implement your table filtering.

mysqldump vs select into outfile

I use select * into outfile option in mysql to backup the data into text files in tab separated format. i call this statement against each table.
And I use load data infile to import data into mysql for each table.
I have not yet done any lock or disable keys while i perform this operation
Now I face some issues:
While it is taking backup the other, updates and selects are getting slow.
It takes too much time to import data for huge tables.
How can I improve the method to solve the above issues?
Is mysqldump an option? I see that it uses insert statements, so before I try it, I wanted to request advice.
Does using locks and disable keys before each "load data" improve speed in import?
If you have a lot of databases/tables, it will definitely be much easier for you to use mysqldump, since you only need to run it once per database (or even once for all databases, if you do a full backup of your system). Also, it has the advantage that it also backs up your table structure (something you cannot do using only select *).
The speed is probably similar, but it would be best to test both and see which one works best in your case.
Someone here tested the options, and mysqldump proved to be faster in his case. But again, YMMV.
If you're concerned about speed, also take a look at the mysqldump/mysqlimport combination. As mentioned here, it is faster than mysqldump alone.
As for locks and disable keys, I am not sure, so I will let someone else answer that part :)
Using mysqldump is important if you want your data backup to be consistent. That is, the data dumped from all tables represents the same instant in time.
If you dump tables one by one, they are not in sync, so you could have data for one table that references rows in another table that aren't included in the second table's backup. When you restore, it won't be pretty.
For performance, I'm using:
mysqldump --single-transaction --tab mydatabase
This dumps for each table, one .sql file for table definition, and one .txt file for data.
Then when I import, I run the .sql files to define tables:
mysqladmin create mydatabase
cat *.sql | mysql mydatabase
Then I import all the data files:
mysqlimport --local --use-threads=4 mydatabase *.txt
In general, running mysqlimport is faster than running the insert statements output by default by mysqldump. And running mysqlimport with multiple threads should be faster too, as long as you have the CPU resources to spare.
Using locks when you restore does not help performance.
The disable keys is intended to defer index creation until after the data is fully loaded and keys are re-enabled, but this helps only for non-unique indexes in MyISAM tables. But you shouldn't use MyISAM tables.
For more information, read:
https://dev.mysql.com/doc/refman/5.7/en/mysqldump.html
https://dev.mysql.com/doc/refman/5.7/en/mysqlimport.html

which is the better way to change the character set for huge data tables?

In my production database Alerts related tables are created with default CharSet of "latin", due to this we are getting error when we try
to insert Japanese characters in the table. We need to change the table and columns default charset to UTF8.
As these tables are having huge data, Alter command might take so much time (it took 5hrs in my local DB with same amount of data)
and lock the table which will cause data loss. Can we plan a mechanism to change the Charset to UTF8, without data loss.
which is the better way to change the charset for huge data tables?
I found this on mysql manual http://dev.mysql.com/doc/refman/5.1/en/alter-table.html:
In most cases, ALTER TABLE makes a temporary copy of the original
table. MySQL waits for other operations that are modifying the table,
then proceeds. It incorporates the alteration into the copy, deletes
the original table, and renames the new one. While ALTER TABLE is
executing, the original table is readable by other sessions. Updates
and writes to the table that begin after the ALTER TABLE operation
begins are stalled until the new table is ready, then are
automatically redirected to the new table without any failed updates
So yes -- it's tricky to minimize downtime while doing this. It depends on the usage profile of your table, are there more reads/writes?
One approach I can think of is to use some sort of replication. So create a new Alert table that uses UTF-8, and find a way to replicate original table to the new one without affecting availability / throughput. When the replication is complete (or close enough), switch the table by renaming it ?
Ofcourse this is easier said than done -- need more learning if it's even possible.
You may take a look into Percona Toolkit::online-chema-change tool:
pt-online-schema-change
It does exactly this - "alters a table’s structure without blocking reads or writes" - with some
limitations(only InnoDB tables etc) and risks involved.
Create a replicated copy of your database on an other machine or instance, when you setup the replication issue stop slave command and alter the table. If you have more than one table, between each conversation you may consider issuing again start slave to synchronise two databases. (If you do not this it may take longer to synchronise) When you complete the conversion the replicated copy can replace your old production database and you remove the old one. This is the way i found out to minimize downtime.

Why is a mysqldump with single-transaction more consistent than a one without?

I have gone through the manual and it was mentioned that every transaction will add a BEGIN statement before it starts taking the dump. Can someone elaborate this in a more understandable manner?
Here is what I read:
This option issues a BEGIN SQL statement before dumping data from the server. It is useful only with transactional tables such as InnoDB and BDB, because then it
dumps the consistent state of the database at the time when BEGIN was issued without blocking any applications."
Can some elaborate on this?
Since the dump is in one transaction, you get a consistent view of all the tables in the database. This is probably best explained by a counterexample. Say you dump a database with two tables, Orders and OrderLines
You start the dump without a single transaction.
Another process inserts a row into the Orders table.
Another process inserts a row into the OrderLines table.
The dump processes the OrderLines table.
Another process deletes the Orders and OrderLines records.
The dump processes the Orders table.
In this example, your dump would have the rows for OrderLines, but not Orders. The data would be in an inconsistent state and would fail on restore if there were a foreign key between Orders and OrderLines.
If you had done it in a single transaction, the dump would have neither the order or the lines (but it would be consistent) since both were inserted then deleted after the transaction began.
I used to run into problems where mysqldump without the --single-transaction parameter would consistently fail due to data being changed during the dump. As far as I can figure, when you run it within a single transaction, it is preventing any changes that occur during the dump from causing a problem. Essentially, when you issue the --single-transaction, it is taking a snapshot of the database at that time and dumping it rather than dumping data that could be changing while the utility is running.
This can be important for backups because it means you get all the data, exactly as it is at one point in time.
So for example, imagine a simple blog database, and a typical bit of activity might be
Create a new user
Create a new post by the user
Delete a user which deletes the post
Now when you backup your database, the backup may backup the tables in this order
Posts
Users
What happens if someone deletes a User, which is required by the Posts, just after your backup reaches #1?
When you restore your data, you'll find that you have a Post, but the user doesn't exist in the backup.
Putting a transaction around the whole thing means that all the updates, inserts and deletes that happen on the database during the backup, aren't seen by the backup.