MySQL, recover database after mysqlbinlog loading error? - mysql

I am copying a MySQL DB from an initial dump file and the set of binlogs created after the dump.
The initial load from the dump is fine. Then, while loading the binlogs using mysqlbinlog, what happens is that one of the files will fail, for example, with a "server has gone away" error.
Is there any way to recover from a failed mysqlbinlog run, or is the database copy now irreparably corrupted? I know which log has failed, but I can't just rerun that log since the error could have occurred at any query within the log.
Is there a way to handle this moving forward?
I can look into minimizing the chances that there will be an error in the first place, but it doesn't seem like much of a recovery process (or master/slave process) if any MySQL issue during the loading completely ruins the database. I feel that I must be missing something.

I'd check the configuration value for max_allowed_packet. This is pretty small by default (4MB or 64MB depending on MySQL version). You might need to increase it.
Note that you need to increase that option both in the server and in the client that is applying binlogs. The effective limit on packet size is the lesser of the server and client's configuration value.
Even if the binlog succeeded through replication, it might not succeed when replaying binlogs, because you need to replay with mysql while specifying the --max-allowed-packet option.
See https://dev.mysql.com/doc/refman/8.0/en/gone-away.html for more explanation of the error you got.
If you don't know the binlog coordinates of the last binlog event that succeeded, you'll have to start over: remove the partially-restored instance and restore from the backup again, then apply the binlog.

Related

AWS RDS - automatic backup vs snapshot with MyISAM tables

I have an AWS RDS MySQL 5.7 database with MyISAM tables that I would like to migrate to another RDS in a custom VPC, and once migrated, convert those MyISAM tables to InnoDB.
If I undertood correctly, the only way to create a correct automatic backup is using the following procedure explained here: "Automated Backups with Unsupported MySQL Storage Engines"
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithAutomatedBackups.html#Overview.BackupDeviceRestrictions
Stop all activity to your MyISAM tables (that is, close all sessions).
You can close all sessions by calling the mysql.rds_kill command for each process that is returned from the SHOW FULL PROCESSLIST command.
Lock and flush each of your MyISAM tables
Create a snapshot of your DB instance. When the snapshot has completed, release the locks and resume activity on the MyISAM tables
Has someone done this procedure before?
How is that the snapshots are being created successfully every night from the current RDS DBInstance, even though it contains MyISAM tables?
Thanks!
The problem isn't with snapshot creation. It's what can go wrong when you actually try to use one of the snapshots.
RDS snapshots work by capturing a snapshot if your RDS instance's underlying EBS volume (you can't see this volume, but it's there -- RDS runs on EC2, with "hidden" instances and volumes).
EBS snapshots capture the entire contents of the hard drive exactly as they happened to exist at the moment in time when the snapshot process starts.
What ends up on the snapshot is essentially the same thing that you would have on a MySQL Server if you executed sudo killall -9 mysqld -- it is as if the server had halted everything, immediately, without doing any of the things it normally does to clean up for a graceful shutdown. With RDS, things are not quite that dramatic, because RDS does take some precautions, but fundamentally, this is the nature of what is happening.
When you create an RDS instance from a snapshot, the first thing that happens when the instance starts up is the same thing your hypothetical server would do when you restarted the killed MySQL Server daemon: InnoDB Crash Recovery.
InnoDB Crash Recovery
To recover from a MySQL server crash, the only requirement is to restart the MySQL server. InnoDB automatically checks the logs and performs a roll-forward of the database to the present. InnoDB automatically rolls back uncommitted transactions that were present at the time of the crash.
https://dev.mysql.com/doc/refman/5.7/en/innodb-recovery.html#innodb-crash-recovery
Crash recovery is InnoDB's mechanism for bringing everything back into harmony in it internal data structures and ensure that all data is intact, exactly as your application left it. It's possible because InnoDB is a transactional storage engine. That means a lot of different things, but what it specifically means in this case is that InnoDB doesn't just change table data when you change a table. It goes through a process that can be simplified something like this:
store the proposed changes to disk¹
actually make the changes
mark the changes as complete
What this means is that until the changes are finalized, InnoDB can be interrupted and will subsequently be able to pick up where it left off, without corrupting or losing data.
MyISAM has no such mechanisms. It just writes to the data files, directly. Even if a MyISAM table isn't actively being used, it may still need to be repaired when the server comes up, to clean up its structures. In some circumstances, repairing the table can be impossible, and all or part of the data in the table will be lost.
If your MyISAM tables are flushed and locked when the snapshot occurs, they are in a quiescent state on the disk, as though the server had actually been gracefully shut down before the snapshot had occurred, so they will be stable on the snapshot.
But the snapshot process will always appear to succeed, because the snapshot is just duplicating whatever is on the disk, as it appears at the moment in time when the snapshot gets underway.
The problem is that what the snapshot captured may not be usable, and you have no way of knowing whether the snapshot will be fully viable.
¹ Note that the first step, "store the proposed changes to disk" is related to the system variable innodb_flush_log_at_trx_commit which makes the system slower if set to 1 but also is the safest setting, because your query doesn't actually succeed until that first step is done. A setting of 2 is still reasonably safe, because it still writes the changes but continues without requiring that the operating system confirm that they have actually been written to the hard drive before your query returns success... but in a crash, a transaction your application thinks was committed may or may not have survived.

Can I recover dropped database in MySQL?

I accidentally dropped a database which I used for my web application, and it is in MySQL. Browsing through the internet I found out that to recover you need to have binary logs enabled. Read that binary logs records only changes in tables, so what it has to do with recovering a db. Upon executing the command "show binary logs;" console shows me "Error Code: 1381. You are not using binary logging". I am a newbie to MySQL, so is it possible to recover it without my binary logging enabled PLUS I have not made any real backup for the db. Going through the MySQL "my.cnf" file found that InnoDB is default enabled, can it help me recover.
If I cannot recover, please mention the steps to be carried out next time creating a new db to ensure that I could recover even if it has been accidentally deleted.
I think it can't be recovered. In case of this case, you can do the following when creating a new db:
open the binary log by add the following:
log-bin=/data/mysqlbinlog/mysql-bin
binlog-format=mixed (or row)
create a slave db.
Good luck for you.

Sql databases get corrupt after every backup

I have a problem with my unix server. This started a week ago. One day after a backup (I used to keep 3 backup files) I visited a website on the server but it wouldn't work. I restarted the server and it seemed to be working fine except the mysql service. My attempts to restart it failed. Then I figured that was because the server was full, so I deleted one of the backups, cleaned up some space and the mysql service restarted successfully. Than I figured tables in one of the databases (MYIsam tables) were corrupt. So I repaired them through myisamchk command via ssh and all worked fine. However, the very next day I woke up they were corrupt again (despite mysql was working fine), and this time there was no disk space problem on the server. I repaired them again. The next day the same thing happenned; and this time innodb tables that were part of another database were corrupt as well. I've fixed them too, so now all is working well but I guess the same thing will happen after tonight's backup.
I can't identify the problem and I don't know what logs to look into to understand the problem. Can anyone please help me out? Thanks very much in advance.
No easy answer here. My immediate thought is that the dbase is still busy when the backups commence, possibly corrupting indexes, interferring with caches, etc. Turn on full logging and check for problems when the backup starts happens. Maybe you will find something.
Look for the my.cnf file. On my CentOs it is located in /etc/my.cnf. It will have a config setting for the location of the error log.
My strongest suspect is OOM kill by the kernel or some other issue that results from running the system out of memory. Try this:
Start top on the server and press M to sort by memory so the biggest memory user is at the top.
note the pid of mysqld
manually perform the backup as you observe the value of the RES column in the top output (resident memory size)
once the backup is over see if the pid of mysqld has changed
If the pid has changed (meaning restart took place), and you saw the memory footprint of mysqld take up something comparable to the total amount of system memory, then my suspicion is correct, and we need to lower some settings in my.cnf to make it use less memory, e.g key_buffer_size and innodb_buffer_pool_size.
EDIT - From the log you posted, there are additional issues although it is not clear how they could be contributing to the table corruption. Your server appears to be running with --skip-innodb and your backup script is not able to deal with the absence of InnoDB storage engine printing exception error messages, but nevertheless continuing. It is also attempting to do a repair, which is failing due to the lack of system privileges (error 1 is Operation not permitted). It is possible that encountering those errors triggers some faulty logic in your backup script that leaves the tables corrupted.
At this point I would recommend disabling MySQL backup using the cPanel tool, and using mysqldump or some other solution (e.g. Xtrabackup (https://www.percona.com/doc/percona-xtrabackup/2.3/index.html)) from a cron job instead.
EDIT2 - from the test results. The manual backup does not run the system out of memory and does not crash the server. The jury is still out on the automatic one.
Don't kill mysqld; shut it down gracefully.
Switch from MyISAM to InnoDB; the latter does not suffer from that 'error'.

Making new MYSQL replication

I need to make working mysql replication from master to slave. (tried it once already)
The database is quite large (over 100GB) and it will take some hours to make it ready for new slave.
The database has MyIsam and innoDB engine and both are being written
I think my only choice is to copy the data files from master to a new slave? (or make a database dump which im referring later in the topic of ROUND 2)
Before that I have to run down all the services which uses the database and
make writelock for tables or should i shut down the whole database?
After data directory sync to the new replication server I started it up and the database with the tables was there. First error that I got rid off by changing bin.log to 007324 and position to 0.
Error 1:
140213 4:52:07 [ERROR] Got fatal error 1236: 'Could not find first log file name in binary log index file' from master when reading data from binary log
140213 4:52:07 [Note] Slave I/O thread exiting, read up to log 'bin-log.007323', position 46774422
After that I got new problems from database and this error came out from every table.
Error 2:
Error 'Incorrect information in file: './database/table.frm'' on query. Default database: 'database'.
Seems that something went wrong.
ROUND 2!
After this scene I started to think that can this be done without long service break.
Master database has been already configured and it works ok to another slave.
So i did some googling and this is what i came up with.
Making read lock to tables:
FLUSH TABLES WITH READ LOCK;
Taking dump:
mysqldump --skip-lock-tables --single-transaction --flush-logs --master-data=2 -A > dbdump.sql
Packaging and moving:
gzip (pigz) the the dbdump and moving it to slave server after that finding the MASTER_LOG_FILE and MASTER_LOG_POS from the dump.
After that i don't think that i want to import the dbdump.sql because its over 100GB and
will take time. So i think SOURCE would be ok option for it.
On SLAVE server:
CREATE DATABASE dbdump;
USE dbdump;
SOURCE dbdump.db;
CHANGE MASTER TO MASTER_HOST='x.x.x.x',MASTER_USER='replication',MASTER_PASSWORD='slavepass',
MASTER_LOG_FILE='mysql-bin.000001',MASTER_LOG_POS=X;
start slave;
SHOW SLAVE STATUS \G
I haven't tested this yet, am I on to something?
--bp
Realize that issuing a SOURCE command is the same as running an import of the dumped SQL from shell. Either way, it is going to take a long time. Outside of that, you have the steps correct - flush table with read lock on master, make a database dump of master, make sure you note master binlog coordinates, import dump on slave, set binlog coordinates, start replication. Do not work with the raw binaries unless you REALLY know what you are doing (especially for INNODB tables).
If you have a number of large tables (i.e. not just one big one), you could consider parallelizing your dumps/imports by table (or groups of tables) to speed things along. There are actually tools out there to help you do this.
You CAN work with the raw binaries, but it is not for the faint of heart. In the past, I have used rsync to differentially update the raw binaries between master and slave (you still must use flush table with read lock and gather master binlog coordinates before doing this). For MyISAM tables this works pretty well actually. For InnoDB, it can be more tricky. I prefer to use the option to set InnoDB to write index and data files per table. You would need to rsync the ibdata* files. You would delete ib_logfile* files from slave.
This whole thing is a bit of a high wire act, so I would not resort to doing this unless you have no other viable options. Absolutely take a traditional SQL dump before even thinking about attempting a binary file sync, and each time until you are VERY comfortable that you actually know what you are doing.

Moving a MySQL slave to new hard drives - do I need mysqld-relay-bin logs?

I am moving a MySQL slave from one set of HDs to another. The configuration of the machine denies me the ability to have both old and new hard drives on it at the same time. So I rsync'ed the data directory to another machine.
Whe the new hard drives came online, I rsyn'ed the data dir back. This worked fine.
However, I cannot start replication. This is the error I get.
120314 4:23:07 [Warning] Neither --relay-log nor --relay-log-index were used; so replication may break when this MySQL server acts as a slave and has his hostname changed!! Please use '--relay-log=mysqld-relay-bin' to avoid this problem.
120314 4:23:07 [ERROR] Failed to open the relay log '/var/lib/mysqllogs/mysqld-relay-bin.000273' (relay_log_pos 677043943)
120314 4:23:07 [ERROR] Could not find target log during relay log initialization
120314 4:23:07 [ERROR] Failed to initialize the master info structure
I found this comment:
https://serverfault.com/questions/61471/moving-a-mysql-slave-to-a-new-host-failed-to-open-the-relay-log
If it is just complaining about the relay logs, in most cases, they
are disposable if the master still has the binary logs around. You can
just run CHANGE MASTER TO on the slave and it will flush the existing
relay logs and start anew. You don't need to make a new fresh copy.
This seems to suggest that I do not need these log files.
The host name is not changing.
My Questions:
Do I need these log files?
If not, what do I need to do to get replication started? Will it remember where it left off?
If I do need these log files, is there anything else I'm forgetting?
I don't think you need the relay bin loge files to get it to work. It might remember where it left off, did you try the command mysql>RESET SLAVE; ? You should get the position from the SLAVE by SHOW SLAVE STATUS; to see if still rememvers, then check to see if the logfile on the master server still exists, because it only keeps it for as long as you set the max file size. But try RESET SLAVE; if you haven't it does magic. You are probably going to have to start the whole process over by dumping the existing server data right after you lock the tables and do "SHOW MASTER STATUS" I wouldn't recommend trying to save this process if you have the option to start replication from scratch.