Start Position for Point-In-Time recovery for AWS RDS - mysql

Using AWS RDS you can restore your database using Point-in-time recovery to get it back to a certain time.
My question is, if someone deleted a bunch of data or dropped a table and you need to recovery to the bin log statement just before the deletion occurred, how do you go about doing that using AWS RDS PITR?
Using MySQL installed on an EC2 instance, you would be able to export the bin log files and find the position just before the deletion happened, and restore to that point in time, however I don't see that functionality available within AWS RDS.
From what I can see, you have to just keep guessing at the time you think the deletion happened?

Related

AWS DMS Change Data Capture from MySQL

I have "full-load-and-cdc" migrations tasks set up to track changes made in MySQL database and replicate changes to AWS S3. The cluster is InnoDB and I connect to one of the replicas. Sometimes, it happens that instance has to be detached from cluster and then added back as clone. In this situation all replication tasks fail with errors like this: "Could not find first log file name in binary log index file".
So, it looks like BinLog file with a specific name doesn't exist anymore. What could I do in a situation like this and be sure I continue replication from last checkpoint?

Good idea to use SQS to move thousands of databases?

We want to move from using MySQL on an EC2 instance to RDS and setup replication. Seems like a no-brainer, right? Well, I've got 30,000 databases to move (don't ask). While setting up replication seems to work well, the process of getting the 30,000 databases into RDS is a royal pain; it takes forever and something almost alway happens.
The nightly backup takes about two hours. I end up with a multi-GB SQL dump file. When I try to restore it, something almost always goes wrong: the RDS instance wasn't big enough memory-wise and crashed, the localhost ran out of swap space, the network connection went flaky. Whatever! I did get it to restore once; IIRC it took 23 hours (30K MySQL DBs are a ton of file IO).
So today, I decided to use mydumper. It generated 30,000 schema files for the database in about two hours, then suddenly, the source MySQL went into uninterruptible sleep according to top, I lost my client connections, strace showed it was still trying to read files, and the mydumper process crashed. I restarted the whole process and just checked the status; mysqld restarted 2.5 hours into it for some reason.
So here's what I'm thinking and I'd like your input: I write two python scripts: firstScript.py will run mydumper on a single database, update a status table, package up the SQL, put it onto an AWS SQS queue, repeating until no more databases are found; the secondScript.py reads from the queue, runs the SQL and updates the status table, repeating until no more messages are found.
I think this can work. Do you? The main thing I'm not sure of is this: can I simply run multiple secondScript.py by Ctrl-Z-ing them into the background?
Or does someone have a better way of moving 30,000 databases?
I would not use mysqldump or mydumper to make a logical dump. Loading the resulting SQL-format dump takes too long.
Instead, use Percona XtraBackup to make a physical backup of your EC2 instance, and upload the backup to S3. Then restore to the RDS instance from S3, setup replication on the RDS instance to your EC2 instance, and let it catch up.
The feature of restoring a physical MySQL backup to RDS was announced in November 2017.
See also:
https://www.percona.com/blog/2018/04/02/migrate-to-amazon-rds-with-percona-xtrabackup/
https://aws.amazon.com/about-aws/whats-new/2017/11/easily-restore-an-amazon-rds-mysql-database-from-your-mysql-backup/
You should try it out with a smaller instance than your 30k databases just so you get some practice with the steps. See the steps in the Percona blog I linked to above.

Create instance from backup on google cloud sql

I would have two questions related to cloud sql backups:
Are backups removed together with instance or maybe they are left for some days?
If no, is it possible to create new instance from backup of already gone instance?
I would expect it possible but looks like backups are only listable under the specific instance and there is no option to start new instance from existing backup.
Regarding to (2): It's actually possible to recover them if you are quick enough. They should still be there, even when Google says they're deleted.
If you know the name of the deleted DB run the following command to check if they are still there
gcloud sql backups list --instance=deleted-db-name --project your-project-name
If you can see any results, you are lucky. Restore them ASAP!
gcloud sql backups restore <backup-ID> --restore-instance=new-db-from-scratch-name --project your-project
And that's it!
Further info: https://geko.cloud/gcp-cloud-sql-how-to-recover-an-accidentally-deleted-database/
Extracted from Google Cloud SQL - Backups and recovery
Restoring from a backup restores to the instance from which the backup
was taken.
So the answer to (1) is they're gone and with regards to (2) if you didn't export a copy of the DB to your Cloud Storage, then No, you can't recover your deleted Cloud sQL instance content.
I noticed a change in this behavior recently (July 28, 2022). Part of our application update process was to run an on-demand backup on the existing deployment, tear down our stack, create a new stack, and then populate the NEW database from the contents of the backup.
Until now, this worked perfectly.
However, as of today, I'm unable to restore from the backup since the original database (dummy-db-19e2df4f) was deleted when we destroyed the old stack. Obviously the workaround is to not delete our original database until the new one has been populated, but this apparent change in behavior was unexpected.
Since the backup is listed, it seems like there are some "mixed messages" below.
List the backups for my old instance:
$ gcloud sql backups list --instance=- | grep dummy-db-19e2df4f
1659019144744 2022-07-28T14:39:04.744+00:00 - SUCCESSFUL dummy-db-19e2df4f
1658959200000 2022-07-27T22:00:00.000+00:00 - SUCCESSFUL dummy-db-19e2df4f
1658872800000 2022-07-26T22:00:00.000+00:00 - SUCCESSFUL dummy-db-19e2df4f
1658786400000 2022-07-25T22:00:00.000+00:00 - SUCCESSFUL dummy-db-19e2df4f
Attempt a restore to a new instance (that is, replacing the contents of new-db-13d63593 with that of the backup/snapshot 1659019144744). Until now this worked:
$ gcloud sql backups restore 1659019144744 --restore-instance=new-db-13d63593
All current data on the instance will be lost when the backup is
restored.
Do you want to continue (Y/n)? y
ERROR: (gcloud.sql.backups.restore) HTTPError 400: Invalid request: Backup run does not exist..
(uh oh...)
Out of curiosity, ask it to describe the backup:
$ gcloud sql backups describe 1659019144744 --instance=dummy-db-19e2df4f
ERROR: (gcloud.sql.backups.describe) HTTPError 400: Invalid request: Invalid request since instance is deleted.

Copy tomcat and mysql from one Amazon EBS volume to another

I launched an Amazon EC2 with Amazon Linux and Amazon-EBS as root volume. I also started tomcat7 and mysql 5.5 on this EBS volume.
Later I decided to change from Amazon Linux to Ubuntu. To do that I need to launch another Amazon EC2 instance with a new EBS root volume. Now I want to copy tomcat7 and mysql from older EBS volume to new one. I have tables and data in mysql which I don't want to loose and an application running on tomcat. How to go about it?
A couple of thoughts and suggestions.
First, if you are going to be having any kind of significant load on your database, running it on EBS-backed volume is probably not a great idea as EBS-backed storage is incredibly slow relative to the machine's local/ephemeral storage (/mnt). Now obviously you don't want DB data on ephemeral storage, so there is really nothing you can do about it if you want to run MySQL on EC2. So my suggestion would be to utilize an RDS instance for your DB if your infrastructure requirements allow for it.
Second, if this is a production application, you are undoubtedly going to have some down time as you make this transition. The question is whether you need to absolutely minimize the amount of downtime. If so, then you need to have an idea as to the size of your database. Is it going to take a long time to dump/load? If not, you could probably just get your new instance up and running, and tested on an older copy of your database and then just dump and load the current database at the time of cutover.
If it is a large database then perhaps you can turn on MySQL binary logging. Then make a dump of the database at a known binary log position. Then install this dump on your new instance. Then when ready to cutover, you can replay the binary logs on the new instance to bring it current. Similarly, you could just set up the DB on the new instance as a replica until the cutover, at which point you make it the master.
You may even consider just using rsync to sync the physical database files if you don't want to mess with binary logging, though this can be a problematic approach if you are not that familiar with dealing with the actual physical database files.
As far as your application goes, that should be much simpler to migrate assuming it is just a collection of files. I would not copy the Tomcat7 installation itself, but rather just install Tomcat on Ubuntu and then adjust the configuration to match current.
As far as the cutover itself goes, this should be pretty straightforward and would vary in approach depending on whether you are using an elastic IP for your server or whether it is behind a load balancer,

How to find the location of automatic S3 backups generated by Amazon RDS?

I've read that Amazon RDS automatically backs up your database to S3. I'm wondering how can I actually see these backups and their contents?
The reasons I want to see them are:
I'm paranoid, am new to the service and haven't experienced a failure to know that the backup process will actually work.
I've read that the backups don't work if some of your tables are MyISAM. That is, I need to have all my tables be InnoDB and not a mix of the two for the s3 backups to be generated.
Does anyone have more information on this?
No, you can't access the S3 bucket RDS uses. If you'd like to confirm that the backups work, you can restore to a point in time via the RDS console. This creates a new RDS instance that uses the backup for that point in time.