I have hosted my MySql database on Amazon RDS, and It has last automated snapshot. (e.g. yesterday midnight). Now situation is, I have deleted some of records from a very important table accidentally, and would like to recovery it. I have no additional backup since yesterday midnight. (as mention earlier). Now How should I recover data without taking any downtime?
How do I use point-in-time data recovery?
If someone need more information let me know and sorry for my poor explanation.
Point in time recovery allows you to create an additional RDS instance, based on the data as it existed on your instance at any specific point in time you choose between the oldest available automated backup and approximately 5 minutes ago. All you have to do is select what date and time you need.
There is no disruption or change to your running instance.
The process creates a new instance, which you connect to, collect the data you need in order to get your production system back the way it should be, and then destroy the new instance. Or, depending on what went wrong for you, you could also switch your application to this new instance and destroy the old one, though it seems unlikely that this is what you would want to do. But you can do either.
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_PIT.html
I have no additional backup since yesterday midnight.
Point in time recovery doesn't care. RDS preserves the snapshots as well as a complete, timestamped log of everything that changed between the snapshots. These logs are archived in an area that is not accessible to you... but they are there. RDS will automatically load the most recent snapshot that is earlier than the point-in-time you select, and then use the logs to roll the new instance's data forward in time to the target time. When the process is complete, your new instance will contain exactly the data that was present on the old instance at the point in time that you selected.
lets assume I have one table contain 10 records at midnight. which is exists in backup/snapshot.
Stop thinking about what is in the snapshot. It doesn't matter.
Next day morning I have added 5 another records at 10:00pm. After half an hour (at 10:30) I have deleted 2 records from them antecedently.
Perform a point in time recovery, selecting any point between 10:00 and 10:30 -- a point in time when the records were in your database.
Point in time recovery creates a new instance, which contains all your data exactly as it existed at the time you selected. Connect to that new instance manually, fetch the missing rows, insert them back into your live/main/production database, and then the newly-created instance can be destroyed because it is no longer needed.
Do not assume this process is complicated or difficult.
Related
I want to set the binary log retention in aws rds for mysql. It looks like there are 2 places i can do this,
via a procedural call
CALL mysql.rds_set_configuration(name,value);
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/mysql_rds_set_configuration.html
Or via the param option group
If the values are different which takes precedent?
side question: the value in the param group is the default one aws rds sets (30) days. Is there some way i can know/measure how long i would need the binary log for?
I don't know which one takes precedence, but you should be able to check that yourself easily by using mysql.rds_show_configuration(). Change the parameter group setting, and then check if that affects the value when you show configuration using the procedure.
I have an educated guess that the parameter group is like changing the persistent setting in the my.cnf file, and changing the setting using the procedure is like using SET GLOBAL, which only lasts until you restart the instance. Then it reads the persistent setting from the parameter group and forgets that you changed it with the procedure.
Is there some way i can know/measure how long i would need the binary log for?
You can't know for certain how much binlog retention you need.
That statement needs some explanation.
Binary logs are used for:
Point in time recovery. For this to work, you need a full backup, and enough binlogs to replay events since the backup. So you need binlog retention only to the most recent backup. The frequency of backups is up to you. If too much time passes and there isn't a continuous set of binlogs to do PITR, then no problem — just create a new full backup.
Replication. Normally, a replica downloads binlogs nearly immediately, so the binlog retention can be very low. But if a replica is offline for some time, it's good that the binlogs stay on the source instance. So the binlog retention has to be long enough so there aren't any binlogs missed when the replica comes back online. If the replica misses any binlogs because they expired, then the replica cannot catch up. It must be wiped and reinitialized from a new backup.
CDC using tools like Debezium. Similar pattern as replication. If the CDC runs periodically, there needs to be enough binlog retention to cover the period between CDC. This could be seconds, or days. It's up to you to determine how often this runs.
So how can the source instance know if a replica or a CDC client has disconnected, when it might come back to download more? It can't know. Perhaps those clients were decommissioned and will never reconnect. The source instance storing the binlogs has no way of knowing.
So it's up to you to know things like how long your replica will be offline.
Or stated another way, if you have X days of binlog retention, it's up to you to make sure the replica is back up before the binlogs it needs start expiring.
If you can't do that, then the replica needs to be reinitialized. At my last DBA job, we had so many replicas that were offline for days due to server failures, we had to reinitialize replicas at least once a week.
I have a situation where during peak moments my writer database even on the largest 96 core AWS instance becomes maxed (due to limited edition promotions where we process hundreds of orders per second).
I have seen that Aurora offer a multi-master setup where all nodes of the cluster are able to write - https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-multi-master.html
In the docs they mention:
If two DB instances attempt to modify the same data page at almost the same instant, a write conflict occurs. The earliest change request is approved using a quorum voting mechanism. That change is saved to permanent storage. The DB instance whose change isn't approved rolls back the entire transaction containing the attempted change. Rolling back the transaction ensures that data is kept in a consistent state, and applications always see a predictable view of the data. Your application can detect the deadlock condition and retry the entire transaction.
I am not really sure what they mean here by "data page". I am pretty sure WordPress doesn't use transactions at all but when thousands of orders are coming in and being pushed into the same table will this cause write errors that will cause orders to fail?
I have looked online and cannot find anyone talking about using WordPress with Aurora multi-master cluster. Is it compatible?
I am looking into migrating my MySQL DB to Azure Database for MySQL https://azure.microsoft.com/en-us/services/mysql/. It currently resides on a server hosted by another company. The DB is about 100 GB. (It worries me that Azure uses the term "relatively large" for 1GB.)
Is there a way to migrate the DB without any or little (a few hours, max) downtime? I obviously can't do a dump and load as the downtime could be days. Their documentation seems to be for syncing with a MySQL server that is already on a MS server.
Is there a way to export the data out of MS Azure if I later want to use something else, again without significant downtime?
Another approach: Use Azure Data Factory to copy the data from your MySQL source to your Azure DB. Set up a sync procedure that updates your Azure Database with new rows. Sync, take MYSQL db offline, sync once more and switch to the Azure DB.
See Microsoft online help
Don't underestimate the complexity of this migration.
With 100GB, it's a good guess that most rows in your tables don't get UPDATEd or DELETEd.
For my suggestion here to work, you will need a way to
SELECT * FROM table WHERE (the rows are new or updated since a certain date)
Some INSERT-only tables will have autoincrementing ID values. In this case you can figure out the ID cutoff value between old and new. Other tables may be UPDATEd. Unless those table have timestamps saying when they were updated, you'll have a challenge figuring it out. You need to understand your data to do that. It's OK if your WHERE (new or updated) operation takes some extra rows that are older. It's NOT OK if it misses INSERTed or UPDATEd rows.
Once you know how to do this for each large table, you can start migrating.
Mass Migration Keeping your old system online and active, you can use mysqldump to migrate your data to the new server. You can take as long as you require to do it. Read this for some suggestions. getting Lost connection to mysql when using mysqldump even with max_allowed_packet parameter
Then, you'll have a stale copy of the data on the new server. Make sure the indexes are correctly built. You may want to use OPTIMIZE TABLE on the newly loaded tables.
Update Migration You can then use your WHERE (the rows are new or updated) queries to migrate the rows that have changed since you migrated the whole table. Again, you can take as long as you want to do this, keeping your old system online. It should take much less time than your first migration, because it will handle far fewer rows.
Final Migration, offline Finally, you can take your system offline and migrate the remaining rows, the ones that changed since your last migration. And migrate your small tables in their entirety, again. Then start your new system.
Yeah but, you say, how will I know I did it right?
For best results, you should script your migration steps, and use the scripts. That way your final migration step will go quickly.
You could rehearse this process on a local server on your premises. While 100GiB is big for a database, it's not an outrageous amount of disk space on a desktop or server-room machine.
Save the very large extracted files from your mass migration step so you can re-use them when you flub your first attempts to load them. That way you'll save the repeated extraction load on your old system.
You should stand up a staging copy of your migrated database (at your new cloud provider) and test it with a staging copy of your application. You may be able to do this with a small subset of your rows. But do test your final migration step with this copy to make sure it works.
Be prepared for a fast rollback to the old system if the new one goes wrong .
AND, maybe this is an opportunity to purge out some old data before you migrate. This kind of migration is difficult enough that you could make a business case for extracting and then deleting old rows from your old server, before you start migrating.
We have a SQL Server 2008 R2 database that backs up transaction logs every now and then. Today there was a big error in the database caused at around 12am... I have transaction logs up to 8am and then 12am - 16pm - etc.
My question is: can I sort of reverse-merge those transaction logs into database, so that I return to the database state at 8am?
Or is my best chance to recover an older full backup and restore all transaction logs up to 8am?
The first option is preferable since full backup has been performed a bit of a while ago and I am afraid to f*ck things up restoring from there and applying trn logs. Am I falsely alarmed about that? Is it actually possible for anything bad to happen if going by that scenario (restoring the full backup and applying trn logs)?
The fact that you don’t create regular transaction log backups doesn’t affect the success of the recovery process. As long as your database is in the Full recovery model, the transactions are stored in the online transaction log and kept in it until a transaction log backup is made. If you make a transaction log backup later than usual, it only means that the online transaction log may grow and that the backup might be bigger. It will not cause any transaction history to be lost.
With a complete chain of transaction log backups back to 8 AM, you can successfully roll back the whole database to a point in time.
As for restoring the full backup and applying trn logs – nothing should go wrong, but it’s always recommended to test the scenario on a test server first, and not directly in production
To restore to a point in time:
In SSMS expand Databases
Right-click the database, select Tasks | Restore| Database
In the General tab, in the Backup sets the available backups will be listed. Click Timeline
Select Specific date and time, change the Time interval to show a wider time range, and move the slider to the time you want to roll back to
You can find more detailed instructions here: How to: Restore to a Point in Time (SQL Server Management Studio)
Keep in mind that this process will roll back all changes made to the database. If you want to roll back only specific changes (e.g. only recover some deleted data, or reverse wrong updates), I suggest a third party tool, such as ApexSQL Log
Reverting your SQL Server database back to a specific point in time
Restore a database to a point in time
Disclaimer: I work for ApexSQL as a Support Engineer
I've been asked for a quick turn around on this. The group I'm assisting has a .MDB database where offsite workers that don't have internet all the time. Thus, way back the team implemented an Access DB which allows for synchronization.
As their team grew bigger they started running into the following issues:
Remote synching – when an user tries to synch from a worksite, more often than not, the database will crash either due to loss of wireless signal, program timing out, or Inspector manually shutting down due to time (i.e., 30 or more minutes)
Multiple synchers – we are unable to synch multiple at one time (there are currently 34 users in 3 different territories). If someone is synching and another person tries to synch at the same time, the second user will end up with an error message. They will have to shut down their DB and try to synch at a later time.
Incomplete synchs – sometimes when an worker synch’s his/her DB, not all the line items will copy over to the Master file which can cause confusion during review.
Is there any work arounds or items I can look into to resolve these?
I have little resources and time so anything involving a new server might not work.
THanks
It sounds as though you are mainly adding new data from different field operatives, rather than everyone updating existing data, if this is the case then that's good and you could try the following:
Ensure all the tables have "Replication ID's" for the Primary Keys as this will ensure no two operatives create conflicting records.
The synchronisation process should then be amended to take a snapshot of said table/tables to a .txt file on the operatives machine and then this file transferred back to the source machine.
Then at the end of the day or more often if required, the master copy should be setup to import the new data from all the text files it has received, as there will be no conflicting Primary Keys you should be ok, just remember to insert only those where the Primary Key is not already in the table.
Hope all that makes sense : )