I am working on an application that we need disaster recovery plans. We currently use RDS to host the db and have 2 hourly backups running (we do not use Aurora but have plans to upgrade in future).
If the database somehow got deleted we want to make sure the backup we will be recovering from is current and therefore need some way of telling that.
One way is to save a heartbeat in the db at certain intervals then i can check that against what is expected.
I was wondering if anyone may have any other ways of solving this issue?
Assuming this MySQL database is connected to your web application, you could have a server side thread in your application which periodically does a heartbeat which writes a record to the database. You may create a special heartbeat table, which will store the heartbeats. Then, you can easily examine any backup and know roughly the last time the database were "alive."
I am not an expert in AWS, and there may be another way of doing this which is easier than what I described above.e
Related
I am using t2.large RDS instance, I want to downgrade to t2.micro to fit my current business. I have a few question to ask:
- How can I downgrade RDS instance without losing data and downtime ?
Thanks,
You can't really do it without downtime, but you could minimize the downtime.
The easiest option is to Modify the DB instance. This will result in downtime because a new database will be provisioned, the data will be relocated and the DNS name will be changed to point to the new instance.
Seeing that you believe a t2.micro will be sufficient for your database, it would be fair to assume that there would be times when your database is not in use so that you can perform the Modify operation. It should only take a few minutes.
Officially, the best way to modify a database without downtime is to use Multi-AZ, which can update one node while traffic is still being served by another node. However, your goal seems to be to reduce cost, rather than spending more to ensure uptime.
By the way, a t2.micro is quite limited in terms of CPU and network bandwidth. You are trying to save 21c per day, at the potential cost of having a poorly-responding database.
You can consider creating a read replica (t2.micro) of the master instance (t2.large). Once the read replica is in sync with the master instance, you can promote the read replica and then point the application towards the new master instance (which is the promoted read replica).
For reference, see:
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_MySQL.Replication.ReadReplicas.html
https://aws.amazon.com/blogs/aws/amazon-rds-for-mysql-promote-read-replica/
I have a system where there's a mysql database to which changes are done. Then I have other machines that connect to this mysql database every ten minutes or so, and re-download tables concerning them (for example, one machine might download tables A, B, C, while another machine might download table A, D, E).
Without using Debezium or Kafka, is there a way to get all MySQL changes done after a certain timestamp, so that only those changes are sent to a machine requesting the updates, instead of the whole tables ? ... For example, machine X might want all mysql changes done since it last contacted the mysql database, and then apply those changes to its own old data to update it.
Is there some way to do this ?
MySQL can be setup to automatically replicate databases, tables etc. automatically. If the connection is lost, it will catch up when the connection is restored.
Take a look at this page MySQL V5.5 Replication, or this one MySQL V8.0 Replication
You can use Debezium as a library embedded into your application, if you don't want or can deploy a Kafka cluster.
Alternatively, you could directly use the MySQL Binlog Connector (it's used by the Debezium connector underneath, too), it lets you read the binlog from given offset positions. But then you'd have to deal with many things yourself, which are handled by solutions such as Debezium already, e.g. the correct handling of schema metadata and changes (e.g. additions of columns to existing tables). Usually this involves parsing the MySQL DDL, which by itself is quite complex.
Disclaimer: I'm the lead of Debezium
I am looking for suggestions on the best way to sync mySQL tables (myISAM) from 2 different databases.
Currently we use Navicat to sync tables from our production server to our test server but we have been running into many problems. Just about everyday we have been running into a sync failure on a table.
We get the error below a lot of the times, not to mention Navicat spams our e-mails with successful and unsuccessful syncs(is there anyway to just receive only the unsuccessful syncs?). I also know altering the table in anyway will cause a failure to sync. So altering the table in anyway must be done to the master first (This makes sense but is there any way around this?).
-[Sync] Finished - Unsuccessful Synchronization: List index out of bounds (0)
Is there any reason to not use the Navicat sync? My boss suggested using mySQL replication instead but my first concern is finding why we have so many problems because it seems like we just are misusing the sync thus giving us all these problems.
Thanks.
sync tables from our production server to our test server
It sounds like you're trying to replicate your production environment in your test environment, right?
A common pattern to follow in this situation is using a tool like mysqldump to create a backup of the entire database, then later import the backup into the test environment. By doing a complete backup and restore, you're not only ensuring that you have at least one backup method that's known to work, you're also ensuring that the test database can never contain modifications that a sync tool might miss. (Sync tools generally require a primary or unique key on each table to operate effectively.)
The backup and reimport process should be an easy thing for you to automate. At my workplace, we perform a mysqldump-based database dump every night, and perform optional imports into each developer's personal copy of the dev environment early the following morning.
I've got a Java web service backed by MySQL + EC2 + EBS. For data integrity I've looked into DRBD, MySQL cluster etc. but wonder if there isn't a simpler solution. I don't need high availability (can handle downtime)
There are only a few operations whose data I need to preserve -- creating an account, changing password, purchase receipt. The majority of the data I can afford to recover from a stale backup.
What I am thinking is that I could pipe selected INSERT/UPDATE commands to storage (S3, SimpleDB for instance) and when required (when the db blows up) replay these commands from the point of last backup. And wouldn't it be neat if this functionality was implemented in the JDBC driver itself.
Is this too silly to work, or am I missing another obvious and robust solution?
Have you looked into moving your MySQL into Amazon Web Services as well? You can use Amazon Relational Database Service (RDS). Also see MySQL Enterprise Support.
You always have a window where total loss of a server and associated file storage will result in some amount of lost data.
When I ran a modestly busy SaaS solution in AWS, I had a MySQL Master running on a large instance and a MySQL Slave running on a small instance in a different availability zone. The replication lag was typically no more than 2 seconds, though a surge in traffic could take that up to a minute or two.
If you can't afford losing 5 minutes of data, I would suggest running a Master/Slave setup over rolling your own recovery mechanism. If you do roll your own, ensure the "stale" backups and the logged/journaled critical data are in a different availability zone. AWS has lost entire zones before.
I'm setting up a SQL Server 2008 server on a production server, which way is the best to backup this data? Should I use replication and then backup that server? Should I just use a simple command-line script and export the data? Which replication method should i use?
The server is going to be pretty loaded so I need an efficent method.
I have access to multiple computers that I can use.
A very simple yet good solution is to run a full backup using sqlcmd (formerly osql) locally, then copy the BAK file over the network to a NAS or other store. It's sub-optimal in terms of network/disk usage, but it's very safe because every backup is independent and given that the process is very simple it is also very robust.
Moreover, this even works in Express editions.
The "best" backup solutions depends upon your recovery criteria.
If you need immediate access to the data in the event of a failure, a three server database mirroring scenario (live, mirror and witness) would seem to fit - although your application may need to be adapted to make use of automatic failover. "Log shipping" may produce similar results (although without automatic failover, or need for a witness).
If, however, there's some wiggle room in the recovery time, regular scheduled backups of the database (e.g., via SQL Agent) and it's transaction logs will allow you to do point-in-time restores. The frequency of backups would be determined by database size, how frequently the data is updated, and how far you are willing to rollback the database in the event of complete failure (unless you can extract a transaction log backup out of a failed server, you can only recover to the latest backup)
If you're looking to simply rollback to known-good states after, say, user error, you can make use of database snapshots as a lightweight "backup" scenario - but these are useless in the event of server failure. They're near instantaneous to create, and only take up room when the data changed - but incur a slight performance overhead.
Of course, these aren't the only backup solutions, nor are they mutually exclusive - just the ones that came to mind.