MySQL dual master replication -- is this scenario safe? - mysql

I currently have a MySQL dual master replication (A<->B) set up and everything seems to be running swimmingly. I drew on the basic ideas from here and here.
Server A is my web server (a VPS). User interaction with the application leads to updates to several fields in table X (which are replicated to server B). Server B is the heavy-lifter, where all the big calculations are done. A cron job on server B regularly adds rows to table X (which are replicated to server A).
So server A can update (but never add) rows, and server B can add rows. Server B can also update fields in X, but only after the user no longer has the ability to update that row.
What kinds of potential disasters can I expect with this scenario if I go to production with it? Or does this seem OK? I'm asking mostly because I'm ignorant about whether any simultaneous operation on the table (from either the A copy or the B copy) can cause problems or if it's just operations on the same row that get hairy.

Dual master replication is messy if you attempt to write to the same database on both masters.
One of the biggest points of contention (and high blood pressure) is the use of autoincrement keys.
As long as you remember to set auto_increment_increment and auto_increment_offset, you can lookup any data you want and retrieve auto_incremented ids.
You just have to remember this rule: If you read an id from serverX, you must lookup needed data from serverX using the same id.
Here is one saving grace for using dual master replication.
Suppose you have
two databases (db1 and db2)
two DB servers (serverA and serverB)
If you impose the following restrictions
all writes of db1 to serverA
all writes of db2 to serverB
then you are not required to set auto_increment_increment and auto_increment_offset.
I hope my answer clarifies the good, the bad, and the ugly of using dual master replication.
Here is a pictorial example of 4 masters using auto increment settings
Nice article from Percona on this subject

Master-master replication can be very tricky, are you sure that this is the best solution for you ? Usually it is used for load-balancing purposes (e.g. round-robin connect to your db servers) and sometimes when you want to avoid the replication lag effect. A big known issue is the auto_increment problem which is supposedly solved using different offsets and increment value.
I think you should modify your configuration to simple master-slave by making A the master and B the slave, unless I am mistaken about the requirements of your system.

I think you can depend on
Percona XtraDB Cluster Feature 2: Multi-Master replication than regular MySQL replication
They promise the foll:
By Multi-Master I mean the ability to write to any node in your cluster and do not worry that eventually you get out-of-sync situation, as it regularly happens with regular MySQL replication if you imprudently write to the wrong server.
With Cluster you can write to any node, and the Cluster guarantees consistency of writes. That is the write is either committed on all nodes or not committed at all.
The two important consequences of Muti-master architecture.
First: we can have several appliers working in parallel. This gives us true parallel replication. Slave can have many parallel threads, and you can tune it by variable wsrep_slave_threads
Second: There might be a small period of time when the slave is out-of-sync from master. This happens because the master may apply event faster than a slave. And if you do read from the slave, you may read data, that has not changes yet. You can see that from diagram. However you can change this behavior by using variable wsrep_causal_reads=ON. In this case the read on the slave will wait until event is applied (this however will increase the response time of the read. This gap between slave and master is the reason why this replication named “virtually synchronous replication”, not real “synchronous replication”
The described behavior of COMMIT also has the second serious implication.
If you run write transactions to two different nodes, the cluster will use an optimistic locking model.
That means a transaction will not check on possible locking conflicts during individual queries, but rather on the COMMIT stage. And you may get ERROR response on COMMIT. I am highlighting this, as this is one of incompatibilities with regular InnoDB, that you may experience. In InnoDB usually DEADLOCK and LOCK TIMEOUT errors happen in response on particular query, but not on COMMIT. Well, if you follow a good practice, you still check errors code after “COMMIT” query, but I saw many applications that do not do that.
So, if you plan to use Multi-Master capabilities of XtraDB Cluster, and run write transactions on several nodes, you may need to make sure you handle response on “COMMIT” query.
You can find it here along with pictorial expln

From my rather extensive experience on this topic I can say you will regret writing to more than one master someday. It may be soon, it may not be for a long time, but it will happen. You will have two servers that each have some correct data and some wrong data, and you will either pick one as the authoritative source and throw the other away (probably without really knowing what you're throwing away) or you'll reconcile the two. No matter how you design it, you cannot eliminate the possibility of this happening, so it's a mathematical certainty that it will happen someday.
Percona (my employer) has handled probably several hundred cases of recovery after doing what you're attempting. Some of them take hours, some take weeks, one I helped with took a few months -- and that's with excellent tools to help.
Use a different replication technology or find a different way to do what you want to do. MMM will not help -- it will bring catastrophe sooner. You cannot do this with standard MySQL replication, with or without external tools. You need a replacement replication technology such as Continuent Tungsten or Percona XtraDB Cluster.
It's often easier to just solve the real need in some other fashion and give up multi-master writes, if you want to use vanilla MySQL replication.

and thanks for sharing my Master-Master Mysql cluster article. As Rolando clarified this configuration is not suitable for most production environment due to the limitation of autoincrement support.
The most adequate way to get a MySQL cluster is using NDB, which require at least 4 servers (2 management and 2 data nodes).
I have written a detailed article to get this running on two servers only, which is very similar to my previous article but using NDB instead.
http://www.hbyconsultancy.com/blog/mysql-cluster-ndb-up-and-running-7-4-and-6-3-on-ubuntu-server-trusty-14-04.html
Notice that I always recommend to analyse your needs and find out the most adequate solution, don't just look for available solutions and try to figure out if they fit with your needs or not.
-Hatem

I would highly recommend looking into a tool that will manage this for you. Multi-master replication can be very troublesome if things go wrong.
I would suggest something like Percona XtraDB Cluster. I've been following this project, and it looks very cool. I definitely think it will be a game changer in the MySQL world. It's still in beta though.

Related

MySQL Replication: Question about a fallback-system

I want to set up a complete server (apache, mysql 5.7) as a fallback of a productive server.
The synchronization on file level using rsync and cronjob is already done.
The mysql-replication is currently the problem. More precisely: the choice of the right replica method.
Multi primary group replication seemed to be the most suitable method so far.
In case of a longer production downtime, it is possible to switch to the fallback server quickly via DNS change.
Write accesses to the database are possible immediately without adjustments.
So far so good: But, if the fallback-server fails, it is in unreachable status and the production-server switches to read only, since its group no longer has the quota. This is of course a no-go.
I thought it might be possible using different replica variables: If the fallback-server is in unreachable state for a certain time (~5 minutes), the production-server should stop the group_replication and start a new group_replication. This has to happen automatically to keep the read-only time relatively low. When the fallback-server is back online, it should be manually added to the newly started group. But if I read the various forum posts and documentation correctly, it's not possible that way. And running a Group_Replication with only two nodes is the wrong decision anyway.
https://forums.mysql.com/read.php?177,657333,657343#msg-657343
Is the master - slave replication the only one that can be considered for such a fallback system? https://dev.mysql.com/doc/refman/5.7/en/replication-solutions-switch.html
Or does the Group_Replication offer possibilities after all, if you can react suitably to the quota problem? Possibilities that I have overlooked so far.
Many thanks and best regards
Short Answer: You must have [at least] 3 nodes.
Long Answer:
Split brain with only two nodes:
Write only to the surviving node, but only if you can conclude that it is the only surviving node, else...
The network died and both Primaries are accepting writes. This to them disagreeing with each other. You may have no clean way to repair the mess.
Go into readonly mode with surviving node. (The only safe and sane approach.)
The problem is that the automated system cannot tell the difference between a dead Primary and a dead network.
So... You must have 3 nodes to safely avoid "split-brain" and have a good chance of an automated failover. This also implies that no two nodes should be in the same tornado path, flood range, volcano path, earthquake fault, etc.
You picked Group Replication (InnoDB Cluster). That is an excellent offering from MySQL. Galera with MariaDB is an equally good offering -- there are a lot of differences in the details, but it boils down to needing 3, preferably dispersed, nodes.
DNS changes take some time, due to the TTL. A proxy server may help with this.
Galera can run in a "Primary + Replicas" mode, but it also allows you to run with all nodes being read-write. This leads to a slightly different set of steps necessary for a client to take to stop writing to one node and start writing to another. There are "Proxys" to help with such.
FailBack
Are you trying to always use a certain Primary except when it is down? Or can you accept letting any node be the 'current' Primary?
I think of "fallback" as simply a "failover" that goes back to the original Primary. That implies a second outage (possibly briefer). However, I understand geographic considerations. You may want your main Primary to be 'near' most of your customers.
I recommend using the Galera MySQL cluster with HAProxy as a load balancer and automatic failover solution. we have used it in production for a long time now and never had serious problems. The most important thing to consider is monitoring the replication sync status between nodes. also, make sure your storage engine is InnoDB because Galera doesn't work with MyISAM.
check this link on how to setup :
https://medium.com/platformer-blog/highly-available-mysql-with-galera-and-haproxy-e9b55b839fe0
But in these kinds of situations, the main problem is not a failover mechanism because there are many solutions out of the box, but rather you have to check your read/write ratio and transactional services and make sure replication delays won't affect them. some times vertically scalable solutions with master-slave replication are more suitable for transaction-sensitive financial systems and it really depends on the service your providing.

What solution can I use for MySQL replication across cities?

We are looking into options for our MySQL replication architecture, the relevant details of our current setup:
We manage several branches on different cities.
Every branch has the same database structure.
Every primary key on all tables are prefixed by a branch identifier.
We need that a branch keep working if it has a network outage and it must sync with the main branch once the connection is restored.
As we don't have any chance to get a duplicate index on any table I'm thinking on something like MySQL multi master, or maybe Percona XtraDB Cluster or Tungsten but I can't find documentation on what happens if a single node is isolated from the others and what happens with the data that it received once the connection is restored.
Is there any proven method that suit this kind of setup? I would appreciate any advice, thanks.
In the case of tungsten, how it behaves depends on how you tell it to behave
You seem to be describing a relaxed consistency model. But no off the shelf clustering solution is going to solve all your problems. Certainly if you ensure that each record is only ever modified at its "home" database then you shouldn't run into many problems, but this model also requires you to replicate all the data to all of the locations. Bandwidth might not be an issue and it does provide a good DR capability, but storage and scalability may become an issue.
If you do have a centralized, well managed datacentre then another approach would be to have each branch run off an asynchronous dual master - one located at the branch and one at the datacentre then roll your own scripts for consolidating the dataset.
Multi master replication would work, as long as your not performing UPDATEs on the same data when the nodes are out of sync. In that case, it will apply the changes silently, so your data would become inconsistent.
If you're not doing that, I think it's the best solution, as MySQL takes care of the binary logs pointers and handles reconnections. Just be sure that the auto_increment_offset setting is properly configured amongst the masters. Anyway, I've just tested this deployment with just 2 master servers (7 years in production with little issues).

What is a good way to show the effect of replication in MySQL?

We have to show a difference to show the advantages of using replication. We have two computers, linked by teamviewer so we can show our class what we are doing exactly.
Is it possible to show a difference in performance? (How long it takes to execute certain queries?)
What sort queries should we test? (in other words, where is the difference between using/not using replication the biggest)
How should we fill our database? How much data should be there?
Thanks a lot!
I guess the answer to the above questions depends on factors such as which storage engine you are using, size of the database, as well as your chosen replication architecture.
I don't think replication will have much of an impact on query execution for simple master->slave architecture. If however, you have an architecture where there are two masters: one handling writes, replicating to another master which exclusively handles reads, and then replication to a slave which handles backups, then you are far more likely to be able to present some of the more positive scenarios. Have a read up on locks and storage engines, as this might influence your choices.
One simple way to show how Replication can be positive is to demonstrate a simple backup strategy. E.g. Taking hourly backups on a master server itself can bring the underlying application to a complete halt for the duration of the backup (Taking backups using mysqldump locks the tables so that no read/write operations can occur). Whereas replicating to a slave, then taking backups from there negates this affect.
If you want to show detailed statistics, it's probably better to look into some benchmarking/profiling tools (sysbench,mysqlslap,sql-bench to name a few). This can become quite complex though.
Also might be worth looking at the Percona Toolkit and the Percona monitoring plugins here: http://www.percona.com/software/
Replication has several advantages:
Robustness is increased with a master/slave setup. In the event of problems with the master, you can switch to the slave as a backup
Better response time for clients can be achieved by splitting the load for processing client queries between the master and slave servers
Another benefit of using replication is that you can perform database backups using a slave server without disturbing the master.
Using replication always a safe thing to do you should be replicating your Production server always incase of failure it will be helpful.
You can show seconds_behind_master value while showing replication performance, this shows indication of how “late” the slave is this value should not be more than 600-800 seconds but network latency does matter here.
Make sure that Master and Slave servers are configured correctly now
You can stop slave server and let Master server has some updates/inserts (bulk inserts) happening and now start slave server you will see larger seconds_behind_master value it should be keep on decreasing till reaches 0 value.
There is a tool called MONyog - MySQL Monitor and Advisor which shows Replication status in real-time.
Also what kind of replication to use whether statement based or row based has been explained here
http://dev.mysql.com/doc/refman/5.1/en/replication-sbr-rbr.html

Best implementation for MySQL replication with Rails 3?

We're looking at potentially setting up replication for our primary MySQL database, and while setting up the replication seems pretty straight-forward, the application implementation seems a bit murkier.
My first idea would be to set up a master-slave configuration and RW-splitting, with all write queries (CREATE, INSERT, UPDATE) going to master, and all read queries (SELECT) going to slave. Having read up on it, it seems that there are essentially two options for how to implement this with our app:
Using an independent middleware layer for all MySQL connections, such as MySQL proxy or DBSlayer. However, the former is in Alpha and the latter has limited documentation.
Using a Ruby-based gem/plugin, such as Octopus to achieve RW-splitting in the framework.
If we wanted to go with a master-slave setup, what you recommend moving forward?
The other thought I've had was to use a master-master configuration, but am unsure about the implementation of such a setup.
Thoughts?
Generally you should do your R/W splitting in the framework because only it can understand the context. In PHP I do this by maintaining two connections - one for writes and one for reads and decide which you want explicitly in your code. The reason for this is that it's not as simple as splitting by query type. For example, if you start a transaction on the write connection, you want all the reads inside it to go through that as well, otherwise they will be outside the transaction and will probably get old data or get hung up with locks.
Unless your workload is really read-heavy, replication is not a scaling solution as replication lag will cause you to get out of date results. Master-master is not that special - it's just two instances of master-slave, but you should not make the mistake of trying to write to both masters as you're asking for split-brain nightmares.
The config I really like is using mmm with master-master pair. This makes failover and redundancy really easy and transparent to applications, and it works beautifully.

MySQL dual master

For my current project we are thinking of setting up a dual master replication topology for a geographically separated setup; one db on the us east coast and the other db in japan.
I am curious if anyone has tried this and what there experience has been.
Also, I am curious what my other options are for solving this problem; we are considering message queues.
Thanks!
Just a note on the technical aspects of your plan: You have to know that MySQL does not officially support multi-master replication (only MySQL Cluster provides support for synchronous replication).
But there is at least one "hack" that makes multi-master-replication possible even with a normal MySQL replication setup. Please see Patrick Galbraith's "MySQL Multi-Master Replication" for a possible solution. I don't have any experience with this setup, so I don't dare to judge on how feasible this approach would be.
There are several things to consider when replicating databases geographically. If you are doing this for performance reasons, be sure your replication model supports your data being "eventually consistent" as it can take time to bring the replication current in both, or many, locations. If your throughput or response times between locations is not good, active replication may not be the best option.
Setting up mysql as dual master does actually work fine in the right scenario done correctly. But I am not sure it fits very well in your scenario.
First of all, dual master setup in mysql is really a ring-setup. Server A is defined as master of B, while B is at the same time defined as the master of A, so both servers act as both master and slave. The replication works by shipping a binary log containing the sql statements which the slave inserts when it sees fit, which is usually right away. But if you're hammering it with local insertions, it will take a while to catch up. The slave insertions are sequential by the way, so you won't get any benefit of multiple cores etc.
The primary use of dual master mysql is to have redundancy on the server level with automatic fail-over (often using hearbeat on linux). Excluding mysql-cluster (for various reasons), this is the only usable automatic failover for mysql. The setup for basic dual master is easily found on google. The heartbeat stuff is a bit more work. But this is not really what you were asking about, since this really behaves as a single database server.
If you want the dual master setup because you always want to write to a local database (write to both of them at the same time), you'll need to write your application with this in mind. You can never have auto-incrementing values in the database, and when you have unique values, you must make sure that the two locations never write the same value. For example location A could write odd unique numbers and location B could write even unique numbers. The reason is that you're not guaranteed that the servers are in sync at any given time, so if you've inserted a unique row in A, and then an overlapping unique row in B before the second server catches up, you'll have a broken system. And if something first breaks, the entire system stops.
To sum it up: it's possible, but you'll need to tip-toe very carefully if you're building business software on top of this.
Because of the one-to-many architecture of MySQL replication, you have to have a replication ring with multiple masters: that is, each replicates from the next in a loop. For two, they replicate off each other. This has been supported from as far back as v3.23.
In a previous place I worked, we did it with v3.23 with quite a number of customers as a way of providing exactly what you're asking. We used SSH tunnels over the Internet to do the replication. It took us some time to get it reliable and several times we had to do a binary copy of one database to another (fortunately, none of them were over 2Gb nor needed 24-hour access). Also the replication in v3 was not nearly as stable as in v4 but even in v5, it will just stop if it detects any sort of error.
To accomodate the inevitable replication lag, we re-structured the application so that it didn't rely on AUTOINCREMENT fields (and removed that attribute from the tables). This was reasonably straightforward due to the data-access layer we had developed; instead of it using mysql_insert_id() for new objects, it created the new ID first and inserted it along with the rest of the row. We also implemented site IDs that we stored in the top half of the ID, because they were BIGINTs. This also meant we didn't have to change the application when we had a client who wanted the database in three locations. :-)
It wasn't 100% robust. InnoDB was just gaining some visibility so we couldn't easily use transactions, although we considered it. So there were race conditions occasionally when two objects tried to be created with the same ID. This meant one failed and we tried to report that in the app. But it was still a significant part of someone's job to watch over the replication and fix things when it broke. Importantly, to fix it before we got too far out of sync, because in a few cases the databases were being used in both sites and would quickly become difficult to re-integrate if we had to rebuild one.
It was a good exercise to be a part of, but I wouldn't do it again. Not in MySQL.