Does MySQL stalls the whole cluster during DDL statements? - mysql

Recently, I read that Galera based MySQL cluster uses a concept called total order isolation (https://galeracluster.com/library/documentation/schema-upgrades.html#toi) for DDL's by default which stalls the writes on the whole cluster until it is commited on all the nodes.
How does MySQL handles DDL in native asynchronous replication ?
Does it stall writes for the other schemas as well?

Native Replication sticks the DDL into the replication stream. When the command pops up in the Slave, it executes the DDL before moving on to other queries in the replication stream.
Caveat: The above statement assumes old flavor, without multi-master replication or multiple replication threads. Regardless of this caveat the table being modified is blocked on the Slave just as it was on the Master.
Galera's TOI goes to some extra effort to make sure all the nodes are in sync, even accounting for the DDL versus ordinary writes. Hnece the name "Total Order of Inserts".
Galera's RSU is, in many cases, a viable alternative. It is not more invasive than a crash of each node, one at a time (hence "Rolling"). Assuming connections can failover to different nodes, RSU avoids other blockage.
Still, you should make a conscious choice between RSU and TOI; there are use cases for dictating one versus the other.
In a distributed system (multiple nodes, multiple clients, etc), pushing code gets tricky. I like to take this approach, even though it leads to perhaps 3 times as many pushes:
Push application code to discover whether the database change has been pushed. Have the code work either with the old schema or new. Do this "push" in a "rolling" manor
Push the new schema (eg CREATE/ALTER TABLE).
Clean up the code. (Again, "roll" it out to the many clients.)

Related

Are there *application*-driven reasons to prefer multi-primary topologies over clustering, or vice-versa?

I have an application that currently uses a single primary and I'm looking to do multi-primary by either setting up a reciprocal multi-primary (just two primaries with auto-increment-increment and auto-increment-offset set appropriately) or Clustering-with-a-capital-C. The database is currently MariaDB 10.3, so the clustering would be Galera.
My understanding of multi-primary is that the application would likely require no changes: the application would connect to a single database (doesn't matter which one), and any transaction that needed to obtain any locks would do so locally, any auto-increment values necessary would be generated, and once a COMMIT occurs, that engine would complete the commit and the likelihood of failure-to-replicate to the other node would be very low.
But for Clustering, a COMMIT actually requires that the other node(s) are updated to ensure success, the likelihood of failure during COMMIT (as opposed to during some INSERT/UPDATE/DELETE) is much higher, and therefore the application would really require some automated retry logic to be built into it.
Is the above accurate, or am I overestimating the likelihood of COMMIT-failure in a Clustered deployment, or perhaps even underestimating the likelihood of COMMIT-failure in a multi-primary environment?
From what I've read, it seems that the Galera Cluster is a little more graceful about handling nodes leaving the re-joining the Cluster and adding new nodes. Is Galera Cluster really just multi-master with the database engine handling all the finicky setup and management, or is there some major difference between the two?
Honestly, I'm more looking for reassurance that moving to Galera Cluster isn't going to end up being an enormous headache relative to the seemingly "easier" and "safer" move to multi-primary.
By "multi-primary", do you mean that each of the Galera nodes would be accepting writes? (In other contexts, "multi-primary" has a different meaning -- and only one Replica.)
One thing to be aware of: "Critical read".
For example, when a user posts something and it writes to one node, and then that user reads from a different node, he expects his post to show up. See wsrep_sync_wait.
(Elaborating on Bill's comment.) The COMMIT on the original write waits for each other node to say "yes, I can and will store that data", but a read on the other nodes may not immediately "see" the value. Using wsrep_sync_wait just before a SELECT makes sure the write is actually visible to the read.

Is there any concept of load balancing in MySQL master-master architecture?

I am running a MySQL 5.5 Master-Slave setup. For avoiding too many hits on my master server, I am thinking of having one or may be more servers for MySQL and incoming requests will first hit the HAProxy and it accordingly forwards the requests either in round robin or any scheduling algorithm defined in HAProxy. So set up will be like -
APP -> API Gateaway/Server -> HAProxy -> Master Server1/Master Server2
So what can be pros and cons to this setup ?
Replication in MySQL is asynchronous by default, so you can't always assume that the replicas are in sync with their source.
If you intend to use your load-balancer to split writes over the two master instances, you could get into trouble with that because of MySQL's asynchronous replication.
Say you commit a row on master1 to a table that has a unique key. Then you commit a row with the same unique value to the same table on master2, before the change on master1 has been applied through replication. Both servers allowed the row to be committed, because as far as they knew, it did not violate the unique constraint. But then as replication tries to apply the change on each server, those changes do conflict with the row committed. This is called split-brain, and it's incredibly difficult to recover from.
If your load-balancer randomly sends some read queries to another instance, they might not return data that you just committed on the other instance. This is called replication lag.
This may or may not be a problem for your app, but it's likely that in your app, at least some of the queries require strong consistency, i.e. reading outdated results is not permitted. Other cases even with the same app may be more tolerant of some replication lag.
I wrote a presentation some years ago about splitting queries between source and replica MySQL instances: https://www.percona.com/sites/default/files/presentations/Read%20Write%20Split.pdf. The presentation goes into more details about the different types of tolerance for replication lag.
MySQL 8.0 has introduced a more sophisticated solution for all of these problems. It's called Group Replication, and it does its best to ensure that all instances are in sync all the time, so you don't have the risk of reading stale data or creating write conflicts. The downside of Group Replication is that to ensure no replication lag occurs, it may need to constrain your transaction throughput. In other words, COMMITs may be blocked until the other instances in the replication cluster respond.
Read more about Group Replication here: https://dev.mysql.com/doc/refman/8.0/en/group-replication.html
P.S.: Whichever solution you decide to pursue, I recommend you do upgrade your version of MySQL. MySQL 5.5 passed its end-of-life in 2018, so it will no longer get updates even for security flaws.

Why use GTIDs in MySQL replication?

When it comes to database replication, what is the use of global transaction identifiers? Why do we need it to prevent concurrency across the servers? How is that prevention achieved exactly?
I tried to read the documentation at
http://dev.mysql.com/doc/refman/5.7/en/replication-gtids.html but still could not understand it clearly. This may sound very basic but I would really appreciate it if someone could explain the concepts to me.
The reason for the Global Transaction ID is to allow a MySQL slave to know if it has applied a given transaction or not, to keep things in sync between Master and Slave. It can also be used for restarting a slave if a connection goes down, again to know the point in time. Without using GTIDs, replication must be controlled based on the position in a given binary transaction log file (bin log). This is much harder to manage than the GTID method.
A master is the only server that is typically written to, so that slaves merely rebuild a copy of the master by applying each transaction in sequence.
It is also important to understand that MySQL replication can run in one of 3 modes:
Statement-based: Each SQL statement is logged to the binlog and replicated as a statement to the slave. This can be in some cases ambiguous at the slave causing the data to not match exactly. (Most of the time it is fine for common uses).
Row-based: In this mode MySQL replicates the actual data changes to each table, with a "before" and "after" picture of each row, which is fully accurate. This can result in a much larger binlog, for example if you have a bulk update query, like: UPDATE t1 SET c1 = 'a' WHERE c2 = 'b'.
Mixed: In this mode, MySQL will use a mix of statement-based and row-based logging in the binlog.
I only mention the modes of replication, because it is mentioned in the doc you referenced that Row-based is the recommended option if you are using GTIDs.
There is another option called Master-Master replication, where you can write to two masters (each acting as a slave for the other), but this requires a special configuration to ensure that the data written to each master is unique. It is much trickier to manage than a typical Master/Slave setup.
Therefore, the prevention of writes to a Slave is something that you must ensure from your application for a typical replication process to function correctly. It is fine to read from a Slave, but you should not write to it. Note that the Slave can be behind the Master if you are using it for reads, so it is best to perform queries for things that can be behind the Master (like reports that are not critical up to the second or millisecond). You can ensure no writes to the Slave by making your common application user a read-only user for the Slave server, and a read-write user for the Master.
Why do we need to prevent concurrency across the servers?
If I understood the question correctly, you are talking about consistency. If so, the answer is that you need keep a consistent state in a distributed system. For example, if my bank account information is replicated throughout several different servers it is fundamental that they have exactly the same € balance. Now imagine that I perform multiple money transactions (deposits/spendings) and at each one I was connected to a different server: concurrency problems would cause my account balance to be different at each server, which is unacceptable.
How is that prevention achieved exactly?
Using a master/slave approach. Amongst the servers, you have one server (the master) that is responsible for handling every writing operation, meaning that modifications to the database must be handled only by this server. The database of this master server is replicated to all other servers (the slaves), which are not allowed to modify the database but can be used to read the database (e.g. SELECT operations). Knowing that there is only one server allowed to modify the database, you do not have consistency issues.
what is the use of global transaction identifiers?
Communication between servers is asynchronous and a slave server is not required to be connected with the master at all times. Therefore, once a slave server reconnects with the master server, it may find that the master's database has been modified in the meanwhile, thus it must update its own database. The problem now is knowing amongst all modifications performed by the master server, which are the ones that the slave server already performed in a previous date and which are the ones that were not performed yet.
GTIDs address this issue: they uniquely identify each transaction performed by the master server. Now, the slave server can identify amongst all the transactions performed by the master server, which are the ones that were not seen before.

MongoDB write concern sync level

I am trying to understand what exactly are the limitations of using MongoDB as the primary database for a project I am working on, it can be hard to wade through the crap online to properly understand how it compares to a more traditional database choice of say MySQL.
From what I understand from reading about HADR configuration of
IBM DB2 - http://pic.dhe.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=%2Fcom.ibm.db2.luw.admin.ha.doc%2Fdoc%2Fc0011724.html,
MySQL - http://dev.mysql.com/doc/refman/5.5/en/replication-semisync.html
and MongoDB - http://docs.mongodb.org/manual/core/write-concern/
It seems that Replica Acknowledged http://docs.mongodb.org/manual/core/replica-set-write-concern/ is the highest level of write concern in a replica set.
Is replica acknowledged the equivalent to the synchronous level in DB2 and Semisynchronous level in MySQL?
No they are not.
IBM DB2 provides a way to make sure that all members of a replica set are upto speed at the same time, it is the same as MySQLs own synchronous replication. It ensures full consistentcy at all times throughout the slave set.
Semisynchronous replication again is not replica set majority either; from the documentation page:
The master waits after commit only until at least one slave has received and logged the events.
But then:
It does not wait for all slaves to acknowledge receipt, and it requires only receipt, not that the events have been fully executed and committed on the slave side.
In other words you have no idea whether or not any slaves actually performed the command. It is the same as w:0 or "unsafe" writes in MongoDB.
With majority you have an idea that every member you send to has actually performed your command as can be seen by a cute little diagram in the documentation: http://docs.mongodb.org/manual/core/replica-set-write-concern/#verify-write-operations
and if that doesn't convince you then the quote:
The following sequence of commands creates a configuration that waits for the write operation to complete on a majority of the set members before returning:
From the next paragraph should.
So MySQL semisynchronous is similar to majority but it isn't the same. DB2 is totally different.
The IBM documentation sums up the differences in replica/slave wirte concern quite well:
The more strict the synchronization mode configuration parameter value, the more protection your database solution has against transaction data loss, but the slower your transaction processing performance. You must balance the need for protection against transaction loss with the need for performance.
This applies to DB2, MySQL and MongoDB alike. You must choose.

MySQL dual master replication -- is this scenario safe?

I currently have a MySQL dual master replication (A<->B) set up and everything seems to be running swimmingly. I drew on the basic ideas from here and here.
Server A is my web server (a VPS). User interaction with the application leads to updates to several fields in table X (which are replicated to server B). Server B is the heavy-lifter, where all the big calculations are done. A cron job on server B regularly adds rows to table X (which are replicated to server A).
So server A can update (but never add) rows, and server B can add rows. Server B can also update fields in X, but only after the user no longer has the ability to update that row.
What kinds of potential disasters can I expect with this scenario if I go to production with it? Or does this seem OK? I'm asking mostly because I'm ignorant about whether any simultaneous operation on the table (from either the A copy or the B copy) can cause problems or if it's just operations on the same row that get hairy.
Dual master replication is messy if you attempt to write to the same database on both masters.
One of the biggest points of contention (and high blood pressure) is the use of autoincrement keys.
As long as you remember to set auto_increment_increment and auto_increment_offset, you can lookup any data you want and retrieve auto_incremented ids.
You just have to remember this rule: If you read an id from serverX, you must lookup needed data from serverX using the same id.
Here is one saving grace for using dual master replication.
Suppose you have
two databases (db1 and db2)
two DB servers (serverA and serverB)
If you impose the following restrictions
all writes of db1 to serverA
all writes of db2 to serverB
then you are not required to set auto_increment_increment and auto_increment_offset.
I hope my answer clarifies the good, the bad, and the ugly of using dual master replication.
Here is a pictorial example of 4 masters using auto increment settings
Nice article from Percona on this subject
Master-master replication can be very tricky, are you sure that this is the best solution for you ? Usually it is used for load-balancing purposes (e.g. round-robin connect to your db servers) and sometimes when you want to avoid the replication lag effect. A big known issue is the auto_increment problem which is supposedly solved using different offsets and increment value.
I think you should modify your configuration to simple master-slave by making A the master and B the slave, unless I am mistaken about the requirements of your system.
I think you can depend on
Percona XtraDB Cluster Feature 2: Multi-Master replication than regular MySQL replication
They promise the foll:
By Multi-Master I mean the ability to write to any node in your cluster and do not worry that eventually you get out-of-sync situation, as it regularly happens with regular MySQL replication if you imprudently write to the wrong server.
With Cluster you can write to any node, and the Cluster guarantees consistency of writes. That is the write is either committed on all nodes or not committed at all.
The two important consequences of Muti-master architecture.
First: we can have several appliers working in parallel. This gives us true parallel replication. Slave can have many parallel threads, and you can tune it by variable wsrep_slave_threads
Second: There might be a small period of time when the slave is out-of-sync from master. This happens because the master may apply event faster than a slave. And if you do read from the slave, you may read data, that has not changes yet. You can see that from diagram. However you can change this behavior by using variable wsrep_causal_reads=ON. In this case the read on the slave will wait until event is applied (this however will increase the response time of the read. This gap between slave and master is the reason why this replication named “virtually synchronous replication”, not real “synchronous replication”
The described behavior of COMMIT also has the second serious implication.
If you run write transactions to two different nodes, the cluster will use an optimistic locking model.
That means a transaction will not check on possible locking conflicts during individual queries, but rather on the COMMIT stage. And you may get ERROR response on COMMIT. I am highlighting this, as this is one of incompatibilities with regular InnoDB, that you may experience. In InnoDB usually DEADLOCK and LOCK TIMEOUT errors happen in response on particular query, but not on COMMIT. Well, if you follow a good practice, you still check errors code after “COMMIT” query, but I saw many applications that do not do that.
So, if you plan to use Multi-Master capabilities of XtraDB Cluster, and run write transactions on several nodes, you may need to make sure you handle response on “COMMIT” query.
You can find it here along with pictorial expln
From my rather extensive experience on this topic I can say you will regret writing to more than one master someday. It may be soon, it may not be for a long time, but it will happen. You will have two servers that each have some correct data and some wrong data, and you will either pick one as the authoritative source and throw the other away (probably without really knowing what you're throwing away) or you'll reconcile the two. No matter how you design it, you cannot eliminate the possibility of this happening, so it's a mathematical certainty that it will happen someday.
Percona (my employer) has handled probably several hundred cases of recovery after doing what you're attempting. Some of them take hours, some take weeks, one I helped with took a few months -- and that's with excellent tools to help.
Use a different replication technology or find a different way to do what you want to do. MMM will not help -- it will bring catastrophe sooner. You cannot do this with standard MySQL replication, with or without external tools. You need a replacement replication technology such as Continuent Tungsten or Percona XtraDB Cluster.
It's often easier to just solve the real need in some other fashion and give up multi-master writes, if you want to use vanilla MySQL replication.
and thanks for sharing my Master-Master Mysql cluster article. As Rolando clarified this configuration is not suitable for most production environment due to the limitation of autoincrement support.
The most adequate way to get a MySQL cluster is using NDB, which require at least 4 servers (2 management and 2 data nodes).
I have written a detailed article to get this running on two servers only, which is very similar to my previous article but using NDB instead.
http://www.hbyconsultancy.com/blog/mysql-cluster-ndb-up-and-running-7-4-and-6-3-on-ubuntu-server-trusty-14-04.html
Notice that I always recommend to analyse your needs and find out the most adequate solution, don't just look for available solutions and try to figure out if they fit with your needs or not.
-Hatem
I would highly recommend looking into a tool that will manage this for you. Multi-master replication can be very troublesome if things go wrong.
I would suggest something like Percona XtraDB Cluster. I've been following this project, and it looks very cool. I definitely think it will be a game changer in the MySQL world. It's still in beta though.