Galera Mariadb Multi-Master Replication - mysql

I am trying to setup a cluster of 3 servers in 3 different locations; Dallas-US, London-UK, Mumbai-India. On each location I have setup a webserver and db server. On db server I have configured Galera Mariadb Multi-Master cluster to replicate db among all three servers. My each webservers are connected with local IP to their regional db server. I am expecting that my Dallas webserver will fetch db records from Dallas db server; London webserver from London db server and Mumbai webserver from Mumbai db server.
Everything is working well but I have found that mysql query takes much time above 100s while fetching record. I have tried Mariadb with single instance and its fetching data within 5s.
What am I doing wrong?

It is possible to configure Galera to be single-Master. It does not sound like you did that, but suggest double checking.
Given that all nodes are writable, here's a simplified view of what happens on each transaction.
Do all the work to store/update the data on the Master you are connected to. (Presumably, it is the local machine.)
At COMMIT time, make a single round-trip to each other nodes (probably ~200ms) to give them a chance to say "wait! that would cause a conflict".
Usually step 2 will come back with "it's OK". At this point, the COMMIT returns success to the client.
(Note: If you are not using BEGIN...COMMIT, but instead auto_commit=ON, then there is an implicit COMMIT at the end of each DML statement.)
For a local read, the default action should return "immediately".
But, maybe you are concerned about the "critical read" problem. (cf wsrep_sync_wait) In this case, you want to make sure that a write has propagated to your server. This is likely to lead to a 200ms delay on the read because it waits for the "gcache" is caught up.
If you can assume that only read from the same server that they write to, consider setting wsrep_sync_wait=0. If anyone does a cross-datacenter write, then read, he could hit the "critical read" problem. (This is where he writes something, but may not see it on the next read.)

Related

Galera Cluster High Ping

I have 3 gameservers connected at the same database. I started using galera cluster to sync, because mysql remote connection gets a delay because of hosts distance, BR, US and FR, and my gameserver use only one main thread for important queries.
This delay (lag) happen because principal thread need receive callback (confirmation) before continue running the aplication.
I thought thatwithn galera cluster, using local database with ping 0 the problem doesnt will happen anymore, but I don't know why, everytime that I get INSERTS and DELET on database the same lag happens.
On my application debug, I see that queries are sent local with 0 MS but it still lagging.
My question is, does galera mysql-wsrep needs confirmation of other clusters?
Galera checks with all the other nodes during the COMMIT command. This is when the lag occurs. Of course, COMMIT is explicitly or implicitly (autocommit) part of any transaction, so every transaction has that lag.
This implies that the optimal use of a geographically-dispersed Galera Cluster is to put many actions in a single transaction. (On the other hand, too many things in a single transaction can lead to undoing too much if there is any failure/deadlock/etc.)
The lag between US and Europe is on the order of 100ms; is that what you are seeing?

AWS reading mysql replicas instances keeps gettin Too Many Connections error

I've purchased a single VPC on AWS and initiated there 6 MySql databases, and foreach one I've created a reading replica, so that I can always run queries on the reading replicas quickly.
Most of the day, my writing instances (original instances) are fully loaded and their CPUs percentage is mostly 99%. However, the reading replicas shows something ~7-10% CPU usage, but sometimes I get an error when I run a service connecting to the reading replica "TOO MANY CONNECTIONS".
I'm not that expert with AWS, but is this happening because the writing replicas are fully loaded and they're on the same VPC?
this happening because the writing replicas are fully loaded and they're on the same VPC?
No, it isn't. This is unrelated to replication. In replication, the replica counts as exactly 1 connection on the master, but replication does not consume any connections on the replica itself. There is no impact on connections related to the intensity of the total workload from replication.
This issue simply means you have more clients connecting to the replica than are allowed by the parameter group based on your RDS instance type. Use the query SELECT ##MAX_CONNECTIONS; to see what this limit is. Use SHOW STATUS LIKE 'THREADS_CONNECTED'; to see how many connections exist currently, and use SHOW PROCESSLIST; (as the administrative user, or any user holding the PROCESS privilege) in order to see what all of these connections are doing.
If many of them show Sleep and have long values in Time (seconds spent in the current state) then the problem is that your application is somehow abandoning connections, rather than properly closing them after use or when they are otherwise no longer needed.

MySQL Replication Thousands of Writes Per Second

For my application I need my database to handle say 1000 updates per second at peak time, this isn't too much of a problem I just need the right server. However, if this server goes down I need a backup with the synced data to take over. How do I sync the data to another database?
In a separate part of my application I have a master and a slave, the slave replicates the master and the slave is read only. Could I use this method for my problem? I have looked into mysql clusters but so far reading about clusters is just making me more confused.
So put simply, how can I replicate my database handing 1000 writes per second, in case of downtime?
There are two solutions one simple but requiring manual reconfiguration in the event of the main server going down, the other more complex but more robust.
A) Simple replication - you can configure a slave server that receives updates from the master server. Both servers must be able to handle the number of updates and queries that you foresee. In the event of the master server failing, you need to manually swap the slave into the master role. http://dev.mysql.com/doc/refman/5.0/en/replication.html
B) Clustering - I'm not very familiar with MySQL clustering, but it gives synchronous updates to all servers and automatic failover - http://www.mysql.com/products/cluster/

Writing into multiple MySQL databases async

I am using AWS RDS so database replication between regions are impossible.
My application written in PHP and deployed on all regions, i am looking for a fast and reliable way to achieve that.
I am going to make MySQL connections :
SET ##auto_increment_increment= NUMBER_OF_WRITEABLE_DATABASES;
SET ##auto_increment_offset = REGION_ID ;
so AI pk's will be unique all over regions.
And my current plan is keeping a query log table with fields => id,queries,status,user_id. It will log all insert,update,delete queries into queries field in same page load.
Status Codes:
Status 0 => not executed
Status 1 => successfully executed on all regions
Status 2 => failed
Status 3 => failed with affected rows not match
Example Row:
id=>1
queries=>
INSERT INTO PROFILES VALUES (1,{USER_ID},'Username','Email')##SEPERATOR##AFFECTED_COUNT
UPDATE USERS SET last_modified='2012-12...' where id={USER_ID}##SEPERATOR##AFFECTED_COUNT
status=0
user_id=>{USER_ID}
and there will be a daemon which reads records which status != 1 and will process them on all regions without commit , once all run without error it will commit or roll back in case of error.
That is what i thought and going to use.
My question is there any more decent/tested approach to that scenario or is there any problem about my approach.
thanks in advance
My initial thought is that you are going down the wrong path if you are trying to use RDS as a solution to enforce unique record ID's across multiple regions. I would think you might want to rethink your actual need for uniqueness across regions or enforce uniqueness using multiple columns (i.e. an autoincrement plus a region identifier). That could be read and put into some eventually consistent data store for read purposes.
You're making a commendable effort, but as the other commenters have stated, your solution isn't viable, for a number of reasons.
You don't really want to use auto_increment_offset and auto_increment_increment at the session level. You want to set those at the server level. If RDS won't let you do that, this is another reason why RDS is probably not the best solution.
If I came out and suggested that you deploy a global network of MySQL servers (EC2, not RDS) in a multi-master ring, where data replicates 1 => 2 => 3 => 4 => 1 and each server ignores incoming replication messages with its own server id, my fellow MySQL DBAs would accuse me of having lost my mind and setting you up for a difficult-to-manage situation; however, I am convinced that this would be a much easier solution than what you have proposed, because at least, then, the data would be changing around the world in pretty much the same order in which it actually changed -- which would reduce the likelihood of conflicting updates originating from multiple locations. MySQL replication is asynchronous, in the sense that server 1 does not wait for a transaction to be committed on server 2 before returning success to the client (indicating that the transaction has committed), but don't confuse that fact with the fact that it is sequential -- transactions are replicated on each server in the order in which they were committed. (New options in MySQL 5.6 allow some exceptions to this by with parallel replication threads, but that isn't significant to this discussion).
Since you have devised a scheme for avoiding conflicting auto-increment values, your bigger problems are likely to come from updates and deletes. In the scenario I just described, if server 2 deleted a record and server 4 deleted the same record at the same time, then server 4 would stop replicating incoming events when it received the delete from server 2, because the "rows affected" would have been different. Your scenario would similarly fail. The difference is that using actual MySQL replication, nothing happening after the conflicting event happened, so until you resolved that conflict, at least your data would not diverge any further into inconsistency because of the sequential nature discussed above and the fact that MySQL replication completely stops whenever a conflict is encountered. In a ring of master servers, the server that has stopped replicating continues collecting a log of replication events from the upstream systems, but execution halts and the data on that server is frozen unless changed locally until the conflict is resolved and replication restarted.
Note also that in your scenario, you need to preserve "from" and "to" values for each column on updates, because you can't roll anything back unless you know that it rolls back to.
That being noted, a rollback needs to occur in real-time, not later. If I transfer money between two bank accounts, and for some reason that transfer needs to roll back, I need to see that while I'm using the bank's web site -- the bank can't roll that transaction back in the middle of the night just because one of their servers has a different balance in my bank account.
Here's a thought: In your scenario, it the account I was transferring "to" was consistent among all the servers, but the account I was transferring "from" was not, then I wonder... would your setup roll back the withdrawal from the "from" account, but leave the deposit in the "to" account? I think it might.
Keep in mind that you are limited by the CAP theorem. No system can be globally consistent, available, and tolerate isolation among the nodes. At best, you can pick any two.
With that thought, the question I have is this: why do all of the nodes in your global system need to be synchronized? If the main reason is performance, consider the possibility of deploying a single global master server, with read replicas distributed among the regions. Write your application with two pools of database connection threads so that most SELECT queries go to the local read replica, while INSERT, DELETE, UPDATE, and CALL (stored procedures that update data), are sent to the global master server. Your biggest worry, then, becomes the fact that you only have eventual consistency on the read replicas. With properly-sized servers and well-written queries, this is very fast (subject to the laws of physics for global travel of optical and electrical signals) but it is not instantaneous. What you have to do to accomplish this is for sessions that have recently made changes to the database, their reads may need to hit the global master -- if you place an order, you need to see the order immediately, so the master might be the best place to look, right away. Later, looking at the local replica will work. You're still out of scope for RDS with this, because of the cross-regional issue... but MySQL on EC2 is a good fit.
Read replicas impose a very small load on the master, but even this load can be mitigated by connecting a single read replica to the master and then connecting the downstream read replicas to that intermediate server.
Setting slave_compressed_protocol = 1 on the masters and the replicas will enable the machines to use compressed connections for transferring the replication events. I have found this to be anywhere from 3:1 to 10:1 depending on the nature of the data being replicated and the delay of compressing and decompressing the data seems insignificant.
Additionally, you could set up a second master, adjacent to the primary master (perhaps in a different A/Z), link those two servers with master-master replciation, chain the read replicas to the 2nd master, use auto increment increment and offsets appropriately, but do not write to or read from to the second master under normal conditions. Why would you do this? This way, you have a 2nd global master that could be placed into service immediately in case of failure of the primary master by redirecting your application to access it.
Of course, the nature of your application plays a large factor in how much global integration is actually required. Solving this problem will require you to rethink how the application works, to determine whether architectural changes are needed.
As a DBA, I don't like some of the restrictions and flexibility constraints that RDS imposes on me. All I really get in return for the loss-of-control is a relative ease of backups and point-in-time restoration... which I like... but, to me, these don't make up for the restrictions.
Footnote: In the 3rd paragraph, I said "transactions are replicated on each server in the order in which they were committed." But that doesn't necessarily mean in the real-world wall-clock actual-order in which they were committed... it actually means the order in which they were committed to each server relative to the other transactions being committed by that server... so a transaction on Server #1 that actually committed before a different transaction on Server #3 might arrive at server #4 after the transaction from #3 instead of before it, depending on how long the transaction took to propagate through server #2 and be committed on server #3. However, this is still "true enough" in principle, because if the transaction on #1 is perceived at server #3 as conflicting with whatever happened on #3, it will not actually replicate to #4 because #3 will stop replicating.

Does the MySQL NDB Cluster consider node distance? Will it use the replicates if they are nearer?

I'm building a very small NDB cluster with only 3 machines. This means that machine 1 will serve as both MGM Server, MySQL Server, and NDB data node. The database is only 7 GB so I plan to replicate each node at least once. Now, since a query might end up using data that is cached in the NDB node on machine one, even if it isn't node the primary source for that data, access would be much faster (for obvious reasons).
Does the NDB cluster work like that? Every example I see has at least 5 machines. The manual doesn't seem to mention how to handle node differences like this one.
There are a couple of questions here :
Availability / NoOfReplicas
MySQL Cluster can give high availability when data is replicated across 2 or more data node processes. This requires that the NoOfReplicas configuration parameter is set to 2 or greater. With NoOfReplicas=1, each row is stored in only one data node, and a data node failure would mean that some data is unavailable and therefore the database as a whole is unavailable.
Number of machines / hosts
For HA configurations with NoOfReplicas=2, there should be at least 3 separate hosts. 1 is needed for each of the data node processes, which has a copy of all of the data. A third is needed to act as an 'arbitrator' when communication between the 2 data node processes fails. This ensures that only one of the data nodes continues to accept write transactions, and avoids data divergence (split brain). With only two hosts, the cluster will only be resilient to the failure of one of the hosts, if the other host fails instead, the whole cluster will fail. The arbitration role is very lightweight, so this third machine can be used for almost any other task as well.
Data locality
In a 2 node configuration with NoOfReplicas=2, each data node process stores all of the data. However, this does not mean that only one data node process is used to read/write data. Both processes are involved with writes (as they must maintain copies), and generally, either process could be involved in a read.
Some work to improve read locality in a 2-node configuration is under consideration, but nothing is concrete.
This means that when MySQLD (or another NdbApi client) is colocated with one of the two data nodes, there will still be quite a lot of communication with the other data node.