I have 3 gameservers connected at the same database. I started using galera cluster to sync, because mysql remote connection gets a delay because of hosts distance, BR, US and FR, and my gameserver use only one main thread for important queries.
This delay (lag) happen because principal thread need receive callback (confirmation) before continue running the aplication.
I thought thatwithn galera cluster, using local database with ping 0 the problem doesnt will happen anymore, but I don't know why, everytime that I get INSERTS and DELET on database the same lag happens.
On my application debug, I see that queries are sent local with 0 MS but it still lagging.
My question is, does galera mysql-wsrep needs confirmation of other clusters?
Galera checks with all the other nodes during the COMMIT command. This is when the lag occurs. Of course, COMMIT is explicitly or implicitly (autocommit) part of any transaction, so every transaction has that lag.
This implies that the optimal use of a geographically-dispersed Galera Cluster is to put many actions in a single transaction. (On the other hand, too many things in a single transaction can lead to undoing too much if there is any failure/deadlock/etc.)
The lag between US and Europe is on the order of 100ms; is that what you are seeing?
Related
As per How To Set Up Replication in MySQL,
Once the replica instance has been initialized, it creates two
threaded processes. The first, called the IO thread, connects to the
source MySQL instance and reads the binary log events line by line,
and then copies them over to a local file on the replica’s server
called the relay log. The second thread, called the SQL thread, reads
events from the relay log and then applies them to the replica
instance as fast as possible.
Isn't it contradictory to the theory of master-slave database replication in which the master copies data to the slaves?
Reliability. (A mini-history of MySQL's efforts.)
When a write occurs on the Primary, N+1 extra actions occur:
One write to the binlog -- this is to allow for any Replicas that happen to be offline (for any reason); they can come back later and request data from this file. (Also see sync_binlog)
N network writes, one per Replica. These are to get the data to the Replicas ASAP.
Normally, if you want more than a few Replicas, you can "fan out" through several levels, thereby allowing for an unlimited number of Replicas. (10 per level would give you 1000 Replicas in 3 layers.)
The product called Orchestrator carries this to an extra level -- the binlog is replicated to an extra server and the network traffic occurs from there. This offloads the Primary. (Booking.com uses it to handle literally hundreds of replicas.)
On the Replica's side the two threads were added 20 years ago because of the following scenario:
The Replica is busy doing one query at a time.
It gets busy with some long query (say an ALTER)
Lots of activity backs up on the Primary
The Primary dies.
Now the Replica finishes the Alter, but does not have anything else to work on, so it is very "behind" and will take extra time to "catch up" once the Primary comes back online.
Hence, the 2-thread Replica "helps" keep things in sync, but it is still not fully synchronous.
Later there was "semi-synchronous" replication and multiple SQL threads in the Replica (still a single I/O thread).
Finally, InnoDB Cluster and Galera became available to provide [effectively] synchronous replication. But they come with other costs.
"master-slave database replication in which the master copies data to the slaves" - it's just a concept - data from a leader is copied to followers. There are many options how this could be done. Some of those are the write ahead log replication, blocks replication, rows replication.
Another interesting approach is to use a replication system completely separate from the storage. An example for this would be Bucardo - replication system for PostgreSQL. In that case nighter leader or follower actually do work.
I am trying to setup a cluster of 3 servers in 3 different locations; Dallas-US, London-UK, Mumbai-India. On each location I have setup a webserver and db server. On db server I have configured Galera Mariadb Multi-Master cluster to replicate db among all three servers. My each webservers are connected with local IP to their regional db server. I am expecting that my Dallas webserver will fetch db records from Dallas db server; London webserver from London db server and Mumbai webserver from Mumbai db server.
Everything is working well but I have found that mysql query takes much time above 100s while fetching record. I have tried Mariadb with single instance and its fetching data within 5s.
What am I doing wrong?
It is possible to configure Galera to be single-Master. It does not sound like you did that, but suggest double checking.
Given that all nodes are writable, here's a simplified view of what happens on each transaction.
Do all the work to store/update the data on the Master you are connected to. (Presumably, it is the local machine.)
At COMMIT time, make a single round-trip to each other nodes (probably ~200ms) to give them a chance to say "wait! that would cause a conflict".
Usually step 2 will come back with "it's OK". At this point, the COMMIT returns success to the client.
(Note: If you are not using BEGIN...COMMIT, but instead auto_commit=ON, then there is an implicit COMMIT at the end of each DML statement.)
For a local read, the default action should return "immediately".
But, maybe you are concerned about the "critical read" problem. (cf wsrep_sync_wait) In this case, you want to make sure that a write has propagated to your server. This is likely to lead to a 200ms delay on the read because it waits for the "gcache" is caught up.
If you can assume that only read from the same server that they write to, consider setting wsrep_sync_wait=0. If anyone does a cross-datacenter write, then read, he could hit the "critical read" problem. (This is where he writes something, but may not see it on the next read.)
I am using Mysql/MariaDB with Innodb storage engine version 10.x.
I want to setup a cluster with master-slave configuration. There is an option to read data from slave using --innodb-read-only or --read-only.
However in addition to the above, client needs to read the data from slave if and only if max slave lag is less than x seconds.
Slaves can lag behind the primary due to network congestion, low disk throughput, long-running operations, etc. The read preference with max allowed staleness option should let application specify a maximum replication lag, or “staleness”, for reads from slaves. When a secondary’s estimated staleness exceeds, the client stops using it for read operations from slaves and start reading from master.
I would like to know if there is an option in MySql/InnoDB?
There's no automatic option for switching the query to the master. This is handled by application logic.
You can run a query SHOW SLAVE STATUS and one of the fields returned is Seconds_Behind_Master. You would have to write application code to check this, and if the lag is greater than your threshold, query the master instead.
You might find some type of proxy that can do this logic for you. See https://mydbops.wordpress.com/2018/02/19/proxysql-series-mysql-replication-read-write-split-up/
It's not always the best option to treat a replica with X seconds of lag as unusable. Some queries are perfectly okay regardless of the lag. I wrote a presentation about this some years ago, and it includes some example queries. Read / Write Splitting with MySQL and PHP (Percona webinar 2013)
There are many Proxy products that may have code for such.
If you automatically switch to the Master, then it may get overwhelmed, thereby leading to worse system problem.
If you try to switch to another Slave, it is too easy to get into a flapping situation.
Galera has a way to deal with "critical read", if you wanted to go to a Cluster setup instead of Master + Slaves.
If part of the problem is the distance between Master and Slave, and if you switch to the Master, where is the Client? If it is near the Slave, won't the added time to reach the master cancel out some of the benefit?
Avoid long-running queries, beef up the Slave to avoid slow disks, speed up queries that are hitting the disk a lot, look into network improvements.
In summary, I don't like the idea of attempt to move a query to the Master; I would work on dealing with the underlying problem.
MariaDB MaxScale has multiple ways of dealing with replication lag.
The simplest method is to limit the maximum allowed replication lag with the max_slave_replication_lag parameter. This works exactly the way you described: if a slave is too many seconds behind the master, other slaves and, as a last resort, the master is used. This is the most common method of dealing with replication lag in MaxScale.
Another option is to use the causal_reads feature which leverages the MASTER_GTID_WAIT and other features found in MariaDB 10.2 and newer versions. This allows read consistency without adding additional load on the master. This does come at the cost of latency: if the server is lagging several seconds behind the read could take longer. This option is used when data consistency is critical but the request latency is not as important.
The third option is to use the CCRFilter to force reads to the master after a write happens. This is a simpler approach compared to causal_reads but it provides data consistency at the cost of increased load on the master.
We created read replica on our MySQL RDS server and our master instance has multi-AZ enabled, when we tried to force fail-over testing our read replica's IO thread got stopped and we were getting
Error 1236 fatal error our binary logs got corrupted.
To avoid this replica failure it's mandatory to enable innodb_flush_log_at_trx_commit=1 and sync_binlog=1 but if we set these variable as per recommendation then its degrade our write operation by 50% - 60%.
Is there any way through that we can avoid this replication error instead of setting above recommended value else if it's necessary to set as per recommendation then kindly suggest us a way how we can improve our write operations?
This Answer applies to MySQL replication in general, not just AWS.
If you are that close to exceeding the capacity of the system, you need to do some serious research into what is going on.
The short answer is to combine (where practicable) multiple transactions into one. innodb_flush_log_at_trx_commit=1 involves an 'extra' fsync at the end of each transaction. So, fewer transactions --> less I/O --> less contention.
Before I understood what was going on, I rand with sync_binlog=0. When something did crash, the binlogs were not actually "corrupt", but the Slaves would be pointing to an "impossible position". This is because the position information had been sent to the Slave before actually writing to disk on the Master. The solution was simple: Move the pointer (on the Slave) to the beginning (Pos=0 or 4) of the next binlog (on the Master).
I suspect (without any real evidence) that innodb_flatc has more impact on performance than sync_binlog.
Now for some AWS-specifics. If "multi-AZ" means that every disk write is writing to machines in two different datacenters, then the issue goes beyond just the two settings you brought up. If, instead, it means that the Slave is remote from the Master, then it is acting like ordinary MySQL Replication (for this Q&A).
I've purchased a single VPC on AWS and initiated there 6 MySql databases, and foreach one I've created a reading replica, so that I can always run queries on the reading replicas quickly.
Most of the day, my writing instances (original instances) are fully loaded and their CPUs percentage is mostly 99%. However, the reading replicas shows something ~7-10% CPU usage, but sometimes I get an error when I run a service connecting to the reading replica "TOO MANY CONNECTIONS".
I'm not that expert with AWS, but is this happening because the writing replicas are fully loaded and they're on the same VPC?
this happening because the writing replicas are fully loaded and they're on the same VPC?
No, it isn't. This is unrelated to replication. In replication, the replica counts as exactly 1 connection on the master, but replication does not consume any connections on the replica itself. There is no impact on connections related to the intensity of the total workload from replication.
This issue simply means you have more clients connecting to the replica than are allowed by the parameter group based on your RDS instance type. Use the query SELECT ##MAX_CONNECTIONS; to see what this limit is. Use SHOW STATUS LIKE 'THREADS_CONNECTED'; to see how many connections exist currently, and use SHOW PROCESSLIST; (as the administrative user, or any user holding the PROCESS privilege) in order to see what all of these connections are doing.
If many of them show Sleep and have long values in Time (seconds spent in the current state) then the problem is that your application is somehow abandoning connections, rather than properly closing them after use or when they are otherwise no longer needed.