MySQL Cluster Load Balancing across multiple Slaves - mysql

I'm in the process of creating a MSSS mysql setup is there a way to load balance the Slaves so that the web box simply asks for a slave connection and one is given to it that isnt being hammered.
my current plan, which isnt ideal, is to do a round robin random approach for each connection. The issue i have though, is what if one of the Slaves breaks, i am unsure how to remove this from rotation.
I'm wondering if others have created a cluster like this and how they manage/maintain it. as i'm a bit clueless.

I'm about to vote to close this as too broad but I'll give you some pointers (using an answer as a comment doesn't provide much space).
Managing clusters is hard.
Have a look at HAProxy and mysqlproxy, both of which are capable of doing this - the trick is planning how your cluster should behave -
Should server state be shared across clients - so if one client fails to connect none will try the server again. How do you deal with 'Too many connections'?
Are you trying to improve availability as well as performance? (single points of failure, preconfiguring one of the slaves as a master but directing writes to the designated master)
Will you have a requirement to fence nodes?
Is asynchronous replication a requirement? (compared with semi-synch or multi-master)
How do you direct writes to the master?

Related

MySQL Replication: Question about a fallback-system

I want to set up a complete server (apache, mysql 5.7) as a fallback of a productive server.
The synchronization on file level using rsync and cronjob is already done.
The mysql-replication is currently the problem. More precisely: the choice of the right replica method.
Multi primary group replication seemed to be the most suitable method so far.
In case of a longer production downtime, it is possible to switch to the fallback server quickly via DNS change.
Write accesses to the database are possible immediately without adjustments.
So far so good: But, if the fallback-server fails, it is in unreachable status and the production-server switches to read only, since its group no longer has the quota. This is of course a no-go.
I thought it might be possible using different replica variables: If the fallback-server is in unreachable state for a certain time (~5 minutes), the production-server should stop the group_replication and start a new group_replication. This has to happen automatically to keep the read-only time relatively low. When the fallback-server is back online, it should be manually added to the newly started group. But if I read the various forum posts and documentation correctly, it's not possible that way. And running a Group_Replication with only two nodes is the wrong decision anyway.
https://forums.mysql.com/read.php?177,657333,657343#msg-657343
Is the master - slave replication the only one that can be considered for such a fallback system? https://dev.mysql.com/doc/refman/5.7/en/replication-solutions-switch.html
Or does the Group_Replication offer possibilities after all, if you can react suitably to the quota problem? Possibilities that I have overlooked so far.
Many thanks and best regards
Short Answer: You must have [at least] 3 nodes.
Long Answer:
Split brain with only two nodes:
Write only to the surviving node, but only if you can conclude that it is the only surviving node, else...
The network died and both Primaries are accepting writes. This to them disagreeing with each other. You may have no clean way to repair the mess.
Go into readonly mode with surviving node. (The only safe and sane approach.)
The problem is that the automated system cannot tell the difference between a dead Primary and a dead network.
So... You must have 3 nodes to safely avoid "split-brain" and have a good chance of an automated failover. This also implies that no two nodes should be in the same tornado path, flood range, volcano path, earthquake fault, etc.
You picked Group Replication (InnoDB Cluster). That is an excellent offering from MySQL. Galera with MariaDB is an equally good offering -- there are a lot of differences in the details, but it boils down to needing 3, preferably dispersed, nodes.
DNS changes take some time, due to the TTL. A proxy server may help with this.
Galera can run in a "Primary + Replicas" mode, but it also allows you to run with all nodes being read-write. This leads to a slightly different set of steps necessary for a client to take to stop writing to one node and start writing to another. There are "Proxys" to help with such.
FailBack
Are you trying to always use a certain Primary except when it is down? Or can you accept letting any node be the 'current' Primary?
I think of "fallback" as simply a "failover" that goes back to the original Primary. That implies a second outage (possibly briefer). However, I understand geographic considerations. You may want your main Primary to be 'near' most of your customers.
I recommend using the Galera MySQL cluster with HAProxy as a load balancer and automatic failover solution. we have used it in production for a long time now and never had serious problems. The most important thing to consider is monitoring the replication sync status between nodes. also, make sure your storage engine is InnoDB because Galera doesn't work with MyISAM.
check this link on how to setup :
https://medium.com/platformer-blog/highly-available-mysql-with-galera-and-haproxy-e9b55b839fe0
But in these kinds of situations, the main problem is not a failover mechanism because there are many solutions out of the box, but rather you have to check your read/write ratio and transactional services and make sure replication delays won't affect them. some times vertically scalable solutions with master-slave replication are more suitable for transaction-sensitive financial systems and it really depends on the service your providing.

Should I be using MySQL Cluster, Master to Master replication, or something else?

I started just using a solo MySQL server and moved on to using Master to Slave replication. This is all inside our network. I have a cloud application where customers post orders to our system. When our ISP goes down it's a nightmare.
I'm looking to have one server on-site and one server off-site that can be in sync and if the one goes down the other one can take over and not miss a step. I have my DNS failover in place and 2 web servers, but I can't decide what I need to do for the MySQL servers.
I don't mind putting in the work to learn MySQL cluster, but I'm not sure if that is the correct solution or master to master or something else?
Scale: I have an orders table currently sitting at 150,000 rows and that could grow to 500,000 this year and possibly start getting into the millions over the next couple years.
Any advice would be greatly appreciated as I never had any formal schooling on the issue.
Thanks in advance.
Simpliest way - just place your server in some reliable datacenter. This way you can reduce failure rate to be tolerable to be handled semi-manually with a master-slave configuration.
If you need to host on-site - then look into improving your site's connectivity, like all datacenters do - have backup ISP channels, with own AS(ip autonomous system) and BGP routing - so when one ISP fails - even ips stay the same, traffic just balances to the others.
Mysql server does not support master-master replication, only mysql cluster supports multi-master, so in fact if you need fast failover - question is whether to host it yourself or use DBaaS (database-as-a-service, folks like cleardb).
DBaaS with SLA and failover is quite expensive, also adds some network delays because your own app and db servers groups most probably are in the same datacenters. But on the plus side - they are easier and faster to setup.

Mysql remote synch

We currently have an application located on a remote server, and our call center uses this application to perform customer transactions.
We plan to setup asterisk on a local server to help us with all the call routing and recording, for asterisk to work smoothly we have to move our application from the remote server to the local.
Its will be easy to mover all data to the local server and do transactions locally, but there is an option for users to do transactions online too which will hit the remote server database.
The reason we still have the remote application because of the reliable infrastructure and backup solution provided by rackspace.
If we move application to local server i am looking at a reliable solution for syncing remote and local databases so that we can handle local as well as online transactions.
Why not use mysql master-master replication and hold definitive data at both ends? (Note you'll have to do some reading on on auto_increment_increment and auto_increment_offset)
symcbean's answer is basically correct. I'd add this article as a good starting place to understand master-master replication. I'd further recommend High Performance MySQL as a good reference for a deeper understanding of the techniques and issues.
There are some issues that you will have to face doing writes to two non-colocated MySQL servers. You'll have replication lag to deal with, so the databases won't necessarily be completely in sync, but will only be "eventually consistent". Also, if you have both sides doing updates on content, you can end up with data integrity issues. If your system leans towards INSERTs more then UPDATES for the write operations, it is less likely that you'll run into issues. Also, if the subset of data that is likely to be modified tends to be localized around one or the other of the servers, you'll run into fewer issues.
Otherwise, you'll probably want to roll your own solution that is designed towards the specific use cases of your application.

What would be my best MySQL Synchronization method?

We're moving a social media service to be on separate data centers as our other hosting provider's entire data center went down. Twice.
This means that both websites need to be synchronized in some sense -- I'm less worried about the code of the pages, that's easy enough to sync, but they need to have the same database data.
From my research on SO, it seems MySQL Replication is a good option, but the MySQL manual, for scaling out, says that its best when there are far more reads then there are writes/updates:
http://dev.mysql.com/doc/refman/5.0/en/replication-solutions-scaleout.html
In our case, it's about equal. We're getting around 200-300 thousand requests a day right now, and we can grow rapidly. Every request is both a read and write request.
What would be the best method or tool to handle this?
Replication isn't instantaneous, and all writes have to be sent over the wire to the remote servers, so it takes bandwidth too. As long as this works for you and you understand the consequences, then don't worry about the read/write ratio.
However, are you sure that you need global replication? We handle millions of requests and have one location, with multiple web servers connected to two databases. One database is the live database, and the other is a replicated read only database.
We do have global fail over locations, and some people connect to these on any day, even if our main node is up because they have Internet issues. The data just trickles in though.
If the main node went down, then every body would be using the global fail over locations, in order. So, if our main node died, all customers would connect to Denver. If Denver went down, they'd all connect to Columbus.
Also, our main node is on two different Internet providers, so one ISP going down doesn't take us down.
Is the connection speed between two datacenters good enough? You can copy files to a new server and move database there. And then setup old server so that it will connect to new server's MySQL database in another DC? This will be slower of course, but depending on the nature of your queries it can be acceptable. As soon as DNS or whatever moves/finishes, you just power off the old server when there is no more requests for it.
To help you to assess your options you need to consider what your requirements are in a disaster recovery scenario (i.e. total loss of the system in one data-centre).
In particular for this scenario, how much data can you afford to lose (recovery point objective - RPO), and how quickly do you need to have the standby data-centre version of the site up and running (recovery time objective - RTO).
For example if your RPO is no transactions lost and recovery in 5 minutes, then the solution would be different than if you can afford to lose 5 mins of transactions and an hour to recover.
Another question I'd ask is if you're using SAN storage at all? This gives you options for replication at the storage level (SAN array to SAN array), rather than at the database level (e.g. MySQL replication).
Also to consider is the distance between the data-centres (e.g. timewise can you afford to perform a synchronous write to both databases, or would an asynchronous replication approach be more appropriate)

MySQL Replication - one website, many servers, different continents

Consider a reasonably large website (2M+ pageviews / m, lots of users) with 2 frontend servers: one front server in the US, and one in Europe. Two dedicated URL bring the visitors on one of the server, one in the french language, the other one in english. Both sites share exactly the same data.
What would be the most cost effective solution? (DB used at my company: MySQL)
1/ A single Master server on Amazon EC2 (US), and slaves on the frontend servers?
Advantages: no master-master rep, meaning no risk of data conflict with autoincrement and duplicates on unique columns, etc..
Drawbacks: The lag! Won't there be too much lagging for writing in the US when you are in Europe?
Another drawback could be the lack of quick n dirty solution in case the master dies. And what about having slaves on same server as front?
2/ Two Amazon EC2 instances, one in the US, one in Europe, acting as master-master replication servers. Plus two slaves on each of the frontends?
Adv: Speed, and security of data. Of course there is no load balancer, but making a hack to switch the master to the other one seems pretty trivial.
Drwbcks: Price. And the risk of corruption on the DB
3/ Any other solution ?
As it is my first time working with servers in 2 continents, I would really appreciate learning from you experience in that area, including MySQL or not, including EC2 or not.
Thanks
Marshall
As usual, what I'm about to say depends on your app, how it uses the database, etc. You need to ask yourself:
If you're using off the shelf software, what have other people done in this situation?
Does the app need to work on the entire dataset, or can you partition?
Is your app built to handle multi-master replication (usually means uses autoincrement pk's)
What are the chances of update/delete collisions? What are the costs?
What's the read:write ratio? What's the nature of the writes? Are they typically update or append operations?
I'm assuming the french server is in Europe, while the English server is in the US? If you can partition your data so that the french site uses one DB and the english site uses the other, you're better off. Even if both sites access both DB's, since you don't have to worry about collisions. You can even run two mysql instances on each master server and do multimaster replication for both.
If you can't partition, I'd probably go with #2, but I'd designate one of the machines as the 'true' master and send all the writes to it to help avoid data clobber. This way it's easy to switch in a pinch.
If you're cost sensitive and you're going to run replicas on your front end servers anyway, just run the master databases on the front end servers. You can always pull it off later. Replicas can often have higher CPU/IO costs than masters taking the same read load: they have to execute their writes in serial, which can really screw things up.
Also, don't use m1.small instances for your DB. Or at least keep an eye on your performance. m1.smalls are significantly under powered, and if you watch top, you'll notice a significant percentage of your CPU time being stolen by the hypervisor. I recommend c1.medium's.
Don't use master-master replication, ever. There is no mechanism for resolving conflicts. If you try to write to both masters at the same time (or write to one master before it has caught up with changes you previously wrote to the other one), then you will end up with a broken replication scenario. The service won't stop, they'll just drift further and further apart making reconciliation impossible.
Don't use MySQL replication without some well-designed monitoring to check that it's working ok. Don't assume that becuase you've configured it correctly initially it'll either keep working, OR stay in sync.
DO have a well-documented, well-tested procedure for recovering slaves from being out of sync or stopped. Have a similarly documented procedure for installing a new slave from scratch.
Your application may need sufficient intelligence to know that a slave is out of sync or stopped, and that it should not be used, if you care about correct or up-to-date data. You'll need some kind of feedback from your monitoring to do this.
If you have a slave in, say the US when your master is in Europe, that would normally give you the amount of latency you expect, i.e. something in the order of 150ms more than if they were co-located.
In MySQL, the slave does not start a query until the master finishes it, so it will always be behind by the length of time an update takes.
Also, the slave is single-threaded, so a single "hard" update query will delay all subsequent ones.
If you're pushing your master hard on multithreaded write-load, assuming your slaves have identical hardware, it is very unlikely that they'll be able to keep up.
We are looking at a similar scenario - after Amazon Eastcoast has completely been cut off the net twice this week - meaning not even being replicated in multiple regions and using RDB instances in kept us available.
But DRB does not allow crossing from East to West or even into Europe.
We are now reviewing the approach of Master Master in East and West or even Europe with one master acting as a failover only, and failover via dnsmadeeasy which responds extremely fast.
Advantage: quick and reliable failover, short downtime, no complex management of the failover function.
Disadvantage: One extra system running without using it - but compared to using RDB that's not more expensive
DRB is nicely managed by Amazon including point in time recovery and so on - all that is lost by switching away from it. But the fact that it is limited to replications within only one area and that area can be completely cut off make it problematic. As an alternative to RDB backup we are looking at Zmanda open source tools to take care of backup management. NOt yet tested, but based on all our stuffing around with failover and databases and hardware and so this looks like the simplest and therefore most promising approach for high availiability.
This question is old, but the solution exists now: Galera. It does MySQL (InnoDB) replication, and works well with WANs, too. http://codership.com/