MYSQL HA on kubernetes (vmware) - mysql

I have setup MYSQLHA as per https://kublr.com/blog/setting-up-mysql-replication-clusters-in-kubernetes-2/ have two nodes up and ready able to deploy pods on each of them and replicate data from master to slave within seconds.
1 Master node
2 Slave nodes
VMWARE ESXi setup 3 VM's on seperate subnets
I also have NFS shared setup just in case required.
Ref:- https://kublr.com/blog/setting-up-mysql-replication-clusters-in-kubernetes-2/
How to perform auto fail-overs and scaling?

Async master-slave replication of MySQL is not the best fit for this. I would go for something like Galera replication where all the nodes are active in the cluster, can act as seed nodes for new joiners when you scale up and a simple readiness probe is enough to exclude faulty node / include new ones in the Galera cluster.
Asynchronous replication with master-slave is a good choice for cases that are ie. geographically distributed, so that the latency does not affect your workloads.

Related

Can I run user pods on Openshift master or Infra nodes

New to openshift so I am bit confused if we can run user pods on master or infra nodes. We have 2 worker nodes and one master and infra nodes each making 4 nodes. The reason for change is to share the loads between all 4 nodes rather than 2 compute nodes.
By reading some documents it seems possible to assign 2 roles to one node but is there any security risk or is it not best practice?
Running on openshift version v3.11.0+d699176-406
if we can run user pods on master or infra nodes
Yes, you absolutely can, the easiest way is to configure that at installation time, refer to https://docs.openshift.com/container-platform/3.11/install/example_inventories.html#multi-masters-using-native-ha-ai for example
is there any security risk or is it not best practice
Running single master node or a single infra node is already a risk to high availability of your cluster. If master fails - your cluster is basically headless, if infra node fails - you lose your internal registry and routers - thus losing an external access or an ability to create new images for your imagestreams. This also applies to host OS upgrades, you will have to reboot master and infra nodes some day, are you okay with having a guaranteed downtime during patching? What if something goes wrong during update?
Regarding running user workload on master and infra nodes - if you are not running privileged SCCs (which can allow to run privileged pods or using any uids on system etc) - you are somewhat safe from container breach, assuming there are no known bugs in container engine you are using. However you should pay close attention to your resource consumption and avoid running any workloads without CPU and memory limits, because overloading master node may result in cluster degradation. You should also monitor disk usage, since running user pods results in more images loaded to your docker storage.
So basically it boils down to this:
Better have multiple master (ideally 3) and couple of infra nodes than a single point of failure for both of these roles. Having separate master and worker nodes is of course better than hosting them together, but multimaster setup even with user workloads should be more resilient if you are watching resource usage carefully.

How distributed databases such as Redis and Cassandra work in Microservices architecture?

Suppose I have a microservice that updates or reads data from Redis and Cassandra. Now suppose I scale up and have 3 instances of exact same microservice. Do I need to 3 instances each for Redis and Cassandra so each instance of microservice has its own instance of Redis and Cassandra? Or since both Redis and Cassandra are cluster based distributed databases, I will not need 3 instances of these databases but share the same cluster among the 3 instances of the microservices?
What if MySQL is also being used by the microservice, will I need 3 instances of MySQL?
You can have a microservices architecture with a service scaled up to 3 instances, but uses a single instance of a database (redis/mysql/....), or you can have a cluster of the databases connected together (more than 1 instance of redis/mysql).
The idea is that you don't restrict a single replica/instance talking to a single replica/instance of the database, that's not the reason to scale up your architecture, i.e: you don't assign an instance of the database to an instance of the running service. It works this way
Service Load Balancer (routes traffic to a single instance) ---> cluster of instances --> DB Load Balancer --> routes traffic to a master node in the DB cluster (or whatever setup you use other than master-slave or master-master or consistent hashing).
TL;DR: Don't assign a database replica per service replica, later on you might need to scale your services to 10 replicas, but have a database cluster of only 3 nodes.

Benchmarking Mysql cluster using sysbench

When benchmarking a mysql clustering using sysbench, do you have to install sysbench on every machine in the cluster to benchmark the cluster performance? Is there a way to install sysbench on one machine and use it to benchmark other mysql servers on different machines?
If, for example i have HAProxy as the load balancer for the cluster which is configured on its own machine separate from the cluster nodes, then can you use the HAProxy machine only to benchmark the entire cluster since HAProxy machine will be doing the load balancing and acts as the window to all other cluster nodes?
I am knew to MySQL benchmarking, and new to using sysbench.
Thanks.
Yes, you will need to install sysbench on every SQL node, your intending to use and benchmark. (not the NDB data node).
HAProxy & ProxySQL are two different things but you can get the best of both worlds if you really want to.
HAProxy is a very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications. It is particularly suited for very high traffic web-sites and powers quite a number of the world's most visited ones. HAProxy(High Availability Proxy) is an open source load balancer which can load balance any TCP service. It is particularly suited for HTTP load balancing as it supports session persistence and layer 7 processing.
ProxySQL is an open-source MySQL proxy server, meaning it serves as an intermediary between a MySQL server and the applications that access its databases. ProxySQL can improve performance by distributing traffic among a pool of multiple database servers and also improve availability by automatically failing over to a standby if one or more of the database servers fail.
To run the sysbench benchmarks, follow this guide:
https://wiki.gentoo.org/wiki/Sysbench
To setup the ProxySQL:
https://www.digitalocean.com/community/tutorials/how-to-use-proxysql-as-a-load-balancer-for-mysql-on-ubuntu-16-04
Set Up Highly Available HAProxy Servers with Keepalived and Floating IPs: https://www.digitalocean.com/community/tutorials/how-to-set-up-highly-available-haproxy-servers-with-keepalived-and-floating-ips-on-ubuntu-14-04
To create multi-node SQL Cluster: https://www.digitalocean.com/community/tutorials/how-to-create-a-multi-node-mysql-cluster-on-ubuntu-16-04
Set the engine to use sysbench - ENGINE=NDBCLUSTER; on mysql client.
you will need to create database and then prepare the sysbench before running it. Good luck!

MySQL Master-Master replication performance

I have the following situation:
I have to set up a high-performance server-cluster with maximum availability with nginx and MySQL. The cluster consists of four web servers which are load ballanced with nginx+gluster which works just fine.
In addition there's another server with 2 SSDs in RAID1. On that server I intend to install 2 VMs each with 12GB of RAM where I set up the MySQL cluster with Master-Master replication.
But that only prevents the system to break down if the MySQL service breaks down on one of the VMs, not if the host system is offline.
To counter that I thought of adding 2 more nodes on other machines to the MySQL cluster as failover. Unfortunately I don't have more machines with SSDs.
Now my question: Would I have to expect performance issues because of the much slower hard drives on the failover machines? And if so, would these issues occur only when inserting data or also when calling pure select queries?
Of course I'd set the loadballancer to prioritize the faster nodes.

Architecture of MySQL on EBS for scaling (Amazon Web Services)

I'm trying to understand how to architect an Amazon Web Services application.
I have an instance running off of EBS. As far as I understand, I need to mount the EBS drive so that I can store my MySQL database on it.
When I later want to scale up, how do I do so? I understand that I can add more server instances, but how will they be accessing the database? Since from what I understand, the EBS volume can only be attached to one server instance.
I can't speak to this particular setup as I do not have experience using EBS with a MySQL instance but how this type of scaling is typically accomplished is by dedicating a particular instance as the master database server. Any time you spin up additional web servers those are still using the master DB IP to connect. At the time in which your database is the bottleneck you then spin up a slave DB instance on one of the boxes (or its own dedicated box). You can then configure replication in either a master to slave direction or a circular replication so that you can write to the slave instance as well.
If you choose the classic master to slave replication then you will have to make sure your writes are only performed on the master DB instance.
You can setup something like Zeus or any other connection load balancer so that you only ever have to connect to a single Database IP which will then round-robin route your read connections to your pool of servers. Otherwise you'd have to manage the connections yourself which is definitely not trivial. Good luck.
Growing Amazon EBS Volume sizes
You can give a try to MySQL clustering on your EBS backed instances. I have similar query, with more requirements, posted here.
EBS Volumes capacity can be scaled up using Snapshot->launch new volume technique, alternatively storage capacity can be scaled out using EBS Striping (RAID 0).
In AWS you cannot mount same EBS Volume to 2 EC2 instances simultaneously, so when you are scaling your application you need to scale out / up your MySQL DB either thru Replication or clustering. AWS RDS is a very good option for MySQL , if your application is read intensive then you can scale out using RDS Read replica's as well. If you need write scaling then functional partition or MySQL Shards can be explored.
AWS has an entire product dedicated to this: RDS.
In all but the rarest and most specialized of circumstances you're going to be better off using RDS than trying to create and tune your own EBS/EC2/MySQL infrastructure.
RDS also directly answers your question - they directly enable the creation of readonly databases to use as query slaves. RDS also performs backups, upgrades, and all sorts of fail-over infrastructure for you.
With EBS there's no way to attach a disk to multiple EC2 instances, so you're not going to be able to build out a failure cluster using that approach. Instead you're going to need replication or backup tools of some type.