MySQL NDB Cluster nodes with randomly generated hostname/IP - google-compute-engine

For the past two years, I've been testing the performance of our software in the Google Compute Engine (GCE), using up to a total of 1k vCPUs for the worker-VMs and the VMs for MySQL NDB Cluster that we use.
I have automated the creation of worker VMs (that run our software) with the help of templates and instance groups, but for the MySQL NDB cluster I've always had two fixed machines which I occasionally resized manually (i.e. change the vCPUs and RAM).
I am now trying to bring up a variable number of VMs for the NDB cluster with different amounts of vCPUs and RAM, and I'd like to automate this process in my test suite. The main problem is that the NDB config expects fixed HostNames in the config. To use GCE templates and instance groups, I would be able to bring up only dynamically named instances like ndb-1ffrd, ndb-i8fhh, ....
To exemplify, here is my current ndb config (one of many) that involves fixed VMs ndb1 and ndb2:
[ndbd]
HostName=ndb1
NodeId=1
NodeGroup=0
[ndbd]
HostName=ndb2
NodeId=2
NodeGroup=0
[ndbd]
HostName=ndb1
NodeId=3
NodeGroup=1
[ndbd]
HostName=ndb2
NodeId=4
NodeGroup=1
I'd like to convert the fixed VMs ndb1/ndb2 into GCE instance groups where I can choose to bring an arbitrary number of such instances up (typically 2 or 4 though) for a test and then destroy the VMs afterwards, and perform this on-demand in an automated fashion during my tests. Reasoning behind this is to have repeatable tests with differently configured VMs. Changing many parameters manually over several tests leads to a nightmare when figuring out what the exact configuration was 10 tests ago -- this way each test would refer to a specific instance template for the ndb VMs.
However, GCE instance group members have a random suffix in their name and the NDB config expects fixed HostNames or IPs. So I'd need to either:
Have GCE generate instances from instance groups named in a deterministic way (e.g. ndb1, ndb2, ndb3, ...), so that I can rely on those names in my ndb configs, or
Somehow allow arbitrary hosts (or hosts with arbitrary suffixes) to connect as ndb nodes but still make sure that the same host isn't added to the same NodeGroup more than once -- something that is manually ensured in the above sample config.
Is there any way to achieve what I'm trying to do?

I think that this may be achieved using scripts along with gcloud SDK, gcloud let's you to launch resources in GCP, there are also many options/flags that may be helpful to set the configurations that you need.
Hope this helps.

Related

Can I run user pods on Openshift master or Infra nodes

New to openshift so I am bit confused if we can run user pods on master or infra nodes. We have 2 worker nodes and one master and infra nodes each making 4 nodes. The reason for change is to share the loads between all 4 nodes rather than 2 compute nodes.
By reading some documents it seems possible to assign 2 roles to one node but is there any security risk or is it not best practice?
Running on openshift version v3.11.0+d699176-406
if we can run user pods on master or infra nodes
Yes, you absolutely can, the easiest way is to configure that at installation time, refer to https://docs.openshift.com/container-platform/3.11/install/example_inventories.html#multi-masters-using-native-ha-ai for example
is there any security risk or is it not best practice
Running single master node or a single infra node is already a risk to high availability of your cluster. If master fails - your cluster is basically headless, if infra node fails - you lose your internal registry and routers - thus losing an external access or an ability to create new images for your imagestreams. This also applies to host OS upgrades, you will have to reboot master and infra nodes some day, are you okay with having a guaranteed downtime during patching? What if something goes wrong during update?
Regarding running user workload on master and infra nodes - if you are not running privileged SCCs (which can allow to run privileged pods or using any uids on system etc) - you are somewhat safe from container breach, assuming there are no known bugs in container engine you are using. However you should pay close attention to your resource consumption and avoid running any workloads without CPU and memory limits, because overloading master node may result in cluster degradation. You should also monitor disk usage, since running user pods results in more images loaded to your docker storage.
So basically it boils down to this:
Better have multiple master (ideally 3) and couple of infra nodes than a single point of failure for both of these roles. Having separate master and worker nodes is of course better than hosting them together, but multimaster setup even with user workloads should be more resilient if you are watching resource usage carefully.

Trying to create two MySQL pods in kubernetes with same volume for high availability

I am trying to deploy two MySQL pods with the same PVC, but I am getting CrashLoopBackoff state when I create the second pod with the error in logs: "innoDB check that you do not already have another mysqld process using the same innodb log files". How to resolve this error?
There are different options to solve high availability. If you are running kubernetes with an infrastructure that can provision the volume to different nodes (f.e. in the cloud) and your pod/node crashes, kubernetes will restart the database on a different node with the same volume. Aside from a short downtime you will have the database back up running in a relatively short time.
The volume will be mounted to a single running mysql pod to prevent data corruption from concurrent access. (This is what mysql notices in your scenario as well, since it is no designed for shared storage as HA solution)
If you need more you can use the built in replication of mysql to create a mysql 'cluster' which can be used even if one node/pod should fail. Each instance of the mysql cluster will have an individual volume in that case. Look at the kubernetes stateful set example for this scenario: https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/

Compute Engine Instance

I have created a Google Compute Engine instance with CentOS and added some stuff there, such as Apache, Webmin, ActiveCollab, Gitolite etc.. etc.
The problem is that the VM is always running out of memory because the RAM is too low.
How do I change the assigned RAM in Google Compute Engine?
Should I have to copy the VM to another with bigger RAM? If so will it copy all the contents from my CentOS installation?
Can anyone give me some advises on how to get more RAM without having to reinstall everything.
Thanks
The recommended approach for manually managed instances is to boot from a Persistent root Disk. When your instance has been booted from Persistent Disk, you can delete the instance and immediately create a new instance from the same disk with a larger machine type. This is similar to shutting down a physical machine, installing faster processors and more RAM, and starting it back up again. This doesn't work with scratch disks because they come and go with the instance.
Using Persistent Disks also enables snapshots, which allow you to take a point-in-time snapshot of the exact state of the disk and create new disks from it. You can use them as backups. Snapshots are also global resources, so you can use them to create Persistent Disks in any zone. This makes it easy to migrate your instance between zones (to prepare for a maintenance window in your current zone, for example).
Never store state on scratch disks. If the instance stops for any reason, you've lost that data. For manually configured instances, boot them from a Persistent Disk. For application data, store it on Persistent Disk, or consider using a managed service for state, like Google Cloud SQL or Google Cloud Datastore.

Configuring Web Apps for Distributed Database

I have read MongoDB's replication docs and MySQL's Cluster page however I cannot figure out how to configure my web apps to connect to database.
My apps will have connection information, database host, username, password etc, however, even if there is multi server function, should I need a big master that has a fixed ip that distirbutes the load to servers? Then, how can I prevent single-point of failure? Is there any common approaches to that problem?
Features such as MongoDB's replica sets are designed to enable automatic failover and recovery. These will help avoid single points of failure at the database level if properly configured. You don't need a separate "big master" to distribute the load; that is the gist of what replica sets provide. Your application connects using a database driver and generally does not need to be aware of the status of individual replicas. For critical writes in MongoDB you can request that the driver does a "safe" commit which requires data to be confirmed written to a minimum number of replicas.
To be comprehensively insulated from server failures, you still have to consider other factors such as physical failure of disks, machines, or networking equipment and provision with appropriate redundancy. For example, your replica sets should be distributed across more than one server or instance. If all of those instances are in the same physical colocation facility, your single point of failure could still be the hopefully unlikely (but possible) case where the colocation facility loses power or network.

Architecture of MySQL on EBS for scaling (Amazon Web Services)

I'm trying to understand how to architect an Amazon Web Services application.
I have an instance running off of EBS. As far as I understand, I need to mount the EBS drive so that I can store my MySQL database on it.
When I later want to scale up, how do I do so? I understand that I can add more server instances, but how will they be accessing the database? Since from what I understand, the EBS volume can only be attached to one server instance.
I can't speak to this particular setup as I do not have experience using EBS with a MySQL instance but how this type of scaling is typically accomplished is by dedicating a particular instance as the master database server. Any time you spin up additional web servers those are still using the master DB IP to connect. At the time in which your database is the bottleneck you then spin up a slave DB instance on one of the boxes (or its own dedicated box). You can then configure replication in either a master to slave direction or a circular replication so that you can write to the slave instance as well.
If you choose the classic master to slave replication then you will have to make sure your writes are only performed on the master DB instance.
You can setup something like Zeus or any other connection load balancer so that you only ever have to connect to a single Database IP which will then round-robin route your read connections to your pool of servers. Otherwise you'd have to manage the connections yourself which is definitely not trivial. Good luck.
Growing Amazon EBS Volume sizes
You can give a try to MySQL clustering on your EBS backed instances. I have similar query, with more requirements, posted here.
EBS Volumes capacity can be scaled up using Snapshot->launch new volume technique, alternatively storage capacity can be scaled out using EBS Striping (RAID 0).
In AWS you cannot mount same EBS Volume to 2 EC2 instances simultaneously, so when you are scaling your application you need to scale out / up your MySQL DB either thru Replication or clustering. AWS RDS is a very good option for MySQL , if your application is read intensive then you can scale out using RDS Read replica's as well. If you need write scaling then functional partition or MySQL Shards can be explored.
AWS has an entire product dedicated to this: RDS.
In all but the rarest and most specialized of circumstances you're going to be better off using RDS than trying to create and tune your own EBS/EC2/MySQL infrastructure.
RDS also directly answers your question - they directly enable the creation of readonly databases to use as query slaves. RDS also performs backups, upgrades, and all sorts of fail-over infrastructure for you.
With EBS there's no way to attach a disk to multiple EC2 instances, so you're not going to be able to build out a failure cluster using that approach. Instead you're going to need replication or backup tools of some type.