I have an Kubernetes environment running multipe applications (services). Now i'm a little bit confused how to setup the MySQL database instance(s).
According to different sources each microservice should have there own database. Should i create a single MySQL statefulset in HA mode running multiple databases OR should i deploy a separate MySQL instance for each application (service) running one database each.
My first thought would be the first option hence where should HA oterwise be usefull for? Would like to hear some differente views on this.
Slightly subjective question, but here's what we have setup. Hopefully, that will help you build a case. I'm sure someone would have a different opinion, and that might be equally valid too:
We deploy about 70 microservices, each with it's own database ("schema"), and it's own JDBC URL (defined via a service). Each microservice has it's own endpoint and credentials that we do not share between microservices. So in effect, we have kept the design to be completely independent across the microservices as far as the schema is concerned.
Deployment-wise, however, we have opted to go with a single database instance for hosting all databases (or "schemas"). While technically, we could deploy each database on its own database instance, we chose not to do it for few main reasons:
Cost overhead: Running separate database instances for each microservice would add a lot of "fixed" costs. This may not be directly relevant to you if you are simply starting the database as a MySQL Docker container (we use a separate database service, such as RDS or Google Cloud SQL). But even in the case of MySQL as a Docker container, you might end up having a non-trivial cost if you run, for example, 70 separate containers one per microservice.
Administration overhead: Given that databases are usually quite involved (disk space, IIOPs, backup/archiving, purge, upgrades and other administration activities), having separate database instances -- or Docker container instances -- may put a significant toll on your admin or operations teams, especially if you have a large number of microservices
Security: Databases are usually also critical when it comes to security as the "truth" usually goes in the DB. Keeping encryption, TLS configuration and strengths of credentials aside (as they should be of utmost importance regardless of your deployment model), security considerations, reviews, audits and logging will bring in significant challenges if your databases instances are too many.
Ease of development: Relatively less critical in the grand scheme of things, but significant, nonetheless. Unless you are thinking of coming up with a different model for development (and thus breaking the "dev-prod parity"), your developers may have a hard time figuring out the database endpoints for debugging even if they only need that information once-in-a-while.
So, my recommendation would be to go with a single database instance (Docker or otherwise), but keep the databases/schemas completely independent and inaccessible by the any microservice but the "owner" microservice.
If you are deploying MySQL as Docker container(s), go with a StatefulSet for persistence. Define an external pvc so that you can always preserve the data, no matter what happens to your pods or even your cluster. Of course, if you run 'active-active', you will need to ensure clustering between your nodes, but we do run it in 'active-passive' mode, so we keep the replica count to 1 given we only use MySQL Docker container alternative for our test environments to save costs of external DBaaS service where it's not required.
Related
I have three different application environments: production, demo, and dev. In each, I have an RDS instance running MySQL. I have five tables that house data that needs to be the same across all environments. I am trying to find a way to handle this.
For security purposes, it's not best to allow demo and dev to access the production database, so putting the data there seems to be a bad idea.
All environments need read/write capabilities. Is there a good solution to this?
Many thanks.
For security purposes, it's not best to allow demo and dev to access the production database, so putting the data there seems to be a bad idea.
Agreed. Do not have your demo/dev environments access data from your production environments.
I don't know your business logic, but I cannot think of a case where dev/demo data needs to be "in sync" with production data, unless the dev/demo environment is also dependent on other "production assets". If that were the case, I would suggest duplicating that data into your other environments.
Usually, the data in your database would be dependent on the environment it's contained within.
For best security and separation of concerns, keep your environment segregated as much as possible. This includes (but not limited to):
database data,
customer data,
images and other files
If data needs to be synchronized, create a script/program to perform that synchronization completely (db + all necessary assets). But do that as part of your normal development pipeline so it goes through dev+testing+qa etc.
So the thing about RDS and database level access is that you still would manage the user credentials like you would on premise. From an AWS perspective all you would need to do to allow access is update the security groups of your Mysql RDS instances to allow the traffic, then give your application the credentials you have provisioned for it. I do agree it is bad practice to give production level access to your dev or demo environments.
As far as the data being the same you can automate a nightly snapshot of the Production database and recreate new instances based on that. If your infrastructure is in Cloudformation or Terraform you can provide the new endpoint created in the snapshot and spin up a new DEV or DEMO environment.
Amazon RDS creates a storage volume snapshot of your DB instance, backing up the entire DB instance and not just individual databases. You can create a DB instance by restoring from this DB snapshot. When you restore the DB instance, you provide the name of the DB snapshot to restore from, and then provide a name for the new DB instance that is created from the restore. You cannot restore from a DB snapshot to an existing DB instance; a new DB instance is created when you restore.
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_CreateSnapshot.html
I would recommend using a fan out system at the point of data capture, along with a snapshot.
Take a point in time snap shot (i.e. now), spin up test/dev databases from this, and then use SQS->SNS->SQS fan out architecture to push any new changes to the data to your other databases?
I'm setting up my first production server on docker but I'm not sure where my MSQL database should live. Should the database live outside the container or within? I've read some articles/posts previously that it should live outside so nothing changes if you have to fire up a new container or image, but I'm not sure if this is correct or not. Are there any speed/performance differences with having it inside or outside of the container?
These are some of the responsibilities of our Database Administrators:
Establish and maintain sound backup and recovery policies and procedures
Implement and maintain database security (create and maintain users and roles, assign privileges)
Perform database tuning and performance monitoring
Perform application tuning and performance monitoring
Setup and maintain documentation and standards
Plan growth and changes (capacity planning)
If I need any of these services I use a database outside of the container and hosted by specialists.
If the data needs to be accessed by other applications I use a database on a centralized database server outside of the container and hosted by specialists.
On performance: Docker containers use a virtual network interface by default, see Docker Advanced networking documentation. This comes with just a slight speed overhead. Still, depending on your expected load, you might want to either bind your DB container to the host network or not dockerize your DB at all.
On data persistence: If you are using volumes or volume containers your data lives outside the container and can be mounted by any new container too. No worries here.
On whether to use containers for DBs (my opinion): It is currently en vogue to containerize stateless and interchangeable applications, meaning that you can simply throw away outdated services and replace them by new containers. While this really makes sense for frequently updated microservices… do you really need this for a comparatively long-lived service like databases? Yes, Docker still helps to contain dependencies and ship stuff faster, but there are alternatives like Ansible-provisioned VMs. In the end it depends on what is easiest for your use case.
I am working on docker container on linux machine. I have to create database docker container. I have chosen MySQL database. I have three requirement:
load balancing - the database contain a huge table will approx. 100 million record. So we need to share the table across multiple server. To carter this I have chosen MySQL Cluster. I need to distribute the data based on the sharde key. The load balancing will be done by HAProxy.
Que : please correct me if I am wrong? provide a better solution
Persistence - even if the all database container dies, its should able to recover from it For this I have planned to create data-only docker container.
Que : if the data-only docker container dies, will this container able to recover? Is there any change in volume if it comes up?
Availability - Since there will be multiple SQL server with replica feature, even if one server dies other server will become primary.
Que : please correct me if I am wrong? provide a better solution
Once upon a time, I remember a when a database table with one million records was considered "big data"...
Before assuming you need to split your dataset across multiple machines I would highly suggest that you first get comfortable with running a single database within a Docker container. Given enough resources MySQL is quite capable of scaling up to 100 million records.
Docker is designed to isolate processes from others running on the same host. This creates challenges for monolithic applications which frequently have a software architecture involving multiple processes communicating to each other over some form of host based IPC (inter process communication). That does not mean they cannot be containerized, but a large multiprocess container looks and operates a lot like a virtual machine, implying that perhaps docker is a less optimal technological fit.
Before I get too negative, it's completely possible to run clustered MySQL using Docker. Couple of examples returned by Google:
http://galeracluster.com/2015/05/getting-started-galera-with-docker-part-1/
http://severalnines.com/blog/how-deploy-galera-cluster-mysql-using-docker-containers
My warning is that you see less examples of running these clusters across multiple Docker hosts, implying the use cases are mostly for demo or test currently.
I am working on a project that has multiple instances of mysql running on different ports (3306, 3307, 3308) the variation in ports was the reason for usernames and passwords and passwords being rejected, however Im not sure why a system administer would choose to do this, can someone help clarify why you would run multiple instances of MySQL which can potentially lead to confusion about usernames and privileges on the differing instances?
Utilize existing hardware properly -
Currently in a standard set up, MySQl queries run in a single thread,(http://lists.mysql.com/internals/37589) having multiple instances give the opportunity to make better use of your hardware, particualrly CPU cores. If your application uses a number of databases that involve a lot of connections then splitting the different databases over different ports allows you to utilise your hardware more efficiently. Also regarding replication, multiple versions can be used to support slaves, "Scale-out solutions - spreading the load among multiple slaves to improve performance. In this environment, all writes and updates must take place on the master server. Reads, however, may take place on one or more slaves. This model can improve the performance of writes (since the master is dedicated to updates), while dramatically increasing read speed across an increasing number of slaves." http://dev.mysql.com/doc/refman/5.0/en/replication.html
Run multiple versions -
Having multiple instances also allows you to have various versions available to a developer for testing and integration purposes. "In some cases, you might want to run multiple instances of MySQL on a single machine. You might want to test a new MySQL release while leaving an existing production setup undisturbed. Or you might want to give different users access to different mysqld servers that they manage themselves. (For example, you might be an Internet Service Provider that wants to provide independent MySQL installations for different customers.)" http://dev.mysql.com/doc/refman/5.5/en/multiple-servers.html
Reduce Licensing Hardware / OS costs & Smaller/manageable data centre foot print & generally reduced overhead - If you are concerned about licences on the hardware or the OS then the ability to run multiple instances of an application on a single machine will appeal as obviously you would not require additional machines and operating systems to run more versions, also reducing support and maintenance costs of separate machines.
Here is an excellent article on the implementation of said approach, the main points I confess gave the structure to this answer: http://opensourcedbms.com/dbms/running-multiple-mysql-5-6-instances-on-one-server-in-centos-6rhel-6fedora/
Could be Development, Test and Production instances.
(although I would probably have just one, with development, test and production databases).
I have been making some research in the domain of servers for a website I want to launch. I thought of a certain configuration of a server with RAID 10 implemented with a NAS doing the backup which has a RAID 10 configuration as well. This should keep data safe in 99.99+ of cases.
My problem appeared when I thought about the need of a second server. If I shall ever require more processing power and thus more storage for users, how can I connect a second server to my primary one and make them act as one what the database (mySQL) is regarded?
I mean, I don't want to replicate my first DB on the second server and load-balance the request - I want to use just one DB (maybe external) and let the servers use it both at the same time. Is this possible? And is the option of backing up mySQL data on a NAS viable?
The most common configuration (once scaling up from a single box) is to put the database on its own server. In many web applications, the database is the bottleneck (rather than the web server); so the first hardware scale-up step tends to be to put the DB on its own server.
This also allows you to put additional security between the database and web server - firewalls are common; different user accounts etc. are pretty much standard.
You can then add web servers to the load balancer, all talking to the same database, as long as your database can keep up.
Having more than one web server also helps with resilience - you can have a catastrophic hardware event on one webserver and the load balancer will direct the traffic to the remaining machines.
Scaling the database server performance is a whole different story - though typically you use very beefy machines for the database, and relative lightweights for the web servers.
To add resilience to the database layer, you can introduce clustering - this is a fairly complex thing to keep running, but protects you against catastrophic failure of a single machine.
Yes, you can back up MySQL to a NAS.