So I am using Mantl.io for our environment. Things are going very well and we are now past the POC phase and starting to think about how we are going to handle continuous delivery. Obviously automation is key. Maybe my approach or thinking is wrong but I am trying to figure out a way to manage the json I will pass to marathon to deploy the docker containers from our registry via a jenkins job call. We have various environments (testing, perf, prod, etc) and in each of these environments I will have my 30+ microservices needing different variables set for cpu, memory, environment variables, etc.
So I am just not sure the best approach for taking my docker containers and linking them with what could be maybe 10 or more different configurations per microservice depending on the environment.
Are there tools for building, managing, versioning, linking containers to configs to environments? I just can't seem to find anything in this realm and that leads me to believe I am headed down the wrong path.
Thanks
Related
Background:
As a backoffice service for our insurance mathematicians, a daily cronjob runs a pod.
Inside the pod, fairly complex future simulations take place.
The pod has two containers, an application server and a db server.
The process has few variables which are fed into the pod.
This is done by configmaps and container env variables.
When the pod is ready after approx. 10 hours, it copies the resulting database to another database
and then it's done. It runs daily because market data changes daily. And we also daily check our new codebase.
Great value, high degree of standardisation, fully automated.
So far so good.
But it uses the same configuration every time it runs.
Now what?
Our mathematicians would like to be able to start the pod feeding their own configuration data into it.
For example on a webpage with configurable input data.
Question:
Is there an existing Kubernetes framework implementing this?
"Provide a webpage with configurable input fields which are transformed into configmaps and env variables starting the pod"?
Sure, not too difficult to write.
But we do cloud native computing also because we want to reuse solutions of general problems and not write it ourselves if possible.
Thanks for any hints in advance.
They can start a Kubernetes Job for one time tasks. Apart from Google Cloud Console UI I'm not aware of an UI where you can configure fields for a config map. Maybe you can write a custom python script that launches these jobs.
https://kubernetes.io/docs/concepts/workloads/controllers/job/
I have an Kubernetes environment running multipe applications (services). Now i'm a little bit confused how to setup the MySQL database instance(s).
According to different sources each microservice should have there own database. Should i create a single MySQL statefulset in HA mode running multiple databases OR should i deploy a separate MySQL instance for each application (service) running one database each.
My first thought would be the first option hence where should HA oterwise be usefull for? Would like to hear some differente views on this.
Slightly subjective question, but here's what we have setup. Hopefully, that will help you build a case. I'm sure someone would have a different opinion, and that might be equally valid too:
We deploy about 70 microservices, each with it's own database ("schema"), and it's own JDBC URL (defined via a service). Each microservice has it's own endpoint and credentials that we do not share between microservices. So in effect, we have kept the design to be completely independent across the microservices as far as the schema is concerned.
Deployment-wise, however, we have opted to go with a single database instance for hosting all databases (or "schemas"). While technically, we could deploy each database on its own database instance, we chose not to do it for few main reasons:
Cost overhead: Running separate database instances for each microservice would add a lot of "fixed" costs. This may not be directly relevant to you if you are simply starting the database as a MySQL Docker container (we use a separate database service, such as RDS or Google Cloud SQL). But even in the case of MySQL as a Docker container, you might end up having a non-trivial cost if you run, for example, 70 separate containers one per microservice.
Administration overhead: Given that databases are usually quite involved (disk space, IIOPs, backup/archiving, purge, upgrades and other administration activities), having separate database instances -- or Docker container instances -- may put a significant toll on your admin or operations teams, especially if you have a large number of microservices
Security: Databases are usually also critical when it comes to security as the "truth" usually goes in the DB. Keeping encryption, TLS configuration and strengths of credentials aside (as they should be of utmost importance regardless of your deployment model), security considerations, reviews, audits and logging will bring in significant challenges if your databases instances are too many.
Ease of development: Relatively less critical in the grand scheme of things, but significant, nonetheless. Unless you are thinking of coming up with a different model for development (and thus breaking the "dev-prod parity"), your developers may have a hard time figuring out the database endpoints for debugging even if they only need that information once-in-a-while.
So, my recommendation would be to go with a single database instance (Docker or otherwise), but keep the databases/schemas completely independent and inaccessible by the any microservice but the "owner" microservice.
If you are deploying MySQL as Docker container(s), go with a StatefulSet for persistence. Define an external pvc so that you can always preserve the data, no matter what happens to your pods or even your cluster. Of course, if you run 'active-active', you will need to ensure clustering between your nodes, but we do run it in 'active-passive' mode, so we keep the replica count to 1 given we only use MySQL Docker container alternative for our test environments to save costs of external DBaaS service where it's not required.
This is going to be a generic question.
We are a young startup faced with the inevitable problem of scaling and during our research, Apache Mesos seemed like a good fit for our architecture, which is –
Core Scala based microservices, each responsible for dealing with a
part of our database, which is mainly MySQL
Middleware microservices,
to deal with some other persistent data-storage systems like MongoDB,
Elasticsearch etc.
Which basically means that we can containerise all of our services and ship them to a single datacenter which can then deploy these containers in a topographically agnostic way.
What we are currently stumped by is –
Mesos doesn't seem to have any native support for MySQL
Container based persistence seems awfully tricky and hard to manage/maintain.
We'd like to continue using MySQL/MongoDB/ElasticSearch because migrating to Cassandra etc. at this stage (we are a small team) is too much of an overhead and hence not an option.
What are the best strategies for this problem?
Mesos provide Persistent resources support for storage-like services.
If you want to use MySQL on mesos, please consider try https://github.com/apache/incubator-cotton
After some research we decided not to try Cotton but we're still sticking to deploying our services across a Mesos cluster.
Instead of hosting our own MySQL database, we decided to outsource it to Amazon RDS. But we're now faced with problems like doing something for our other databases like MongoDB.
It seems to me that both tools are used to easily install and automatically configure applications.
However, I've limitedly used Docker and haven't used Ansible at all. So I'm a little confused.
Whenever I search for a comparison between these two technologies, I find details about how to use these technologies in combination.
There are many reasons most articles talk about using them together.
Think of Ansible as a way of installing and configuring a machine where you can go back and tweak any individual step of that install and configuration in the future. You can then scale that concept out to many machines as you are able to manage.
A key difference where Ansible has an advantage is that it can not just manage the internals of the machine, but also manage the other systems such as networking, DNS, monitoring etc that surround the machine.
Building out many machines via Ansible takes pretty much as much time to do 50 machines as it does to make 1, as all 50 will be created step by step. If you are running a rolling deploy across multiple environments its this build step by step that takes up time.
Now think of Docker as having built one of those individual machines - installed and configured and ready to be deployed wherever a docker system is available (which is pretty much everywhere these days). The drawback here is you don't get to manage all the other aspects needed around making docker containers actually work, and tweaking them long term isn't as much fun as it sounds if you haven't automated the configuration (hence Ansible helps here).
Scaling from 1 to 50 Docker machines once you have already created the initial image is blindingly fast in comparison to the step by step approach Ansible takes, and this is most obvious during a rolling deploy of many machines in smaller groups.
Each has its drawbacks in either ability or speed. Combine them both however and it can be pretty awesome. As no doubt with most of the articles you have already read, I would recommend looking at using Ansible to create (and update) your base Docker container(s) and then using Ansible to manage the rollout of whatever scale of containers you need to satisfy your applications usage.
They are completely different things. Ansible is used to automate configuration and management of machines/containers an Docker is a lightweight container system for Linux.
http://www.ansible.com/home
https://www.docker.com/
I'm new to logstash but I like how easy it makes shipping logs and aggregating them. Basically it just works. One problem I have is I'm not sure how to go about making my configurations maintainable. Do people usually have one monolithic configuration file with a bunch of conditionals or do they separate them out into different configurations and launch an agent for each one?
We heavily use Logstash to monitor ftbpro.com. I have two notes which you might find useful:
You should run one agent (process) per machine, not more. Logstash agents requires some amount of CPU and memory, especially under high loads, so you don't want to run more than one on a single machine.
We manage our Logstash configurations with Chef. We have a separate template for each configuration and Chef assembles the configuration by the roles of the machine. So the final result is one large configuration in each machine, but on our repository the configurations are separate and thus maintainable.
Hope this helps you.
I'll offer the following advice
Send your data to Redis as a "channel" rather than a "list", based on time and date, which makes managing Redis a lot easier.
http://www.nightbluefruit.com/blog/2014/03/managing-logstash-with-the-redis-client/