How to keep a data store portable in Docker? - mysql

I am in the progress of changing my development environment to Docker and I'm pretty happy so far but I have one basic question. But first let me describe on what kind of setup I've landed.
I'm using the example of an environment for web development.
I'm organizing every service in its own container, so
PHP who talks to a mysql container and has a data container (called app) for the source.
nginx links to the PHP container and serves the files from the data container (app).
app is basically the same container as PHP (to save space) and mounts my project folder into the container. app then serves the project folder to the other containers.
then there is a mysql container who has his own data container called data
a phpmyadmin container that talks to the mysql container
and finally there is data, the data container for the DB.
I'm not sure if the benefits are clear for everyone, so there it is (because you could put everything into one container...).
Mounting the project folder from my host machine into the Docker container lets me use my favorite editor and gives me continuous development.
Decoupling the database engine from its store gives you the freedom to change the engine but keep the data. (And of cause you don't have to install any programming stuff apart from an editor and Docker.)
My goal is to have the whole setup highly portable, so having the latest version of my project code on the host system and not living inside a container is a huge plus. I am organizing the setup described above in a ´docker-compose.yml´ file inside my project folder. So I can just copy that whole project folder to a different machine, type ´docker-compose´ and be up and running.
I actually have it in my Dropbox and can switch machines just like that. Pretty sweet.
But there is one drawback. The DB store is not portable as it lies somewhere in the Virtualbox file system. I tried mounting the data store into the host OS but that doesn't really work. The files are there but I get various errors when I try to read or write to it.
I guess my question is if there is a best practice to keep the database store in sync (or highly portable) between different dev machines.

I'd nix the data containers and switched over to named volumes. Data containers haven't been needed for quite a while despite some outdated documentation indicating otherwise.
Named volumes let you select from a variety of volume drivers that make it possible to mount the data from outside sources (including NFS, gluster, and flocker). It also removes the requirement to pick a container that won't have a significant disk overhead, allows you to mount the folders in any location in each container, and separates container management from data management (so a docker rm -v $(docker ps -aq) doesn't nuke your data).
The named volume is as easy to create as giving the volume a name on the docker run, e.g. docker run -v app-data:/app myapp. And then you can list them with docker volume ls.

Related

PHP, Apache and MySQL in the same Docker container (same Dockerfile)

I need an image with all 3 items in the same container, I know it is not a good practice, I know docker is solving the scalability, redundancy and availability issues by splitting services, etc. For all my projects I separate the services but for this project in specific I need all 3 services running under the same container.
So far I can run apache or mysql but not both at the same time, some issues with the entrypoints, some issues with the mysql permissions and I still haven't been able to get them all together in the same container up and running.
Has anyone faced this issue? Maybe some documentation I have missed?
Thank you
I've solved this before for a purely-local dev testing environment (because as you stated, this isn't normally a good practice).
What I did was start my Docker image from Alpine Linux, and then installed PHP, MySQL, and Nginx on top of it. You'll also need to COPY your source files into the container and set their appropriate permissions with chmod or chown.
Alternatively, you can use a volume as well, but depending on your OS you might run into permissions issues unless you create the same user/group in your container that owns the files on your local system.
If you'd like some inspiration, you can view the Dockerfile for the container that I've mentioned in the above.

I need a foolproof checklist for hosting an already built e-commerce site on Amazon Web Services

I have built an e-commerce website on my local computer that uses Django version 2.2 and python 3.7.
The website consists of:
fancyfetish is the main project directory.
The apps, (cart, users, baseapp, products, blog) are all stored in their own directory 'apps.
Within the settings folder I have three settings files:
- production.py
- base.py
- development.py
The static file in the main directory is where I put collectstatic files.
Media is where I store externally uploaded images (product images for example)
Docs is just random bits like a hand drawn site layout.
Static files like JS and CSS are stored within baseapp, within apps.
I want to host this website on Amazon Web Services, and I assume I need to use Elastic Beanstalk. I went through the process of trying to host with free version of EB, installed the EB CLI, and after using eb create and eb deploy on the CLI my website appeared.
However, the static files didn't load properly in the first instance because I had not properly configured DJANGO_SETTINGS_MODULE. I have now done this. But before deploying I added eb migrate functionality so that I could also migrate my database.
This seems to have messed everything up. I can no longer deploy because there is a DATABASE error, which I expected. The error said 'Not able to connect to MySQL database through 'localhost'. Well, of course it cant.
So, in order to deploy my site on AWS I needed to configure the databases, because with the eb migrate functionality it will no longer deploy without trying to also connect to my database using the settings I have configured.
I have so far, whilst in development mode, connected my project to MySQL and everything is running perfectly on localhost, with my models transferring beautifully to the database just as I would like.
I worked out that I need to create a database on AWS, obviously. So I set up an RDS. I didn't link it to my deployed application because it would appear that the application doesn't have an environment that I can see when I log into my console. So where my project has been deploying to I don't know, because it doesn't look like the CLI version is connected to the online version in my console.
So I thought I'd deal with that problem later and work out how to make a database, which I managed to do. However, migrating the database I already have up and running on MySQL to my RDS database seems impossible, and there are not very good instructions. Let alone trying to then connect said database to my deployed application, which doesn't seem to sync with my local app.
So, I have ended up deleting everything because I was becoming so confused, with so many new directories (.ebextensions etc etc) and a database that wont connect, a project that won't deploy, a database that wont point to my project etc. I ended up created an EC2 folder and all sorts, getting myself massively confused with what I actually need to do to make this whole thing work.
If any part of this ramble makes any sense to anyone out there, and you yourself have managed to deploy a larger django project to AWS and keep your existing databases then please do let me know. But I have a feeling this may be a long shot.
Basically I need a step by step list of what to do to deploy:
For example:
1) Create an elastic beanstalk instance
2) Create an environment on CLI that syncs to the one in my AWS console
etc
etc
(With how to's if you possibly have the time!)
Thank you, and I am so sorry for being so confused by something that may be simple
Edited to show my process:
I have built a directory called .ebextensions with a file within it called django.config with the following content:
option_settings:
aws:elasticbeanstalk:container:python:
WSGIPath: fancyfetish/wsgi.py
I have run the following command:
eb init -p python-3.6 fancyfetish
There was no output as a result of this in the terminal, however a directory was created called .elasticbeanstalk with one file in it called config.yml
I then typed eb init to create an SSH key pair and there was no output from this command at all:
As you can see I have tried doing this several times.
Instead I created a key pair manually within AWS console and a file automatically to my computer called keyname.pem
I then typed into the console
chmon 400 path/to/key/keyname.pem
This provided no output on the terminal so I cannot know if it worked.
I moved the downloaded SSH file into the .SSH directory in the Home directory of my computer, and then in the terminal typed:
eb init -k nameofkey
The output was:
WARNING: Uploaded SSH public key for "fancyfet" into EC2 for region us-
west-2.
I then went on to type
eb create fancyfet-env
And an environment was created with the following output:
I know that this has to do with databases and connecting to MySQL.
I then typed:
eb deploy
With the following output:
So now comes the bit where I get stuck, successfully creating a database that connects to my already existing database that is populated with database in MySQL, and connecting the project to the database.
HELP!(Thank you so much!)

How best to deploy this multi-tier app?

We currently have an application that runs on one dedicated server. I'd like to move it to OpenShift. It has:
A public-facing web app written in PhP
A Java app for administrators running on Wildfly
A Mysql database
A filesystem containing lots of images and documents that must be accessible to both the Java and PhP apps. A third party ftp's a data file to the server every day, and a perl script loads that into the db and the file system.
A perl script occasionally runs ffmpeg to generate videos, reading images from and writing videos to the filesystem.
Is Openshift a good solution for this, or would it be better to use AWS directly instead (for instance because they have dedicated file system components?)
Thanks
Michael Davis
Ottawa
The shared file system will definitely be the biggest issue here. You could get around it by setting up your applications to use Amazon S3 or some other shared Cloud file system though fairly easily.
As for the rest of the application, if I were setting this up I would:
Setup a scaled PHP application, even if you set the scaling to just use 1 gear this will allow you to put the MySQL database on it's own gear, and even choose a different size for it, such as having medium web gears (that run php) and a large gear that runs the MySQL database. This will also allow your wildfly gear to access the database since it will have a FQDN (fully qualified domain name) that any of your applications on your account can reach. However, keep in mind that it will use a non-standard port instead of 3306.
Then you can setup your WildFly server as whatever size you want, but, keep in mind that the MySQL connection variables will not be there, you will have to put them into your java application manually.
As for the perl script, depending on how intensive it is, you could run it on it's own whatever sized gear with some extra storage, or you could co-locate it with either the php or java application as a cron job. You can have it store the files on Amazon S3 and pull them down/upload them as it does the ffmpeg operations on them. Since OpenShift is also hosted on Amazon (In the US-EAST region) these operations should be pretty fast, as long as you also put your S3 bucket in the US-EAST region.
Those are my thoughts, hope it helps. Feel free to ask questions if you have them. You can also visit http://help.openshift.com and under "Contact Us" click on "Submit a request" and make sure you reference this StackOverflow question so I know what you are talking about, you can ask any questions you might have and we can discuss solutions for them.

Why docker (Container-based technologies) are useful

Lately I have been looking into docker and the usefulness it can provide to a SaaS company. I have spent some time learning about how to containerize apps and learning briefly about what docker and containers are. I have some problems understanding the usefulness of this technology. I have watched some videos from dockercon and it seems like everyone is talking about how docker makes deployment easy and how you deploy in your dev environment is guaranteed to run in production. However, I have some questions:
Deploying containers directly to production from the dev environment means that developers should develop inside containers which are the same containers that will run on production. This is practically not possible because developers like to develop on their fancy MACs with IDEs. Developers will revolt if they are told to ssh into containers and develop their code inside. So how does that work in companies that currently use docker?
If we assume that the development workflow will not change. Developers will develop locally and push their code into a repo or something. So where is the "containerizing the app" fits within the workflow?
Also if developers do not develop within containers, then the "what you develop is what you deploy and is guaranteed to work" assumption is violated. If this is the case, then I can only see that the only benefit docker offers is isolation, which is the same thing virtualization offer, of course with a lower overhead. So my question here would be, is the low overhead the only advantage docker has on virtualization? or are other things I dont see?
You can write the code outside of a container and transfer it into the container in many different ways. Some examples include:
Code locally and include the source when you docker build by using an ADD or COPY statement as part of the Dockerfile
Code locally and push your code to a source code repository like GitHub and then have the build process pull the code into the container as part of docker build
Code locally and mount your local source code directory as a shared volume with the container.
The first two allow you to have exactly the same build process in production and development. The last example would not be suitable for production, but could be quickly converted to production with an ADD statement (i.e. the first example)
In the Docker workflow, Developers can create both source code (which gets stored and shared in a repository like git, mercurial, etc) and a ready-to-run container, which gets stored and shared via a repository like https://registry.hub.docker.com or a local registry.
The containerized running code that you develop and test is exactly what can go to production. That's one advantage. In addition you get isolation, interesting container-to-container networking, and tool integration with a growing class of devops tools for creating, maintaining and deploying containers.

Where do you keep the configuration files for your stack?

For the website(s) I am a developer for we have a number of different technologies which make up our stack, each with a different set of configurations etc.
This is a Rails stack, so we're running things including:
Nginx w/ Passenger
Varnish
Redis
Memcached
MySQL
MongoDB
As we're continually tweaking our configs and changing them to support our continually changing system, and if we were to 'lose' the configurations (e.g. due to a server crash or otherwise) it would be a huge pain to rebuild from memory.
Given that version control would be extremely useful I can quite easily add these files into a Git repo or similar and store them in the cloud somewhere, but what about application-specific configuration (for example, URL Rewrite config for a website on a shared server)? Should these be in this same repo as well?
Put website specific stuff in the Git repo of that website, and system-wide stuff in a "systems" git repo.
If you are not currently using Source Control (of any kind) in your development environment, stop whatever you are doing and sort that out right now. That is the most important aspect of your setup.
At a very minimum you should keep EVERYTHING that is a text file and relates to your app (yes all config files, URL rewrites).
Others suggest you can put binary files also, but at the very minimum all source code, all config etc should be in source control.
By the end of the day :)