docker commit mysql doesn't save - mysql

I am trying to create a docker image from a mysql container.
The problem is that db of the new image is clean, but
files/folders, which I create manually
in the origin container before commit, are copied.
base mysql image is official 5.6
docker is 1.11.
I checked that folder
/var/lib/mysql/d1 appears when a db is created but new image
doesn't persist this folder, though folders in / root are persisted.

Several things happening here:
First, docker commit is a code smell. It tends to be used by those creating images with a manual process, rather than automating their builds with a Dockerfile that would allow for easy recreation. If at all possible, I recommend you transition to a Dockerfile for your image creation.
Next, a docker commit will not capture changes made to a volume. And this same issue occurs if you try to update a volume with a RUN step in a Dockerfile. Both of these capture changes to the container filesystem and store those changes as a layer in the docker image, and the volumes are not part of the container filesystem. This is also visible if you run docker diff against a container. In this case, the upstream image has defined the volume in their Dockerfile:
VOLUME /var/lib/mysql
And docker does not have a command to undo a created volume from the Dockerfile. You would need to either directly modify the image definition from outside of docker (not recommended) or build your own upstream image with that step removed (recommended).
What the mysql image does provide is the ability to inject your own database creation scripts in /docker-entrypoint-initdb.d, which you can add with your own image that extends mysql, or mount as a volume. This is where you would inject your schema, or initialize from a known backup for development.
Lastly, if the goal is to have persistence, you should store your data in a volume, not by committing containers:
docker run -v mysql-data:/var/lib/mysql \
-e MYSQL_ROOT_PASSWORD=my-secret-pw -d mysql
The volume allows you to recreate the container, upgrade to a newer version of mysql when patches are released (e.g. security fixes) without losing your data.
To backup the volume this will export to a tgz:
docker run --rm -v mysql-data:/source busybox tar -cC /source . >backup.tgz
And to restore a volume, this creates one from a tgz:
docker run --rm -i -v mysql-data:/target busybox tar -xC /target <backup.tgz

You can make data persist by using docker commit command like below.
docker commit CONTAINER_ID REPOSITORY:TAG
docker commit | Docker Documentation
But just as BMitch's answer said, a docker commit will not capture changes made to a volume.
And usually you should use a volume to store data permanently and let a container be ephemeral without data being stored in itself.
So I guess many people think that trying to persist data without using a volume is a bad practice.
But there are some cases you might consider committing and freeze data into an image.
For example, it's handy when you have an image with all the tables and records in it if you use the image for automated test in CI.
In the case of github actions, only thing you need to do is just pull the image and create the database container and run tests against the database.
No need to think about migration of data.

Related

Github Action Service Container from Dockerfile in same repo

I'm learning Github Actions and designing a workflow with a job that requires a Service Container.
The documentation states that configuration must specify "The Docker image to use as the service container to run the action. The value can be the Docker base image name or a public docker Hub or registry". All of the examples in the docs use publicly-available Docker images, however I want to create a Service Container from a Dockerfile contained within my repo.
Is it possible to use a local Dockerfile to create a Service Container?
Because the job depends on a Service Container, that image must exist when the job begins, and therefore the image cannot be created by an earlier step in the same job. The image could be built in a separate job, but because jobs execute in separate runners I believe that Job 2 will not have access to the image created in Job 1. If this is true then could I follow this approach, using upload/download-artifact so provide Job 1's image to Job 2?
If all else fails, I could have Job 1 create the image and upload it to Docker Hub, then have Job 2 download it from Docker Hub, but surely there is a better way.
The GitHub Actions host machine (runner) is a fully loaded Linux machine, with everything everybody needs already installed.
You can easily launch multiple containers - either your own images, or public images - by simply running docker and docker-compose commands.
My advice to you is: Describe your service(s) in a docker-compose.yml file, and in one of your GitHub Actions steps, simply do docker-compose up -d.
You can create a docker image with the Dockerfile or docker-compose.yml residing inside the repo. Refer to this public gist, it might be helpful.
Instead of building multiple docker-images, you can use docker-compose. Docker-compose is the preferred way to deal with this kind of scenario.

Cannot map agent.conf using Cygnus docker installation

I have problem installing CYGNUS using docker as source, simply i cannot understand where i should map what specific agent.conf.
Image i am using is from here.
When i try to map agent.conf witch have my specific setup to container it starts and run but fail to copy, and not only that any change i made to file inside container wont stay it returns to previous default state.
While i have no issues with grouping_rules.conf using same approach.
I used docker and docker compose both same results.
Path on witch i try to copy opt/apache-flume/conf/agent.conf
docker run -v /home/igor/Documents/cygnus/agent.conf:/opt/apache-flume/conf/agent.conf fiware/cygnus-ngsi
Can some who managed to run it using his config tell me if i misunderstood location of agent.conf or something because this is weird, i used many docker images and never had issue where i was not able to copy from my machine to docker container.
Thanks in advance.
** EDIT **
Link of agent.conf
Did you copy the agent.conf file to your directory before start the container?
As you can see here, when you define a volume with "-v" option, docker copies the content of the host directory, inside the container directory using the mount point. Therefore, you must first provide the agent.conf file on your host.
The reason is that when using a "bind mounted" directory from the
host, you're telling docker that you want to take a file or directory
from your host and use it in your container. Docker should not modify
those files/directories, unless you explicitly do so. For example, you
don't want -v /home/user/:/var/lib/mysql to result in your
home-directory being replaced with a MySQL database.
If you do not have access to the agent.conf file, you can download the template in the source code from the official cygnus github repo here. You can also copy it once the docker container is running, using the docker cp option:
docker cp <containerId>:/file/path/within/container /host/path/target
Keep in mind, that you will have to edit the agent.conf file to configure it according to the database you are using. You can find in the official doc how to configure cygnus to use differents sinks like MongoDB, MySQL, etc.
I hope I have been helpful.
Best regards!

'undo' or 'cancel' dockerfile VOLUME to share mysql DB in registry

I'm inheriting from the mysql Dockerfile and want to move a VOLUME (/var/lib/mysql) back inside the container so I can distribute it from a registry.
Is there a way in my downstream Dockerfile to (a) undo the VOLUME declaration or (b) replace /var/lib/mysql with a symlink?
I'm giving up on this -- seems simpler to distribute a zipped copy of the DB data directory. If you have a better option, please post.
I had the exact same problem, just with another database (arangodb).
However, I did not find a direct solution for this problem, but in my case (this should also work with mysql), I simply changed the data directory of my database to a non-volume directory in the Dockerfile.
For now, this seems like the best solution, as you can build a full image that contains your data.
As L0j1k has argued vividly in general it is a very bad idea to have your data dir inside of the container. However there are situations where it makes sense. Like for automated tests, run a container with testdata check that everything works as expected and throw it away. Also on OSX & Windows volumes aren't native mounds (because docker runs in a VM) and they can be painfully slow. So you might be better of with copying your data from and to the container, depending on your situation.
While you can't undo the VOLUME directive you can simply create a new data dir and tell Mysql to use that:
FROM mariadb:latest
# Create data dir in /var/lib/data
RUN mkdir /var/lib/data
RUN chown mysql.mysql /var/lib/data
# Change data dir from /var/lib/mysql to /var/lib/data
RUN sed -i 's/\/var\/lib\/mysql/\/var\/lib\/data/g' /etc/mysql/my.cnf
Use with caution.
DO NOT ship your database data in the same image as your database! This is an antipattern and will create bigger problems almost immediately. Ship the data separately as an archive which you then mount into your database container via bind-mount (-v /home/foo/db:/var/lib/mysql). Bind-mount volumes in your docker run statement will override any VOLUME Dockerfile directive. Alternatively, create some automation to dump the database and ship that to your containers, then restore using the dump. Whatever you do will be better than creating an image with your data in the database image. Just as one example of why this is a bad idea: What happens when you need to move the data/database mutant which now has changes? You'll probably use docker export to dump the entire container's filesystem into a new image, and now you're passing around a big blob of crap which is hard to audit. Docker containers (and microservices in general) are designed to be ephemeral and stateless, which means you can hose any one container and recreate it and it'll continue working. You can't do this if you ship your blob of data inside the database image.
With respect to the VOLUME directive in that Dockerfile: Remember that Dockerfiles are used during docker build and therefore do not (and cannot) contain host-dependent information or actions. So the VOLUME /var/lib/mysql isn't making your image impossible to distribute. What that directive does is create a generic (i.e. non-bind-mount) data volume that persists the data of that directory beyond the lifetime of the container. It is not the same thing as a bind-mount volume for example in docker run -v "/var/docker/app/data:/var/lib/mysql" .... This Dockerfile directive does not prevent you from distributing the image because it does not specify host-dependent information.

Move Docker Containers via export/import: lost Configuration like start command

I want to move containers from one host to another. The containers have updated data in their filesystem, so I do not want to move the original images (docker save) but containers (using docker export).
So I use
docker export l4bnode > l4bnode.tar
on the old host, copy the file to new host, and import image
cat l4bnode.tar | docker import - andi/l4bnode
on the new one. But.. it looks like all the configuration data I had in the Dockerfile (and that I also could specify/had specified in the command line when running the container) is lost. I tried
docker run andi/l4bnode
and get
docker: Error response from daemon: No command specified.
Using docker inspect, I see that all data on the imported image is empty, though it is set on the exported running container. I mainly am missing startup command, working directory, environment variables and exposed ports (some of which I have to change then due to the migration and new environment).
How can I apply the original configuration on the new host, or preferrably, migrate it properly?
You can commit the current container state as new image. Then use save/load on the new image.
That being said this is something you generally should try to avoid. Runtime data should be kept in volumes, any configuration changes should happen via Dockerfile rebuilds.

Create Dockerfile interactive?

If you look at dockerfiles the often contains lines like this:
sed 's/main$/main universe/' -i /etc/apt/sources.list
I think it is difficult to set up things like this.
Is it possible to launch a default OS image, then enter it interactive with a shell, do some modifications, and then print out the diff (filesystem diff)?
The diff should be used as the dockerfile to recreating the image.
But maybe I am missing something, since I am new to docker.
You can create docker images several ways.
I tend to have two windows open when I create a new docker image. One for my docker run -i -t centos bash, where I am writing all my commands to get it the way I want, and the other one with the Dockerfile, so I can put in whatever I do.
When it comes to config files, I am putting them in the files/folders that matches the one on the image.
Example, if I change /etc/something/file.conf, I will create the file in etc/something/file.conf in the same directory as my Dockerfile, and then use Dockers ADD command to add it whenever I do a build.
This works perfectly, since I can have all this in a git repository with a README.md containing the info I need for running/building the image.
The other thing you can do is to is to run docker ps -a after you are done with the changes you wanted to create an image on, and get the docker ID of the image of the container you just configured. You can tag this new image, or start it with docker run abc0123 bash just like you would a normal docker image.
The problem with this is that you wont be able to easily build it next time without bringing the whole image.
Dockerfiles with ADD is the way to go!
If you do not want to run sed (which is used to preserve the default file and of minimal changes to it), you can simply ADD the modifies file.
For that you can docker run -it --rm thebaseimage /bin/sh (or any other shell that is provided) and edit it in place. Then just copy it outside the container (or docker export it) and use it on your build.
The downside of ADD vs RUN sed… is that, if something changes in a new version of your base image, you will overwrite those changes.
The Dockerfile is (mostly) equivalent to a series of docker run and docker commit commands. You wouldn't want to look at the docker diff to see what files changed -- you'd want to see what docker run commands had occurred. You could get these from your host shell history and process these into a Dockerfile.