I have two types of Docker containers: One with a web application (nginx/php-fpm) and one with a MySQL database. Both are connected through a network. The app container is aware of the DB container, however, the DB doesn’t know if zero, one or more app containers are available. Both types of containers use Supervisord.
The database container has to startup mysqld, which can take a few seconds.
The other container has to perform some startup actions, part of which require database access. As these actions depend on the DB container, I have put a loop at the top of the script, waiting for the DB server to become available:
try=0
ok=0
until mysql -h$dbhost -u$dbuser -p$dbpass -e "USE $dbname" && ok=1; do
[ $((++try)) -gt 30 ] && break
sleep 1
done
if [ $ok -gt 0 ]; then
# DO STUFF
else
exit 1
fi
While this does work, I see two downsides: First, the script will fail if a DB container is down or takes longer than a certain timeout to start when the app container comes up. Second, the app container won’t know if there are changes on the DB server (e.g. migrations).
While I’m aware of Supervisord events, I wonder: How can I notify an arbitrary amount of other containers in the same network of such events?
(NOTE: I’m not restricted to using Supervisord for this, I just feel that this is the most promising approach.)
You might want to use Compose.
Also you can add an healthcheck to your database container and add a condition for the web server container. Something like this.
healthcheck:
test: ["CMD-SHELL", "mysql_check.sh"]
interval: 30s
timeout: 30s
retries: 3
and
depends_on:
mysql-database:
condition: service_healthy
Compose will wait for the database to be ready before starting the webserver container.
Related
I still don't know if the issue is with docker networking, node, or the connection from node to mysql.
But I have a docker that contains express gateway for api management. Every once in a while it starts giving Operation timed out
The error is coming from nodejs but when it happens :
I can't see anything in the logs of the container
Running tcpdump from the server shows a call being made to the docker api but returns a response of 500 (when running correctly i can see after it the call to the port 3306 to connect to the database)
Running tcpdump from inside the docker container returns absolutely nothing (when working correctly I can see the calls)
Calls that don't require a database connection work correctly! but still i can't see their logs in the container nor their calls inside tcpdump
It's as if the server is calling another docker, but i searched all volumes, images, there's no duplicate.
I tried to check the following :
Resources on the same machine
Resources on the database machine
tcpdump with wireshark on both the server and the docker
Add connection pooling to sequelize (In case a connection to the database is causing the block sometimes)
Checking all oauth2 routes in case it's redirecting to localhost server or anything
Literally adding logs everywhere just to see a log when this happens, but in vain
telnet from the server to the localhost with external port and to 172.17.0.2 with internal port -> slight difference when i do it from localhost, after a while i receive a Connection closed by foreign host
I don't know if it's normal for a docker container to hang like this or if an image is not correctly deleted, but things simply worked when I created the container with another name.
I read all I could find, but documentation on this scenario is scant or unclear for podman. I have the following (contrived) ROOTLESS podman setup:
pod-1 name: pod1
Container names in pod1:
p1c1 -- This is also it's assigned hostname within pod1
p1c2 -- This is also it's assigned hostname within pod1
p1c3 -- This is also it's assigned hostname within pod1
pod-2 name: pod2
Container names in pod2:
p2c1 -- This is also it's assigned hostname within pod2
p2c2 -- This is also it's assigned hostname within pod2
p2c3 -- This is also it's assigned hostname within pod2
I keep certain containers in different pods specifically to avoid port conflict, and to manage containers as groups.
QUESTION:
Give the above topology, how do I communicate between, say, p1c1 and p2c1? In other words, step-by-step, what podman(1) commands do I issue to collect the necessary addressing information for pod1:p1c1 and pod2:p2c1, and then use that information to configure applications in them so they can communicate with one another?
Thank you in advance!
EDIT: For searchers, additional information can be found here.
Podman doesn't have anything like the "services" concept in Swarm or Kubernetes to provide for service discovery between pods. Your options boil down to:
Run both pods in the same network namespace, or
Expose the services by publishing them on host ports, and then access them via the host
For the first solution, we'd start by creating a network:
podman network create shared
And then creating both pods attached to the shared network:
podman pod create --name pod1 --network shared
podman pod create --name pod2 --network shared
With both pods running on the same network, containers can refer to
the other pod by name. E.g, if you were running a web service in
p1c1 on port 80, in p2c1 you could curl http://pod1.
For the second option, you would do something like:
podman pod create --name pod1 -p 1234:1234 ...
podman pod create --name pod2 ...
Now if p1c1 has a service listening on port 1234, you can access that from p2c1 at <some_host_address>:1234.
If I'm interpreting option 1 correctly, if the applications in p1c1 and p2c1 both use, say, port 8080; then there won't be any conflict anywhere (either within the pods and the outer host) IF I publish using something like this: 8080:8080 for app in p1c1 and 8081:8080 for app in p2c1? Is this interpretation correct?
That's correct. Each pod runs with its own network namespace
(effectively, it's own ip address), so services in different pods can
listen on the same port.
Can the network (not ports) of a pod be reassigned once running? REASON: I'm using podman-compose(1), which creates things for you in a pod, but I may need to change things (like the network assignment) after the fact. Can this be done?
In general you cannot change the configuration of a pod or a
container; you can only delete it and create a new one. Assuming that
podman-compose has relatively complete support for the
docker-compose.yaml format, you should be able to set up the network
correctly in your docker-compose.yaml file (you would create the
network manually, and then reference it as an external network in
your compose file).
Here is a link to the relevant Docker documentation. I haven't tried this myself with podman.
Accepted answer from #larsks will only work for rootful containers. In other words, run every podman commands with sudo prefix. (For instance when you connect postgres container from spring boot application container, you will get SocketTimeout exception)
If two containers will work on the same host, then get the ip address of the host, then <ipOfHost>:<port>. Example: 192.168.1.22:5432
For more information you can read this blog => https://www.redhat.com/sysadmin/container-networking-podman
Note: The above solution of creating networks, only works in rootful mode. You cannot do podman network create as a rootless user.
This only occurs whenever building the project docker-compose.yml but for some reason, docker doesn't wait for the port to become active and then execute the next service.
My question is: if there is any way to do this without using wait-for-it or programs similar to it
docker-compose logs:
Edit: I have also tried this which was unsuccessful.
Thanks in advance!
depends_on means only that it will wait when container starts but it does not mean that service inside that container is ready.
Look here how you have to wait for db is ready https://github.com/api-platform/api-platform/blob/master/api/docker/php/docker-entrypoint.sh#L29
I have a small Python Flask server running on OpenShift starter us-west-1. I use a MySQL container for data storage. Yesterday I scaled down the MySQL application from 1 to 0 pods. When I tried to scale it back up to 1 pod, the container creation keeps failing when trying to mount the persistent volume:
Failed to attach volume "pvc-8bcc2d2b-8d92-11e7-8d9c-06d5ca59684e" on node "ip-XXX-XXX-XXX-XXX.us-west-1.compute.internal" with: Error attaching EBS volume "vol-08b957e6975554914" to instance "i-05e81383e32bcc8ac": VolumeInUse: vol-08b957e6975554914 is already attached to an instance status code: 400, request id: 3ec76894-f611-445f-8416-2db2b1e9c5b7
I have seen some suggestions that say that the deployment strategy needs to be "Recreate", but it is already set like that. I have tried scaling down and up again multiple times. I have also tried to manually stop the pod deployment and start a new one, but it keeps giving the same errors.
Any suggestions?
After contacting support, they fixed the volume and recently also updated the platform to prevent this bug in the future.
I'm setting up a PrestaShop installation on a development server which is a GCE instance and using Cloud SQL as a database server. Everything works just fine except one thing: whenever there is a long period of inactivity on the site, the first page load after that always gives me this error:
Link to database cannot be established: SQLSTATE[HY000] [2003]
If I refresh the page the error is gone and never appears again until I stop using the site for an hour or so. It almost looks like database instance is going into sleep mode or something like that.
The reason I mentioned Prestashop is the fact that I never get this error when using Adminer or connecting to the database from mysql console client.
With the per use billing model, instances are spun down after a 15 minute timeout to save you money. They then take a few seconds to be spun up when next accessed. It may be the Prestashop is timing out on these first requests (though I have no experience with that application).
Try changing your instance to a package billing, which has a 12 hour timeout, to see if this helps
https://developers.google.com/cloud-sql/faq#how_usage_calculated
According to GCE documentation,
Once a connection has been established with an instance, traffic is permitted in both directions over that connection, until the connection times out after 10 minutes of inactivity
I suspect that might be the cause. To get around it, you can try to lower the tcp keepalive time.
Refer here: https://cloud.google.com/sql/docs/compute-engine-access
To keep long-lived unused connections alive, you can set the TCP keepalive. The following commands set the TCP keepalive value to one minute and make the configuration permanent across instance reboots.
# Display the current tcp_keepalive_time value.
$ cat /proc/sys/net/ipv4/tcp_keepalive_time
# Set tcp_keepalive_time to 60 seconds and make it permanent across reboots.
$ echo 'net.ipv4.tcp_keepalive_time = 60' | sudo tee -a /etc/sysctl.conf
# Apply the change.
$ sudo /sbin/sysctl --load=/etc/sysctl.conf
# Display the tcp_keepalive_time value to verify the change was applied.
$ cat /proc/sys/net/ipv4/tcp_keepalive_time