I have an application that runs well with celery on local but when I deploy it into elastic beanstalk, celery seems to shutdown or not run my task. I am using supervisor to run celery.
This is my configuration for supervisord
I also set a global env of C_FORCE_ROOT=true
Error: 2020-12-21 04:49:56,076 INFO waiting for app, celery-worker to die [2020-12-21 04:49:57,732: DEBUG/MainProcess] removing tasks from inqueue until task handler finished
Unrecoverable error: WorkerLostError('Could not start worker processes')
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/celery/worker/worker.py", line 208, in start
self.blueprint.start(self)
File "/usr/local/lib/python3.8/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/usr/local/lib/python3.8/site-packages/celery/bootsteps.py", line 369, in start
return self.obj.start()
File "/usr/local/lib/python3.8/site-packages/celery/worker/consumer/consumer.py", line 318, in start
blueprint.start(self)
File "/usr/local/lib/python3.8/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/usr/local/lib/python3.8/site-packages/celery/worker/consumer/consumer.py", line 599, in start
c.loop(*c.loop_args())
File "/usr/local/lib/python3.8/site-packages/celery/worker/loops.py", line 59, in asynloop
raise WorkerLostError('Could not start worker processes')
billiard.exceptions.WorkerLostError: Could not start worker processes
[supervisord]
nodaemon=true
[program:app]
command = gunicorn -b 0.0.0.0:5000 --worker-class gevent application.app:app
user=root
directory = /usr/src/app/restful
priority = 900
autostart=true
autorestart = true
stopsignal = TERM
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
stdin_open = true
tty=true
[program:celery-worker]
command= python -m celery worker -A application.libs.celery_config.celery --loglevel=DEBUG --uid=nobody --gid=nogroup
user=root
directory = /usr/src/app/restful
autostart=true
autorestart = false
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
stdin_open = true
tty=true
[program:celery-beat]
command= python -m celery beat -A application.libs.celery_config.celery --schedule=/tmp/celerybeat-schedule --loglevel=DEBUG
user=root
directory = /usr/src/app/restful
autostart=true
autorestart = false
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
stdin_open = true
tty=true
Related
My configuration is as follows:
I am running a Django-REST backend, with a MySQL database. I am trying to run the Django backend in its own Docker container, as well as running a MySQL database in its own Django container. It seems that Django is not able to connect to the MySQL database when my containers are running.
Database settings in Django:
DATABASES = {
"default": {
"ENGINE": os.environ.get("SQL_ENGINE", "django.db.backends.sqlite3"),
"NAME": os.environ.get("SQL_DATABASE", BASE_DIR / "db.sqlite3"),
"USER": os.environ.get("SQL_USER", "user"),
"PASSWORD": os.environ.get("SQL_PASSWORD", "password"),
"HOST": os.environ.get("SQL_HOST", "localhost"),
"PORT": os.environ.get("SQL_PORT", "5432"),
}
}
Dockerfile:
FROM python:3.10.2-slim-buster
ENV PYTHONUNBUFFERED 1
RUN mkdir /code
WORKDIR /code
RUN apt update \
&& apt install -y --no-install-recommends python3-dev \
default-libmysqlclient-dev build-essential default-mysql-client \
&& apt autoclean
RUN pip install --no-cache-dir --upgrade pip
COPY ./requirements.txt /code/
RUN pip install --no-cache-dir -r requirements.txt
COPY ./neura-dbms-backend /code/
EXPOSE 7000
Requirements.txt:
Django
djangorestframework
django-cors-headers
requests
boto3
django-storages
pytest
mysqlclient==2.1.1
django-use-email-as-username
djangorestframework-simplejwt
gunicorn
docker-compose.yml:
version: "3.8"
services:
neura-dbms-backend:
build:
context: ./DBMS/neura-dbms-backend
command: [sh, -c, "python manage.py runserver 0.0.0.0:7000"]
image: neura-dbms-backend
container_name: neura-dbms-backend
volumes:
- ./DBMS/neura-dbms-backend/neura-dbms-backend:/code
ports:
- 7000:7000
networks:
- docker-network
environment:
- DEBUG=1
- SECRET_KEY=${SECRET_KEY_DBMS}
- DJANGO_ALLOWED_HOSTS=${DJANGO_ALLOWED_HOSTS}
- DJANGO_ALLOWED_ORIGINS=${DJANGO_ALLOWED_ORIGINS}
- JWT_KEY=${JWT_KEY}
- SQL_ENGINE=django.db.backends.mysql
- SQL_DATABASE=db_neura_dbms
- SQL_USER=neura_dbms_user
- SQL_PASSWORD=super_secure_password
- SQL_HOST=db_neura_dbms
- SQL_PORT=5432
depends_on:
- "db_neura_dbms"
db_neura_dbms:
image: mysql:latest
volumes:
- mysql_data_db_neura_dbms:/var/lib/mysql/
environment:
- MYSQL_DATABASE=db_neura_dbms
- MYSQL_USER=neura_dbms_user
- MYSQL_PASSWORD=super_secure_password
- MYSQL_ROOT_PASSWORD=super_secure_password
networks:
- docker-network
networks:
docker-network:
driver: bridge
volumes:
mysql_data_db_neura_dbms:
I am able to build images for Django and the Database, but when I try to run the containers, I get the following error from the Django container:
neura-dbms-backend | System check identified no issues (0 silenced).
neura-dbms-backend | Exception in thread django-main-thread:
neura-dbms-backend | Traceback (most recent call last):
neura-dbms-backend | File "/usr/local/lib/python3.10/site-packages/django/db/backends/base/base.py", line 282, in ensure_connection
neura-dbms-backend | self.connect()
neura-dbms-backend | File "/usr/local/lib/python3.10/site-packages/django/utils/asyncio.py", line 26, in inner
neura-dbms-backend | return func(*args, **kwargs)
neura-dbms-backend | File "/usr/local/lib/python3.10/site-packages/django/db/backends/base/base.py", line 263, in connect
neura-dbms-backend | self.connection = self.get_new_connection(conn_params)
neura-dbms-backend | File "/usr/local/lib/python3.10/site-packages/django/utils/asyncio.py", line 26, in inner
neura-dbms-backend | return func(*args, **kwargs)
neura-dbms-backend | File "/usr/local/lib/python3.10/site-packages/django/db/backends/mysql/base.py", line 247, in get_new_connection
neura-dbms-backend | connection = Database.connect(**conn_params)
neura-dbms-backend | File "/usr/local/lib/python3.10/site-packages/MySQLdb/__init__.py", line 123, in Connect
neura-dbms-backend | return Connection(*args, **kwargs)
neura-dbms-backend | File "/usr/local/lib/python3.10/site-packages/MySQLdb/connections.py", line 185, in __init__
neura-dbms-backend | super().__init__(*args, **kwargs2)
neura-dbms-backend | MySQLdb.OperationalError: (2002, "Can't connect to MySQL server on 'db_neura_dbms' (115)")
What am I missing? Thanks!
So I added a script so that Django waits for the mysql database to be ready before it connects:
#!/bin/bash
if [ "$SQL_HOST" = "db" ]
then
echo "Waiting for mysql..."
while !</dev/tcp/$SQL_HOST/$SQL_PORT; do sleep 1; done;
echo "MySQL started"
fi
# python manage.py migrate
exec "$#"
When I first run the Docker containers, it seems that MySQL runs through some sort of setup, Django then tries to connect and fails.
If I then kill the containers, and run them again, the MySQL setup is finished, and Django is able to connect to the database. I wonder if there is a way for Django to wait for this setup to be finished as well?
depends_on only waits till the database container is started, but in this case, after the container is started it still takes some time for mysql to make the system ready for connection.
what you can do is create a command file (This is for postgres, you can make one for yours, you will need to add raised mysql exception instead of Psycopg2Error )
import time
from psycopg2 import OperationalError as Psycopg2Error
from django.db.utils import OperationalError
from django.core.management.base import BaseCommand
class Command(BaseCommand):
"""
Django command to wait for database
"""
def handle(self, *args, **options):
"""
Command entrypoint
"""
self.stdout.write("Checking database availability\n")
db_up = False
seconds_cnt = 0
while not db_up:
try:
self.check(databases=['default'])
db_up = True
self.stdout.write(
self.style.WARNING(
"Available within {} seconds".format(seconds_cnt)))
self.stdout.write(self.style.SUCCESS("Database available!"))
except(Psycopg2Error, OperationalError):
seconds_cnt += 1
self.stdout.write(
self.style.WARNING(
"Database unavailable waiting... {} seconds"
.format(seconds_cnt)))
time.sleep(1)
Your command can be updated with,
command: >
sh -c "python manage.py wait_for_db &&
python manage.py runserver 0.0.0.0:87000"
I was checking the github code of LLFF : https://github.com/Fyusion/LLFF, Non-Rigid NeRF : https://github.com/facebookresearch/nonrigid_nerf and followed the suggested steps to install requirements. While running a preprocess file which return poses from images by SfM using COLMAP. I was getting the following error while executing the preprocessing in a remote server. Can anyone please help me with solving this?
python preprocess.py --input data/example_sequence1/
Need to run COLMAP
qt.qpa.xcb: could not connect to display
qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.
Available platform plugins are: eglfs, minimal, minimalegl, offscreen, vnc, webgl, xcb.
*** Aborted at 1660905461 (unix time) try "date -d #1660905461" if you are using GNU date ***
PC: # 0x0 (unknown)
*** SIGABRT (#0x3e900138a9f) received by PID 1280671 (TID 0x7f5740d49000) from PID 1280671; stack trace: ***
# 0x7f57463a2197 google::(anonymous namespace)::FailureSignalHandler()
# 0x7f574421f420 (unknown)
# 0x7f5743bf300b gsignal
# 0x7f5743bd2859 abort
# 0x7f57442be35b QMessageLogger::fatal()
# 0x7f574477c799 QGuiApplicationPrivate::createPlatformIntegration()
# 0x7f574477cb6f QGuiApplicationPrivate::createEventDispatcher()
# 0x7f57443dbb62 QCoreApplicationPrivate::init()
# 0x7f574477d1e1 QGuiApplicationPrivate::init()
# 0x7f5744c03bc5 QApplicationPrivate::init()
# 0x562bbb634975 colmap::RunFeatureExtractor()
# 0x562bbb61d1a0 main
# 0x7f5743bd4083 __libc_start_main
# 0x562bbb620e39 (unknown)
Traceback (most recent call last):
File "imgs2poses.py", line 18, in <module>
gen_poses(args.scenedir, args.match_type)
File "/data1/user_data/ashish/NeRF/LLFF/llff/poses/pose_utils.py", line 268, in gen_poses
run_colmap(basedir, match_type)
File "/data1/user_data/ashish/NeRF/LLFF/llff/poses/colmap_wrapper.py", line 35, in run_colmap
feat_output = ( subprocess.check_output(feature_extractor_args, universal_newlines=True) )
File "/home/ashish/anaconda3/envs/nrnerf/lib/python3.6/subprocess.py", line 356, in check_output
**kwargs).stdout
File "/home/ashish/anaconda3/envs/nrnerf/lib/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['colmap', 'feature_extractor', '--database_path', 'scenedir/database.db', '--image_path', 'scenedir/images', '--ImageReader.single_camera', '1']' died with <Signals.SIGABRT: 6>.
'''
Unable to receive email on task failure or even using EmailOperator
Hi Guys,
I am unable to receive email from my box even after adding required parameters to send one.
Below is how my default args looks like --
default_args = {
'owner': 'phonrao',
'depends_on_past': False,
#'start_date': datetime(2019, 3, 28),
'start_date': airflow.utils.dates.days_ago(2),
'email': ['phonrao#gmail.com'],
'email_on_failure': True,
'email_on_retry': True,
'retries': 1,
'retry_delay': timedelta(minutes=5),
#'on_failure_callback': report_failure,
#'end_date': datetime(2020,4 ,1),
#'schedule_interval': '#hourly',
}
I have few HttpsOperator task in between -- those are working good and are a success, but they do not send email on error(I purposely tried to introduce an error to check if they send any email). Below is an example of my task.
t1 = SimpleHttpOperator(
task_id='t1',
http_conn_id='http_waterfall',
endpoint='/update_data',
method='POST',
headers={"Content-Type":"application/json"},
xcom_push=True,
log_response=True,
dag=dag,
)
and this is my EmailOperator task
t2 = EmailOperator(
dag=dag,
task_id="send_email",
to='phonrao#gmail.com',
subject='Success',
html_content="<h3>Success</h3>"
)
t2 >> t1
Below is the error from Logs:
[2019-04-02 15:28:21,305] {{base_task_runner.py:101}} INFO - Job 845: Subtask send_email [2019-04-02 15:28:21,305] {{cli.py:520}} INFO - Running <TaskInstance: schedulerDAG.send_email 2019-04-02T15:23:08.896589+00:00 [running]> on host a47cd79aa987
[2019-04-02 15:28:21,343] {{logging_mixin.py:95}} INFO - [2019-04-02 15:28:21,343] {{configuration.py:255}} WARNING - section/key [smtp/smtp_user] not found in config
[2019-04-02 15:28:21,343] {{models.py:1788}} ERROR - [Errno 99] Cannot assign requested address
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1657, in _run_raw_task
result = task_copy.execute(context=context)
File "/usr/local/lib/python3.6/site-packages/airflow/operators/email_operator.py", line 78, in execute
mime_subtype=self.mime_subtype, mime_charset=self.mime_charset)
File "/usr/local/lib/python3.6/site-packages/airflow/utils/email.py", line 55, in send_email
mime_subtype=mime_subtype, mime_charset=mime_charset, **kwargs)
File "/usr/local/lib/python3.6/site-packages/airflow/utils/email.py", line 101, in send_email_smtp
send_MIME_email(smtp_mail_from, recipients, msg, dryrun)
File "/usr/local/lib/python3.6/site-packages/airflow/utils/email.py", line 121, in send_MIME_email
s = smtplib.SMTP_SSL(SMTP_HOST, SMTP_PORT) if SMTP_SSL else smtplib.SMTP(SMTP_HOST, SMTP_PORT)
File "/usr/local/lib/python3.6/smtplib.py", line 251, in __init__
(code, msg) = self.connect(host, port)
File "/usr/local/lib/python3.6/smtplib.py", line 336, in connect
self.sock = self._get_socket(host, port, self.timeout)
File "/usr/local/lib/python3.6/smtplib.py", line 307, in _get_socket
self.source_address)
File "/usr/local/lib/python3.6/socket.py", line 724, in create_connection
raise err
File "/usr/local/lib/python3.6/socket.py", line 713, in create_connection
sock.connect(sa)
OSError: [Errno 99] Cannot assign requested address
[2019-04-02 15:28:21,351] {{models.py:1817}} INFO - All retries failed; marking task as FAILED
Below is my airflow.cfg
[email]
email_backend = airflow.utils.email.send_email_smtp
[smtp]
# If you want airflow to send emails on retries, failure, and you want to use
# the airflow.utils.email.send_email_smtp function, you have to configure an
# smtp server here
smtp_host = localhost
smtp_starttls = True
smtp_ssl = False
# Uncomment and set the user/pass settings if you want to use SMTP AUTH
# smtp_user = airflow
# smtp_password = airflow
smtp_port = 25
smtp_mail_from = airflow#example.com
Has anyone encounter this issue and any suggestions on how do I resolve this?
If your airflow running on Kubernetes (installed by helm chart), you should take a look in "airflow-worker-0" pod, and make sure the environment variable of SMTP_HOST or SMTP_USER ... available in the config. Simply debugging, access to the container of airflow-worker and then run python, then trying these commands to make sure it works correctly.
import airflow
airflow.utils.email.send_email('example#gmail.com', 'Airflow TEST HERE', 'This is airflow status success')
I have the same issues, by resolving the environment variable of SMTP. Now it works.
I have a cuba application which I want to use sidekiq with.
This is how I setup the config.ru:
require './app'
require 'sidekiq'
require 'sidekiq/web'
environment = ENV['RACK_ENV'] || "development"
config_vars = YAML.load_file("./config.yml")[environment]
Sidekiq.configure_client do |config|
config.redis = { :url => config_vars["redis_uri"] }
end
Sidekiq.configure_server do |config|
config.redis = { url: config_vars["redis_uri"] }
config.average_scheduled_poll_interval = 5
end
# run Cuba
run Rack::URLMap.new('/' => Cuba, '/sidekiq' => Sidekiq::Web)
I started sidekiq using systemd. This is the systemd script which I adapted from the sidekiq.service on the sidekiq site.:
#
# systemd unit file for CentOS 7, Ubuntu 15.04
#
# Customize this file based on your bundler location, app directory, etc.
# Put this in /usr/lib/systemd/system (CentOS) or /lib/systemd/system (Ubuntu).
# Run:
# - systemctl enable sidekiq
# - systemctl {start,stop,restart} sidekiq
#
# This file corresponds to a single Sidekiq process. Add multiple copies
# to run multiple processes (sidekiq-1, sidekiq-2, etc).
#
# See Inspeqtor's Systemd wiki page for more detail about Systemd:
# https://github.com/mperham/inspeqtor/wiki/Systemd
#
[Unit]
Description=sidekiq
# start us only once the network and logging subsystems are available,
# consider adding redis-server.service if Redis is local and systemd-managed.
After=syslog.target network.target
# See these pages for lots of options:
# http://0pointer.de/public/systemd-man/systemd.service.html
# http://0pointer.de/public/systemd-man/systemd.exec.html
[Service]
Type=simple
Environment=RACK_ENV=development
WorkingDirectory=/media/temp/bandmanage/repos/fall_prediction_verification
# If you use rbenv:
#ExecStart=/bin/bash -lc 'pwd && bundle exec sidekiq -e production'
ExecStart=/home/froy001/.rvm/wrappers/fall_prediction/bundle exec "sidekiq -r app.rb -L log/sidekiq.log -e development"
# If you use the system's ruby:
#ExecStart=/usr/local/bin/bundle exec sidekiq -e production
User=root
Group=root
UMask=0002
# if we crash, restart
RestartSec=1
Restart=on-failure
# output goes to /var/log/syslog
StandardOutput=syslog
StandardError=syslog
# This will default to "bundler" if we don't specify it
SyslogIdentifier=sidekiq
[Install]
WantedBy=multi-user.target
The code calling the worker is :
raw_msg = JSON.parse(req.body.read, {:symbolize_names => true})
if raw_msg
ts = raw_msg[:ts]
waiting_period = (1000*60*3) # wait 3 min before checking
perform_at_time = Time.at((ts + waiting_period)/1000).utc
FallVerificationWorker.perform_at((0.5).minute.from_now, raw_msg)
my_res = { result: "success", status: 200}.to_json
res.status = 200
res.write my_res
else
my_res = { result: "not found", status: 404}.to_json
res.status = 404
res.write my_res
end
I am only using the default q.
My problem is that the job is not being processed at all.
After you run systemctl enable sidekiq so that it starts at boot and systemctl start sidekiq so that it starts immediately, then you should have some logs to review which will provide some detail about any failure to start:
sudo journalctl -u sidekiq
Review the logs, review the systemd docs and adjust your unit file as needed. You can find all the installed systemd documentation with apropos systemd. Some of the most useful man pages to review are systemd.service,systemd.exec and systemd.unit
I have a bash script that starts some python code, running on a RasPi under latest raspbian. It runs fine if I run it manually with sudo, but when it auto runs at boot I get a mysql error. The script is called from a line in /etc/rc.local
The script is
#!/bin/bash
/home/control/solar/v10_1/control.py >> '/home/control/solar/logs/control_X.X_start.log' 2>&1 &
echo "control started" >> '/home/control/solar/logs/control_X.X_start.log'
echo "UID is $UID , EUID is $EUID" >> '/home/control/solar/logs/control_X.X_start.log'
The output to the log after bootup is
UID is 0 , EUID is 0
Traceback (most recent call last):
File "/home/control/solar/v10_1/control.py", line 42, in <module>
import variables # global variables
File "/home/control/solar/v10_1/variables.py", line 13, in <module>
gv.db = MySQLdb.connect("localhost", "solar", "solar", db='solar') # database for logging
File "/usr/lib/python2.7/dist-packages/MySQLdb/__init__.py", line 81, in Connect
return Connection(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", line 187, in __init__
super(Connection, self).__init__(*args, **kwargs2)
_mysql_exceptions.OperationalError: (2002, "Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)")
I have made all the paths absolute, the UID is 0 in both cases, what can cause the difference in behaviour?