DAG structure: We have an ETL pipeline having a number of phases. Each phase could have child phases inside it, which is true recursively as well (to a known depth, about 3-4). The inner most layer consists of SQLs (largely variable number, avg about 30), either executing in parallel or serial fashion or both.
For each phase, we have used SubDAG.
We are aware that using SubDAGs is a bad practice, but because of our hierarchical structure, they become a natural choice. We are using Celery Executor for SubDAGs.
For the leaf nodes in the innermost layer, we have a custom operator.
Problem: We (very) occasionally hit a deadlock and the SubDAG task which houses the SQL nodes, fails. There are no child node failures.
Error: sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction')
StackTrace:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/airflow/models/taskinstance.py", line 930, in _run_raw_task
result = task_copy.execute(context=context)
File "/usr/local/lib/python3.6/dist-packages/airflow/operators/subdag_operator.py", line 102, in execute
executor=self.executor)
File "/usr/local/lib/python3.6/dist-packages/airflow/models/dag.py", line 1284, in run
job.run()
File "/usr/local/lib/python3.6/dist-packages/airflow/jobs/base_job.py", line 222, in run
self._execute()
File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/airflow/jobs/backfill_job.py", line 769, in _execute
session=session)
File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 70, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/airflow/jobs/backfill_job.py", line 699, in _execute_for_run_dates
session=session)
File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 70, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/airflow/jobs/backfill_job.py", line 586, in _process_backfill_task_instances
_per_task_process(task, key, ti)
File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/airflow/jobs/backfill_job.py", line 508, in _per_task_process
session.commit()
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 1036, in commit
self.transaction.commit()
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 503, in commit
self._prepare_impl()
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 482, in _prepare_impl
self.session.flush()
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 2479, in flush
self._flush(objects)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 2617, in _flush
transaction.rollback(_capture_exception=True)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
compat.reraise(exc_type, exc_value, exc_tb)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/util/compat.py", line 153, in reraise
raise value
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 2577, in _flush
flush_context.execute()
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
rec.execute(self)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/unitofwork.py", line 589, in execute
uow,
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/persistence.py", line 236, in save_obj
update,
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/persistence.py", line 996, in _emit_update_statements
statement, multiparams
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py", line 982, in execute
return meth(self, multiparams, params)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py", line 1101, in _execute_clauseelement
distilled_params,
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py", line 1250, in _execute_context
e, statement, parameters, cursor, context
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py", line 1476, in _handle_dbapi_exception
util.raise_from_cause(sqlalchemy_exception, exc_info)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/util/compat.py", line 152, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py", line 1246, in _execute_context
cursor, statement, parameters, context
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/default.py", line 581, in do_execute
cursor.execute(statement, parameters)
File "/usr/local/lib/python3.6/dist-packages/MySQLdb/cursors.py", line 209, in execute
res = self._query(query)
File "/usr/local/lib/python3.6/dist-packages/MySQLdb/cursors.py", line 315, in _query
db.query(q)
File "/usr/local/lib/python3.6/dist-packages/MySQLdb/connections.py", line 239, in query
_mysql.connection.query(self, query)
sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction')
[SQL: UPDATE task_instance SET state=%s, queued_dttm=%s WHERE task_instance.task_id = %s AND task_instance.dag_id = %s AND task_instance.execution_date = %s]
[parameters: ('queued', datetime.datetime(2020, 1, 14, 10, 51, 48, 301880, tzinfo=<Timezone [UTC]>), 'SQL_W', 'RANDOM_DAG_ID.PhaseX.PhaseY.PhaseZ', datetime.datetime(2020, 1, 14, 10, 28, 36, 123853, tzinfo=<Timezone [UTC]>))]
(Background on this error at: http://sqlalche.me/e/e3q8)
Additionally, we also see a warning in the logs as below.
[2020-01-14 10:51:43,217] {logging_mixin.py:112} INFO - [2020-01-14 10:51:43,217] {backfill_job.py:246}
WARNING - ('RANDOM_DAG_ID.PhaseX.PhaseY.PhaseZ', 'SQL_A', datetime.datetime(2020, 1, 14, 10, 28, 36, 123853, tzinfo=<Timezone [UTC]>), 1) state success not in running=dict_values(
[<TaskInstance: RANDOM_DAG_ID.PhaseX.PhaseY.PhaseZ.SQL_B 2020-01-14 10:28:36.123853+00:00 [running]>,
<TaskInstance: RANDOM_DAG_ID.PhaseX.PhaseY.PhaseZ.SQL_C 2020-01-14 10:28:36.123853+00:00 [running]>,
<TaskInstance: RANDOM_DAG_ID.PhaseX.PhaseY.PhaseZ.SQL_D 2020-01-14 10:28:36.123853+00:00 [running]>,
<TaskInstance: RANDOM_DAG_ID.PhaseX.PhaseY.PhaseZ.SQL_E 2020-01-14 10:28:36.123853+00:00 [running]>,
<TaskInstance: RANDOM_DAG_ID.PhaseX.PhaseY.PhaseZ.SQL_F 2020-01-14 10:28:36.123853+00:00 [running]>])
Airflow Version: 1.10.5, 1.10.7
We found that AIRFLOW-2516 is the issue we are facing, this comment pointing to the probable cause. It suggests the deadlock is due to two queries acquiring locks on two indices in different order.
Dropping the second index on task state may not be a good idea as it would affect Airflow Scheduler performance, which is already not great.
They seem to have made fix for deadlock at the main DAG level, not at the SubDAG level.
We have tried auto-retrying the same SubDAG operator after a minute, using BaseOperator parameter retries, but in good number of cases, the deadlock happens in the retry as well.
The same issue could occur even if we remove SubDAGs, which will not be a trivial exercise for us.
Any lateral solutions/workarounds are also welcome.
Please let me know if airflow.cfg file is required.
Related
We added a second airflow database to support our staging airflow instance, and now both staging and production seem to have intermittent connection issues. We recently upgraded to 2.2.4, but looking through past RDS logs, I see aborted connections (Got an error reading communication packets) prior to the upgrade as well (previous version 1.10.11). Both instances have a webserver UI that pull in past DAG runs fine, and the scheduler is running DAGs appropriately and storing dag_run data. When the service is started however, we receive an error "MySQL server has gone away" (see below for stack trace). airflow db check and airflow db shell both return successful connections. Some relevant airflow config values are:
sql_alchemy_pool_enabled = True
sql_alchemy_pool_size = 5
sql_alchemy_max_overflow = 10
sql_alchemy_pool_recycle = 1800
sql_alchemy_pool_pre_ping = True
sql_alchemy_schema =
sql_alchemy_reconnect_timeout = 300
I've also tried upping the pool size to 20, and also confirmed that our db size should be able to handle at least 90 concurrent connections.
[2022-04-28 16:32:03,212] {manager.py:512} WARNING - Refused to delete permission view, assoc with role exists DAG Runs.can_create Admin
Process ForkProcess-19:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
cursor, statement, parameters, context
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
cursor.execute(statement, parameters)
File "/usr/local/lib/python3.6/site-packages/MySQLdb/cursors.py", line 206, in execute
res = self._query(query)
File "/usr/local/lib/python3.6/site-packages/MySQLdb/cursors.py", line 319, in _query
db.query(q)
File "/usr/local/lib/python3.6/site-packages/MySQLdb/connections.py", line 254, in query
_mysql.connection.query(self, query)
MySQLdb._exceptions.OperationalError: (2006, 'MySQL server has gone away')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/local/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.6/site-packages/airflow/dag_processing/manager.py", line 287, in _run_processor_manager
processor_manager.start()
File "/usr/local/lib/python3.6/site-packages/airflow/dag_processing/manager.py", line 520, in start
return self._run_parsing_loop()
File "/usr/local/lib/python3.6/site-packages/airflow/dag_processing/manager.py", line 585, in _run_parsing_loop
self._find_zombies()
File "/usr/local/lib/python3.6/site-packages/airflow/utils/session.py", line 70, in wrapper
return func(*args, session=session, **kwargs)
File "/usr/local/lib/python3.6/site-packages/airflow/dag_processing/manager.py", line 1079, in _find_zombies
LJ.latest_heartbeat < limit_dttm,
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 3373, in all
return list(self)
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 3535, in __iter__
return self._execute_and_instances(context)
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 3560, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
return meth(self, multiparams, params)
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1130, in _execute_clauseelement
distilled_params,
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1317, in _execute_context
e, statement, parameters, cursor, context
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1511, in _handle_dbapi_exception
sqlalchemy_exception, with_traceback=exc_info[2], from_=e
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
cursor, statement, parameters, context
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
cursor.execute(statement, parameters)
File "/usr/local/lib/python3.6/site-packages/MySQLdb/cursors.py", line 206, in execute
res = self._query(query)
File "/usr/local/lib/python3.6/site-packages/MySQLdb/cursors.py", line 319, in _query
db.query(q)
File "/usr/local/lib/python3.6/site-packages/MySQLdb/connections.py", line 254, in query
_mysql.connection.query(self, query)
sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) (2006, 'MySQL server has gone away')
[SQL: SELECT task_instance.try_number AS task_instance_try_number, task_instance.task_id AS task_instance_task_id, task_instance.dag_id AS task_instance_dag_id, task_instance.run_id AS task_instance_run_id, task_instance.start_date AS task_instance_start_date, task_instance.end_date AS task_instance_end_date, task_instance.duration AS task_instance_duration, task_instance.state AS task_instance_state, task_instance.max_tries AS task_instance_max_tries, task_instance.hostname AS task_instance_hostname, task_instance.unixname AS task_instance_unixname, task_instance.job_id AS task_instance_job_id, task_instance.pool AS task_instance_pool, task_instance.pool_slots AS task_instance_pool_slots, task_instance.queue AS task_instance_queue, task_instance.priority_weight AS task_instance_priority_weight, task_instance.operator AS task_instance_operator, task_instance.queued_dttm AS task_instance_queued_dttm, task_instance.queued_by_job_id AS task_instance_queued_by_job_id, task_instance.pid AS task_instance_pid, task_instance.executor_config AS task_instance_executor_config, task_instance.external_executor_id AS task_instance_external_executor_id, task_instance.trigger_id AS task_instance_trigger_id, task_instance.trigger_timeout AS task_instance_trigger_timeout, task_instance.next_method AS task_instance_next_method, task_instance.next_kwargs AS task_instance_next_kwargs, dag.fileloc AS dag_fileloc, dag_run_1.state AS dag_run_1_state, dag_run_1.id AS dag_run_1_id, dag_run_1.dag_id AS dag_run_1_dag_id, dag_run_1.queued_at AS dag_run_1_queued_at, dag_run_1.execution_date AS dag_run_1_execution_date, dag_run_1.start_date AS dag_run_1_start_date, dag_run_1.end_date AS dag_run_1_end_date, dag_run_1.run_id AS dag_run_1_run_id, dag_run_1.creating_job_id AS dag_run_1_creating_job_id, dag_run_1.external_trigger AS dag_run_1_external_trigger, dag_run_1.run_type AS dag_run_1_run_type, dag_run_1.conf AS dag_run_1_conf, dag_run_1.data_interval_start AS dag_run_1_data_interval_start, dag_run_1.data_interval_end AS dag_run_1_data_interval_end, dag_run_1.last_scheduling_decision AS dag_run_1_last_scheduling_decision, dag_run_1.dag_hash AS dag_run_1_dag_hash
FROM task_instance INNER JOIN job ON task_instance.job_id = job.id AND job.job_type IN (%s) INNER JOIN dag ON task_instance.dag_id = dag.dag_id INNER JOIN dag_run AS dag_run_1 ON dag_run_1.dag_id = task_instance.dag_id AND dag_run_1.run_id = task_instance.run_id
WHERE task_instance.state = %s AND (job.state != %s OR job.latest_heartbeat < %s)]
[parameters: ('LocalTaskJob', <TaskInstanceState.RUNNING: 'running'>, <TaskInstanceState.RUNNING: 'running'>, datetime.datetime(2022, 4, 28, 16, 27, 2, 870459))]
(Background on this error at: http://sqlalche.me/e/13/e3q8)```
I'm developing a large Django project and I'm beginning to write some tests. But I ran into a problem when running them. When I run python manage.py test I get the following error:
$ project/ python manage.py test
project/env/lib/python3.7/site-packages/django/db/models/base.py:319: RuntimeWarning: Model 'accounts.manager' was already registered. Reloading models is not advised as it can lead to inconsistencies, most notably with related models.
new_class._meta.apps.register_model(new_class._meta.app_label, new_class)
Creating test database for alias 'default'...
Got an error creating the test database: (1007, "Can't create database 'test_partyadvisor'; database exists")
Type 'yes' if you would like to try deleting the test database 'test_partyadvisor', or 'no' to cancel: yes
Destroying old test database for alias 'default'...
Traceback (most recent call last):
File "project/env/lib/python3.7/site-packages/django/db/backends/utils.py", line 62, in execute
return self.cursor.execute(sql)
File "project/env/lib/python3.7/site-packages/django/db/backends/mysql/base.py", line 101, in execute
return self.cursor.execute(query, args)
File "project/env/lib/python3.7/site-packages/MySQLdb/cursors.py", line 206, in execute
res = self._query(query)
File "project/env/lib/python3.7/site-packages/MySQLdb/cursors.py", line 312, in _query
db.query(q)
File "project/env/lib/python3.7/site-packages/MySQLdb/connections.py", line 224, in query
_mysql.connection.query(self, query)
MySQLdb._exceptions.OperationalError: (1170, "BLOB/TEXT column 'content' used in key specification without a key length")
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "manage.py", line 22, in <module>
execute_from_command_line(sys.argv)
File "project/env/lib/python3.7/site-packages/django/core/management/__init__.py", line 364, in execute_from_command_line
utility.execute()
File "project/env/lib/python3.7/site-packages/django/core/management/__init__.py", line 356, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "project/env/lib/python3.7/site-packages/django/core/management/commands/test.py", line 29, in run_from_argv
super(Command, self).run_from_argv(argv)
File "project/env/lib/python3.7/site-packages/django/core/management/base.py", line 283, in run_from_argv
self.execute(*args, **cmd_options)
File "project/env/lib/python3.7/site-packages/django/core/management/base.py", line 330, in execute
output = self.handle(*args, **options)
File "project/env/lib/python3.7/site-packages/django/core/management/commands/test.py", line 62, in handle
failures = test_runner.run_tests(test_labels)
File "project/env/lib/python3.7/site-packages/django/test/runner.py", line 601, in run_tests
old_config = self.setup_databases()
File "project/env/lib/python3.7/site-packages/django/test/runner.py", line 546, in setup_databases
self.parallel, **kwargs
File "project/env/lib/python3.7/site-packages/django/test/utils.py", line 187, in setup_databases
serialize=connection.settings_dict.get('TEST', {}).get('SERIALIZE', True),
File "project/env/lib/python3.7/site-packages/django/db/backends/base/creation.py", line 69, in create_test_db
run_syncdb=True,
File "project/env/lib/python3.7/site-packages/django/core/management/__init__.py", line 131, in call_command
return command.execute(*args, **defaults)
File "project/env/lib/python3.7/site-packages/django/core/management/base.py", line 330, in execute
output = self.handle(*args, **options)
File "project/env/lib/python3.7/site-packages/django/core/management/commands/migrate.py", line 204, in handle
fake_initial=fake_initial,
File "project/env/lib/python3.7/site-packages/django/db/migrations/executor.py", line 115, in migrate
state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial)
File "project/env/lib/python3.7/site-packages/django/db/migrations/executor.py", line 145, in _migrate_all_forwards
state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial)
File "project/env/lib/python3.7/site-packages/django/db/migrations/executor.py", line 244, in apply_migration
state = migration.apply(state, schema_editor)
File "project/env/lib/python3.7/site-packages/django/db/migrations/migration.py", line 129, in apply
operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
File "project/env/lib/python3.7/site-packages/django/db/migrations/operations/models.py", line 97, in database_forwards
schema_editor.create_model(model)
File "project/env/lib/python3.7/site-packages/django/db/backends/base/schema.py", line 319, in create_model
self.execute(sql, params or None)
File "project/env/lib/python3.7/site-packages/django/db/backends/base/schema.py", line 136, in execute
cursor.execute(sql, params)
File "project/env/lib/python3.7/site-packages/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
File "project/env/lib/python3.7/site-packages/django/db/utils.py", line 94, in __exit__
six.reraise(dj_exc_type, dj_exc_value, traceback)
File "project/env/lib/python3.7/site-packages/django/utils/six.py", line 685, in reraise
raise value.with_traceback(tb)
File "project/env/lib/python3.7/site-packages/django/db/backends/utils.py", line 62, in execute
return self.cursor.execute(sql)
File "project/env/lib/python3.7/site-packages/django/db/backends/mysql/base.py", line 101, in execute
return self.cursor.execute(query, args)
File "project/env/lib/python3.7/site-packages/MySQLdb/cursors.py", line 206, in execute
res = self._query(query)
File "project/env/lib/python3.7/site-packages/MySQLdb/cursors.py", line 312, in _query
db.query(q)
File "project/env/lib/python3.7/site-packages/MySQLdb/connections.py", line 224, in query
_mysql.connection.query(self, query)
django.db.utils.OperationalError: (1170, "BLOB/TEXT column 'content' used in key specification without a key length")
The test database is created, but for some reason I can not use it (?).
My database configuration is, (in the settings.py file):
DATABASES = {
"default": {
"ENGINE": "django.db.backends.mysql",
"HOST": env.get("database").get("HOST"),
"PORT": env.get("database").get("PORT"),
"NAME": env.get("database").get("NAME"),
"USER": env.get("database").get("USER"),
"PASSWORD": env.get("database").get("PASS")
}
}
Notes:
The user I use to access the db has all grants on it.
I tried to run the tests using the root user, and got the same error.
The normal db and the cache one work perfectly.
Also ran makemigrations and migrate.
Deleted the test db from MySQL shell and re-run python manage.py test, got the same error.
Option --keep-db not working either.
System is Ubuntu 18.04 and I'm using Django 1.11 and MySQL 5.7.
I have an airflow instance on EC2 that is running the webserver/scheduler. I want to hook up a MySQL RDS instance as the backend metadata database as opposed to the native SQLite. I replaced the one line in Airflow.cfg that connects to the database via sql_alchemy to connect to RDS with a pymysql driver:
#sql_alchemy_conn = sqlite:////home/cloud-user/airflow/airflow.db
sql_alchemy_conn = mysql+pymysql://admin:<PASSWORD>#airflow-db.xxxxxxxxxxxx.us-east-1.rds.amazonaws.com:3306/airflow
The connection seems to work fine and I am able to get into the RDS instance and query tables via a MySQL client set up on my EC2 instance.
When I toggle a DAG on or off, I get this nasty python stack trace in my shell:
[2019-09-30 14:00:51,774] {app.py:1891} ERROR - Exception on /admin/airflow/paused [POST]
Traceback (most recent call last):
File "/home/cloud-user/.local/lib/python2.7/site-packages/flask/app.py", line 2446, in wsgi_app
response = self.full_dispatch_request()
File "/home/cloud-user/.local/lib/python2.7/site-packages/flask/app.py", line 1951, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/cloud-user/.local/lib/python2.7/site-packages/flask/app.py", line 1820, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/cloud-user/.local/lib/python2.7/site-packages/flask/app.py", line 1949, in full_dispatch_request
rv = self.dispatch_request()
File "/home/cloud-user/.local/lib/python2.7/site-packages/flask/app.py", line 1935, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/cloud-user/.local/lib/python2.7/site-packages/flask_admin/base.py", line 69, in inner
return self._run_view(f, *args, **kwargs)
File "/home/cloud-user/.local/lib/python2.7/site-packages/flask_admin/base.py", line 368, in _run_view
return fn(self, *args, **kwargs)
File "/home/cloud-user/.local/lib/python2.7/site-packages/flask_login/utils.py", line 258, in decorated_view
return func(*args, **kwargs)
File "/home/cloud-user/.local/lib/python2.7/site-packages/airflow/www/utils.py", line 279, in wrapper
session.commit()
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1027, in commit
self.transaction.commit()
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 494, in commit
self._prepare_impl()
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 473, in _prepare_impl
self.session.flush()
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2459, in flush
self._flush(objects)
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2597, in _flush
transaction.rollback(_capture_exception=True)
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
compat.reraise(exc_type, exc_value, exc_tb)
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2557, in _flush
flush_context.execute()
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
rec.execute(self)
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 589, in execute
uow,
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 245, in save_obj
insert,
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 1138, in _emit_insert_statements
statement, params
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 988, in execute
return meth(self, multiparams, params)
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement
distilled_params,
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1253, in _execute_context
e, statement, parameters, cursor, context
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1473, in _handle_dbapi_exception
util.raise_from_cause(sqlalchemy_exception, exc_info)
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1249, in _execute_context
cursor, statement, parameters, context
File "/home/cloud-user/.local/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute
cursor.execute(statement, parameters)
File "/home/cloud-user/.local/lib/python2.7/site-packages/pymysql/cursors.py", line 170, in execute
result = self._query(query)
File "/home/cloud-user/.local/lib/python2.7/site-packages/pymysql/cursors.py", line 328, in _query
conn.query(q)
File "/home/cloud-user/.local/lib/python2.7/site-packages/pymysql/connections.py", line 517, in query
self._affected_rows = self._read_query_result(unbuffered=unbuffered)
File "/home/cloud-user/.local/lib/python2.7/site-packages/pymysql/connections.py", line 732, in _read_query_result
result.read()
File "/home/cloud-user/.local/lib/python2.7/site-packages/pymysql/connections.py", line 1075, in read
first_packet = self.connection._read_packet()
File "/home/cloud-user/.local/lib/python2.7/site-packages/pymysql/connections.py", line 684, in _read_packet
packet.check_error()
File "/home/cloud-user/.local/lib/python2.7/site-packages/pymysql/protocol.py", line 220, in check_error
err.raise_mysql_exception(self._data)
File "/home/cloud-user/.local/lib/python2.7/site-packages/pymysql/err.py", line 109, in raise_mysql_exception
raise errorclass(errno, errval)
ProgrammingError: (pymysql.err.ProgrammingError) (1064, u"You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '[(\\'is_paused\\', u\\'true\\'), (\\'dag_id\\', u\\'test\\')]'')' at line 1")
[SQL: INSERT INTO log (dttm, dag_id, task_id, event, execution_date, owner, extra) VALUES (%(dttm)s, %(dag_id)s, %(task_id)s, %(event)s, %(execution_date)s, %(owner)s, %(extra)s)]
[parameters: {'task_id': None, 'extra': "[('is_paused', u'true'), ('dag_id', u'test')]", 'execution_date': None, 'event': 'paused', 'owner': 'anonymous', 'dttm': datetime.datetime(2019, 9, 30, 18, 0, 51, 768073, tzinfo=<Timezone [UTC]>), 'dag_id': u'test'}]
(Background on this error at: http://sqlalche.me/e/f405)
From what I saw on the Airflow documentation, I'm making the change correctly, and the 'airflow initdb / resetdb' commands execute without error.
I already spent a fair amount of time googling this error, but there are no clear answers to this problem. I'm really not sure if I'm missing a prerequisite or if I should be using a different connector?
EDIT: I'm using python 2.7, as seen in the stack trace. Airflow claims compatibility in the short-term, but I see another SO user's problems went away after upgrading to python 3.6: Link to other solution. I'll try this and update if it seems to work.
It looks like the solution is indeed to upgrade to python 3.6, leveraging a virtual environment due to the required duality of python 2.x and 3.y with Linux systems and Airflow. Specifically, I followed this guide and my DAGs seem to be executing successfully.
In the process of upgrading to Django 1.6 I've started to get a frequent OperationalError: (2006, 'MySQL server has gone away') message on requests to the gunicorn server I use to run the django app. These errors occur instantly from the moment the server is started, on requests that should only take a second which makes me doubt that it is a timeout issue. This error isn't present on the old 1.4 branch of the project and the 1.6 branch doesn't behave this way if it's served simply through django-admin.py runserver.
I generally run gunicorn through an sv process (though it errors if I run it manually too) with the command django-admin.py run_gunicorn --workers=4 -b localhost:8000 which results in many requests, even ones for static media, returning with:
[ERROR] 2015-07-29 14:30:27,931 - gunicorn.error:260 - Error handling request
Traceback (most recent call last):
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/gunicorn/workers/sync.py", line 125, in handle_request
respiter = self.wsgi(environ, resp.start_response)
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/django/core/handlers/wsgi.py", line 187, in __call__
self.load_middleware()
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 47, in load_middleware
mw_instance = mw_class()
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/django/middleware/locale.py", line 24, in __init__
for url_pattern in get_resolver(None).url_patterns:
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/django/core/urlresolvers.py", line 365, in url_patterns
patterns = getattr(self.urlconf_module, "urlpatterns", self.urlconf_module)
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/django/core/urlresolvers.py", line 360, in urlconf_module
self._urlconf_module = import_module(self.urlconf_name)
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/django/utils/importlib.py", line 40, in import_module
__import__(name)
File "/opt/apps/maplecroft/versions/current/websites/maplecroft/urls.py", line 2, in <module>
from maplecroft.views import RiskAtlasesLandingView
File "/opt/apps/maplecroft/versions/current/maplecroft/views.py", line 40, in <module>
import maplecroft.search as _search
File "/opt/apps/maplecroft/versions/current/maplecroft/search.py", line 71, in <module>
class MaplecroftSearchForm(SearchForm):
File "/opt/apps/maplecroft/versions/current/maplecroft/search.py", line 111, in MaplecroftSearchForm
choices=model_choices(),
File "/opt/apps/maplecroft/versions/current/maplecroft/search.py", line 57, in model_choices
for category in reversed(Category.objects.filter(parent=None)):
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/django/db/models/query.py", line 77, in __len__
self._fetch_all()
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/django/db/models/query.py", line 857, in _fetch_all
self._result_cache = list(self.iterator())
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/django/db/models/query.py", line 220, in iterator
for row in compiler.results_iter():
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 713, in results_iter
for rows in self.execute_sql(MULTI):
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 786, in execute_sql
cursor.execute(sql, params)
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/django/db/backends/util.py", line 69, in execute
return super(CursorDebugWrapper, self).execute(sql, params)
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/django/db/backends/util.py", line 53, in execute
return self.cursor.execute(sql, params)
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/django/db/utils.py", line 99, in __exit__
six.reraise(dj_exc_type, dj_exc_value, traceback)
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/django/db/backends/util.py", line 53, in execute
return self.cursor.execute(sql, params)
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/django/db/backends/mysql/base.py", line 124, in execute
return self.cursor.execute(query, args)
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/MySQLdb/cursors.py", line 174, in execute
self.errorhandler(self, exc, value)
File "/opt/envs/maplecroft/local/lib/python2.7/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
raise errorclass, errorvalue
OperationalError: (2006, 'MySQL server has gone away')
However, if I drop to --workers=1 everything seems to run smoothly so my current thoughts are that it is an issue with the threaded workers feature of gunicorn?
Edit: I just tried upgrading gunicorn to the latest version (was on 0.17.2) but that doesn't seem to have made a difference.
I am wondering if this question: https://serverfault.com/questions/407612/error-2006-mysql-server-has-gone-away is relevant, but struggling to overlay it with my current issues
It appears that it was the gunicorn version after all- I tried upgrading gunicorn but was still using the django-admin.py run_gunicorn method of starting the server. Upgrading gunicorn and switching to the non-deprecated gunicorn wsgi:application method solved it.
Having trouble with the db.alter command when changing a date field from null=True and blank=True to required by removing these two values.
When the below line is commented out, the migration runs without a problem.
db.alter_column('milestones_milestone', 'date', self.gf('django.db.models.fields.DateField')(default='2011-01-01'))
This should change the column description from:
'milestones.milestone': {
'date': ('django.db.models.fields.DateField', [], {'null': 'True', 'blank': 'True'}),
},
to
'milestones.milestone': {
'date': ('django.db.models.fields.DateField', [], {default:'2011-01-01'}),
},
When the above line is left in the migration, the error I get:
- Migrating forwards to 0002_auto__add_field_milestone_type__chg_field_milestone_date__add_field_mi.
> milestones:0002_auto__add_field_milestone_type__chg_field_milestone_date__add_field_mi
! Error found during real run of migration! Aborting.
! Since you have a database that does not support running
! schema-altering statements in transactions, we have had
! to leave it in an interim state between migrations.
! You *might* be able to recover with: = ALTER TABLE `milestones_milestone` DROP COLUMN `type` CASCADE; []
= ALTER TABLE `milestones_milestonetemplate` DROP COLUMN `type` CASCADE; []
! The South developers regret this has happened, and would
! like to gently persuade you to consider a slightly
! easier-to-deal-with DBMS.
! NOTE: The error which caused the migration to fail is further up.
Traceback (most recent call last):
File "manage.py", line 11, in <module>
execute_manager(global_settings)
File "C:\Python26\lib\site-packages\django\core\management\__init__.py", line 438, in execute_manager
utility.execute()
File "C:\Python26\lib\site-packages\django\core\management\__init__.py", line 379, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "C:\Python26\lib\site-packages\django\core\management\base.py", line 191, in run_from_argv
self.execute(*args, **options.__dict__)
File "C:\Python26\lib\site-packages\django\core\management\base.py", line 218, in execute
output = self.handle(*args, **options)
File "C:\SQE_Dashboard\SQE Dashboard-mimercha\SQE Dashboard\dashboard\lib\south\management\commands\migrate.py", line 109, in ha
ndle
ignore_ghosts = ignore_ghosts,
File "C:\SQE_Dashboard\SQE Dashboard-mimercha\SQE Dashboard\dashboard\lib\south\migration\__init__.py", line 202, in migrate_app
success = migrator.migrate_many(target, workplan, database)
File "C:\SQE_Dashboard\SQE Dashboard-mimercha\SQE Dashboard\dashboard\lib\south\migration\migrators.py", line 292, in migrate_ma
ny
result = self.migrate(migration, database)
File "C:\SQE_Dashboard\SQE Dashboard-mimercha\SQE Dashboard\dashboard\lib\south\migration\migrators.py", line 125, in migrate
result = self.run(migration)
File "C:\SQE_Dashboard\SQE Dashboard-mimercha\SQE Dashboard\dashboard\lib\south\migration\migrators.py", line 99, in run
return self.run_migration(migration)
File "C:\SQE_Dashboard\SQE Dashboard-mimercha\SQE Dashboard\dashboard\lib\south\migration\migrators.py", line 81, in run_migrati
on
migration_function()
File "C:\SQE_Dashboard\SQE Dashboard-mimercha\SQE Dashboard\dashboard\lib\south\migration\migrators.py", line 57, in <lambda>
return (lambda: direction(orm))
File "C:\SQE_Dashboard\SQE Dashboard-mimercha\SQE Dashboard\dashboard\milestones\migrations\0002_auto__add_field_milestone_type_
_chg_field_milestone_date__add_field_mi.py", line 15, in forwards
db.alter_column('milestones_milestone', 'date', self.gf('django.db.models.fields.DateField')(default='2011-01-01'))
File "C:\SQE_Dashboard\SQE Dashboard-mimercha\SQE Dashboard\dashboard\lib\south\db\generic.py", line 373, in alter_column
self.execute("ALTER TABLE %s %s;" % (self.quote_name(table_name), sql), values)
File "C:\SQE_Dashboard\SQE Dashboard-mimercha\SQE Dashboard\dashboard\lib\south\db\generic.py", line 137, in execute
cursor.execute(sql, params)
File "C:\Python26\lib\site-packages\django\db\backends\util.py", line 15, in execute
return self.cursor.execute(sql, params)
File "C:\Python26\lib\site-packages\django\db\backends\mysql\base.py", line 86, in execute
return self.cursor.execute(query, args)
File "C:\Python26\lib\site-packages\MySQLdb\cursors.py", line 173, in execute
self.errorhandler(self, exc, value)
File "C:\Python26\lib\site-packages\MySQLdb\connections.py", line 36, in defaulterrorhandler
raise errorclass, errorvalue
django.db.utils.DatabaseError: (1265, "Data truncated for column 'date' at row 512")
I'm using:
South 0.71 Note: I tried upgrading to 0.73 and found 0.73 gave me the same error and broke my scripts when loading older fixtures.
Django 1.2.1
python library: MySQLDdb DB API v2.0 compatible, revision 603
mysql Ver 14.14 Distrib 5.1.51, for Win32 (ia32)
InnoDB Storage Engine
I just ran into the same error. In my case I had accidentally set the column's default value to datetime.now which caused data truncation.
I'd recommend that you remove the default value from your model, set auto_now_add=True and regenerate the migration file.