How do I connect Airflow to SQLite locally? - mysql

I'm trying to try out Airflow for the very first time and I'm trying to connect it to a local SQLite database. But I can't seem to get my head around on how to actually do it.
I've read up on Airflow's document, Set my executor to LocalExecutor and set up my sql_alchemy_conn to sqlite:////home/myName/Programs/sqlite3/DatabaseName.db but it doesn't seem to work as it throws an
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 21, in <module>
from airflow import configuration
File "/usr/local/lib/python2.7/dist-packages/airflow/__init__.py", line 35, in <module>
from airflow import configuration as conf
File "/usr/local/lib/python2.7/dist-packages/airflow/configuration.py", line 520, in <module>
conf.read(AIRFLOW_CONFIG)
File "/usr/local/lib/python2.7/dist-packages/airflow/configuration.py", line 283, in read
self._validate()
File "/usr/local/lib/python2.7/dist-packages/airflow/configuration.py", line 169, in _validate
self.get('core', 'executor')))
airflow.exceptions.AirflowConfigException: error: cannot use sqlite with the LocalExecutor
error when I tried to run airflow initdb. I tried to google around and tried vipul sharma's solution found here and set the value of my sql_alchemy_conn to mysql://:#localhost:3306/ but it still doesn't work as it throws an
sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (1045, "Access denied for user 'myName'#'localhost' (using password: NO)")
error. I know that the answer should be really simple but I really don't understand how to so I hope you can guide me through on what to do/read.

Use SequentialExecutor
"This executor will only run one task instance at a time, can be used for debugging. It is also the only executor that can be used with sqlite since sqlite doesn’t support multiple connections." airflow documentation
You just didn't need to change to LocalExecutor, change it back to SequentialExecutor, change sql_alchemy_conn to point to sqlite:////home/myName/Programs/sqlite3/DatabaseName.db and stop airflow services (webserver, scheduler).
Execute airflow initdb then start up the services again.
Hopefully that works.

Related

gcloud crashing due to (SSLHandshakeError): [SSL: UNKNOWN_PROTOCOL]

I have installed gcloud sdk by following the link https://cloud.google.com/sdk/docs/downloads-apt-get on Ubuntu 16.04.6 LTS. I have also done the proxy configuration using the following link https://cloud.google.com/sdk/docs/proxy-settings.
Google Cloud SDK 288.0.0
alpha 2020.04.03
beta 2020.04.03
bq 2.0.56
core 2020.04.03
gsutil 4.49
kubectl 2020.04.03
gcloud init is successful and gcloud info --run-diagnostics displays no problems. However gcloud crashes for any other commands I run. I have tried the following commands.
1. gcloud compute images list
2. gcloud services list
3. gcloud logging logs list
Here is the message I get after the crash.
ERROR: gcloud crashed (SSLHandshakeError): [SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:590)
If you would like to report this issue, please run the following command:
gcloud feedback
To check gcloud for common problems, please run the following command:
gcloud info --run-diagnostics
Can someone please help.
PS. Here is debug output.
DEBUG: Running [gcloud.compute.images.list] with arguments: [--verbosity: "debug"]
INFO: Refreshing access_token
INFO: Display format: " table(
name,
selfLink.map().scope(projects).segment(0):label=PROJECT,
family,
deprecated.state:label=DEPRECATED,
status
)"
DEBUG: [SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:590)
Traceback (most recent call last):
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/calliope/cli.py", line 983, in Execute
resources = calliope_command.Run(cli=self, args=args)
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/calliope/backend.py", line 809, in Run
display_info=self.ai.display_info).Display()
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/calliope/display.py", line 483, in Display
self._printer.Print(self._resources)
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/core/resource/resource_printer_base.py", line 275, in Print
for resource in resources:
File "/usr/lib/google-cloud-sdk/lib/surface/compute/images/list.py", line 113, in _FilterDeprecated
for image in images:
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/api_lib/compute/lister.py", line 1065, in __call__
errors=errors):
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/api_lib/compute/request_helper.py", line 204, in ListJson
for item in _ListCore(requests, http, batch_url, errors, _HandleJsonList):
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/api_lib/compute/request_helper.py", line 134, in _ListCore
requests=requests, http=http, batch_url=batch_url)
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/api_lib/compute/batch_helper.py", line 106, in MakeRequests
batch_request_callback=batch_checker.BatchCheck)
File "/usr/bin/../lib/google-cloud-sdk/lib/third_party/apitools/base/py/batch.py", line 226, in Execute
batch_http_request.Execute(http)
File "/usr/bin/../lib/google-cloud-sdk/lib/third_party/apitools/base/py/batch.py", line 492, in Execute
self._Execute(http)
File "/usr/bin/../lib/google-cloud-sdk/lib/third_party/apitools/base/py/batch.py", line 449, in _Execute
response = http_wrapper.MakeRequest(http, request)
File "/usr/bin/../lib/google-cloud-sdk/lib/third_party/apitools/base/py/http_wrapper.py", line 356, in MakeRequest
max_retry_wait, total_wait_sec))
File "/usr/bin/../lib/google-cloud-sdk/lib/third_party/apitools/base/py/http_wrapper.py", line 304, in HandleExceptionsAndRebuildHttpConnections
raise retry_args.exc
SSLHandshakeError: [SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:590)
ERROR: gcloud crashed (SSLHandshakeError): [SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:590)
I got it working after some debugging. Seems like firewall was the issue in my case. I shifted to python 3.5 . After which I ran the same command gcloud compute images list
I got the error "Caught socket error, retrying request to url https://compute.googleapis.com/batch/compute/v1". Adding this url to firewall exception solved my issue.

Windows 10 Rtree installation successful from .whl file, but error when running code

I am running Python 3.7, 64bit on Windows 10 and trying desperately to get Rtree running. I use the package Rtree-0.9.1-cp37-cp37m-win_amd64.whl from Christoph Gohlke (https://www.lfd.uci.edu/~gohlke/pythonlibs/).
I have tried for very long to get it to work, but keep on getting the following error message when running a script that uses geopandas.
Traceback (most recent call last):
File "C:\Python37\lib\site-packages\rtree\core.py", line 90, in <module>
rt = ctypes.CDLL(os.path.join(here, 'spatialindex_c.dll'))
File "C:\Python37\lib\ctypes\__init__.py", line 364, in __init__
self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] The specified module could not be found
The installation of the whl-package should include the libspatialindex files, but they are not found when running the code. I tried to use Python 2.7 first to run it, then installed Python 3.7. I've checked all the dependencies and checked whether the "spatialindex_c.dll" files are at the right place, but nothing helps. Would be great to get an answer on that.

Unable to import the database via PLESK

I'm trying to import my database via PLESK and still have the same errors:
enter image description here
ERROR 2006 (HY000) at line 1898: MySQL server has gone away
Traceback (most recent call last):
File "/usr/local/psa/admin/sbin/dbbackup", line 6, in
File "/usr/local/psa/lib/modules/python/dbbackup/dbbackup.py", line
100, in main restore(options, password)
File "/usr/local/psa/lib/modules/python/dbbackup/dbbackup.py", line
89, in restore
raise Exception("program 'mysql' finished with non-zero exit code:
%d" % p.returncode)
Exception: program 'mysql' finished with non-zero exit code: 1
I tried to enlarge memory_limit and upload_max_filesize in PLESK PHP Settings but it still doesn't work. I've also checked version of my MySQL (5.5.2) because I found it might be not supported by the latest version of Plesk, but it seems to be OK.
I am beginner in such subjects and I really don't know what to do. Please, rescue me :(
It is timeout issue.
I would suggest increasing wait_timeout in my.cnf
Hi Todra,
if you desire to increase the "wait-timeout", you have to define for example in your "my.cnf":
AFTER => [mysqld]
interactive_timeout=60
wait_timeout=60
In addition, you might consider to set as well:
mysql -uadmin -p`cat /etc/psa/.psa.shadow` -e"SET GLOBAL wait_timeout=60; SET GLOBAL interactive_timeout=60"

exception when using ipy64 with pypyodbc

I'm trying to create a database with MSAccess using pypyodbc on Ironpython. I have this working perfectly fine on my old machine, but I must migrate to a new machine. However on the new machine I get an exception trying to run the same script.
Using python I can create a database without errors.
ipy
import pypyodbc
pypyodbc.win_create_mdb('C:\database.mdb')
pypyodbc.connect('Driver={Microsoft Access Driver (*.mdb)};DBQ=C:\database.mdb;')
However, if I try the same using ipy64 I get an Exception
"Access Driver is not found"
Traceback (most recent call last):
File "", line 1, in
File "C:\IronPython27\pypyodbc.py", line 2564, in win_create_mdb
Exception: Access Driver is not found.
I installed the AccessDatabaseEngine_x64 as I have 64-bit office products installed.
Thanks,
John.

Stanford Tagger in nltk not working due to JVM parameters

I am having a wired error while running following example code snippet
st = StanfordTagger('bidirectional-distsim-wsj-0-18.tagger')
st.tag('What is the airspeed of an unladen swallow ?'.split())
The first line worked properly but second line is giving following error.
Could not create the Java virtual machine.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.6/dist-packages/nltk-2.0.1rc1- py2.6.egg/nltk/tag/stanford.py", line 51, in tag
return self.batch_tag([tokens])[0]
File "/usr/local/lib/python2.6/dist-packages/nltk-2.0.1rc1-py2.6.egg/nltk/tag/stanford.py", line 77, in batch_tag
stdout=PIPE, stderr=PIPE)
File "/usr/local/lib/python2.6/dist-packages/nltk-2.0.1rc1-py2.6.egg/nltk/internals.py", line 166, in java
raise OSError('Java command failed!')
OSError: Java command failed!
I have tried adding .usr/lib/jvm into path but still not working
It wasn't working for me either. So I tried the following and its working perfectly.
st = POSTagger('path-to/stanford-postagger-full-2012-07-09/models/wsj-0-18-left3words.tagger','path-to/stanford-postagger-full-2012-07-09/stanford-postagger.jar')
and use nltk's tokenize method instead of Python's split()
taggedSentence= st.tag(nltk.word_tokenize(sentence))
I see that question is very outdated, but this days I got same error for unknown reason. It gives me a lot of headache. But I found solution.
First, I installed Oracle Java (here is instructions: How To Manually Install Oracle Java on a Debian or Ubuntu VPS)
Now, my python script told me more information on error. It outputs something like:
Forking JVM: error=12, Cannot allocate memory or error=12, Not enough space
Here you can read more about such problem: Forking the JVM
And to avoid that annoying error I need to edit /etc/sysctl.conf and add the following:
vm.overcommit_memory = 1
Then restart system for the change to take effect.