connect prestodb through sqlalchemy - sqlalchemy

I'd like to connect to prestodb with SQLalchemy interface. I'm running prestodb==0.7.0 and SQLalchemy== 1.4.20 and SQLalchemy doesn't seem to have prestodb baked in:
NoSuchModuleError: Can't load plugin: sqlalchemy.dialects:presto
Not much luck with registering the prestodb either:
from sqlalchemy.dialects import registry
import prestodb
from prestodb.dbapi import Connection
registry.register('presto', 'prestodb.dbapi', 'Connection')
from sqlalchemy.engine import create_engine
port = 8889
user = os.environ["USER"]
engine = create_engine(f'presto://{user}#presto:{port}/hive',
connect_args={'protocol': 'https', 'requests_kwargs': {'verify': False}})
db = engine.raw_connection()
# AttributeError: type object 'Connection' has no attribute 'get_dialect_cls'
Any ideas?

If you have a look at the Dialects docs you will see that Presto is a external dialect and needs to be installed separately. The Presto dialect is supported through PiHyve and can be installed using pip install 'pyhive[presto]'.

Related

Pyodbc error after upgrading python from 3.8 to 3.11

I am using the following code:
import pandas as pd, pyodbc
from sqlalchemy import create_engine
import urllib
def connStrMDBV2(fname):
driver = r'DRIVER={Microsoft Access Driver (*.mdb, *.accdb)}'
connection_string = (fr"{driver};DBQ={fname};ExtendedAnsiSQL=1;")
connection_uri = f"access+pyodbc:///?odbc_connect={urllib.parse.quote_plus(connection_string)}"
engine = create_engine(connection_uri)
return engine
fname = fr'C:\Users\asd\Desktop\test11\mytest.mdb'
connStrMDBV2(fname)
I get the error:
raise exc.NoSuchModuleError(
sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.dialects:access.pyodbc
I have checked all the packages are already installed and I am not using virtual environment. It used to work on Python 3.8 and I am getting error after upgrading to Python 3.11

How to include python mysql.connector into AWS Chalice deployment?

I try to deploy an AWS lambda application, I implemented with the Chalice Python Framework. My app.py connects to a MySQL server and therefore has to
import mysql.connector
But on every invocation of one of my lambda functions I get an error in the log
'Unable to import module 'app': No module named mysql.connector'
I tried to add the mysql.connector to the requirements.txt file in the chalice project:
mysql_connector==2.1.6
And if I do so, 2 additional folders containing several files appear in the AWS lambda environment:
/mysql_connector-2.1.6.data
/mysql_connector-2.1.6.dist-info
But the error remains the same. How to deploy python mysql.connector with Chalice?
This finally worked for me:
lib_path=os.path.abspath(os.path.join(__file__, '..', 'mysql_connector-2.1.6.data', 'purelib'))
sys.path.append(lib_path)
import mysql.connector
Putting the "mysql_connector==2.1.6" into the "requirements.txt" file did install the mysql connector in lambda environment. I added the path of the package (../mysql_connector-2.1.6.data/purelib) to system path.

Cant connect to Mysql database from pyspark, getting jdbc error

I am learning pyspark, and trying to connect to a mysql database.
But i am getting a java.lang.ClassNotFoundException: com.mysql.jdbc.Driver Exception while running the code. I have spent a whole day trying to fix it, any help would be appreciated :)
I am using pycharm community edition with anaconda and python 3.6.3
Here is my code:
from pyspark import SparkContext,SQLContext
sc= SparkContext()
sqlContext= SQLContext(sc)
df = sqlContext.read.format("jdbc").options(
url ="jdbc:mysql://192.168.0.11:3306/my_db_name",
driver = "com.mysql.jdbc.Driver",
dbtable = "billing",
user="root",
password="root").load()
Here is the error:
py4j.protocol.Py4JJavaError: An error occurred while calling o27.load.
: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
This got asked 9 months ago at the time of writing, but since there's no answer, there it goes. I was in the same situation, searched stackoverflow over and over, tried different suggestions but the answer finally is absurdly simple: You just have to COPY the MySQL driver into the "jars" folder of Spark!
Download here https://dev.mysql.com/downloads/connector/j/5.1.html
I'm using the 5.1 version, although 8.0 exists, but I had some other problems when running the latest version with Spark 2.3.2 (had also other problems running Spark 2.4 on Windows 10).
Once downloaded you can just copy it into your Spark folder
E:\spark232_hadoop27\jars\ (use your own drive:\folder_name -- this is just an example)
You should have two files:
E:\spark232_hadoop27\jars\mysql-connector-java-5.1.47-bin.jar
E:\spark232_hadoop27\jars\mysql-connector-java-5.1.47.jar
After that the following code launched through pyCharm or jupyter notebook should work (as long as you have a MySQL database set up, that is):
import findspark
findspark.init()
import pyspark # only run after findspark.init()
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
dataframe_mysql = spark.read.format("jdbc").options(
url="jdbc:mysql://localhost:3306/uoc2",
driver = "com.mysql.jdbc.Driver",
dbtable = "company",
user="root",
password="password").load()
dataframe_mysql.show()
Bear in mind, I'm working currently locally with my Spark setup, so no real clusters involved, and also no "production" kind of code which gets submitted to such a cluster. For something more elaborate this answer could help: MySQL read with PySpark
On my computer, #Kondado 's solution works only if I change the driver in the options:
driver = 'com.mysql.cj.jdbc.Driver'
I am using Spark 8.0 on Windows. I downloaded mysql-connector-java-8.0.15.jar, Platform Independent version from here. And copy it to 'C:\spark-2.4.0-bin-hadoop2.7\jars\'
My code in Pycharm looks like this:
#import findspark # not necessary
#findspark.init() # not necessary
from pyspark import SparkConf, SparkContext, sql
from pyspark.sql import SparkSession
sc = SparkSession.builder.getOrCreate()
sqlContext = sql.SQLContext(sc)
source_df = sqlContext.read.format('jdbc').options(
url='jdbc:mysql://localhost:3306/database1',
driver='com.mysql.cj.jdbc.Driver', #com.mysql.jdbc.Driver
dbtable='table1',
user='root',
password='****').load()
print (source_df)
source_df.show()
I dont know how to add jar file to ClassPath(can someone tell me how??) so I put it in the SparkSession config and it works fine.
spark = SparkSession \
.builder \
.appName('test') \
.master('local[*]') \
.enableHiveSupport() \
.config("spark.driver.extraClassPath", "<path to mysql-connector-java-5.1.49-bin.jar>") \
.getOrCreate()
df = spark.read.format("jdbc").option("url","jdbc:mysql://localhost/<database_name>").option("driver","com.mysql.jdbc.Driver").option("dbtable",<table_name>).option("user",<user>).option("password",<password>).load()
df.show()
This worked for me, pyspark with mssql
java version is 1.7.0_191
pyspark version is 2.1.2
Download the below jar files
sqljdbc41.jar
mssql-jdbc-6.2.2.jre7.jar
Paste the above jars inside jars folder in the virtual environment
test_env/lib/python3.6/site-packages/pyspark/jars
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('Practise').getOrCreate()
url = 'jdbc:sqlserver://your_host_name:your_port;databaseName=YOUR_DATABASE_NAME;useNTLMV2=true;'
df = spark.read.format('jdbc'
).option('url', url
).option('user', 'your_db_username'
).option('password','your_db_password'
).option('dbtable', 'YOUR_TABLE_NAME'
).option('driver', 'com.microsoft.sqlserver.jdbc.SQLServerDriver'
).load()

Celery + SQLAlchemy : DatabaseError: (DatabaseError) SSL error: decryption failed or bad record mac

Error in the title triggers sometimes when using celery with more than one worker on a postgresql db with SSL turned on.
I'm in a flask + SQLAlchemy configuration
As mentionned here : https://github.com/celery/celery/issues/634
the solution in the django-celery plugin was to simply dispose all db connection at the start of the task.
In flask + SQLAlchemy configuration, doing this worked for me :
from celery.signals import task_prerun
#task_prerun.connect
def on_task_init(*args, **kwargs):
engine.dispose()
in case you don't know what "engine" is and how to get it, see here : http://flask.pocoo.org/docs/patterns/sqlalchemy/

web2py doesn't connect to mysql

I installed web2py as source and wanted to use DAL without the rest of the framework.
But DAL does not connect to mysql:
>>> DAL('mysql://user1:user1#localhost/test_rma')
...
RuntimeError: Failure to connect, tried 5 times:
'NoneType' object has no attribute 'connect'
Whereas MySQLdb can connect to the database with the same credentials:
>>> import MySQLdb
>>> db = MySQLdb.connect(host='localhost', user='user1', passwd='user1', db='test_rma')
A similar problem with MsSQL was solved by explicitly setting the driver object. I tried the same solution:
>>> from gluon.dal import MySQLAdapter
>>> print MySQLAdapter.driver
None
>>> driver = globals().get('MySQLdb',None)
>>> print MySQLAdapter.driver
None
But still the driver is None.
Ok, I found the solution of the problem. I had to write:
MySQLAdapter.driver = globals().get('MySQLdb',None)
instead of
driver = globals().get('MySQLdb',None)
I misread that line in the original question.