Access to a MySQL database via Jupyter Notebook w/ Python3 - mysql

I needed access a MySQL database via Jupyter Notebook, on which I run Python 3.6 (Anaconda install). It's a linear workflow, extracting data from the DB and manipulating it in Python/Pandas. No need for an ORM, a simple connector should do. However, the widely referenced MySQLdb package doesn't work with Python 3.x.
What are the alternatives?

The recommended installation modality for Jupyter on Ubuntu is Anaconda, so the appropriate package manager is conda. Installation via pip/pip3 or apt won't be accessible to the Notebook. conda makes it simple to get at least two good connectors:
pymysql works well and is easy to install:
sudo conda install pymysql
The 'official' connector:
sudo conda install mysql-connector-python
I tried pymysql first and it was fine but then switched to the second option due to the availability of extensive documentation.
If your objective is to import the data into a Pandas dataframe then use of the built-in pd.sql_read_table or pd.sql_read_query is convenient, as it labels the columns etc. It still requires installation of a connector, as discussed above.
An example with MySQL-connector-python, where you need to enter the database DETAILS:
import pandas as pd
import sqlalchemy
engine = sqlalchemy.create_engine('mysql+mysqlconnector://USER:PASSWORD#HOST/DB_NAME')
example_df = pd.read_sql_table("YOUR_TABLE_NAME", engine)

Related

SQLAlchemy NoSuchModuleError: Can't load plugin: sqlalchemy.dialects:spanner

Problem
We're trying to connect to Cloud Spanner via SQLAlchemy version 1.3.23 and python-spanner-sqlalchemy. Using Poetry for dependency management, sqlalchemy-spanner has been added like so (this is how the project was set up):
sqlalchemy = "~1.3"
sqlalchemy-spanner = { git="https://github.com/cloudspannerecosystem/python-spanner-sqlalchemy.git", tag="v0.1.0" }
When create_engine is called with
create_engine("spanner:///projects/my-project/instances/my-instance/databases/my-db")
I get the following error
class 'sqlalchemy.exc.NoSuchModuleError'>", "NoSuchModuleError(\"Can't load plugin: sqlalchemy.dialects:spanner\")
Attempts
Registry
I've tried adding (as seen in the conftest.py file in the python-spanner-sqlalchemy test package)
from sqlalchemy.dialects import registry
registry.register("spanner", "google.cloud.sqlalchemy_spanner", "SpannerDialect")
before create_engine is called, which leads to the following error:
class 'ModuleNotFoundError'>", "ModuleNotFoundError(\"No module named 'google.cloud.sqlalchemy_spanner'\")
This makes me think that the plugin dialect has not been correctly added since, in line 49 of setup.py, the connection for the dialect is made:
entry_points={
"sqlalchemy.dialects": [
"spanner = google.cloud.sqlalchemy_spanner:SpannerDialect"
]
},
Installing via python setup.py install
In the README for the spanner project, it says to clone the repo and install via python setup.py install. I performed this step, but am unsure how to import this into my current project or make my project aware of this library. I've never manually installed python packages before so, if anyone can provide any help here, I'd appreciate it.
What I did try:
install the library as per above
try to add the dependency via poetry : poetry add sqlalchemy-spanner. Got Could not find a matching version of package sqlalchemy-spanner
try to locate the library via pip : pip install sqlalchemy-spanner== which usually lists available package versions.
I'm not sure that either of the last 2 bullets actually check a local installation of a package. Not even really sure what I'm talking about here.
Update
So I was able to install the local version of python-spanner-sqlalchemy by using pip install /path/to/project, which works, but still having the same issues with loading the dialect.
I added an import for SpannerDialect in the code (in the Registry section) above with from google.cloud.sqlalchemy_spanner import SpannerDialect. PyCharm auto-completed this for me which indicates to me that the package is successfully installed and available. But I receive the ModuleNotFoundError for google.cloud.sqlalchemy_spanner when running.
I ran python in my project root directory and, from the repl, imported SpannerDialect with no errors.
Solution
To clarify, the solution #larkee provided worked regarding the updated repository URL.
As a note, we recently moved the repo from cloudspannerecosystem/python-spanner-sqlalchemy to googleapis/python-spanner-sqlalchemy.
I clarified why that worked in the comments to their answer
I have not tested the answer from #neondot42, but I have seen this brought up as well, so take a look there if you're having the same issue.
sorry for the late response. I was recently struggling with a similar problem. Sqlalchemy was unable to load the spanner dialect. I tried uninstalling and installing different versions with pip but nothing seemed to work.
At the end what did the trick was specifying the "driver" part of the database URL as "spanner" too. So the final URL looked like this:
spanner+spanner:///projects/project-id/instances/instance-id/databases/database-id
I am not entirely sure of why this appeared to be the solution for me, as all the documentation I could find on Google's end just showed that you could use only the "spanner:///..." to connect. Also I am not familiar with the inner workings of sqlalchemy and how it should detect other installed dialects. However, I hope this solution can help someone else.
I was able to replicate the error you are seeing by installing sqlalchemy but not having sqlalchemy-spanner installed at all:
pip install sqlalchemy==1.3
pip uninstall sqlalchemy-spanner
>>> from sqlalchemy import create_engine
>>> engine = create_engine(<url>)
sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.dialects:spanner
After I install sqlalchemy-spanner I am able to use SQLAlchemy without issue:
pip install git+git://github.com/googleapis/python-spanner-sqlalchemy.git#v0.1.0
>>> from sqlalchemy import create_engine, MetaData, Table, Column, Integer, String, inspect
>>> engine = create_engine(<url>)
>>> metadata = MetaData(bind=engine)
>>> table = Table(
... "TestTable",
... metadata,
... Column("user_id", Integer, primary_key=True),
... Column("user_name", String(16), nullable=False),
... )
>>> table.create()
>>> inspect(engine).get_table_names()
['TestTable']
>>>
Based on this, I think the issue is that sqlalchemy-spanner is not being installed. Unfortunately, I'm not familiar with Poetry for dependency management so I'm not sure exactly what is going wrong. As a note, we recently moved the repo from cloudspannerecosystem/python-spanner-sqlalchemy to googleapis/python-spanner-sqlalchemy. I am able to use either for the pip command but perhaps Poetry requires the newer one?

Cannot connect to a mysql:// instance with SQLAlchemy from my Mac

I'm trying to connect to a mySQL database through SQLAlchemy and I can't get it to work for the life of me. Here's the code I'm running and the error I'm getting.
import pandas as pd
import sqlalchemy as sql
import urllib.parse
user = 'user'
pswd = urllib.parse.quote_plus('password')
connect_string = 'mysql://{}:{}#localhost:8018/test'.format(user, pswd)
sql_engine = sql.create_engine(connect_string)
>>> ModuleNotFoundError: No module named 'MySQLdb'
After seeing this, I've tried numerous different ways of installing 'MySQLdb' ( including the obvious 'pip3 install mysqlclient') and when I run that in my terminal, I get the following.
>>> ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
I know this question has been posted here before but I've spent 2 hours now trying to figure this out with the advice on those other questions and nothing is working. Can anyone help? Below is a picture of the full error I'm getting when I try and install "mysqlclient"
According to
https://pypi.org/project/mysqlclient/#files
mysqlclient only offers pre-compiled wheel files for Windows machines running 64-bit Python 3.6, 3.7, and 3.8. All other configurations require that it be installed from source, and that can be ... complicated.
You may want to try installing PyMySQL instead, and use a connection URI of the form
mysql+pymysql://<username>:<password>#<host>/<dbname>[?<options>]
SQLAlchemy is a high-level ORM. To connect to backends it needs Python DB API driver(s). You can install SQLAlchemy with a MySQL driver using
pip install -U 'SQLAlchemy[mysql]'
See the list of extra drivers available for such installation.
Most drivers have parts written in C/C++ and thus require compilation. You need to install MySQL-related libraries to compile mysqlclient. See https://stackoverflow.com/search?q=%5Bpip%5D+install+%5Bmysql%5D+%22mysql_config%22+not+found
There are a few pure-Python drivers. They're easier to install. PyMySQL is one such driver. To install it along with SQLAlchemy:
pip install -U 'SQLAlchemy[pymysql]'

ModuleNotFoundError: No module named 'pyarrow' with satisfied requirements

I am trying to run this command in Jupyter Notebook: import pyarrow, get the same error: "ModuleNotFoundError: No module named 'pyarrow'
I have installed it already with pip3 and brew also. So when I ran pip3install pyarrow it says requirements are already satisfied. All other libraries I have installed runs with no issues from the same directory.
Thank you.
This is an odd one, for sure. I am not familiar enough with pyarrow to know why the following worked.
From the docs, If I do pip3 install pyarrow and run pip3 list, pyarrow shows up in the list but I cannot seem to import it from the python CLI. Yet, if I also run conda install -c conda-forge pyarrow, installing all of it's dependencies, now jupyter notebook can import it properly.
Are you using a venv? Try running jupyter kernelspec list in your console and make sure your kernel is running in the environment you're expecting it to be.
https://github.com/jupyter/notebook/issues/2359 (discussion about this here)

Error in importing MySQL connector in Python 3.5

I am getting ImportError: No module named 'mysql' while I do the following...
>>> import mysql.connector
MySQL is installed and am on Python 3.5. I can't figure out. The above command is running fine in Python 2.7.
Unfortunately there is no mysql-connector available for python3.5.
So, you can use pymysql module which is a replacement for mysql-connector package
pip install pymysql
import pymysql
Connection = pymysql.connect(host='hostname', user='username', password='password',
db='database',charset='utf8mb4',cursorclass=pymysql.cursors.DictCursor )
ExecuteQuery(Connection)
Connection.close()
For me (OS X 10.11.4, Python 3.5.1), the mysql-connector-package installed itself (by default) into Python 2.7 that was also on my system. I copied the mysql package directory from:
/Library/Python/2.7/site-packages/mysql to /Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/mysql. Everything works. According to http://dev.mysql.com/doc/connector-python/en/connector-python-versions.html the connector package supports Python 3.3+.
Your specific installation path for the module may be different, but you can easily check this from the REPL using the sys.path:
https://leemendelowitz.github.io/blog/how-does-python-find-packages.html
As we need to be connecting MySQl with Python, the Python command assumes that the connector is enabled. So we need to follow the steps below:
open MySQL Installer;
go to "Custom settings";
go to "Connectors";
choose the connector\Python and version based on the one you are using;
add it;
use import mysql.connector.
It should run, in my case it did!
From Can't run MySql Utilities:
MYSQL Utilities assumes that the MySQL Connector for Python has been installed. If you install it (http://dev.mysql.com/downloads/connector/python/), MySQL Utilities should run OK.
I think this link may help you to solve your problem.

Python 3 and mysql through SQLAlchemy

Currently:
SQLAlchemy installed and working (or at least import v0.8.0b2)
Mysql (v5.5.16)
Distribute (0.6.34)
Oracle mysql-python connector
Python 3.2
Windows 7 32/64 (note that I installed Python 32bits)
The problem is that MySQLdb or Oursql is required and I didn't managed to get any of them working.
Found this but didn't manage to get it working neither.
Edit: If you are aware of an other orm that works with Python3, I'm interested.
I was successful in getting Oracle's MySQL connector for python working with SQLAlchemy on Python 3.3. Your connection string needs to start with "mysql+mysqlconnector://...". After I changed my connection string everything (well, simple things) started working.
The MySQL connector docs can be found here: https://dev.mysql.com/doc/connector-python/en/
The package is up on PyPi: https://pypi.org/project/mysql-connector-python/
Here are the SQLAlchemy docs about using the Python connector: http://docs.sqlalchemy.org/en/latest/dialects/mysql.html#module-sqlalchemy.dialects.mysql.mysqlconnector
For others who arrive here, this should do it:
pip install mysql-connector==2.1.4 # version avoids Protobuf error
URI = 'mysql+mysqlconnector://$USER:$PASS#$HOST/$DB'
I tried Oracle's connection, as suggested by #Brad Campbell, but unfortunately it was extremely slow, much slower than the "real" MySQL-Python connection I had been using with SQLAlchemy on Python 2.
After checking SQLAlchemy themselves,
http://docs.sqlalchemy.org/en/latest/dialects/mysql.html#module-sqlalchemy.dialects.mysql.mysqldb
To use MySQL-Python on Python 3, they recommend a fork of it, mysqlclient,
https://github.com/PyMySQL/mysqlclient-python
It is available via pip with pip install mysqlclient, but there are almost certainly other steps you'll need to do to set it up initially. After that though, I was seeing the performance go back to what I was used to, which was about 5x faster than with Oracle's connector.
I've gotten oursql + SQLAlchemy 0.8.1 + Python 3.3 to work. Building off of LukeCarrier's port, I modified oursql.c to use the correct import levels, and it worked! Try this, and be sure to follow the readme:
https://github.com/clintron/py3k-oursql
You may also need to have the latest version of Cython.