Directly creating a pyodbc connection works but using sql.alchemyurl does not work [duplicate] - sqlalchemy

I'm using SQLalchemy for a Python project, and I want to have a tidy connection string to access my database. So for example:
engine = create_engine('postgresql://user:pass#host/database')
The problem is my password contains a sequence of special characters that get interpreted as delimiters when I try to connect.
I realize that I could just use engine.URL.create() and then pass my credentials like this:
import sqlalchemy as sa
connection_url = sa.engine.URL.create(
drivername="postgresql",
username="user",
password="p#ss",
host="host",
database="database",
)
print(connection_url)
# postgresql://user:p%40ss#host/database
But I'd much rather use a connection string if this is possible.
So to be clear, is it possible to encode my connection string, or the password part of the connection string - so that it can be properly parsed?

You need to URL-encode the password portion of the connect string:
from urllib.parse import quote_plus
from sqlalchemy.engine import create_engine
engine = create_engine("postgres://user:%s#host/database" % quote_plus("p#ss"))
If you look at the implementation of the class used in SQLAlchemy to represent database connection URLs (in sqlalchemy/engine/url.py), you can see that they use the same method to escape passwords when converting the URL instances into strings.

In Python 3.x, you need to import urllib.parse.quote:
The urllib module has been split into parts and renamed in Python 3 to
urllib.request, urllib.parse, and urllib.error.
When you are trying to connect database MySQL with password which contains sequence of special characters and your python version is Python3
user_name is your userid for database
database is your database name
your_password password with special characters
from urllib.parse import quote
from sqlalchemy.engine import create_engine
engine = create_engine('mysql+mysqlconnector://user_name:%s#localhost:3306/database' % quote('your_password'))

Related

My mysql password contain # symbol. pymysql is throwing error [duplicate]

I'm using SQLalchemy for a Python project, and I want to have a tidy connection string to access my database. So for example:
engine = create_engine('postgresql://user:pass#host/database')
The problem is my password contains a sequence of special characters that get interpreted as delimiters when I try to connect.
I realize that I could just use engine.URL.create() and then pass my credentials like this:
import sqlalchemy as sa
connection_url = sa.engine.URL.create(
drivername="postgresql",
username="user",
password="p#ss",
host="host",
database="database",
)
print(connection_url)
# postgresql://user:p%40ss#host/database
But I'd much rather use a connection string if this is possible.
So to be clear, is it possible to encode my connection string, or the password part of the connection string - so that it can be properly parsed?
You need to URL-encode the password portion of the connect string:
from urllib.parse import quote_plus
from sqlalchemy.engine import create_engine
engine = create_engine("postgres://user:%s#host/database" % quote_plus("p#ss"))
If you look at the implementation of the class used in SQLAlchemy to represent database connection URLs (in sqlalchemy/engine/url.py), you can see that they use the same method to escape passwords when converting the URL instances into strings.
In Python 3.x, you need to import urllib.parse.quote:
The urllib module has been split into parts and renamed in Python 3 to
urllib.request, urllib.parse, and urllib.error.
When you are trying to connect database MySQL with password which contains sequence of special characters and your python version is Python3
user_name is your userid for database
database is your database name
your_password password with special characters
from urllib.parse import quote
from sqlalchemy.engine import create_engine
engine = create_engine('mysql+mysqlconnector://user_name:%s#localhost:3306/database' % quote('your_password'))

Unicode error from CommCare export tool for MySql

While using the Commcare Export tool for exporting data, the data is exported correctly in the Excel File and also in SQLite DB, however when we try to export the Data in MySql DB, the export breaks and gives us the following error:
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 0-3: ordinal not in range(256)
(Refer attached Screenshot for the same)
The data is imported correctly into the DB until the Hindi Text is encountered. Once Hindi text is encountered, it breaks the process and gives the error.
We understand that the error may be due to the Devnagiri Text being inserted into the DB, so we tried to solve this issue by changing all the data columns to utf8_unicode_ci, but still the problem persists.
How can we fix this?
The default mysqldb connection uses latin-1. According to the SqlAlchemy docs you can set the connection encoding directly in the connection string:
http://docs.sqlalchemy.org/en/latest/dialects/mysql.html#unicode
The CommCare HQ Export Tool Docs at
https://confluence.dimagi.com/display/commcarepublic/CommCare+Data+Export+Tool#CommCareDataExportTool-SQLURLformats
include this string, and suggest the format
mysql+pymysql://<username>:<password>#<host>/<database name>?charset=utf8
(in your case pymysql would be mysqldb)
are you including the charset in that connector? If not, that should correct issues with the cursor expected latin-1 encoded text.

psycopg2.copy_from: Remove quotes in text when importing from CSV

I have a CSV file that has all entries quoted i.e. with opening and closing quotes. When I import to the database using copy_from, the database table contains quotes on the data and where there is an empty entry I get quotes only i.e. "" entries in the column as seen below
[
Is there a way to tell copy_from to ignore quotes so that when I import the file the text doesn't have quotes around it and empty entries are converted to Null as below?
Here is my code:
with open(source_file_path) as inf:
cursor.copy_from(inf, table_name, columns=column_list, sep=',', null="None")
UPDATE:
I still haven't got a solution to the above but for the sake of getting the file imported I went ahead and wrote the raw SQL code and executed it in SQLAlchemy connection and Pyscopg2's cursor as below and they both removed quotes and put Null where there were empty entries.
sql = "COPY table_name (col1, col2, col3, col4) FROM '{}' DELIMITER ',' CSV HEADER".format(csv_file_path)
SQL Alchemy:
conn = engine.connect()
trans = conn.begin()
conn.execute(sql)
trans.commit()
conn.close()
Psycopg2:
conn = psycopg2.connect(pg_conn_string)
conn.set_isolation_level(0)
cursor = conn.cursor(cursor_factory=psycopg2.extras.DictCursor)
cursor = conn.cursor()
cursor.execute(sql)
While still wishing the copy_from function would work, now am wondering if the above two are equally as fast as copy_from and if so, which of the two is faster?
Probably a better approach would be to use the built-in CSV library to read the CSV file and transfer the rows to the database. Corollary to the UNIX philosophy of "do one thing and do it well" is to use the appropriate tool (the one that's specialized) for the job. What's good with the CSV library is that you have customization options on how to read the CSV like quoting characters and skipping initial rows (see documentation).
Assuming a simple CSV file with two columns: an integer "ID", and a quoted string "Country Code":
"ID", "Country Code"
1, "US"
2, "UK"
and a declarative SQLAlchemy target table:
from sqlalchemy import create_engine, Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base
engine = create_engine("postgresql+psycopg2://<REMAINDER_OF_YOUR_ENGINE_STRING>")
Base = declarative_base(bind=engine)
class CountryTable(Base):
__tablename__ = 'countries'
id = Column(Integer, primary_key=True)
country = Column(String)
you can transfer the data by:
import csv
from sqlalchemy.orm import sessionmaker
from your_model_module import engine, CountryTable
Session = sessionmaker(bind=engine)
with open("path_to_your.csv", "rb") as f:
reader = csv.DictReader(f)
session = Session()
for row in reader:
country_record = CountryTable(id=row["ID"], country=row["Country Code"])
session.add(country_record)
session.commit()
session.close()
This solution is longer than a one line .copy_from method but it gives you better control without having to dig through the code/understanding the documentation of wrapper or convenience functions like .copy_from. You can specify selected columns to be transferred and handle exceptions at row level since data is transferred row by row with a commit. Rows can be transferred in a batch with a single commit through:
with open("path_to_your.csv", "rb") as f:
reader = csv.DictReader(f)
session = Session()
session.add_all([
CountryTable(id=row["ID"], country=row["Country Code"]) for row in reader
])
session.commit()
session.close()
To compare the execution time of different approaches to your problem, use the timeit module (or rather the commandline command) that comes with Python. Caution however: it's better to be correct than fast.
EDIT:
I was trying to figure out where .copy_from is coded as I haven't used it before. It turns out to be a psycopg2 specific convenience function. It does not 100% support reading CSV files but only file like objects. The only customization argument applicable to CSVs is the separator. It does not understand quoting characters.

Importing a very very large file in postgres without defining the table structure

I have an insanely large csv file which I want to import in postgres db. It is of the size of 500MB. I do not want to create the tabe first with more than 1000+ columns and then go for the insert like with the convntional copy command. Is there any way where I can use the header info of csv ( column names ), to directly import this data without creating a table first.
I am lookig for an import which is similar to R import.
It is probably not the solution your expecting but with Python you could read in the headers of your columns and create a table out of the csv very easily:
import pandas as pd
import psycopg2
from sqlalchemy import create_engine
# Read the csv to a dataframe
df = pd.read_csv('path_to_csv_file', index_col='name_of_index_column', sep=",")
# Connect and upload
engine = create_engine('postgresql+psycopg2://db_user_name:db_password#localhost:5432/' + 'db_name', client_encoding='utf8')
df.to_sql('table_name', engine, if_exists='replace', index =True, index_label='name_of_index_column')

Export Django model data as MySQL file?

I'm writing a Django app that does scientific calculations for a client, and he wants to be able to export data as a MySQL dump file. Is there an easy way to do this or will I need to write a custom serializer?
No need to use Django for this. MySQL includes a command to do it for you:
mysqldump -u USERNAME -pPASSWORD DATABASE_NAME > DATABASE_FILE.sql
If the table structure that you want to export is constant (it likely is), then it would be easy to write a template to generate it as a text output. You can use either Django templates (they're not HTML-specific), or simply string interpolation.
something like this:
def dump(w, qs):
for r in qs:
w.write ("insert into tablename (fieldA, fieldB, fieldC) values ('%s', '%s', %d);\n" % (
quote(r.fieldA), quote(r.fieldB), int(r.fieldC)))
assuming fieldA and fieldB are strings, fieldC is an integer, and quote() is the MySQL safe-escaping function. The w parameter is a file-like object (can be a Django HttpResponse object), and qs is the queryset with the wanted data.