How to close sqlalchemy connection in MySQL - mysql

This is a sample code I'd like to run:
for i in range(1,2000):
db = create_engine('mysql://root#localhost/test_database')
conn = db.connect()
#some simple data operations
conn.close()
db.dispose()
Is there a way of running this without getting "Too many connections" errors from MySQL?
I already know I can handle the connection otherwise or have a connection pool. I'd just like to understand how to properly close a connection from sqlalchemy.

Here's how to write that code correctly:
db = create_engine('mysql://root#localhost/test_database')
for i in range(1,2000):
conn = db.connect()
#some simple data operations
conn.close()
db.dispose()
That is, the Engine is a factory for connections as well as a pool of connections, not the connection itself. When you say conn.close(), the connection is returned to the connection pool within the Engine, not actually closed.
If you do want the connection to be actually closed, that is, not pooled, disable pooling via NullPool:
from sqlalchemy.pool import NullPool
db = create_engine('mysql://root#localhost/test_database', poolclass=NullPool)
With the above Engine configuration, each call to conn.close() will close the underlying DBAPI connection.
If OTOH you actually want to connect to different databases on each call, that is, your hardcoded "localhost/test_database" is just an example and you actually have lots of different databases, then the approach using dispose() is fine; it will close out every connection that is not checked out from the pool.
In all of the above cases, the important thing is that the Connection object is closed via close(). If you're using any kind of "connectionless" execution, that is engine.execute() or statement.execute(), the ResultProxy object returned from that execute call should be fully read, or otherwise explicitly closed via close(). A Connection or ResultProxy that's still open will prohibit the NullPool or dispose() approaches from closing every last connection.

Tried to figure out a solution to disconnect from database for an unrelated problem (must disconnect before forking).
You need to invalidate the connection from the connection Pool too.
In your example:
for i in range(1,2000):
db = create_engine('mysql://root#localhost/test_database')
conn = db.connect()
# some simple data operations
# session.close() if needed
conn.invalidate()
db.dispose()

I use this one
engine = create_engine('...')
with engine.connect() as conn:
conn.execute(text(f"CREATE SCHEMA IF NOT EXISTS...")
engine.dispose()

In my case these always works and I am able to close!
So using invalidate() before close() makes the trick. Otherwise close() sucks.
conn = engine.raw_connection()
conn.get_warnings = True
curSql = xx_tmpsql
myresults = cur.execute(curSql, multi=True)
print("Warnings: #####")
print(cur.fetchwarnings())
for curresult in myresults:
print(curresult)
if curresult.with_rows:
print(curresult.column_names)
print(curresult.fetchall())
else:
print("no rows returned")
cur.close()
conn.invalidate()
conn.close()
engine.dispose()

Related

Lock wait timeout exceeded when calling pymysql executemany after disconnection

I have a reasonably large dataset of about 6,000,000 rows X 60 columns that I am trying to insert into a database. I am chunking them, and inserting them 10,000 at a time into a mysql database using a class I've written and pymysql. The problem is, I occasionally time out the server when writing, so I've modified my executemany call to re-connect on errors. This works fine for when I lose connection once, but if I lose the error a second time, I get a pymysql.InternalException stating that lock wait timeout exceeded. I was wondering how I could modify the following code to catch that and destroy the transaction completely before attempting again.
I've tried calling rollback() on the connection, but this causes another InternalException if the connection is destroyed because there is no cursor anymore.
Any help would be greatly appreciated (I also don't understand why I am getting the timeouts to begin with, but the data is relatively large.)
class Database:
def __init__(self, **creds):
self.conn = None
self.user = creds['user']
self.password = creds['password']
self.host = creds['host']
self.port = creds['port']
self.database = creds['database']
def connect(self, type=None):
self.conn = pymysql.connect(
host = self.host,
user = self.user,
password = self.password,
port = self.port,
database = self.database
)
def executemany(self, sql, data):
while True:
try:
with self.conn.cursor() as cursor:
cursor.executemany(sql, data)
self.conn.commit()
break
except pymysql.err.OperationalError:
print('Connection error. Reconnecting to database.')
time.sleep(2)
self.connect()
continue
return cursor
and I am calling it like this:
for index, chunk in enumerate(dataframe_chunker(df), start=1):
print(f"Writing chunk\t{index}\t{timer():.2f}")
db.executemany(insert_query, chunk.values.tolist())
Take a look at what MySQL is doing. The lockwait timeouts are because the inserts cannot be done until something else finishes, which could be your own code.
SELECT * FROM `information_schema`.`innodb_locks`;
Will show the current locks.
select * from information_schema.innodb_trx where trx_id = [lock_trx_id];
Will show the involved transactions
SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST where id = [trx_mysql_thread_id];
Will show the involved connection and may show the query whose lock results in the lock wait timeout. Maybe there is an uncommitted transaction.
It is likely your own code, because of the interaction with your executemany function which catches exceptions and reconnects to the database. What of the prior connection? Does the lockwait timeout kill the prior connection? That while true is going to be trouble.
For the code calling executemany on the db connection, be more defensive on the try/except with something like:
def executemany(self, sql, data):
while True:
try:
with self.conn.cursor() as cursor:
cursor.executemany(sql, data)
self.conn.commit()
break
except pymysql.err.OperationalError:
print('Connection error. Reconnecting to database.')
if self.conn.is_connected():
connection.close()
finally:
time.sleep(2)
self.connect()
But the solution here will be to not induce lockwait timeouts if there are no other database clients.

How to Close a Connection When Using Jupyter SQL Magic?

I'm using SQL Magic to connect to a db2 instance. However, I can't seem to find the syntax anywhere on how to close the connection when I'm done querying the database.
you cannot explicitly close a connection using Jupyter SQL Magic. In fact, that is one of the shortcoming of using Jupyter SQL Magic to connect to DB2. You need to close your session to close the Db2 connection. Hope this helps.
This probably isn't very useful, and to the extent it is it's probably not guaranteed to work in the future. But if you need a really hackish way to close the connection, I was able to do it this way (for a postgres db, I assume it's similar for db2):
In[87]: connections = %sql -l
Out[87]: {'postgresql://ngd#node1:5432/graph': <sql.connection.Connection at 0x7effdbcf6b38>}
In[88]: conn = connections['postgresql://ngd#node1:5432/graph']
In[89]: conn.session.close()
In[90]: %sql SELECT 1
...
StatementError: (sqlalchemy.exc.ResourceClosedError) This Connection is closed
[SQL: SELECT 1]
[parameters: [{'__name__': '__main__', '__doc__': 'Automatically created module for IPython interactive environment', '__package__': None, '__loader__': None, '__s ... (123202 characters truncated) ... stgresql://ngd#node1:5432/graph']", '_i28': "conn = connections['postgresql://ngd#node1:5432/graph']\nconn.session.close()", '_i29': '%sql SELECT 1'}]]
A big problem is--if you want to reconnect, that doesn't seem to work. Even after running %reload_ext sql, and trying to connect again, it still thinks the connection is closed when you try to use it. So unless someone knows how to fix that behavior, this is only useful for disconnecting if you don't want to re-connect again (to the same db with the same params) before restarting the kernel.
You can also restart the kernel.
This is the most simple way I've found to close all connections at the end of the session. You must restart the kernel to be able to re-establish the connection.
connections = %sql -l
[c.session.close() for c in connections.values()]
sorry for being to late but I've just started with working with SQL Magic and got annoyed with the constant errors appearing. It's a bit of a awkward patch but this helped me use it.
def multiline_qry(qry):
try:
%sql {qry}
except Exception as ex:
if str(type(ex).__name__) != 'ResourceClosedError':
template = "An exception of type {0} occurred. Arguments:\n{1!r}"
message = template.format(type(ex).__name__, ex.args)
print (message)
qry = '''DROP TABLE IF EXISTS EMPLOYEE;
CREATE TABLE EMPLOYEE(firstname varchar(50),lastname varchar(50));
INSERT INTO EMPLOYEE VALUES('Tom','Mitchell'),('Jack','Ryan');
'''
multiline_qry(qry)
log out the notebook first if you want to close the connection.

In Shiny app, how to refresh data from MySQL db every 10 minutes if any change occured

I've made dashboard using shiny, shinydashboard and RMySQL package.
Following is what I wrote in order to refresh data every 10 minutes if any change occured.
In global.R
con = dbConnect(MySQL(), host, user, pass, db)
check_func <- function() {dbGetQuery(con, check_query}
get_func <- function() {dbGetQuery(con, get_query}
In server.R
function(input, output, session) {
# check every 10 minutes for any change
data <- reactivePoll(10*60*1000, session, checkFunc = check_func, valueFunc = get_func)
session$onSessionEnded(function() {dbDisconnect(con)})
However, above code infrequently generates corrupt connection handle error from check_func.
Warning: Error in .local: internal error in RS_DBI_getConnection: corrupt connection handle
Should I put dbConnect code inside server function?
Any better ideas?
link: using session$onsessionend to disconnect rshiny app from the mysql server
"pool" package is the answer: http://shiny.rstudio.com/articles/pool-basics.html
This adds a new level of abstraction when connecting to a database: instead of directly fetching a connection from the database, you will create an object (called a pool) with a reference to that database. The pool holds a number of connections to the database. ... Each time you make a query, you are querying the pool, rather than the database. ... You never have to create or close connections directly: the pool knows when it should grow, shrink or keep steady.
I've got answer from here. -> https://stackoverflow.com/a/39661853/4672289

Matlab Database Toolbox - Warning: com.mysql.jdbc.Connection#6e544a45 is not serializable

I'm connecting to a MySQL database through the Matlab Database Toolbox in order to run the same query over and over again within 2 nested for loops. After each iteration I get this warning:
Warning: com.mathworks.toolbox.database.databaseConnect#26960369 is not serializable
In Import_Matrices_DOandT_julaugsept_inflow_nomettsed at 476
Warning: com.mysql.jdbc.Connection#6e544a45 is not serializable
In Import_Matrices_DOandT_julaugsept_inflow_nomettsed at 476
Warning: com.mathworks.toolbox.database.databaseConnect#26960369 not serializable
In Import_Matrices_DOandT_julaugsept_inflow_nomettsed at 476
Warning: com.mysql.jdbc.Connection#6e544a45 is not serializable
In Import_Matrices_DOandT_julaugsept_inflow_nomettsed at 476
My code is basically structured like this:
%Server
host =
user =
password =
dbName =
%# JDBC parameters
jdbcString = sprintf('jdbc:mysql://%s/%s', host, dbName);
jdbcDriver = 'com.mysql.jdbc.Driver';
%# Create the database connection object
conn = database(dbName, user , password, jdbcDriver, jdbcString);
setdbprefs('DataReturnFormat', 'numeric');
%Loop
for SegmentNum=3:41;
for tl=1:15;
tic;
sqlquery=['giant string'];
results = fetch(conn, sqlquery);
(some code here that saves the results into a few variables)
save('inflow.mat');
end
end
time = toc
close(conn);
clear conn
Eventually, after some iterations the code will crash with this error:
Error using database/fetch (line 37)
Query execution was interrupted
Error in Import_Matrices_DOandT_julaugsept_inflow_nomettsed (line
466)
results = fetch(conn, sqlquery);
Last night it errored after 25 iterations. I have about 600 iterations total I need to do, and I don't want to have to keep checking back on it every 25. I've heard there can be memory issues with database connection objects...is there a way to keep my code running?
Let's take this one step at a time.
Warning: com.mathworks.toolbox.database.databaseConnect#26960369 is not serializable
This comes from this line
save('inflow.mat');
You are trying to save the database connection. That doesn't work. Try specifying the variables you wish to save only, and it should work better.
There are a couple of tricks to excluding the values, but honestly, I suggest you just find the most important variables you wish to save, and save those. But if you wish, you can piece together a solution from this page.
save inflow.mat a b c d e
Try wrapping the query in a try catch block. Whenever you catch an error reset the connection to the database which should free up the object.
nQuery = 100;
while(nQuery>0)
try
query_the_database();
nQuery = nQuery - 1;
catch
reset_database_connection();
end
end
The ultimate main reason for this is that database connection objects are TCP/IP ports and multiple processes cannot access the same port. That is why database connection object are not serialized. Ports cannot be serialized.
Workaround is to create a connection with in the for loop.

Adding output converter to pyodbc connection in SQLAlchemy

using:
Python 2.7.3
SQLAlchemy 0.7.8
PyODBC 3.0.3
I have implemented my own Dialect for the EXASolution DB using PyODBC as the underlying db driver. I need to make use of PyODBC's output_converter function to translate DECIMAL(x, 0) columns to integers/longs.
The following code snippet does the trick:
pyodbc = self.dbapi
dbapi_con = connection.connection
dbapi_version = dbapi_con.getinfo(pyodbc.SQL_DRIVER_VER)
(major, minor, patch) = [int(x) for x in dbapi_version]
if major >= 3:
dbapi_con.add_output_converter(pyodbc.SQL_DECIMAL, self.decimal2int)
I have placed this code snippet in the initialize(self, connection) method of
class EXADialect_pyodbc(PyODBCConnector, EXADialect):
Code gets called, and no exception is thrown, but this is a one time initialization. Later on, other connections are created. These connections are not passed through my initialization code.
Does anyone have a hint on how connection initialization works with SQLAlchemy, and where to place my code so that it gets called for every new connection created?
This is an old question, but something I hit recently, so an updated answer may help someone else along the way. In my case, I was trying to automatically downcase mssql UNIQUEIDENTIFIER columns (guids).
You can grab the raw connection (pyodbc) through the session or engine to do this:
engine = create_engine(connection_string)
make_session = sessionmaker(engine)
...
session = make_session()
session.connection().connection.add_output_converter(SQL_DECIMAL, decimal2int)
# or
connection = engine.connect().connection
connection.add_output_converter(SQL_DECIMAL, decimal2int)