Closing active connections using RMySQL - mysql

As per my question earlier today, I suspect I have an issue with unclosed connections that is blocking data from being injected into my MySQL database. Data is being allowed into tables that are not currently being used (hence I suspect many open connections preventing uploading into that particular table).
I am using RMySQL on Ubuntu servers to upload data onto a MySQL database.
I'm looking for a way to a) determine if connections are open b) close them if they are. The command exec sp_who and exec sp_who2 from the SQL command line returns an SQL code error.
Another note: I am able to connect, complete the uploading process, and end the R process successfully, and there is no data on the server (checked via the SQL command line) when I try only that table.
(By the way,: If all else fails, would simply deleting the table and creating a new one with the same name fix it? It would be quite a pain, but doable.)

a. dbListConnections( dbDriver( drv = "MySQL"))
b. dbDisconnect( dbListConnections( dbDriver( drv = "MySQL"))[[index of MySQLConnection you want to close]]). To close all: lapply( dbListConnections( dbDriver( drv = "MySQL")), dbDisconnect)
Yes, you could just rewrite the table, of course you would lose all data. Or you can specify dbWriteTable(, ..., overwrite = TRUE).
I would also play with the other options, like row.names, header, field.types, quote, sep, eol. I've had a lot of weird behavior in RMySQL as well. I can't remember specifics, but it seems like I've had no error message when I had done something wrong, like forget to set row.names. HTH

Close all active connections:
dbDisconnectAll <- function(){
ile <- length(dbListConnections(MySQL()) )
lapply( dbListConnections(MySQL()), function(x) dbDisconnect(x) )
cat(sprintf("%s connection(s) closed.\n", ile))
}
executing:
dbDisconnectAll()

Simplest:
lapply(dbListConnections( dbDriver( drv = "MySQL")), dbDisconnect)
List all connections and disconnect them by lapply

Closing a connection
You can use dbDisconnect() together with dbListConnections() to disconnect those connections RMySQL is managing:
all_cons <- dbListConnections(MySQL())
for(con in all_cons)
dbDisconnect(con)
Check all connections have been closed
dbListConnections(MySQL())
You could also kill any connection you're allowed to (not just those managed by RMySQL):
dbGetQuery(mydb, "show processlist")
Where mydb is..
mydb = dbConnect(MySQL(), user='user_id', password='password',
dbname='db_name', host='host')
Close a particular connection
dbGetQuery(mydb, "kill 2")
dbGetQuery(mydb, "kill 5")

lapply(dbListConnections(MySQL()), dbDisconnect)

In current releases the "dbListConnections" function is deprecated and DBI no longer requires drivers to maintain a list of connections. As such, the above solutions may no longer work. E.g. in RMariaDB the above solutions create errors.
I made with the following alternative that uses the MySQL server's functionality and that should work with current DBI / driver versions:
### listing all open connection to a server with open connection
query <- dbSendQuery(mydb, "SHOW processlist;")
processlist <- dbFetch(query)
dbClearResult(query)
### getting the id of your current connection so that you don't close that one
query <- dbSendQuery(mydb, "SELECT CONNECTION_ID();")
current_id <- dbFetch(query)
dbClearResult(query)
### making a list with all other open processes by a particular set of users
# E.g. when you are working on Amazon Web Services you might not want to close
# the "rdsadmin" connection to the AWS console. Here e.g. I choose only "admin"
# connections that I opened myself. If you really want to kill all connections,
# just delete the "processlist$User == "admin" &" bit.
queries <- paste0("KILL ",processlist[processlist$User == "admin" & processlist$Id != current_id[1,1],"Id"],";")
### making function to kill connections
kill_connections <- function(x) {
query <- dbSendQuery(mydb, x)
dbClearResult(query)
}
### killing other connections
lapply(queries, kill_connections)
### killing current connection
dbDisconnect(mydb)

Related

How to Close a Connection When Using Jupyter SQL Magic?

I'm using SQL Magic to connect to a db2 instance. However, I can't seem to find the syntax anywhere on how to close the connection when I'm done querying the database.
you cannot explicitly close a connection using Jupyter SQL Magic. In fact, that is one of the shortcoming of using Jupyter SQL Magic to connect to DB2. You need to close your session to close the Db2 connection. Hope this helps.
This probably isn't very useful, and to the extent it is it's probably not guaranteed to work in the future. But if you need a really hackish way to close the connection, I was able to do it this way (for a postgres db, I assume it's similar for db2):
In[87]: connections = %sql -l
Out[87]: {'postgresql://ngd#node1:5432/graph': <sql.connection.Connection at 0x7effdbcf6b38>}
In[88]: conn = connections['postgresql://ngd#node1:5432/graph']
In[89]: conn.session.close()
In[90]: %sql SELECT 1
...
StatementError: (sqlalchemy.exc.ResourceClosedError) This Connection is closed
[SQL: SELECT 1]
[parameters: [{'__name__': '__main__', '__doc__': 'Automatically created module for IPython interactive environment', '__package__': None, '__loader__': None, '__s ... (123202 characters truncated) ... stgresql://ngd#node1:5432/graph']", '_i28': "conn = connections['postgresql://ngd#node1:5432/graph']\nconn.session.close()", '_i29': '%sql SELECT 1'}]]
A big problem is--if you want to reconnect, that doesn't seem to work. Even after running %reload_ext sql, and trying to connect again, it still thinks the connection is closed when you try to use it. So unless someone knows how to fix that behavior, this is only useful for disconnecting if you don't want to re-connect again (to the same db with the same params) before restarting the kernel.
You can also restart the kernel.
This is the most simple way I've found to close all connections at the end of the session. You must restart the kernel to be able to re-establish the connection.
connections = %sql -l
[c.session.close() for c in connections.values()]
sorry for being to late but I've just started with working with SQL Magic and got annoyed with the constant errors appearing. It's a bit of a awkward patch but this helped me use it.
def multiline_qry(qry):
try:
%sql {qry}
except Exception as ex:
if str(type(ex).__name__) != 'ResourceClosedError':
template = "An exception of type {0} occurred. Arguments:\n{1!r}"
message = template.format(type(ex).__name__, ex.args)
print (message)
qry = '''DROP TABLE IF EXISTS EMPLOYEE;
CREATE TABLE EMPLOYEE(firstname varchar(50),lastname varchar(50));
INSERT INTO EMPLOYEE VALUES('Tom','Mitchell'),('Jack','Ryan');
'''
multiline_qry(qry)
log out the notebook first if you want to close the connection.

Connection lost when connect R to MySQL using collect

I would like to use dplyr and RMySQL to work with my large data. There is no issues with dplyr code. The problem (I think) is about exporting data out of MySQL to R. My connection is dropped every time even I am using n=Inf in collect. Theoretically, my data should have more than 50K rows, but I can only get around 15K back. Any suggestions are appreciated.
Approach 1
library(dplyr)
library(RMySQL)
# Connect to a database and select a table
my_db <- src_mysql(dbname='aermod_1', host = "localhost", user = "root", password = "")
my_tbl <- tbl(my_db, "db_table")
out_summary_station_raw <- select(my_tbl, -c(X, Y, AVERAGE_CONC))
out_station_mean_local <- collect(out_summary_station_raw)
Approach 2: using Pool
library(pool)
library(RMySQL)
library(dplyr)
pool <- dbPool(
drv = RMySQL::MySQL(),
dbname = "aermod_1",
host = "localhost",
username = "root",
password = ""
)
out_summary_station_raw <- src_pool(pool) %>% tbl("aermod_final") %>% select(-c(X, Y, AVERAGE_CONC))
out_station_mean_local <- collect(out_summary_station_raw, n = Inf)
Warning message (both approaches):
Warning messages:
1: In dbFetch(res, n) : error while fetching rows
2: Only first 15,549 results retrieved. Use n = Inf to retrieve all.
Update:
Checked log and it looks fine from the server side. For my example, the slow-log said Query_time: 79.348351 Lock_time: 0.000000 Rows_sent: 15552 Rows_examined: 16449696, but collect just could not retrieve the full data. I Am able to replicate the same move using MySQL Bench.
As discussed here
https://github.com/tidyverse/dplyr/issues/1968,
https://github.com/tidyverse/dplyr/blob/addb214812f2f45f189ad2061c96ea7920e4db7f/NEWS.md and https://github.com/tidyverse/dplyr/commit/addb214812f2f45f189ad2061c96ea7920e4db7fthis package issue seems to be addresed.
What version of the dplyr package are you using?
After the most recent RMySQL update, I noticed that I could not collect() data from large views, and I reported it as an issue. Your problem might be related.
One thing to try is to roll back to the last version.
devtools::install_version("RMySQL", version = "0.10.9",
repos = "http://cran.us.r-project.org")

In Shiny app, how to refresh data from MySQL db every 10 minutes if any change occured

I've made dashboard using shiny, shinydashboard and RMySQL package.
Following is what I wrote in order to refresh data every 10 minutes if any change occured.
In global.R
con = dbConnect(MySQL(), host, user, pass, db)
check_func <- function() {dbGetQuery(con, check_query}
get_func <- function() {dbGetQuery(con, get_query}
In server.R
function(input, output, session) {
# check every 10 minutes for any change
data <- reactivePoll(10*60*1000, session, checkFunc = check_func, valueFunc = get_func)
session$onSessionEnded(function() {dbDisconnect(con)})
However, above code infrequently generates corrupt connection handle error from check_func.
Warning: Error in .local: internal error in RS_DBI_getConnection: corrupt connection handle
Should I put dbConnect code inside server function?
Any better ideas?
link: using session$onsessionend to disconnect rshiny app from the mysql server
"pool" package is the answer: http://shiny.rstudio.com/articles/pool-basics.html
This adds a new level of abstraction when connecting to a database: instead of directly fetching a connection from the database, you will create an object (called a pool) with a reference to that database. The pool holds a number of connections to the database. ... Each time you make a query, you are querying the pool, rather than the database. ... You never have to create or close connections directly: the pool knows when it should grow, shrink or keep steady.
I've got answer from here. -> https://stackoverflow.com/a/39661853/4672289

Multiple (simultaneous) users in R Shiny with access to MySQL database

I have a Shiny application (hosted on shinyapps.io) that records a user's click of certain actionButtons to a MySQL database. I'd love some advice on a few things:
where to put the dbConnect code (i.e. inside or outside the shinyServer function)
when to close the connection (as I was running into the problem of too many open connections)
Each addition to the database just adds a new row, so users aren't accessing and modifying the same elements. The reason I ask this is I was running into problem of multiple users not being able to use the app at the same time (with the error "Disconnected from server") and I wasn't sure if it was from the MySQL connections.
Thank you!
Someone in the comments posted about the pool package, which serves this exact purpose! Here's the relevant parts of my server.R code:
library(shiny)
library(RMySQL)
library(pool)
pool <- dbPool(
drv = RMySQL::MySQL(),
user='username',
password='password',
dbname='words',
host='blahblahblah')
shinyServer(function(input, output) {
## function to write to databse
writeToDB <- function(word, vote){
query <- paste("INSERT INTO word_votes (vote, word) VALUES (", vote, ", '", word, "');", sep="")
conn <- poolCheckout(pool)
dbSendQuery(conn, query)
conn <- poolReturn(conn)
## rest of code
}
I added the poolCheckout and poolReturn to run successfully and prevent leaks.

Problem with querying in Amazon RDS

Hi i have a python script that connects to an Amazon RDS machine and check for new entries.
my scripts works on the localhost perfectly. But on the RDS it does not detect the new entry. once i cancel the script and run again i get the new entry. For testing i tried it out like this
cont = MySQLdb.connect("localhost","root","password","DB")
cursor = cont.cursor()
for i in range(0, 100):
cursor.execute("Select count(*) from box")
A = cursor.fetchone()
print A
and during this process when i add a new entry it does not detect the new entry but when i close the connection and run it again i get the new entry. Why is this i checked the cache it was at 0. what else am i missing.
I have seen this happen in MySQL command-line clients as well.
My understanding (from other people linking to this URL) is that Python's API often silently creates transactions: http://www.python.org/dev/peps/pep-0249/
If that is true, then your cursor is looking at a consistent version of the data, even after another transaction adds rows. You could try doing a cursor.rollback() in the for loop to stop the implicit transaction that the SELECT is running in.
I got the solution to this, it is due to the Isolation level in Mysql, all i had to do was
Set the default transaction isolation level transaction-isolation = READ-COMMITTED
And i am using Django for this i had to add this in the django database settings
'OPTIONS': {
"init_command": "SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED"
}