How to query multiple times and close the connection at the end? - mysql

I'd like to open a connection to mysql database and retrieve data with different queries. Do I need to close the connection every time I fetch the data or is there a better way to query multiple times and close the connection only at the end?
Currently I do this:
db = dbConnect(MySQL(), user='root', password='1234', dbname='my_db', host='localhost')
query1=dbSendQuery(db, "select * from table1")
data1 = fetch(query1, n=10000)
query2=dbSendQuery(db, "select * from table2") ##ERROR !
and I get the error message:
Error in mysqlExecStatement(conn, statement, ...) :
RS-DBI driver: (connection with pending rows, close resultSet before continuing)
Now if I clear the result with dbClearResult(query1) I need to redo the connection (dbConnect...)
Is there a better/efficient way to fetch everything first instead of open/close every time?

Try dbGetQuery(...) instead of using dbSendQuery(...) and fetch() like this
db = dbConnect(MySQL(), user='root', password='1234', dbname='my_db', host='localhost')
query1=dbGetQuery(db, "select * from table1")
query2=dbGetQuery(db, "select * from table1")
From the help page:
The function ‘dbGetQuery’ does all these in one operation (submits the statement, fetches all output records, and clears the result set).

Related

Values are not inserted into MySQL table using pool.apply_async in python2.7

I am trying to run the following code to populate a table in parallel for a certain application. First the following function is defined which is supposed to connect to my db and execute the sql command with the values given (to insert into table).
def dbWriter(sql, rows) :
# load cnf file
MYSQL_CNF = os.path.abspath('.') + '/mysql.cnf'
conn = MySQLdb.connect(db='dedupe',
charset='utf8',
read_default_file = MYSQL_CNF)
cursor = conn.cursor()
cursor.executemany(sql, rows)
conn.commit()
cursor.close()
conn.close()
And then there is this piece:
pool = dedupe.backport.Pool(processes=2)
done = False
while not done :
chunks = (list(itertools.islice(b_data, step)) for step in
[step_size]*100)
results = []
for chunk in chunks :
print len(chunk)
results.append(pool.apply_async(dbWriter,
("INSERT INTO blocking_map VALUES (%s, %s)",
chunk)))
for r in results :
r.wait()
if len(chunk) < step_size :
done = True
pool.close()
Everything works and there are no errors. But at the end, my table is empty, meaning somehow the insertions were not successful. I have tried so many things to fix this (including adding column names for insertion) after many google searches and have not been successful. Any suggestions would be appreciated. (running code in python2.7, gcloud (ubuntu). note that indents may be a bit messed up after pasting here)
Please also note that "chunk" follows exactly the required data format.
Note. This is part of this example
Please note that the only thing I am changing in the above example (linked) is that I am separating the steps for creation of and inserting into the tables since I am running my code on gcloud platform and it enforces GTID standards.
Solution was changing dbwriter function to:
conn = MySQLdb.connect(host = # host ip,
user = # username,
passwd = # password,
db = 'dedupe')
cursor = conn.cursor()
cursor.executemany(sql, rows)
cursor.close()
conn.commit()
conn.close()

Looping over database connections

I have twelve MySQL database connections created using:
mydb1 = dbConnect(MySQL(), user='user', password=password, dbname='db',host='domain')
mydb2...
mydb3...
...
mydb12...
I have a script where I want to execute the same query on all 12 databases and loop through them. How do I pass the dbConnect objects successfully to a dbSendQuery?
items <- ls()[grep("mydb",ls())]
query <- dbSendQuery(items[1], "SELECT * FROM table")
gives me the error:
Error in (function (classes, fdef, mtable) : unable to find an
inherited method for function ‘dbSendQuery’ for signature
‘"character", "character"’
You cannot pass the textual representation of a connection object to the database functions. Your call is analogous to dbSendQuery("mydb1", "select * from table"), which I'm guessing you would not have typed in literally.
Ultimately you want to deal with a list of connections, which you can form manually with
conns <- list(mydb1, mydb2, ...)
but if that is difficult or you want to be more programmatic about it, try
conns <- lapply(ls()[grep("mydb",ls())], get)
and then
results <- lapply(conns, function(con) dbSendQuery(con, "select * from ..."))

Multiple DB connection in R

I was wondering if someone could help with this annoying issue.
I'm trying to create/make multiple connections to different database.
I have a data.frame with 3 connection credentials named conf - It works if I manually enter the connections variable like so:
conn <- dbConnect(MySQL(), user=conf$user, password=conf$passws, host=conf$host, dbname=conf$db)
which ends up creating a single connection.
However, what I want is to be able to refer to the connection as:
conf$conn <- dbConnect(MySQL(), user=conf$user, password=conf$passws, host=conf$host, dbname=conf$db)
here is the error message I'm getting.
Error in rep(value, length.out = nrows) :
attempt to replicate an object of type 'S4'
I think the problem is how I'm adding conf$conn
I used a combination of the pool and config package to solve a similar problem to set up a number of simultaneous PostgreSQL connections. Note that this solution needs a config.yml file with the connection properties for db1 and db2.
library(pool)
library(RPostgreSQL)
connect <- function(cfg) {
config <- config::get(config = cfg)
dbPool(
drv = dbDriver("PostgreSQL", max.con = 100),
dbname = config$dbname,
host = config$host,
port = config$port,
user = config$user,
password = config$password
)
}
conn <- lapply(c("db1", "db2"), connect)

How to write entire dataframe into mySql table in R

I have a data frame containing columns 'Quarter' having values like "16/17 Q1", "16/17 Q2"... and 'Vendor' having values like "a", "b"... .
I am trying to write this data frame into database using
query <- paste("INSERT INTO cc_demo (Quarter,Vendor) VALUES(dd$FY_QUARTER,dd$VENDOR.x)")
but it is throwing error :
Error in .local(conn, statement, ...) :
could not run statement: Unknown column 'dd$FY_QUARTER' in 'field list'
I am new to Rmysql, Please provide me some solution to write entire dataframe?
To write a data frame to mySQL DB you need to:
Create a connection to your database, you need to specify:
MySQL connection
User
Password
Host
Database name
library("RMySQL")
connection <- dbConnect(MySQL(), user = 'root', password = 'password', host = 'localhost', dbname = 'TheDB')
Using the connection create a table and then export data to the database
dbWriteTable(connection, "testTable", testTable)
You can overwrite an existing table like this:
dbWriteTable(connection, "testTable", testTable_2, overwrite=TRUE)
I would advise against writing sql query when you can actually use very handy functions such as dbWriteTable from the RMySQL package. But for the sake of practice, below is an example of how you should go about writing the sql query that does multiple inserts for a MySQL database:
# Set up a data.frame
dd <- data.frame(Quarter = c("16/17 Q1", "16/17 Q2"), Vendors = c("a","b"))
# Begin the query
sql_qry <- "insert into cc_demo (Quarter,Vendor) VALUES"
# Finish it with
sql_qry <- paste0(sql_qry, paste(sprintf("('%s', '%s')", dd$Quarter, dd$Vendors), collapse = ","))
You should get:
"insert into cc_demo (Quarter,Vendor) VALUES('16/17 Q1', 'a'),('16/17 Q2', 'b')"
You can provide this query to your database connection in order to run it.
I hope this helps.

Empty result for sql query in Rstudio-server

I'm trying to get data from MySQL DB into Rstudio-server. My actions are like
mydb = dbConnect(MySQL(), user='user', password='password', dbname='dbname', host='localhost')
query <- stri_paste('select sellings.updated_at AS Up_Date, concat(item_parameters.title, " ", ad_attributes.int_value) AS Class, CONCAT(geos.name, " ", geos.kind) AS place, geos.lon, geos.lat, sellings.price AS price, ((geo_routes.distance*2/1000 + 100)) AS delivery_cost FROM sellings, users, item_parameters, ad_attributes, geos, geo_routes WHERE users.encrypted_password!="" && item_parameters.title="Класс" && sellings.price IS NOT NULL && ad_attributes.int_value IS NOT NULL AND users.id=sellings.user_id AND item_parameters.id=ad_attributes.item_parameter_id AND sellings.id = ad_attributes.ad_id AND sellings.geo_guid = geos.guid AND geos.routable_guid = geo_routes.src_guid AND geo_routes.distance = (SELECT geo_routes.distance FROM geo_routes, geos WHERE geos.guid = sellings.geo_guid AND geo_routes.src_guid = geos.routable_guid AND geo_routes.dst_guid = (SELECT geos.routable_guid FROM geos WHERE geos.name = "Воронеж" && geos.kind = "г")) ORDER BY Up_Date;')
rs = dbGetQuery(mydb, query)
And I get an empty dataframe. But when I do the same with my local DB everything is OK. The query takes a pretty long time, about 3 minutes, but it works properly. Moreover the same query works right from the command line in MySQL. On the server, it takes about 4 seconds. OS of server is Debian 7, OS of local machine is Win 8. Any idea?
Sometimes when querying from the command line the default schema has been set in a previous command. This command doesn't carry over to R so the exact same query from a command line to a R session might not work. Maybe check the dbname.
Insert the below statements in your SQL query
SET NOCOUNT ON
SET ANSI_WARNINGS OFF
It worked for me