Problem inserting data into MySQL from R with a loop - mysql

good afternoon and thanks for reading. I currently have to manually enter data into SQL and am trying to automate the data using a loop in R, but when entering the query and reviewing the information in MySQL, I don't see any new observations added.
Basically, that's the code to input the observations that I'm using (using other data that I'm looping out of a data frame):
library(dplyr)
library(odbc)
library(DBI)
library(RMySQL)
connection <- dbConnect(RMySQL::MySQL(),
dbname = "xxx",
host= "xxx",
port = xxx,
user = "xxx",
password = "xxx")
payout_history <- paste0("INSERT INTO `payout_history` (`uuid`, `trip_id`, `delivery_order_id`, `amount`, `payout_type_id`, `payout_invoice_id`) VALUES (uuid(), '",id,"', '",order_id,"', '",amount,"', '7', '",head(payout_invoice$id,1),"');")
dbExecute(connection,payout_history)
The query works when I try to use it within the MySQL manager, but within the R console with the dbExecute() function it doesn't add anything, do you know if I should use any other function to be able to insert observations?

Related

RMySQL: Error in as.character.default() :

I am trying to use the following function in R:
heritblup <- function(name) {
library(RMySQL)
library(DBI)
con <- dbConnect(RMySQL::MySQL(),
dbname ="mydab",
host = "localhost",
port = 3306,
user = "root",
password = "")
value1 <- 23;
rss<- paste0 ("INSERT INTO namestable
(myvalue, person)
VALUES ('$value1', '",name,"')")
rs <<- dbGetQuery (con, rss)
}
heritblup("Tommy")
But I keep getting this error:
Error in as.character.default()
: no method for coercing this S4 class to a vector Called from:
as.character.default()
I tried to change the paste function to this:
rss<- paste0 ("INSERT INTO namestable
(myvalue, person)
VALUES ($value1, ",name,")")
the error persists;
I have no idea whats wrong.
Please help
Couple of issues in code. I'm not sure if OP is attempting to insert records in database or fetch from database.
Assuming, based on query that he is expecting to insert data in database table.
The rule is that query should be prepared in R the way it will be executed in MySQL. Value replacement (if any) should be performed in R as MySQL engine will not have any idea about variables from R.
Hence, the query preparation steps should be done as:
rss <- sprintf("INSERT INTO namestable (myvalue, person) VALUES (%d, '%s')", value1, name)
# "INSERT INTO namestable (myvalue, person) VALUES (23, 'test')"
If data insert is goal then dbGetQuery is not right option per R documentation instead dbSendStatement() should be used for data manipulation. The reference from help suggest:
However, callers are strongly encouraged to use dbSendStatement() for
data manipulation statements.
Based on that query execution line should be:
rs <- dbSendStatement(con, rss)
ret_val <- dbGetRowsAffected(rs)
dbClearResult(rs)
dbDisconnect(con)
return(ret_val)

Error in RMySQL package

I am using RMySQL package to write (append) data in current table.
I am using R, version 3.3.2.
My code looks like this:
library(RMySQL)
df_final <- some_data
m<-dbDriver("MySQL")
mydb <- dbConnect(m, user='odvjet12_mislav',
password='my_pass',
host='91.234.46.219',
dbname='odvjet12_fina_pn')
dbWriteTable(mydb, value = df_final, name = "fina_pn", append = TRUE, row.names = FALSE)
This code works fine for some time, but in last ten days, it always return an error:
Error in .local(conn, statement, ...) :
could not run statement: The used command is not allowed with this MySQL version
I don't understand how it is possible for code to work for some time and now, it returns an error?
I kindly ask for feedback on this issue.
Best,
Mislav Šagovac
You could also use dbGetQuery from the RMySQL package and iterate over the rows, which was my solution when I reached a similar error for a dataframe I wanted to write to a MySQL DB:
mydb = dbConnect(MySQL(), user='user', password='password', dbname='databasename', host='hostname')
for(i in 1:nrow(df)){
dbGetQuery(mydb,paste0("INSERT INTO MYTABLE (COL1,COL2) VALUES(",df$col1[i],",",df$col2[i],")"))
}

How to write entire dataframe into mySql table in R

I have a data frame containing columns 'Quarter' having values like "16/17 Q1", "16/17 Q2"... and 'Vendor' having values like "a", "b"... .
I am trying to write this data frame into database using
query <- paste("INSERT INTO cc_demo (Quarter,Vendor) VALUES(dd$FY_QUARTER,dd$VENDOR.x)")
but it is throwing error :
Error in .local(conn, statement, ...) :
could not run statement: Unknown column 'dd$FY_QUARTER' in 'field list'
I am new to Rmysql, Please provide me some solution to write entire dataframe?
To write a data frame to mySQL DB you need to:
Create a connection to your database, you need to specify:
MySQL connection
User
Password
Host
Database name
library("RMySQL")
connection <- dbConnect(MySQL(), user = 'root', password = 'password', host = 'localhost', dbname = 'TheDB')
Using the connection create a table and then export data to the database
dbWriteTable(connection, "testTable", testTable)
You can overwrite an existing table like this:
dbWriteTable(connection, "testTable", testTable_2, overwrite=TRUE)
I would advise against writing sql query when you can actually use very handy functions such as dbWriteTable from the RMySQL package. But for the sake of practice, below is an example of how you should go about writing the sql query that does multiple inserts for a MySQL database:
# Set up a data.frame
dd <- data.frame(Quarter = c("16/17 Q1", "16/17 Q2"), Vendors = c("a","b"))
# Begin the query
sql_qry <- "insert into cc_demo (Quarter,Vendor) VALUES"
# Finish it with
sql_qry <- paste0(sql_qry, paste(sprintf("('%s', '%s')", dd$Quarter, dd$Vendors), collapse = ","))
You should get:
"insert into cc_demo (Quarter,Vendor) VALUES('16/17 Q1', 'a'),('16/17 Q2', 'b')"
You can provide this query to your database connection in order to run it.
I hope this helps.

Fetching data in parallel from mysql using R doParallel or foreach

I am trying to fetch data in parallel from MySQL database using R. Following code is fetching data one by one and working fine. But I want to speed up the process by sending multiple queries and save it into different variables. Later I will merge timeseries inside the variables.
library(RMySQL)
dbConnect(MySQL(), user='external', password='xxxxxxx', dbname='GMT_Minute_Data', host='xx.xx.xxx.xxx')
sqlData <-select TradeTime, Open, High, Low, Close from ad where tradetime between ‘2014-01-01’ and ‘2015-10-20’
data1= dbFetch(sqlData, n=-1)
sqlData <-select TradeTime, Open, High, Low, Close from ty where tradetime between ‘2014-01-01’ and ‘2015-10-20’
data2 = dbFetch(sqlData, n=-1)
sqlData <-select TradeTime, Open, High, Low, Close from ax where tradetime between ‘2014-01-01’ and ‘2015-10-20’
data3 = dbFetch(sqlData, n=-1)
connections <- dbListConnections(MySQL())
for(i in connections) {dbDisconnect(i)}
I have tried to fetch data in parallel using following code:
library(foreach)
library(doParallel)
library(RMySQL)
fetchData<- function(nInst, inst1, inst2, inst3, inst4, inst5, startDate, endDate, con1){
inst<-NULL
sqlData <-NULL
if(nInst==1)
inst<-inst1
else if(nInst==2)
inst<-inst2
else if(nInst==3)
inst<-inst3
else if(nInst==4)
inst<-inst4
else if(nInst==5)
inst<-inst5
sqlData <- dbSendQuery(con1, paste0('select TradeTime, Open, High, Low, Close from ', inst, ' where tradetime between \'', startDate, '\' and \'', endDate, '\'' ))
data1 = dbFetch(sqlData, n=-1)
print(head(data1))
data1
}
cluster = makeCluster(5, type = "SOCK")
registerDoParallel(cluster)
mydb <- NULL
clusterEvalQ(cluster, {
mydb <- dbConnect(MySQL(), user='external', password='xxxxxx', dbname='GMT_Minute_Data', host='xx.xx.xxx.xxx')
NULL
})
allDataList<-foreach(n =1:2, .verbose=TRUE, .packages=('RMySQL')) %dopar% {
fetchData(n, inst1, inst2, inst3, inst4, inst5, startDate, endDate, mydb)
}
stopCluster(cluster)
on.exit(dbDisconnect(mydb))
Sometime code is only fetching data for the first instrument but not for the rest of the instruments.
Please assist if someone know the solution.
Thanks,
I think the problem is that foreach is auto-exporting the mydb variable to the workers, thus defeating the purpose of initializing with it clusterEvalQ. Database connections can't be serialized and sent to other machines properly, which is why it's useful to initialize it manually with clusterEvalQ. The foreach .verbose=TRUE option let's you verify that mydb is not auto-exported. If it says that it is auto-exported, you need to prevent it.
In your example, you can prevent mydb from being auto-exported by simply removing the mydb <- NULL statement, but I suggest that you use the foreach .noexport='mydb' option to be certain that it's never auto-exported. Here's a stripped-down example that does that:
library(doParallel)
fetchData <- function(ignore) {
mydb
}
cluster <- makeCluster(5, type = "SOCK")
registerDoParallel(cluster)
clusterEvalQ(cluster, {
mydb <- sample(100, 1) # different value for each worker
NULL
})
r <- foreach(n=1:2, .noexport='mydb', .verbose=TRUE) %dopar% {
fetchData(n)
}
In this case, foreach analyzes the fetchData function and notices that it's using a variable named mydb. Thus, if mydb is defined on the master, it will auto-export it unless you tell it not to. That's why I suggest using .noexport='mydb' even if it's not defined in the local environment. It makes doubly sure that your function doesn't use a corrupt database connection.

How to extract create statements from different tables of MySQL DBs?

I would like to extract all Create Statements in my 50 MySQL Databases via SHOW CREATE TABLE db.table or SHOW CREATE TABLE db1.mytableor SHOW CREATE TABLE db2.sometableor SHOW CREATE TABLE db3.mytable1. Thus each of the DBs has some tables inside db1(table,mytable...) db2(table1,sometable) and so on
To illustrate the DBs via a example query:
SELECT *
FROM db.table1 m
LEFT JOIN db1.sometable o ON m.id = o.id
LEFT JOIN db2.sometables t ON p.id=t.id
LEFT JOIN db3.sometable s ON s.column='john'
library(RMySQL)
library(DBI)
con <- dbConnect(RMySQL::MySQL(),
username = "",
password = "",
host = "",
port = 3306,
dbname= mydbname)# when using dbs<-dbGetQuery(con ,"SHOW DATABASES") I have to ## dbname= mydbname## to get all DBs
Using dbs<-dbGetQuery(con ,"SHOW DATABASES")I can extract all 50 Databases in the dbConnection as character vector. I would like loop over each DB in the dbsand apply SHOW CREATE TABLE to each row/db. I suppose I have to parse the each row/db into dbname= mydbnameand dbs<-dbGetQuery(con ,"SHOW CREATE TABLE"). But I just cant figure out how to make the loops
I tried:
apply(dbs, 1, function(row) {
dbname <- row[]
for (i in 1:length(dbname)) {
create<-dbGetQuery(con,"SHOW CREATE TABLE") }
})
But that doesnt seem right. I suppose I have to include the con into the loop somehow. Otherwise I'll get:
Error in .local(drv, ...) : object 'dbname' not found
So I tried:
apply(dbs, 1, function(row) {
dbname <- row[]
for (i in 1:length(dbname)) {
con <- dbConnect(RMySQL::MySQL(),
username = "",
password = "",
host = "",
port = 3306,
dbname= [i])
create<-dbGetQuery(con,"SHOW CREATE TABLE") }})
I suppose that comes close to the solution but I miss something:
dbs<-dbGetQuery(con,"show databases")
library(foreach)
foreach(i = 1:(length(dbs))%dopar%{
query<-paste("SHOW CREATE TABLE",dbs[i])
creates<-dbGetQuery(con,query)
})
Consider this approach of importing a data frame of each database (leaving out the system ones, INFORMATION_SCHEMA and MYSQL) and their corresponding tables. Then, run SHOW CREATE TABLE statements. Finally, merge the original dataframe with binded dataframe of create statements.
Now, the one caveat is tables that repeat names across databases. To return distinct values of such combinations, the aggregate() by head function is used.
con <- dbConnect(RMySQL::MySQL(),
username = "****", password = "****",
host = "****", port = 3306,
dbname= "****")
dbtbls <- dbGetQuery(con, "SELECT `TABLE_SCHEMA` AS `Database`,
`TABLE_NAME` AS `Table`
FROM `INFORMATION_SCHEMA`.`TABLES`
WHERE `TABLE_TYPE` = 'BASE TABLE'
AND `TABLE_SCHEMA` NOT LIKE '%SCHEMA%'
AND `TABLE_SCHEMA` NOT LIKE '%MYSQL%' ")
# LIST OF SQL STATEMENTS
sql <- paste0("SHOW CREATE TABLE ", dbtbls$Database, ".", dbtbls$Table)
# LIST OF DATAFRAMES
createstmts <- lapply(sql, function(x) dbGetQuery(con, x))
dbDisconnect(con)
# ROW BIND LIST INTO ONE DATAFRAME TO MERGE WITH ORIGINAL
stmtsdf <- do.call(rbind, createstmts)
finaldf <- merge(dbtbls, stmtsdf, by='Table')
# RETURN DISTINCT RECORDS
finaldf <- aggregate(.~Database+Table, finaldf, FUN=head, 1)
mysqldump --no-data
does exactly what you are asking for. (There may be other parameters desirable to avoid/include CREATE DATABASE, etc.)
If the requirement is to subsequently pull the CREATEs into R, then I ask whether this is a one-time task, or a recurring task. For one-time, I would suggest that, overall, the mysqldump approach might be simpler.
First, you can just simply use
for (i in 1:length(dbs)) { }
Or you can look into apply functions, particularly, sapply. There you can do parsing per dbConnection string, connect and get all tables as list or vector. Then you can loop inside those to get create table statements.
So, it is basically apply inside apply.
For a good explanation of apply functions, you can look into http://www.r-bloggers.com/using-apply-sapply-lapply-in-r/