How to extract create statements from different tables of MySQL DBs? - mysql

I would like to extract all Create Statements in my 50 MySQL Databases via SHOW CREATE TABLE db.table or SHOW CREATE TABLE db1.mytableor SHOW CREATE TABLE db2.sometableor SHOW CREATE TABLE db3.mytable1. Thus each of the DBs has some tables inside db1(table,mytable...) db2(table1,sometable) and so on
To illustrate the DBs via a example query:
SELECT *
FROM db.table1 m
LEFT JOIN db1.sometable o ON m.id = o.id
LEFT JOIN db2.sometables t ON p.id=t.id
LEFT JOIN db3.sometable s ON s.column='john'
library(RMySQL)
library(DBI)
con <- dbConnect(RMySQL::MySQL(),
username = "",
password = "",
host = "",
port = 3306,
dbname= mydbname)# when using dbs<-dbGetQuery(con ,"SHOW DATABASES") I have to ## dbname= mydbname## to get all DBs
Using dbs<-dbGetQuery(con ,"SHOW DATABASES")I can extract all 50 Databases in the dbConnection as character vector. I would like loop over each DB in the dbsand apply SHOW CREATE TABLE to each row/db. I suppose I have to parse the each row/db into dbname= mydbnameand dbs<-dbGetQuery(con ,"SHOW CREATE TABLE"). But I just cant figure out how to make the loops
I tried:
apply(dbs, 1, function(row) {
dbname <- row[]
for (i in 1:length(dbname)) {
create<-dbGetQuery(con,"SHOW CREATE TABLE") }
})
But that doesnt seem right. I suppose I have to include the con into the loop somehow. Otherwise I'll get:
Error in .local(drv, ...) : object 'dbname' not found
So I tried:
apply(dbs, 1, function(row) {
dbname <- row[]
for (i in 1:length(dbname)) {
con <- dbConnect(RMySQL::MySQL(),
username = "",
password = "",
host = "",
port = 3306,
dbname= [i])
create<-dbGetQuery(con,"SHOW CREATE TABLE") }})
I suppose that comes close to the solution but I miss something:
dbs<-dbGetQuery(con,"show databases")
library(foreach)
foreach(i = 1:(length(dbs))%dopar%{
query<-paste("SHOW CREATE TABLE",dbs[i])
creates<-dbGetQuery(con,query)
})

Consider this approach of importing a data frame of each database (leaving out the system ones, INFORMATION_SCHEMA and MYSQL) and their corresponding tables. Then, run SHOW CREATE TABLE statements. Finally, merge the original dataframe with binded dataframe of create statements.
Now, the one caveat is tables that repeat names across databases. To return distinct values of such combinations, the aggregate() by head function is used.
con <- dbConnect(RMySQL::MySQL(),
username = "****", password = "****",
host = "****", port = 3306,
dbname= "****")
dbtbls <- dbGetQuery(con, "SELECT `TABLE_SCHEMA` AS `Database`,
`TABLE_NAME` AS `Table`
FROM `INFORMATION_SCHEMA`.`TABLES`
WHERE `TABLE_TYPE` = 'BASE TABLE'
AND `TABLE_SCHEMA` NOT LIKE '%SCHEMA%'
AND `TABLE_SCHEMA` NOT LIKE '%MYSQL%' ")
# LIST OF SQL STATEMENTS
sql <- paste0("SHOW CREATE TABLE ", dbtbls$Database, ".", dbtbls$Table)
# LIST OF DATAFRAMES
createstmts <- lapply(sql, function(x) dbGetQuery(con, x))
dbDisconnect(con)
# ROW BIND LIST INTO ONE DATAFRAME TO MERGE WITH ORIGINAL
stmtsdf <- do.call(rbind, createstmts)
finaldf <- merge(dbtbls, stmtsdf, by='Table')
# RETURN DISTINCT RECORDS
finaldf <- aggregate(.~Database+Table, finaldf, FUN=head, 1)

mysqldump --no-data
does exactly what you are asking for. (There may be other parameters desirable to avoid/include CREATE DATABASE, etc.)
If the requirement is to subsequently pull the CREATEs into R, then I ask whether this is a one-time task, or a recurring task. For one-time, I would suggest that, overall, the mysqldump approach might be simpler.

First, you can just simply use
for (i in 1:length(dbs)) { }
Or you can look into apply functions, particularly, sapply. There you can do parsing per dbConnection string, connect and get all tables as list or vector. Then you can loop inside those to get create table statements.
So, it is basically apply inside apply.
For a good explanation of apply functions, you can look into http://www.r-bloggers.com/using-apply-sapply-lapply-in-r/

Related

Problem inserting data into MySQL from R with a loop

good afternoon and thanks for reading. I currently have to manually enter data into SQL and am trying to automate the data using a loop in R, but when entering the query and reviewing the information in MySQL, I don't see any new observations added.
Basically, that's the code to input the observations that I'm using (using other data that I'm looping out of a data frame):
library(dplyr)
library(odbc)
library(DBI)
library(RMySQL)
connection <- dbConnect(RMySQL::MySQL(),
dbname = "xxx",
host= "xxx",
port = xxx,
user = "xxx",
password = "xxx")
payout_history <- paste0("INSERT INTO `payout_history` (`uuid`, `trip_id`, `delivery_order_id`, `amount`, `payout_type_id`, `payout_invoice_id`) VALUES (uuid(), '",id,"', '",order_id,"', '",amount,"', '7', '",head(payout_invoice$id,1),"');")
dbExecute(connection,payout_history)
The query works when I try to use it within the MySQL manager, but within the R console with the dbExecute() function it doesn't add anything, do you know if I should use any other function to be able to insert observations?

RMySQL: Error in as.character.default() :

I am trying to use the following function in R:
heritblup <- function(name) {
library(RMySQL)
library(DBI)
con <- dbConnect(RMySQL::MySQL(),
dbname ="mydab",
host = "localhost",
port = 3306,
user = "root",
password = "")
value1 <- 23;
rss<- paste0 ("INSERT INTO namestable
(myvalue, person)
VALUES ('$value1', '",name,"')")
rs <<- dbGetQuery (con, rss)
}
heritblup("Tommy")
But I keep getting this error:
Error in as.character.default()
: no method for coercing this S4 class to a vector Called from:
as.character.default()
I tried to change the paste function to this:
rss<- paste0 ("INSERT INTO namestable
(myvalue, person)
VALUES ($value1, ",name,")")
the error persists;
I have no idea whats wrong.
Please help
Couple of issues in code. I'm not sure if OP is attempting to insert records in database or fetch from database.
Assuming, based on query that he is expecting to insert data in database table.
The rule is that query should be prepared in R the way it will be executed in MySQL. Value replacement (if any) should be performed in R as MySQL engine will not have any idea about variables from R.
Hence, the query preparation steps should be done as:
rss <- sprintf("INSERT INTO namestable (myvalue, person) VALUES (%d, '%s')", value1, name)
# "INSERT INTO namestable (myvalue, person) VALUES (23, 'test')"
If data insert is goal then dbGetQuery is not right option per R documentation instead dbSendStatement() should be used for data manipulation. The reference from help suggest:
However, callers are strongly encouraged to use dbSendStatement() for
data manipulation statements.
Based on that query execution line should be:
rs <- dbSendStatement(con, rss)
ret_val <- dbGetRowsAffected(rs)
dbClearResult(rs)
dbDisconnect(con)
return(ret_val)

Is there a faster way to upload data from R to MySql?

I am using the following code to upload a new table into a mysql database.
library(RMySql)
library(RODBC)
con <- dbConnect(MySQL(),
user = 'user',
password = 'pw',
host = 'amazonaws.com',
dbname = 'db_name')
dbSendQuery(con, "CREATE TABLE table_1 (
var_1 VARCHAR(50),
var_2 VARCHAR(50),
var_3 DOUBLE,
var_4 DOUBLE);
")
channel <- odbcConnect("db name")
sqlSave(channel, dat = df, tablename = "tb_name", rownames = FALSE, append =
TRUE)
The full data set is 68 variables and 5 million rows. It is taking over 90 minutes to upload 50 thousand rows to MySql. Is there a more efficient way to upload the data to MySql. I originally tried dbWriteTable() but this would result in an error message saying the connection to the database was lost.
Consider a CSV export from R for an import into MySQL with LOAD DATA INFILE:
...
write.csv(df, "/path/to/filename.csv", row.names=FALSE)
dbSendQuery(con, "LOAD DATA LOCAL INFILE '/path/to/filename.csv'
INTO TABLE mytable
FIELDS TERMINATED by ','
ENCLOSED BY '"'
LINES TERMINATED BY '\\n'")
You could try to disable the mysql query log:
dbSendQuery(con, "SET GLOBAL general_log = 'off'")
I can't tell if your mysql user account has the appropriate permissions to do that, or if it conflicts with your business needs.
Off the top of my head: Otherwise you could try to send the data in say 1000-row batches, using a for- loop in your Rscript, and maybe option verbose = true in your call to sqlSave
If you send the data in a single batch, Mysql might try to run the INSERT as a single transaction ("all-or-nothing") and if it fails it goes into recovery or just fails after inserting some random number of rows.

How to write entire dataframe into mySql table in R

I have a data frame containing columns 'Quarter' having values like "16/17 Q1", "16/17 Q2"... and 'Vendor' having values like "a", "b"... .
I am trying to write this data frame into database using
query <- paste("INSERT INTO cc_demo (Quarter,Vendor) VALUES(dd$FY_QUARTER,dd$VENDOR.x)")
but it is throwing error :
Error in .local(conn, statement, ...) :
could not run statement: Unknown column 'dd$FY_QUARTER' in 'field list'
I am new to Rmysql, Please provide me some solution to write entire dataframe?
To write a data frame to mySQL DB you need to:
Create a connection to your database, you need to specify:
MySQL connection
User
Password
Host
Database name
library("RMySQL")
connection <- dbConnect(MySQL(), user = 'root', password = 'password', host = 'localhost', dbname = 'TheDB')
Using the connection create a table and then export data to the database
dbWriteTable(connection, "testTable", testTable)
You can overwrite an existing table like this:
dbWriteTable(connection, "testTable", testTable_2, overwrite=TRUE)
I would advise against writing sql query when you can actually use very handy functions such as dbWriteTable from the RMySQL package. But for the sake of practice, below is an example of how you should go about writing the sql query that does multiple inserts for a MySQL database:
# Set up a data.frame
dd <- data.frame(Quarter = c("16/17 Q1", "16/17 Q2"), Vendors = c("a","b"))
# Begin the query
sql_qry <- "insert into cc_demo (Quarter,Vendor) VALUES"
# Finish it with
sql_qry <- paste0(sql_qry, paste(sprintf("('%s', '%s')", dd$Quarter, dd$Vendors), collapse = ","))
You should get:
"insert into cc_demo (Quarter,Vendor) VALUES('16/17 Q1', 'a'),('16/17 Q2', 'b')"
You can provide this query to your database connection in order to run it.
I hope this helps.

How to add appending data to MySQL through R?

I am running an R script that adds data to mySQL database. I usually format the data and add it as as it comes in after couple of hours (data string is not continuous). My first set of data was added properly in MySQL database. The second string of data can not be added properly.
con = dbConnect(MySQL(), user='root', password='xxxxxx', dbname='test', host='localhost')
dbWriteTable(con, 'Tables', value = parseTweets(filterStream(file.name= "", track=c("lebron"), timeout=10, oauth=my_oauth)))
When I rerun the last code (dbWriteTable) again, it gives me following error
Error: Error in .local(conn, statement, ...) :
could not run statement: Table 'tables' already exists
I also used
dbWriteTable(con, 'Tables', value = parseTweets(filterStream(file.name= "", track=c("lebron"), timeout=10, oauth=my_oauth)), append = TRUE)
but it provides the same error
For some reason giving setting append to TRUE is not working. Instead give it a number. See the code below for better understanding
dbWriteTable(con, 'Tables', value = parseTweets(filterStream(file.name= "", track=c("lebron"), timeout=100, oauth=my_oauth)), overwrite = 0, row.names = 0, append = 1)