Generating pyodbc SQL syntax within a loop - ms-access

Can you do such a thing? I have the following but cursor.execute does not like the syntax of selectSQL. Ultimately I'm looking to iterate through all tables in a .accdb and insert records from each table into a another .accdb with the same tables and fields. Reason being, bringing over new records from field data collection on TabletPCs to master database on server.
import pyodbc
connOtherDB = pyodbc.connect("Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ='path to my dbase;")
otherDBtbls = connOtherDB.cursor().tables()
for t in otherDBtbls:
if t.table_name.startswith("tbl"): #ignores MS sys tables
cursor = connOtherDB.cursor()
#selectSQL = '"SELECT * from '+ str(t.table_name) + '"'
cursor.execute("SELECT * from tblDatabaseComments") #this works if I refer to a table by name
cursor.execute(selectSQL) #this does not work. throws SQL syntax error
row = cursor.fetchone()
if row:
print t.table_name
print row

Use str.format() to ease building of SQL statements:
import pyodbc
connOtherDB = pyodbc.connect("Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ='path to my dbase;")
otherDBtbls = connOtherDB.cursor().tables()
for t in otherDBtbls:
if t.table_name.startswith("tbl"): #ignores MS sys tables
cursor = connOtherDB.cursor()
selectSQL = 'SELECT * FROM {}'.format(t.table_name)
cursor.execute(selectSQL)
row = cursor.fetchone()
if row:
print t.table_name
print row
As an aside, take a look a PEP 8 -- Style Guide for Python Code for guidance on maximum line length and variable naming, among other coding conventions.

Related

df.to_sql with AS400

i want to put a Panda Dataframe to a IBM i Series / AS400. I already researched a much, but now I am stuck.
I already made a lot of queries, where I use pyodbc. For df.to_sql() I should use, as readed on other stacks, sqlalchemy with the ibm_db_sa dialect.
My actual code is:
CONNECTION_STRING = (
"driver={iSeries Access ODBC Driver};"
"System=111.111.111.111;"
"database=TESTDB;"
"uid=USER;"
"pwd=PASSW;"
)
quoted = urllib.parse.quote_plus(CONNECTION_STRING)
engine = create_engine('ibm_db_sa+pyodbc:///?odbc_connect={}'.format(quoted))
create_statement = df.to_sql("TABLETEST", engine, if_exists="append")
the following packages are installed
python 3.9
ibm-db 3.1.3
ibm-db-sa 0.3.7
ibm-db-sa-py3 0.3.1.post1
pandas 1.3.5
pip 22.0.4
setuptools 57.0.0
SQLAlchemy 1.4.39
when I run, i get the following error:
sqlalchemy.exc.ProgrammingError: (pyodbc.ProgrammingError) ('42S02', '[42S02] [IBM][System i Access ODBC Driver][DB2 for i5/OS]SQL0204 - COLUMNS in SYSCAT type *FILE not found. (-204) (SQLPrepare)')
[SQL: SELECT "SYSCAT"."COLUMNS"."COLNAME", "SYSCAT"."COLUMNS"."TYPENAME", "SYSCAT"."COLUMNS"."DEFAULT", "SYSCAT"."COLUMNS"."NULLS", "SYSCAT"."COLUMNS"."LENGTH", "SYSCAT"."COLUMNS"."SCALE", "SYSCAT"."COLUMNS"."IDENTITY", "SYSCAT"."COLUMNS"."GENERATED"
FROM "SYSCAT"."COLUMNS"
WHERE "SYSCAT"."COLUMNS"."TABSCHEMA" = ? AND "SYSCAT"."COLUMNS"."TABNAME" = ? ORDER BY "SYSCAT"."COLUMNS"."COLNO"]
[parameters: ('USER', 'TABLETEST')]
(Background on this error at: https://sqlalche.me/e/14/f405)
I think, the dialect could be wrong, because the parameters are the username and the table for the ODBC connection?
AND: I am not really sure, whats the difference between ibm_db_sa and ibm_db?
I tried a few days again, before someone is trying to do this via sqlalchemy should do it via pyodbc.
Here is my working example
refering the df_to_sql_bulk_insert function to this
(and now I am currently using my system-DSN):
def df_to_sql_bulk_insert(df: pd.DataFrame, table: str, **kwargs) -> str:
df = df.copy().assign(**kwargs)
columns = ", ".join(df.columns)
tuples = map(str, df.itertuples(index=False, name=None))
values = re.sub(r"(?<=\W)(nan|None)(?=\W)", "NULL", (",\n" + " " * 7).join(tuples))
return f"INSERT INTO {table} ({columns})\nVALUES {values}"
cnxn = pyodbc.connect("DSN=XXX")
cursor = cnxn.cursor()
sqlstr = df_to_sql_bulk_insert(df,"DBXXX.TBLXXX")
cursor.execute(sqlstr)
cnxn.commit()

Saving dbplyr query (tbl_sql object) to MySQL without saving data locally

This question expands on this question
Here, I'm using the custom function created by #Simon.S.A. shown in the answer to this question. I'm attempting to save a tbl_sql object in R to MySQL as a new table without first saving it locally. Here, the database and schema in my MySQL are named "test." The tbl_sql object in R is my_data, and I want to save this is a new table in MySQL labeled "car_data".
library(DBI)
library(tidyverse)
library(dbplyr)
#establish connection and import data from MySQL
con <- DBI::dbConnect(RMariaDB::MariaDB(),
dbname = "test",
host = "127.0.0.1",
user = "user",
password = "password")
my_data <- tbl(con, "mtcars")
my_data <- my_data %>% filter(mpg >= 22)
# write function to save tbl_sql as a new table in SQL
write_to_database <- function(input_tbl, db, schema, tbl_name){
# connection
tbl_connection <- input_tbl$src$con
# SQL query
sql_query <- glue::glue(
"SELECT *\n",
"INTO {db}.{schema}.{tbl_name}\n",
"FROM (\n",
dbplyr::sql_render(input_tbl),
"\n) AS sub_query"
)
result <- dbExecute(tbl_connection, as.character(sql_query))
}
# execute function
write_to_database(my_data, "test", "test", "car_data")
After running final line, I get the following error. I'm not sure how I can fix this.
Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '.test.car_data
FROM (
SELECT *
FROM `mtcars`
WHERE (`mpg` >= 22.0)
) AS sub_quer' at line 2 [1064]
12.
stop(structure(list(message = "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '.test.car_data\nFROM (\nSELECT *\nFROM `mtcars`\nWHERE (`mpg` >= 22.0)\n) AS sub_quer' at line 2 [1064]",
call = NULL, cppstack = NULL), class = c("Rcpp::exception",
"C++Error", "error", "condition")))
11.
result_create(conn#ptr, statement, is_statement)
10.
initialize(value, ...)
9.
initialize(value, ...)
8.
new("MariaDBResult", sql = statement, ptr = result_create(conn#ptr,
statement, is_statement), bigint = conn#bigint, conn = conn)
7.
dbSend(conn, statement, params, is_statement = TRUE)
6.
.local(conn, statement, ...)
5.
dbSendStatement(conn, statement, ...)
4.
dbSendStatement(conn, statement, ...)
3.
dbExecute(tbl_connection, as.character(sql_query))
2.
dbExecute(tbl_connection, as.character(sql_query))
1.
write_to_database(my_data, "test", "test", "car_data")
Creating a table with INTO command is an SQL Server (even MS Access) specific syntax and not supported in MySQL. Instead, consider the counterpart statement: CREATE TABLE...SELECT. Also, schema differs between RDBMS's. For MySQL, database is synonymous to schema.
Therefore, consider adjusted version of SQL build:
sql_query <- glue::glue(
"CREATE TABLE {db}.{tbl_name}\n AS \n",
"SELECT * \n",
"FROM (\n",
dbplyr::sql_render(input_tbl),
"\n) AS sub_query"
)

Is there a faster way to upload data from R to MySql?

I am using the following code to upload a new table into a mysql database.
library(RMySql)
library(RODBC)
con <- dbConnect(MySQL(),
user = 'user',
password = 'pw',
host = 'amazonaws.com',
dbname = 'db_name')
dbSendQuery(con, "CREATE TABLE table_1 (
var_1 VARCHAR(50),
var_2 VARCHAR(50),
var_3 DOUBLE,
var_4 DOUBLE);
")
channel <- odbcConnect("db name")
sqlSave(channel, dat = df, tablename = "tb_name", rownames = FALSE, append =
TRUE)
The full data set is 68 variables and 5 million rows. It is taking over 90 minutes to upload 50 thousand rows to MySql. Is there a more efficient way to upload the data to MySql. I originally tried dbWriteTable() but this would result in an error message saying the connection to the database was lost.
Consider a CSV export from R for an import into MySQL with LOAD DATA INFILE:
...
write.csv(df, "/path/to/filename.csv", row.names=FALSE)
dbSendQuery(con, "LOAD DATA LOCAL INFILE '/path/to/filename.csv'
INTO TABLE mytable
FIELDS TERMINATED by ','
ENCLOSED BY '"'
LINES TERMINATED BY '\\n'")
You could try to disable the mysql query log:
dbSendQuery(con, "SET GLOBAL general_log = 'off'")
I can't tell if your mysql user account has the appropriate permissions to do that, or if it conflicts with your business needs.
Off the top of my head: Otherwise you could try to send the data in say 1000-row batches, using a for- loop in your Rscript, and maybe option verbose = true in your call to sqlSave
If you send the data in a single batch, Mysql might try to run the INSERT as a single transaction ("all-or-nothing") and if it fails it goes into recovery or just fails after inserting some random number of rows.

How to write entire dataframe into mySql table in R

I have a data frame containing columns 'Quarter' having values like "16/17 Q1", "16/17 Q2"... and 'Vendor' having values like "a", "b"... .
I am trying to write this data frame into database using
query <- paste("INSERT INTO cc_demo (Quarter,Vendor) VALUES(dd$FY_QUARTER,dd$VENDOR.x)")
but it is throwing error :
Error in .local(conn, statement, ...) :
could not run statement: Unknown column 'dd$FY_QUARTER' in 'field list'
I am new to Rmysql, Please provide me some solution to write entire dataframe?
To write a data frame to mySQL DB you need to:
Create a connection to your database, you need to specify:
MySQL connection
User
Password
Host
Database name
library("RMySQL")
connection <- dbConnect(MySQL(), user = 'root', password = 'password', host = 'localhost', dbname = 'TheDB')
Using the connection create a table and then export data to the database
dbWriteTable(connection, "testTable", testTable)
You can overwrite an existing table like this:
dbWriteTable(connection, "testTable", testTable_2, overwrite=TRUE)
I would advise against writing sql query when you can actually use very handy functions such as dbWriteTable from the RMySQL package. But for the sake of practice, below is an example of how you should go about writing the sql query that does multiple inserts for a MySQL database:
# Set up a data.frame
dd <- data.frame(Quarter = c("16/17 Q1", "16/17 Q2"), Vendors = c("a","b"))
# Begin the query
sql_qry <- "insert into cc_demo (Quarter,Vendor) VALUES"
# Finish it with
sql_qry <- paste0(sql_qry, paste(sprintf("('%s', '%s')", dd$Quarter, dd$Vendors), collapse = ","))
You should get:
"insert into cc_demo (Quarter,Vendor) VALUES('16/17 Q1', 'a'),('16/17 Q2', 'b')"
You can provide this query to your database connection in order to run it.
I hope this helps.

Inserting multiple values against multiple column names in mysql using Python3 and pymysql

I'm looking for a clean way to add multiple values on one row which corresponds to a list of columns in mysql.
Essentially I have two lists:
cols = ['Col_A', 'Col_B','Col_E', 'Col_H, 'Col_n'....]
vals = ['1','56','HEX 00 A0 DB 00', 'Pass', '87'....]
The lists lengths can be 100+ items. Both cols and vals lists will be the same length, so each cols item has a corresponding vals item.
I am using pymysql to connect to an SQL database on a network storage device running MariaDB.
Here's a snippet of my non-working function attempt at passing the two lists:
def add_to_database(cols, vals):
connection = pymysql.connect(host='11.22.33.44',
user='usr',
password='pass',
db='my_db',
charset='utf8mb4',
cursorclass=pymysql.cursors.DictCursor,
autocommit=True)
cursor = connection.cursor()
try:
cursor.execute("CREATE TABLE data_tbl (%s)" % 'id INTEGER PRIMARY KEY')
except:
pass
# Add Column and Values lists to database here
for item in cols:
try:
# First add columns if not already present
cursor.execute("ALTER TABLE data_tbl ADD COLUMN " + str(item))
except:
# Pass column add if already present
pass
cursor.execute("INSERT INTO data_tbl '{0}' VALUES '{1}';".format(cols, vals,))
conn.close()
return
I'm still new to SQL and I've also been playing around with the SQL syntax, so apologies if the code looks a bit odd now.
The common error I get is below:
pymysql.err.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near ''('Col_A', 'Col_B', 'Col_E', 'Col_H', 'Col_...' at line 1")