I'm using sqlalchemy with pyodbc to restore a mssql ".bak" file. I've followed advice from previous posts regarding getting around transactions but it doesn't seem to change anything. Any help would be appreciated.
from urllib.parse import quote_plus
from sqlalchemy import create_engine
params = quote_plus("Driver={SQL Server Native Client 11.0};"
"Server=Computer\SQLEXPRESS;"
"Database=master;"
"Trusted_Connection=yes;")
engine = create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)
connection = engine.raw_connection()
db_path = r"C:\\Path\\to\\OutputDB.bak"
move_path = r"C:\\Path\\to\\backup\\db.mdf"
move_log_path = r"C:\\Path\\to\\backup\\db_Log.ldf"
sql_cmd = f"""
RESTORE DATABASE [db]
FROM DISK = N'{db_path}'
WITH FILE = 1,
MOVE N'db'
TO N'{move_path}',
MOVE N'test_log'
TO N'{move_log_path}',
RECOVERY,
NOUNLOAD,
REPLACE,
STATS = 5
"""
connection.autocommit = True
cursor = connection.cursor()
cursor.execute(sql_cmd)
while cursor.nextset():
pass
connection.autocommit = False
I get the below error message:
ProgrammingError: ('42000', '[42000] [Microsoft][SQL Server Native Client 11.0][SQL Server]Cannot perform a backup or restore operation within a transaction. (3021) (SQLExecDirectW); [42000] [Microsoft][SQL Server Native Client 11.0][SQL Server]RESTORE DATABASE is terminating abnormally. (3013)')
I managed to fix this by passing connect_args={'autocommit': True} to create_engine. Neither cursor.execute(sql_cmd).execution_options(autocommit=True) or connection.autocommit = True appeared to work.
from urllib.parse import quote_plus
from sqlalchemy import create_engine
params = quote_plus("Driver={SQL Server Native Client 11.0};"
"Server=Computer\SQLEXPRESS;"
"Database=master;"
"Trusted_Connection=yes;")
engine = create_engine("mssql+pyodbc:///?odbc_connect=%s" % params, connect_args={'autocommit': True})
connection = engine.raw_connection()
db_path = r"C:\\Path\\to\\OutputDB.bak"
move_path = r"C:\\Path\\to\\backup\\db.mdf"
move_log_path = r"C:\\Path\\to\\backup\\db_Log.ldf"
sql_cmd = f"""
RESTORE DATABASE [db]
FROM DISK = N'{db_path}'
WITH FILE = 1,
MOVE N'db'
TO N'{move_path}',
MOVE N'test_log'
TO N'{move_log_path}',
RECOVERY,
NOUNLOAD,
REPLACE,
STATS = 5
"""
cursor = connection.cursor()
cursor.execute(sql_cmd)
while cursor.nextset():
pass
Related
In My DAG:
I need use pandas .to_sql
It work with sqlalchemy, but not safety.
It error with connection in Airflow...?
from sqlalchemy import create_engine
# It's works, but not safety!
conn = create_engine('postgresql+psycopg2://login:pass#localhost:5432/de')
# More safe: Make connection in Airflow, PG_PROJECT_CONN
dwh_hook = PostgresHook('PG_PROJECT_CONN')
conn = dwh_hook.get_conn()
# Work OK:
df_ = pd.read_sql(f'select * from stg.restaurants;',conn)
# Don't work: WARNING - /usr/local/lib/python3.8/dist-packages/pandas/io/sql.py:761
# UserWarning: pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI
# or sqlite3 DBAPI2 connectionother DBAPI2 objects are not tested, please consider using SQLAlchemy
df_restaurants.to_sql('restaurants', conn, if_exists='append', schema='stg', index=False, chunksize=100)
I have airflow installed on Ubuntu as WSL on windows.
I am trying to load a delimited file that is stored on my C drive into Mysql database using the code below:
import logging
import os
import csv
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from airflow.operators.mysql_operator import MySqlOperator
from airflow.hooks.mysql_hook import MySqlHook
def bulk_load_sql(table_name, **kwargs):
local_filepath = 'some c drive path'
conn = MySqlHook(conn_name_attr='mysql_default')
conn.bulk_load(table_name, local_filepath)
return table_name
dag = DAG(
"dag_name",
start_date=datetime.datetime.now() - datetime.timedelta(days=1),
schedule_interval=None)
t1 = PythonOperator(
task_id='csv_to_stgtbl',
provide_context=True,
python_callable=bulk_load_sql,
op_kwargs={'table_name': 'mysqltablnm'},
dag=dag
)
It gives the following exception:
MySQLdb._exceptions.OperationalError: (2068, 'LOAD DATA LOCAL INFILE file request rejected due to restrictions on access.')
I have checked the following setting on mysql and its ON
SHOW GLOBAL VARIABLES LIKE 'local_infile'
Could someone please provide some pointers as to how to fix it.
Is there any other way I can load a delimited file into mysql using airflow.
For now, I have implemented a work around as follows:
def load_staging():
mysqlHook = MySqlHook(conn_name_attr='mysql_default')
#cursor = conn.cursor()
conn = mysqlHook.get_conn()
cursor = conn.cursor()
csv_data = csv.reader(open('c drive file path'))
header = next(csv_data)
logging.info('Importing the CSV Files')
for row in csv_data:
#print(row)
cursor.execute("INSERT INTO table_name (col1,col2,col3) VALUES (%s, %s, %s)",
row)
conn.commit()
cursor.close()
t1 = PythonOperator(
task_id='csv_to_stgtbl',
python_callable=load_staging,
dag=dag
)
However, it would have been great if the LOAD DATA LOCAL INFILE would have worked.
I am writing a script in python 3 that is listening to the tunnel and saving and updating data inside MySQL depend on the message received.
I went into weird behavior, i did a simple connection to MySQL using pymysql module and everything worked fine, ut after sometime this simple connection closes.
So i decide to implement Pool connection to MySQL and here arises the problem. Something happens no errors, but the issue is the following:
My cursor = yield self._pool.execute(query, list(filters.values()))
cursor result = tornado_mysql.pools.Pool object at 0x0000019DE5D71F98
and stacks like that not doing anything more
If i remove yield from cursor pass that line and next line throws error
response = yield c.fetchall()
AttributeError: 'Future' object has no attribute 'fetchall'
How i can fix the MySQL pool connection to work properly?
What i tried:
I use few modules for pool connection, all goes in same issue
Did back simple connection with pymysql and worked again
Below my code:
python script file
import pika
from model import SyncModel
_model = SyncModel(conf, _server_id)
#coroutine
def main():
credentials = pika.PlainCredentials('user', 'password')
try:
cp = pika.ConnectionParameters(
host='127.0.0.1',
port=5671,
credentials=credentials,
ssl=False,
)
connection = pika.BlockingConnection(cp)
channel = connection.channel()
#coroutine
def callback(ch, method, properties, body):
if 'messageType' in properties.headers:
message_type = properties.headers['messageType']
if message_type in allowed_message_types:
result = ptoto_file._reflection.ParseMessage(descriptors[message_type], body)
if result:
result = protobuf_to_dict(result)
if message_type == 'MyMessage':
yield _model.message_event(data=result)
else:
print('Message type not in allowed list = ' + str(message_type))
print('continue listening...')
channel.basic_consume(callback, queue='queue', no_ack=True)
print(' [*] Waiting for messages. To exit press CTRL+C')
channel.start_consuming()
except Exception as e:
print('Could not connect to host 127.0.0.1 on port 5671')
print(str(e))
if __name__ == '__main__':
main()
SyncModel
from tornado_mysql import pools
from tornado.gen import coroutine, Return
from tornado_mysql.cursors import DictCursor
class SyncModel(object):
def __init__(self, conf, server_id):
self.conf = conf
servers = [i for i in conf.mysql.servers]
for s in servers:
if s['server_id'] == server_id:
// s hold all data as, host, user, port, autocommit, charset, db, password
s['cursorclass'] = DictCursor
self._pool = pools.Pool(s, max_idle_connections=1, max_recycle_sec=3)
#coroutine
def message_event(self, data):
table_name = 'table_name'
query = ''
data = data['message']
filters = {
'id': data['id']
}
// here the connection fails as describe above
response = yield self.query_select(table_name, self._pool, filters=filters)
#coroutine
def query_select(self, table_name, _pool, filters=None):
if filters is None:
filters = {}
combined_filters = ['`%s` = %%s' % i for i in filters.keys()]
where = 'WHERE ' + ' AND '.join(combined_filters) if combined_filters else ''
query = """SELECT * FROM `%s` %s""" % (table_name, where)
c = self._pool.execute(query, list(filters.values()))
response = yield c.fetchall()
raise Return({response})
All the code was working with just simple connection to the database, after i start to use pool example is not working anymore. Will appreciate any help in this issue.
This is a stand alone script.
The pool connection was not working, so switched back to pymysql with double checking the connection
I would like to post my answer that worked, only this solution worked for me
before connecting to mysql to check if the connection is open, if not reconnect
if not self.mysql.open:
self.mysql.ping(reconnect=True)
As part of hosting my website I have an SQL server on pythonanywhere.com with some data collected from my website. I need to aggregate some of the information into a new table stored in the same database. If I use the code below I can create a new table as observed by the SHOW TABLES query. However, I cannot see that table in the Django online interface provided alongside the SQL server.
Why is that the case? How can I make the new visible on the Django interface so I can browse the content and modify it?
from __future__ import print_function
from mysql.connector import connect as sql_connect
import sshtunnel
from sshtunnel import SSHTunnelForwarder
from copy import deepcopy
sshtunnel.SSH_TIMEOUT = 5.0
sshtunnel.TUNNEL_TIMEOUT = 5.0
def try_query(query):
try:
cursor.execute(query)
connection.commit()
except Exception:
connection.rollback()
raise
if __name__ == '__main__':
remote_bind_address = ('{}.mysql.pythonanywhere-services.com'.format(SSH_USERNAME), 3306)
tunnel = SSHTunnelForwarder(('ssh.pythonanywhere.com'),
ssh_username=SSH_USERNAME, ssh_password=SSH_PASSWORD,
remote_bind_address=remote_bind_address)
tunnel.start()
connection = sql_connect(user=SSH_USERNAME, password=DATABASE_PASSWORD,
host='127.0.0.1', port=tunnel.local_bind_port,
database=DATABASE_NAME)
print("Connection successful!")
cursor = connection.cursor() # get the cursor
cursor.execute("USE {}".format(DATABASE_NAME)) # select the database
cursor.execute("SHOW TABLES")
prev_tables = deepcopy(cursor.fetchall())
try_query("CREATE TABLE IF NOT EXISTS TestTable(TestName VARCHAR(255) PRIMARY KEY, SupplInfo VARCHAR(255))")
print("Created table.")
cursor.execute("SHOW TABLES")
new_tables = deepcopy(cursor.fetchall())
I'm using python to export a large matrixs (shape around 3000 * 3000) into MySQL.
Right now I'm using MySQLdb to insert those values but it's too troublesome and too inefficient. Here is my code:
# -*- coding:utf-8 -*-
import MySQLdb
import numpy as np
import pandas as pd
import time
def feature_to_sql_format(df):
df = df.fillna(value='')
columns = list(df.columns)
index = list(df.index)
index_sort = np.reshape([[int(i)] * len(columns) for i in index], (-1)).tolist()
columns_sort = (columns * len(index))
values_sort = df.values.reshape(-1).tolist()
return str(zip(index_sort, columns_sort, values_sort))[1: -1].replace("'NULL'", 'NULL')
if __name__ == '__main__':
t1 = time.clock()
df = pd.read_csv('C:\\test.csv', header=0, index_col=0)
output_string = feature_to_sql_format(df)
sql_CreateTable = 'USE derivative_pool;DROP TABLE IF exists test1;' \
'CREATE TABLE test1(date INT NOT NULL, code VARCHAR(12) NOT NULL, value FLOAT NULL);'
sql_Insert = 'INSERT INTO test (date,code,value) VALUES ' + output_string + ';'
con = MySQLdb.connect(......)
cur = con.cursor()
cur.execute(sql_CreateTable)
cur.close()
cur = con.cursor()
cur.execute(sql_Insert)
cur.close()
con.commit()
con.close()
t2 = time.clock()
print t2 - t1
And it consumes around 274 seconds totally.
I was wondering if there is a simplier way to do this, I thought of export the matrix to csv and then use LOAD DATA INFILE to import, but it's also too complicated.
I noticed that in pandas documentation pandas dataframe has a function to_sql, and in version 0.14 you can set the 'flavor' to 'mysql', that is:
df.to_sql(con=con, name=name, flavor='mysql')
But now my pandas version is 0.19.2 and the flavor is reduced to only 'sqlite'...... And I still tried to use
df.to_sql(con=con, name=name, flavor='sqlite')
and it gives me an error.
Is there any convinient way to do this?
Later pandas versions support SQLalchemy connectors instead of flavor = "mysql"
First, install dependencies:
pip install mysql-connector-python-rf==2.2.2
pip install MySQL-python==1.2.5
pip install SQLAlchemy==1.1.1
Then create the engine:
from sqlalchemy import create_engine
connection_string= "mysql+mysqlconnector://root:#localhost/MyDatabase"
engine = create_engine(connection_string)
Then you can use df.to_sql(...):
df.to_sql('MyTable', engine)
Here are some things you can do in MYSQL to speed up your data load:
SET FOREIGN_KEY_CHECKS = 0;
SET UNIQUE_CHECKS = 0;
SET SESSION tx_isolation='READ-UNCOMMITTED';
SET sql_log_bin = 0;
#LOAD DATA LOCAL INFILE....
SET UNIQUE_CHECKS = 1;
SET FOREIGN_KEY_CHECKS = 1;
SET SESSION tx_isolation='READ-REPEATABLE';