I'm playing with python trying to create a basic repository class (normally a C++/C# for work) and am having an issue.
The following code has bombs out on
a = officesRepo(conn) saying "Too many positional arguments for constructor call", but it's being given the only argument specified, the MySql connection object.
I'm coding in vscode on linux using python3.8. I'm wondering if pylint is expecting me to pass in "self", when I don't think it's needed.
Any help/advice/tips greatly received. Flame away if you like, as long as it teaches me something! ;-)
import pymysql.cursors
import Pocos
class officesRepo:
def __init__(conn):
self.conn = conn
def create(office):
pass
def getAll():
cursor = conn.cursor()
SQL = "SELECT `officeCode`, `city`, `phone`, `addressLine1`, `addressLine2`, `state`, `country`, `postalCode`, `territory` "
SQL += "FROM `offices`"
cursor.execute(SQL)
#result = cursor.fetchone()
ret = []
for val in cursor:
ret.append(ret.append(val["officeCode"], val["city"], val["phone"], val["addressLine1"], val["addressLine2"], val["state"], val["country"], val["postalCode"], val["territory"]))
return ret
def getById(id):
pass
conn = pymysql.connect(host='localhost',
user='user',
password='password',
db='classicmodel',
charset='utf8mb4',
cursorclass=pymysql.cursors.DictCursor)
a = officesRepo(conn)
b = a.getAll()
print(b)
The first parameter of an instance method is self. You don't need to pass it explicitly, but you do need to include it in the parameter list. Right now though, the conn parameter is acting as self, then there's no other parameters after that (thus the error).
You'd need
def __init__(self, conn):
. . .
then similarly for the other methods. All instance methods require an explicit self parameter.
Related
I am working on a personal project and was wondering if my solution for inserting data to a MySQL database would be considered "pythonic" and efficient.
I have written a separate class for that, which will be called from an object which holds a dataframe. From there I am calling my save() function to write the dataframe to the database.
The script will be running once a day where I scrape some data from some websites and save it to my database. So it is important that it really runs through completely even when I have bad data or temporary connection issues (script and database run on different machines).
import mysql.connector
# custom logger
from myLog import logger
# custom class for formatting the data, a lot of potential errors are handled here
from myFormat import myFormat
# insert strings to mysql are stored and referenced here
import sqlStrings
class saveSQL:
def __init__(self):
self.frmt = myFormat()
self.host = 'XXX.XXX.XXX.XXX'
self.user = 'XXXXXXXX'
self.password = 'XXXXXXXX'
self.database = 'XXXXXXXX'
def save(self, payload, type):
match type:
case 'First':
return self.__first(payload)
case 'Second':
...
case _:
logger.error('Undefined Input for Type!')
def __first(self, payload):
try:
self.mydb = mysql.connector.connect(host=self.host,user=self.user,password=self.password,database=self.database)
mycursor = self.mydb.cursor()
except mysql.connector.Error as err:
logger.error('Couldn\'t establish connection to DB!')
try:
tmpList = payload.values.tolist()
except ValueError:
logger.error('Value error in converting dataframe to list: ' % payload)
try:
mycursor.executemany(sqlStrings.First, tmpList)
self.mydb.commit()
dbWrite = mycursor.rowcount
except mysql.connector.Error as err:
logger.error('Error in writing to database: %s' % err)
for ele in myList:
dbWrite = 0
try:
mycursor.execute(sqlStrings.First, ele)
self.mydb.commit()
dbWrite = dbWrite + mycursor.rowcount
except mysql.connector.Error as err:
logger.error('Error in writing to database: %s \n ele: %s' % [err,ele])
continue
pass
mycursor.close()
return dbWrite
Things I am wondering about:
Is the match case a good option to distinguish between writing to different tables depending on the data?
Are the different try/except blocks really necessary or are there easier ways of handling potential errors?
Do I really need the pass command at the end of the for-loop?
Trying to implement a context manager for a MySql connection I get an error message.
I have use MySql.connector module including the connection option to connect a database, and the pyMySql module, always getting the same result.
import pymysql
class MySQLConnector:
def __init__(self, con_dict):
self.cnx = None
self.con_dict = con_dict
def __enter__(self):
self.cnx = pymysql.connect(**self.con_dict)
return self.cnx
def __exit__(self):
self.cnx.close()
class ReceiveBroke(QDialog):
def __init__(self, db, config):
super().__init__()
with MySQLConnector(config) as
cur = self.cnx.cnx.cursor()
qry = "SELECT * FROM horses
cur.execute(qry)
result = cur.fetchall()
print(result)
self.setUI()
self.conTest()
def conTest(self):
if self.cnx.ope
print("y")
I hope to obtain a working context manager closing the database connection a finishing the with Block.
Result: Error Message. Always after executing the last line within the "with" block :"TypeError exit takes one positional argument but four were given" at which time the program crashes.
hope you found the error already. If not, I think the error message is quite clear:
TypeError exit takes one positional argument but four were given
__exit__ function needs 4 arguments: self, type, value, traceback. These last 3 relate to any exception that may happen in the with block.
I am creating a test Flask API, and have created a Database class that I use from my main app. I am using pymysql to access my MySQL DB but I am having trouble figuring out when to close the cursor and connection. Right now I have
import pymysql
class Database:
def __init__(self):
host = '127.0.0.1'
user = 'root'
password = ''
db = 'API'
self.con = pymysql.connect(host=host, user=user, password=password, db=db, cursorclass=pymysql.cursors.DictCursor, autocommit=True)
self.cur = self.con.cursor()
def getUser(self, id):
sql = 'SELECT * from users where id = %d'
self.cur.execute(sql, (id))
result = self.cur.fetchall()
return result
def getAllUsers(self):
sql = 'SELECT * from users'
self.cur.execute(sql)
result = self.cur.fetchall()
return result
def AddUser(self, firstName, lastName, email):
sql = "INSERT INTO `users` (`firstName`, `lastName`, `email`) VALUES (%s, %s, %s)"
self.cur.execute(sql, (firstName, lastName, email))
I have tried adding self.cur.close() and self.con.close() after each execution of the cursor in the functions but then I get an error the next time I call a function saying the cursor is closed, or after I do an insert statement it won't show the new value even though it was inserted correctly into MySQL. How do I know when to close the cursor, and how to start it back up properly with each call to a method?
This sounds like a great use case for a python context manager. Context Managers allow you to properly manage resources, such as a database connection, by allowing you to specify how your resource's set-up and tear down methods should work. You can create your own custom context manager in one of two ways: First, by wrapping your database class, and implementing the required methods for the context manager: __init__(), __enter__(), and __exit__(). Second, by utilizing a #contextmanager decorator on a function definition and creating a generator for your database resource within said function definition. I will show both approaches and let you decide which one is your preference. The __init__() method is the initialization method for your custom context manager, similar to the initialization method used for custom python classes. The __enter__() method is your setup code for your custom context manager. Lastly, the __exit()__ method is your teardown code for your custom context manager. Both approaches utilize these methods with the main difference being the first method will explicitly state these methods within your class definition. Where as in the second approach, all the code up to your generator's yield statement is your initialization and setup code and all the code after the yield statement is your teardown code. I would also consider extracting out your user based database actions into a user model class as well. Something along the lines of:
custom context manager: (class based approach):
import pymysql
class MyDatabase():
def __init__(self):
self.host = '127.0.0.1'
self.user = 'root'
self.password = ''
self.db = 'API'
self.con = None
self.cur = None
def __enter__(self):
# connect to database
self.con = pymysql.connect(host=self.host, user=self.user, password=self.password, db=self.db, cursorclass=pymysql.cursors.DictCursor, autocommit=True)
self.cur = self.con.cursor()
return self.cur
def __exit__(self, exc_type, exc_val, traceback):
# params after self are for dealing with exceptions
self.con.close()
user.py (refactored):'
# import your custom context manager created from the step above
# if you called your custom context manager file my_database.py: from my_database import MyDatabase
import <custom_context_manager>
class User:
def getUser(self, id):
sql = 'SELECT * from users where id = %d'
with MyDatabase() as db:
db.execute(sql, (id))
result = db.fetchall()
return result
def getAllUsers(self):
sql = 'SELECT * from users'
with MyDatabase() as db:
db.execute(sql)
result = db.fetchall()
return result
def AddUser(self, firstName, lastName, email):
sql = "INSERT INTO `users` (`firstName`, `lastName`, `email`) VALUES (%s, %s, %s)"
with MyDatabase() as db:
db.execute(sql, (firstName, lastName, email))
context manager (decorator approach):
from contextlib import contextmanager
import pymysql
#contextmanager
def my_database():
try:
host = '127.0.0.1'
user = 'root'
password = ''
db = 'API'
con = pymysql.connect(host=host, user=user, password=password, db=db, cursorclass=pymysql.cursors.DictCursor, autocommit=True)
cur = con.cursor()
yield cur
finally:
con.close()
Then within your User class you could use the context manager by first importing the file and then using it similar to as before:
with my_database() as db:
sql = <whatever sql stmt you wish to execute>
#db action
db.execute(sql)
Hopefully that helps!
I've tried to implement this pipeline in my spider.
After installing the necessary dependencies I am able to run the spider without any errors but for some reason it doesn't write to my database.
I'm pretty sure there is something going wrong with connecting to the database. When I give in a wrong password, I still don't get any error.
When the spider scraped all the data, it needs a few minutes before it starts dumping the stats.
2017-08-31 13:17:12 [scrapy] INFO: Closing spider (finished)
2017-08-31 13:17:12 [scrapy] INFO: Stored csv feed (27 items) in: test.csv
2017-08-31 13:24:46 [scrapy] INFO: Dumping Scrapy stats:
Pipeline:
import MySQLdb.cursors
from twisted.enterprise import adbapi
from scrapy.xlib.pydispatch import dispatcher
from scrapy import signals
from scrapy.utils.project import get_project_settings
from scrapy import log
SETTINGS = {}
SETTINGS['DB_HOST'] = 'mysql.domain.com'
SETTINGS['DB_USER'] = 'username'
SETTINGS['DB_PASSWD'] = 'password'
SETTINGS['DB_PORT'] = 3306
SETTINGS['DB_DB'] = 'database_name'
class MySQLPipeline(object):
#classmethod
def from_crawler(cls, crawler):
return cls(crawler.stats)
def __init__(self, stats):
print "init"
#Instantiate DB
self.dbpool = adbapi.ConnectionPool ('MySQLdb',
host=SETTINGS['DB_HOST'],
user=SETTINGS['DB_USER'],
passwd=SETTINGS['DB_PASSWD'],
port=SETTINGS['DB_PORT'],
db=SETTINGS['DB_DB'],
charset='utf8',
use_unicode = True,
cursorclass=MySQLdb.cursors.DictCursor
)
self.stats = stats
dispatcher.connect(self.spider_closed, signals.spider_closed)
def spider_closed(self, spider):
print "close"
""" Cleanup function, called after crawing has finished to close open
objects.
Close ConnectionPool. """
self.dbpool.close()
def process_item(self, item, spider):
print "process"
query = self.dbpool.runInteraction(self._insert_record, item)
query.addErrback(self._handle_error)
return item
def _insert_record(self, tx, item):
print "insert"
result = tx.execute(
" INSERT INTO matches(type,home,away,home_score,away_score) VALUES (soccer,"+item["home"]+","+item["away"]+","+item["score"].explode("-")[0]+","+item["score"].explode("-")[1]+")"
)
if result > 0:
self.stats.inc_value('database/items_added')
def _handle_error(self, e):
print "error"
log.err(e)
Spider:
import scrapy
import dateparser
from crawling.items import KNVBItem
class KNVBspider(scrapy.Spider):
name = "knvb"
start_urls = [
'http://www.knvb.nl/competities/eredivisie/uitslagen',
]
custom_settings = {
'ITEM_PIPELINES': {
'crawling.pipelines.MySQLPipeline': 301,
}
}
def parse(self, response):
# www.knvb.nl/competities/eredivisie/uitslagen
for row in response.xpath('//div[#class="table"]'):
for div in row.xpath('./div[#class="row"]'):
match = KNVBItem()
match['home'] = div.xpath('./div[#class="value home"]/div[#class="team"]/text()').extract_first()
match['away'] = div.xpath('./div[#class="value away"]/div[#class="team"]/text()').extract_first()
match['score'] = div.xpath('./div[#class="value center"]/text()').extract_first()
match['date'] = dateparser.parse(div.xpath('./preceding-sibling::div[#class="header"]/span/span/text()').extract_first(), languages=['nl']).strftime("%d-%m-%Y")
yield match
If there are better pipelines available to do what I'm trying to achieve that'd be welcome as well. Thanks!
Update:
With the link provided in the accepted answer I eventually got to this function that's working (and thus solved my problem):
def process_item(self, item, spider):
print "process"
query = self.dbpool.runInteraction(self._insert_record, item)
query.addErrback(self._handle_error)
query.addBoth(lambda _: item)
return query
Take a look at this for how to use adbapi with MySQL for saving scraped items. Note the difference in your process_item and their process_item method implementation. While you return the item immediately, they return Deferred object which is the result of runInteraction method and which returns the item upon its completion. I think this is the reason your _insert_record never gets called.
If you can see the insert in your output that's already a good sign.
I'd rewrite the insert function this way:
def _insert_record(self, tx, item):
print "insert"
raw_sql = "INSERT INTO matches(type,home,away,home_score,away_score) VALUES ('%s', '%s', '%s', '%s', '%s')"
sql = raw_sql % ('soccer', item['home'], item['away'], item['score'].explode('-')[0], item['score'].explode('-')[1])
print sql
result = tx.execute(sql)
if result > 0:
self.stats.inc_value('database/items_added')
It allows you to debug the sql you're using. In you version you're not wrapping the string in ' which is a syntax error in mysql.
I'm not sure about your last values (score) so I treated them as strings.
How can I escape the input to a MySQL db in Python3?
I'm using PyMySQL and works fine, but when I try to do something like:
cursor.execute("SELECT * FROM `Codes` WHERE `ShortCode` = '{}'".format(request[1]))
it won't work if the string has ' or ". I also tried:
cursor.execute("SELECT * FROM `Codes` WHERE `ShortCode` = %s",request[1])
The problem with this is that the library (PyMySQL) uses the formatting syntax for Python2.x, %, that doesn't work anymore.
I also found this possible solution
conn.escape_string()
in here, but I don't know where to add this code.
This is all I got:
import pymysql
import sys
conn = pymysql.connect( host = "localhost",
user = "test",
passwd = "",
db = "test")
cursor = conn.cursor()
cursor.execute("SELECT * FROM `Codes` WHERE `ShortCode` = {}".format(request[1]))
result = cursor.fetchall()
cursor.close()
conn.close()
Edit: I solved it! In PyMySQL the right way is like this:
import pymysql
import sys
conn = pymysql.connect(host="localhost",
user="test",
passwd="",
db="test")
cursor = conn.cursor()
text = conn.escape(request[1])
cursor.execute("SELECT * FROM `Codes` WHERE `ShortCode` = {}".format(text))
cursor.close()
conn.close()
Where the text = conn.escape(request[1]) line is what escapes the code. Found it inside PyMySQL code. There, request[1] is the input.
Although the "solved" answer works, it is not best practice. When using a library conforming to the Python DBI, you should be using bind variables rather than formatting a string and passing it to execute. There are dangers inherent in that methodology.
Therefore, this is the right way to do it:
cursor.execute("SELECT * FROM `Codes` WHERE `ShortCode` = %s", text)
Note that this is not a format string but a bind variable passed to the executing cursor.
For details: Python DBI PEP
Solved. In PyMySQL the right way is like this:
import pymysql
import sys
conn = pymysql.connect(host="localhost",
user="test",
passwd="",
db="test")
cursor = conn.cursor()
text = conn.escape(request[1])
cursor.execute("SELECT * FROM `Codes` WHERE `ShortCode` = {}".format(text))
cursor.close()
conn.close()
Where the text = conn.escape(request[1]) line is what escapes the code. Found it inside PyMySQL code. There, request[1] is the input.
Ready to use helper function
def mysql_insert(conn, table, row):
cols = ', '.join('`{}`'.format(col) for col in row.keys())
vals = ', '.join('%({})s'.format(col) for col in row.keys())
sql = 'INSERT INTO `{0}` ({1}) VALUES ({2})'.format(table, cols, vals)
conn.cursor().execute(sql, row)
conn.commit()
Usage example
insert_into(conn, 'people', {
'firstname': 'John',
'lastname': 'Doe',
'age': 18, })
Reference: https://github.com/PyMySQL/PyMySQL/blob/master/pymysql/cursors.py#L157-L158
def execute(self, query, args=None):
If args is a list or tuple, %s can be used as a placeholder in the query.
If args is a dict, %(name)s can be used as a placeholder in the query.