Lambda (Python 3.6) PyMySql Query EXISTS query always returns 1 - mysql

I am trying to get a PyMySQL query in Lambda (Python 3.6) to return whether a user exists or not. I pass my slack user ID into the query. This is what I want to check in MySQL. I can run the same query through MySQL and it returns a 0, but for some reason, every time I call this query through lambda, it tells me the user exists (My database is empty). My query is function is this:
def userExists(user):
statement = f"SELECT EXISTS(SELECT 1 FROM slackDB.Assets WHERE userID LIKE '%{user}%')Assets"
tempBool = cursor.execute(statement, args=None)
conn.commit()
return tempBool
Here is the full code I am working with:
################################
# Slack Lambda handler.
################################
import sys
import logging
import os
import pymysql
import urllib
# Grab data from the environment.
BOT_TOKEN = os.environ["BOT_TOKEN"]
ASSET_TABLE = os.environ["ASSET_TABLE"]
REGION_NAME = os.getenv('REGION_NAME', 'us-east-2')
DB_NAME = "admin"
DB_PASSWORD = "somepassword"
DB_DATABASE = "someDB"
RDS_HOST = "myslackdb.somepseudourl.us-east-2.rds.amazonaws.com"
port = 3306
logger = logging.getLogger()
logger.setLevel(logging.INFO)
try:
conn = pymysql.connect(RDS_HOST, user=DB_NAME, passwd=DB_PASSWORD, db=DB_DATABASE, connect_timeout=5)
cursor = conn.cursor()
except:
logger.error("ERROR: Unexpected error: Could not connect to MySql instance.")
sys.exit()
# Define the URL of the targeted Slack API resource.
SLACK_URL = "https://slack.com/api/chat.postMessage"
def userExists(user):
statement = f"SELECT EXISTS(SELECT 1 FROM slackDB.Assets WHERE userID LIKE '%{user}%')Assets"
tempBool = cursor.execute(statement, args=None)
conn.commit()
return tempBool
def addUser(user):
statement = f"INSERT INTO `slackDB`.`Assets` (`userID`, `money`) VALUES ('{user}', '1000')"
tempBool = cursor.execute(statement, args=None)
conn.commit()
return tempBool
def lambda_handler(data, context):
# Slack challenge answer.
if "challenge" in data:
return data["challenge"]
# Grab the Slack channel data.
slack_event = data['event']
slack_userID = slack_event["user"]
slack_text = slack_event["text"]
channel_id = slack_event["channel"]
slack_reply = ""
# Ignore bot messages.
if "bot_id" in slack_event:
slack_reply = ""
else:
# Start data sift.
if slack_text.startswith("!networth"):
slack_reply = "Your networth is: "
elif slack_text.startswith("!price"):
command,asset = text.split()
slack_reply = f"The price of a(n) {asset} is: "
elif slack_text.startswith("!addme"):
if userExists(slack_userID):
slack_reply = f"User {slack_userID} already exists"
else:
slack_reply = f"Adding user {slack_userID}"
addUser(slack_userID)
# We need to send back three pieces of information:
data = urllib.parse.urlencode(
(
("token", BOT_TOKEN),
("channel", channel_id),
("text", slack_reply)
)
)
data = data.encode("ascii")
# Construct the HTTP request that will be sent to the Slack API.
request = urllib.request.Request(
SLACK_URL,
data=data,
method="POST"
)
# Add a header mentioning that the text is URL-encoded.
request.add_header(
"Content-Type",
"application/x-www-form-urlencoded"
)
# Fire off the request!
urllib.request.urlopen(request).read()
# Everything went fine.
return "200 OK"
I am typing '!addme' in slack and it always tells me the user exists. I have printed out my query statement and it is inputting my slack ID correctly. I have checked my table, and it is completely empty. I have run the query in MySQL and it returns a 0.
Does anyone have any ideas? Am I just derping this up on something easy? Any helps or hints is much appreciated.
Thanks,

I don't see a fetch from the cursor. Just the execute.
And the return from execute is the number of rows affected. For DML operations (INSERT/UPDATE/DELETE) that makes sense. But I wouldn't rely on the rows affected count for a SELECT.
In this case, the SELECT EXISTS query is going to either return a row, or throw an error. But the fact that the query returns a row doesn't tell us anything about the value of the Assets column.
From the query, it looks like we want to fetch a row, and then determine if the Assets column contains a 0 or 1 (or NULL).
After the query execution, try cur.fetchone to retrieve the row.
We could also execute a simpler query, and then use a fetch to determine if a row is returned or not.

Related

Inserting to MySQL with mysql.connector - good practice/efficiency

I am working on a personal project and was wondering if my solution for inserting data to a MySQL database would be considered "pythonic" and efficient.
I have written a separate class for that, which will be called from an object which holds a dataframe. From there I am calling my save() function to write the dataframe to the database.
The script will be running once a day where I scrape some data from some websites and save it to my database. So it is important that it really runs through completely even when I have bad data or temporary connection issues (script and database run on different machines).
import mysql.connector
# custom logger
from myLog import logger
# custom class for formatting the data, a lot of potential errors are handled here
from myFormat import myFormat
# insert strings to mysql are stored and referenced here
import sqlStrings
class saveSQL:
def __init__(self):
self.frmt = myFormat()
self.host = 'XXX.XXX.XXX.XXX'
self.user = 'XXXXXXXX'
self.password = 'XXXXXXXX'
self.database = 'XXXXXXXX'
def save(self, payload, type):
match type:
case 'First':
return self.__first(payload)
case 'Second':
...
case _:
logger.error('Undefined Input for Type!')
def __first(self, payload):
try:
self.mydb = mysql.connector.connect(host=self.host,user=self.user,password=self.password,database=self.database)
mycursor = self.mydb.cursor()
except mysql.connector.Error as err:
logger.error('Couldn\'t establish connection to DB!')
try:
tmpList = payload.values.tolist()
except ValueError:
logger.error('Value error in converting dataframe to list: ' % payload)
try:
mycursor.executemany(sqlStrings.First, tmpList)
self.mydb.commit()
dbWrite = mycursor.rowcount
except mysql.connector.Error as err:
logger.error('Error in writing to database: %s' % err)
for ele in myList:
dbWrite = 0
try:
mycursor.execute(sqlStrings.First, ele)
self.mydb.commit()
dbWrite = dbWrite + mycursor.rowcount
except mysql.connector.Error as err:
logger.error('Error in writing to database: %s \n ele: %s' % [err,ele])
continue
pass
mycursor.close()
return dbWrite
Things I am wondering about:
Is the match case a good option to distinguish between writing to different tables depending on the data?
Are the different try/except blocks really necessary or are there easier ways of handling potential errors?
Do I really need the pass command at the end of the for-loop?

Draw a graph from a SQL request using Dash

I have been searching but I didn't find a simple way to draw a graph from a SQL request.
For example I have this code, and I want to make a bar chart from the result of the request :
import pymysql as sql
from dash import dcc
DB = ...
HOST = ...
USER = ...
PASSWORD = ...
connection = sql.connect(host=HOST,
port=x,
user=USER,
password=PASSWORD,
database=DB,
cursorclass=sql.cursors.DictCursor)
with connection.cursor() as cursor:
# Read a single record
sql = 'SELECT COUNT(*) AS count FROM table'
cursor.execute(sql)
result = cursor.fetchone()
print(result)
Also, I would like to update the chart regularly.
Thank you

FastAPI not running all the functions to return the right values from database

I am trying to make a twitter points program. Basically, you get points based off of the number of likes, retweets and replies your post with a specified hashtag gets. I made an API to get these points from a database but fastAPI is not doing all the funtions specified to return the correct values.
API code:
DATABASE_URL = "mysql+mysqlconnector://root:password#localhost:3306/twitterdb"
database = Database(DATABASE_URL)
metadata_obj = MetaData()
engine = create_engine(
DATABASE_URL, connect_args={"check_same_thread": False}
)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
metadata = sqlalchemy.MetaData()
Base = declarative_base()
user_points = sqlalchemy.Table(
"points",
metadata_obj,
sqlalchemy.Column("username", sqlalchemy.String,),
sqlalchemy.Column("rt_points", sqlalchemy.Integer,),
sqlalchemy.Column("reply_points", sqlalchemy.Integer),
sqlalchemy.Column("like_points", sqlalchemy.Integer),
sqlalchemy.Column("total_points", sqlalchemy.Integer)
)
engine = sqlalchemy.create_engine(
DATABASE_URL
)
metadata.create_all(engine)
app = FastAPI()
#app.on_event("startup")
async def connect():
await database.connect()
#app.on_event("shutdown")
async def shutdown():
await database.disconnect()
class UserName(BaseModel):
rt_points: int
reply_points: int
like_points: int
total_points : int
#app.get('/userdata/', response_model=UserName)
async def get_points(user: str):
username=user
metrics.clear()
tweets_list = tweet_id(username)
tweets_list.get_tweet_ids(str(username))
metrics.main()
summing=summer(username)
summing.sum_fun(str(username))
query = user_points.select().where(user_points.c.username == username)
user = await database.fetch_one(query)
return {**user}
if __name__ == "__main__":
uvicorn.run("main:app", reload= True, host="127.0.0.1", port=5000, log_level="info")
code for metrics.py:
ids=[]
class tweet_id:
def __init__(self, name):
self.name = name
def get_tweet_ids(self, name):
try:
connection = mysql.connector.connect(host='localhost',
database='twitterdb',
user='root',
password='password')
cursor = connection.cursor()
query="truncate table twitterdb.points"
query1="truncate table twitterdb.Metrics"
sql_select_query = """SELECT tweetid FROM twitterdb.StreamData WHERE username = %s"""
# set variable in query
cursor.execute(query)
cursor.execute(query1)
cursor.execute(sql_select_query, (name,))
# fetch result
record = cursor.fetchall()
for row in record:
ids.append(int(row[0]))
except mysql.connector.Error as error:
print("Failed to get record from MySQL table: {}".format(error))
finally:
if connection.is_connected():
cursor.close()
connection.close()
def create_url():
tweet_fields = "tweet.fields=public_metrics"
converted_list = [str(element) for element in ids]
id_list = ",".join(converted_list)
url = "https://api.twitter.com/2/tweets?ids={}&{}".format(id_list, tweet_fields)
return url
#curl 'https://api.twitter.com/2/tweets?ids=1459764778088337413&tweet.fields=public_metrics&expansions=attachments.media_keys&media.fields=public_metrics' --header 'Authorization: Bearer $Bearer
def bearer_oauth(r):
"""
Method required by bearer token authentication.
"""
r.headers["Authorization"] = f"Bearer {bearer_token}"
return r
def connect_to_endpoint(url):
response = requests.request("GET", url, auth=bearer_oauth)
print(response.status_code)
if response.status_code != 200:
raise Exception(
"Request returned an error: {} {} {}".format(
response.status_code, response.text, ids
)
)
return url
return response.json()
def main():
def append_to_database(json_response):
#Loop through each tweet
for tweet in json_response['data']:
# Tweet ID
tweetid = tweet['id']
# Tweet metrics
retweet_count = tweet['public_metrics']['retweet_count']
reply_count = tweet['public_metrics']['reply_count']
like_count = tweet['public_metrics']['like_count']
quote_count = tweet['public_metrics']['quote_count']
connect(tweetid, retweet_count, reply_count, like_count, quote_count)
def connect(tweetid, retweet_count, reply_count, like_count, quote_count):
"""
connect to MySQL database and insert twitter data
"""
try:
con = mysql.connector.connect(host='localhost',
database='twitterdb', user='root', password='passsword', charset='utf8')
if con.is_connected():
"""
Insert twitter data
"""
cursor = con.cursor(buffered=True)
# twitter, golf
delete_previous_data_query = "truncate table Metrics"
query = "INSERT INTO Metrics (tweetid,retweet_count,reply_count,like_count,quote_count) VALUES (%s, %s, %s, %s, %s)"
cursor.execute(delete_previous_data_query)
cursor.execute(query, (tweetid,retweet_count,reply_count,like_count,quote_count))
con.commit()
except Error as e:
print(e)
cursor.close()
con.close()
return
url = create_url()
json_response = connect_to_endpoint(url)
append_to_database(json_response)
#Function to calculate sum of points and display it
class summer:
def __init__(self, name):
self.name = name
def sum_fun(self, name):
try:
con = mysql.connector.connect(host='localhost',
database='twitterdb', user='root', password='password', charset='utf8')
if con.is_connected():
cursor = con.cursor(buffered=True)
def create_points_table():
query= ("INSERT INTO twitterdb.points(username, rt_points,reply_points,like_points,total_points) (SELECT %s, SUM(quote_count + retweet_count) * 150, SUM(reply_count) * 50, SUM(like_count) * 10, SUM(quote_count + retweet_count) * 150 + SUM(reply_count) * 50 + SUM(like_count) * 10 FROM twitterdb.Metrics)")
cursor.execute(query, (name,))
con.commit()
create_points_table();
except Error as e:
print(e)
cursor.close()
con.close()
def clear():
"""
connect to MySQL database and insert twitter data
"""
try:
con = mysql.connector.connect(host='localhost',
database='twitterdb', user='root', password='password', charset='utf8')
if con.is_connected():
cursor = con.cursor(buffered=True)
clear_points = ("truncate table twitterdb.points")
cursor.execute(clear_points)
except Error as e:
print(e)
cursor.close()
con.close()
return
What happens here is that there's a database named twitterdb with the tables StreamData, metrics, and points.
StreamData containts tweetids and usernames of the posts that were tweeted with the specified hashtag and it is build with the Streaming API.
Here the issues is that, suppose I have the following usernames mark and ramon in the streamdata table. So when I input the username via the API as mark no issues happen, it returns the correct points for mark, but if I then enter something like mark1 or any random value, it returns the points for mark again. But then if I enter ramon it gives the right points for ramon but then if I enter the random values again, I get the same points for ramon.
Furthermore, the first time when we start the API and if we enter a random value, it returns an error that is specified in the exception as defined in connect_to_endpoint function.
The code logic here is that,
We enter a username via the API, and the get_tweet_ids function looks for that username in the streamdata table and selects all the tweet ids corresponding to that username and saves it to a list, ids. This list of ids is given to the twitter metrics API endpoint and the required values from the response is saved to the table metrics.
Then, the sum_fun is called to select the sum of values of likes, rts and replies from the metrics table, multiply it with the specified points and save it to the table points along with the username.
The API at last returns the values in the table points matching the username.
How can I get it to stop returning the values for random data? If an invalid data is given, it must raise the exception in connect_to_endpoint function, but it just returns whatever value is in the table points previously.
I tried multiple approaches to this like, clearing the values of points before all other functions and checking to return only the values corresponding to the username in the points table. But neither of them worked. When the username was checked in the points table after running it with random values, it contained the random value but with the points of the previous valid username.
NOTE: The table points is a temporary table and values are assigned only when an API call is made.
I am a complete beginner to all this and this is more of a pet project I have been working on, so please help out. Any and all help and guidance regarding my logic and design and a fix for this will be of much use. Thanks.
if the code that you have provided for metrics.py is correct your problem should comme from how you declare the variable ids.
in your code you have declare it as a global so it will not be reset at every function call or class instance creation.
what you should to is declare it in get_tweet_ids()
class tweet_id:
def __init__(self, name):
self.name = name
def get_tweet_ids(self, name):
ids=[] # modification here
try:
connection = mysql.connector.connect(host='localhost',
database='twitterdb',
user='root',
password='password')
cursor = connection.cursor()
query="truncate table twitterdb.points"
query1="truncate table twitterdb.Metrics"
sql_select_query = """SELECT tweetid FROM twitterdb.StreamData WHERE username = %s"""
# set variable in query
cursor.execute(query)
cursor.execute(query1)
cursor.execute(sql_select_query, (name,))
# fetch result
record = cursor.fetchall()
for row in record:
ids.append(int(row[0]))
return ids # modification here
except mysql.connector.Error as error:
print("Failed to get record from MySQL table: {}".format(error))
finally:
if connection.is_connected():
cursor.close()
connection.close()
with this you will have a new instance of ids at every get_tweet_ids call.
You will have to change the rest of your code according to this return statement

Python script DB connection as Pool not working, but simple connection is working

I am writing a script in python 3 that is listening to the tunnel and saving and updating data inside MySQL depend on the message received.
I went into weird behavior, i did a simple connection to MySQL using pymysql module and everything worked fine, ut after sometime this simple connection closes.
So i decide to implement Pool connection to MySQL and here arises the problem. Something happens no errors, but the issue is the following:
My cursor = yield self._pool.execute(query, list(filters.values()))
cursor result = tornado_mysql.pools.Pool object at 0x0000019DE5D71F98
and stacks like that not doing anything more
If i remove yield from cursor pass that line and next line throws error
response = yield c.fetchall()
AttributeError: 'Future' object has no attribute 'fetchall'
How i can fix the MySQL pool connection to work properly?
What i tried:
I use few modules for pool connection, all goes in same issue
Did back simple connection with pymysql and worked again
Below my code:
python script file
import pika
from model import SyncModel
_model = SyncModel(conf, _server_id)
#coroutine
def main():
credentials = pika.PlainCredentials('user', 'password')
try:
cp = pika.ConnectionParameters(
host='127.0.0.1',
port=5671,
credentials=credentials,
ssl=False,
)
connection = pika.BlockingConnection(cp)
channel = connection.channel()
#coroutine
def callback(ch, method, properties, body):
if 'messageType' in properties.headers:
message_type = properties.headers['messageType']
if message_type in allowed_message_types:
result = ptoto_file._reflection.ParseMessage(descriptors[message_type], body)
if result:
result = protobuf_to_dict(result)
if message_type == 'MyMessage':
yield _model.message_event(data=result)
else:
print('Message type not in allowed list = ' + str(message_type))
print('continue listening...')
channel.basic_consume(callback, queue='queue', no_ack=True)
print(' [*] Waiting for messages. To exit press CTRL+C')
channel.start_consuming()
except Exception as e:
print('Could not connect to host 127.0.0.1 on port 5671')
print(str(e))
if __name__ == '__main__':
main()
SyncModel
from tornado_mysql import pools
from tornado.gen import coroutine, Return
from tornado_mysql.cursors import DictCursor
class SyncModel(object):
def __init__(self, conf, server_id):
self.conf = conf
servers = [i for i in conf.mysql.servers]
for s in servers:
if s['server_id'] == server_id:
// s hold all data as, host, user, port, autocommit, charset, db, password
s['cursorclass'] = DictCursor
self._pool = pools.Pool(s, max_idle_connections=1, max_recycle_sec=3)
#coroutine
def message_event(self, data):
table_name = 'table_name'
query = ''
data = data['message']
filters = {
'id': data['id']
}
// here the connection fails as describe above
response = yield self.query_select(table_name, self._pool, filters=filters)
#coroutine
def query_select(self, table_name, _pool, filters=None):
if filters is None:
filters = {}
combined_filters = ['`%s` = %%s' % i for i in filters.keys()]
where = 'WHERE ' + ' AND '.join(combined_filters) if combined_filters else ''
query = """SELECT * FROM `%s` %s""" % (table_name, where)
c = self._pool.execute(query, list(filters.values()))
response = yield c.fetchall()
raise Return({response})
All the code was working with just simple connection to the database, after i start to use pool example is not working anymore. Will appreciate any help in this issue.
This is a stand alone script.
The pool connection was not working, so switched back to pymysql with double checking the connection
I would like to post my answer that worked, only this solution worked for me
before connecting to mysql to check if the connection is open, if not reconnect
if not self.mysql.open:
self.mysql.ping(reconnect=True)

Pipeline doesn't write to MySQL but also gives no error

I've tried to implement this pipeline in my spider.
After installing the necessary dependencies I am able to run the spider without any errors but for some reason it doesn't write to my database.
I'm pretty sure there is something going wrong with connecting to the database. When I give in a wrong password, I still don't get any error.
When the spider scraped all the data, it needs a few minutes before it starts dumping the stats.
2017-08-31 13:17:12 [scrapy] INFO: Closing spider (finished)
2017-08-31 13:17:12 [scrapy] INFO: Stored csv feed (27 items) in: test.csv
2017-08-31 13:24:46 [scrapy] INFO: Dumping Scrapy stats:
Pipeline:
import MySQLdb.cursors
from twisted.enterprise import adbapi
from scrapy.xlib.pydispatch import dispatcher
from scrapy import signals
from scrapy.utils.project import get_project_settings
from scrapy import log
SETTINGS = {}
SETTINGS['DB_HOST'] = 'mysql.domain.com'
SETTINGS['DB_USER'] = 'username'
SETTINGS['DB_PASSWD'] = 'password'
SETTINGS['DB_PORT'] = 3306
SETTINGS['DB_DB'] = 'database_name'
class MySQLPipeline(object):
#classmethod
def from_crawler(cls, crawler):
return cls(crawler.stats)
def __init__(self, stats):
print "init"
#Instantiate DB
self.dbpool = adbapi.ConnectionPool ('MySQLdb',
host=SETTINGS['DB_HOST'],
user=SETTINGS['DB_USER'],
passwd=SETTINGS['DB_PASSWD'],
port=SETTINGS['DB_PORT'],
db=SETTINGS['DB_DB'],
charset='utf8',
use_unicode = True,
cursorclass=MySQLdb.cursors.DictCursor
)
self.stats = stats
dispatcher.connect(self.spider_closed, signals.spider_closed)
def spider_closed(self, spider):
print "close"
""" Cleanup function, called after crawing has finished to close open
objects.
Close ConnectionPool. """
self.dbpool.close()
def process_item(self, item, spider):
print "process"
query = self.dbpool.runInteraction(self._insert_record, item)
query.addErrback(self._handle_error)
return item
def _insert_record(self, tx, item):
print "insert"
result = tx.execute(
" INSERT INTO matches(type,home,away,home_score,away_score) VALUES (soccer,"+item["home"]+","+item["away"]+","+item["score"].explode("-")[0]+","+item["score"].explode("-")[1]+")"
)
if result > 0:
self.stats.inc_value('database/items_added')
def _handle_error(self, e):
print "error"
log.err(e)
Spider:
import scrapy
import dateparser
from crawling.items import KNVBItem
class KNVBspider(scrapy.Spider):
name = "knvb"
start_urls = [
'http://www.knvb.nl/competities/eredivisie/uitslagen',
]
custom_settings = {
'ITEM_PIPELINES': {
'crawling.pipelines.MySQLPipeline': 301,
}
}
def parse(self, response):
# www.knvb.nl/competities/eredivisie/uitslagen
for row in response.xpath('//div[#class="table"]'):
for div in row.xpath('./div[#class="row"]'):
match = KNVBItem()
match['home'] = div.xpath('./div[#class="value home"]/div[#class="team"]/text()').extract_first()
match['away'] = div.xpath('./div[#class="value away"]/div[#class="team"]/text()').extract_first()
match['score'] = div.xpath('./div[#class="value center"]/text()').extract_first()
match['date'] = dateparser.parse(div.xpath('./preceding-sibling::div[#class="header"]/span/span/text()').extract_first(), languages=['nl']).strftime("%d-%m-%Y")
yield match
If there are better pipelines available to do what I'm trying to achieve that'd be welcome as well. Thanks!
Update:
With the link provided in the accepted answer I eventually got to this function that's working (and thus solved my problem):
def process_item(self, item, spider):
print "process"
query = self.dbpool.runInteraction(self._insert_record, item)
query.addErrback(self._handle_error)
query.addBoth(lambda _: item)
return query
Take a look at this for how to use adbapi with MySQL for saving scraped items. Note the difference in your process_item and their process_item method implementation. While you return the item immediately, they return Deferred object which is the result of runInteraction method and which returns the item upon its completion. I think this is the reason your _insert_record never gets called.
If you can see the insert in your output that's already a good sign.
I'd rewrite the insert function this way:
def _insert_record(self, tx, item):
print "insert"
raw_sql = "INSERT INTO matches(type,home,away,home_score,away_score) VALUES ('%s', '%s', '%s', '%s', '%s')"
sql = raw_sql % ('soccer', item['home'], item['away'], item['score'].explode('-')[0], item['score'].explode('-')[1])
print sql
result = tx.execute(sql)
if result > 0:
self.stats.inc_value('database/items_added')
It allows you to debug the sql you're using. In you version you're not wrapping the string in ' which is a syntax error in mysql.
I'm not sure about your last values (score) so I treated them as strings.