New SQLAlchemy Add Record Error - sqlalchemy

Newbie to SQLAlchemy.
I'm having trouble adding a record. I modeled the add after the tutorial which passes multiple values (albeit hard coded values.) Attached is the routine and the error.
StackOverflow thinks my 'explanation to code' ratio is off, so I'm adding additional explanation so I can submit my query.
import pdb
from table import wrl
from sqlalchemy import or_, and_, desc, asc
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
rs = create_engine('credentials', echo=True)
aws = create_engine('credentials', echo=True)
rs_session = sessionmaker(bind=rs)
aws_session = sessionmaker(bind=aws)
rs = rs_session()
aws = aws_session()
# pdb.set_trace()
y = rs.query(wrl).order_by(wrl.UUID_PK).first()
cat = y.Added_Timestamp #now we have the oldest record time stamp value
query_string = cat[:8]+"%" #now we have the oldest record's date i.e. substring(20111215_121212;1;8)
move_me = rs.query(wrl).filter(wrl.Added_Timestamp.like(query_string)).limit(10)
pdb.set_trace()
for x in move_me:
# pdb.set_trace()
wrl_rec = wrl(x.UUID_PK,
x.Web_Request_Headers,
x.Web_Request_Body,
x.Current_Machine,
x.Current_Machine,
x.ResponseBody,
x.Full_Log_Message,
x.Remote_Address,
x.basic_auth_username,
x.Request_Method,
x.Request_URI,
x.Request_Protocol,
x.Time_To_Process_Request,
x.User_ID,
x.Error,
x.Added_Timestamp,
x.Processing_Time_Milliseconds,
x.mysql_timestamp)
aws.add(wrl_rec)
aws.commit()
print 'added %s ' % x.UUID_PK
Traceback (most recent call last):
File "migrate.py", line 47, in <module>
x.mysql_timestamp)
TypeError: __init__() takes exactly 1 argument (19 given)
Any suggestions appreciated.

The problem is not really SA related. My conjecture is that your constructor (wrl.__init__(self, ...)) is either not defined, or does not take any positional arguments, which you are trying to specify when creating this object in wrl_rec.
So basically, the error message is pretty much indicating your problem.
On a side-note, does order_by(wrl.UUID_PK) really return the oldest record by the Timestamp as your comment few lines below indicate? Somehow I highly doubt that.

Related

Error message when importing .csv files into MySQL using Python

I am a novice when it comes to Python and I am trying to import a .csv file into an already existing MySQL table. I have tried it several different ways but I cannot get anything to work. Below is my latest attempt (not the best syntax I'm sure). I originally tried using ‘%s’ instead of ‘?’, but that did not work. Then I saw an example of the question mark but that clearly isn’t working either. What am I doing wrong?
import mysql.connector
import pandas as pd
db = mysql.connector.connect(**Login Info**)
mycursor = db.cursor()
df = pd.read_csv("CSV_Test_5.csv")
insert_data = (
"INSERT INTO company_calculations.bs_import_test(ticker, date_updated, bs_section, yr_0, yr_1, yr_2, yr_3, yr_4, yr_5, yr_6, yr_7, yr_8, yr_9, yr_10, yr_11, yr_12, yr_13, yr_14, yr_15)"
"VALUES(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"
)
for row in df.itertuples():
data_inputs = (row.ticker, row.date_updated, row.bs_section, row.yr_0, row.yr_1, row.yr_2, row.yr_3, row.yr_4, row.yr_5, row.yr_6, row.yr_7, row.yr_8, row.yr_9, row.yr_10, row.yr_11, row.yr_12, row.yr_13, row.yr_14, row.yr_15)
mycursor.execute(insert_data, data_inputs)
db.commit()
Error Message:
> Traceback (most recent call last): File
> "C:\...\Python_Test\Excel_Test_v1.py",
> line 33, in <module>
> mycursor.execute(insert_data, data_inputs) File "C:\...\mysql\connector\cursor_cext.py",
> line 325, in execute
> raise ProgrammingError( mysql.connector.errors.ProgrammingError: Not all parameters were used in the SQL statement
MySQL Connector/Python supports named parameters (which includes also printf style parameters (format)).
>>> import mysql.connector
>>> mysql.connector.paramstyle
'pyformat'
According to PEP-249 (DB API level 2.0) the definition of pyformat is:
pyformat: Python extended format codes, e.g. ...WHERE name=%(name)s
Example:
>>> cursor.execute("SELECT %s", ("foo", ))
>>> cursor.fetchall()
[('foo',)]
>>> cursor.execute("SELECT %(var)s", {"var" : "foo"})
>>> cursor.fetchall()
[('foo',)]
Afaik the qmark paramstyle (using question mark as a place holder) is only supported by MariaDB Connector/Python.

Iterating through describe_instances() to print key & value boto3

I am currently working on a python script to print pieces of information on running EC2 instances on AWS using Boto3. I am trying to print the InstanceID, InstanceType, and PublicIp. I looked through Boto3's documentation and example scripts so this is what I am using:
import boto3
ec2client = boto3.client('ec2')
response = ec2client.describe_instances()
for reservation in response["Reservations"]:
for instance in reservation["Instances"]:
instance_id = instance["InstanceId"]
instance_type = instance["InstanceType"]
instance_ip = instance["NetworkInterfaces"][0]["Association"]
print(instance)
print(instance_id)
print(instance_type)
print(instance_ip)
When I run this, "instance" prints one large block of json code, my instanceID, and type. But I am getting an error since adding NetworkInterfaces.
instance_ip = instance["NetworkInterfaces"][0]["Association"]
returns:
Traceback (most recent call last):
File "/Users/me/AWS/describeInstances.py", line 12, in <module>
instance_ip = instance["NetworkInterfaces"][0]["Association"]
KeyError: 'Association'
What am I doing wrong while trying to print the PublicIp?
Here is the structure of NetworkInterfaces for reference:
The full Response Syntax for reference can be found here (https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ec2.html#EC2.Client.describe_instances)
Association man not always may be present. Also an instance may have more then one interface. So your working loop could be:
for reservation in response["Reservations"]:
for instance in reservation["Instances"]:
instance_id = instance["InstanceId"]
instance_type = instance["InstanceType"]
#print(instance)
print(instance_id, instance_type)
for network_interface in instance["NetworkInterfaces"]:
instance_ip = network_interface.get("Association", "no-association")
print(' -', instance_ip)

too many values to unpack (expected 2) lda

I received error : too many values to unpack (expected 2) , when running the below code. anyone can help me? I added more details.
import gensim
import gensim.corpora as corpora
dictionary = corpora.Dictionary(doc_clean)
doc_term_matrix = [dictionary.doc2bow(doc) for doc in doc_clean]
Lda = gensim.models.ldamodel.LdaModel
ldamodel = Lda(doc_term_matrix, num_topics=3, id2word = dictionary, passes=50, per_word_topics = True, eval_every = 1)
print(ldamodel.print_topics(num_topics=3, num_words=20))
for i in range (0,46):
for index, score in sorted(ldamodel[doc_term_matrix[i]], key=lambda tup: -1*tup[1]):
print("subject", i)
print("\n")
print("Score: {}\t \nTopic: {}".format(score, ldamodel.print_topic(index, 6)))
Focusing on the loop, since this is where the error is being raised. Let's take it one iteration at a time.
>>> import numpy as np # just so we can use np.shape()
>>> i = 0 # value in first loop
>>> x = sorted( ldamodel[doc_term_matrix[i]], key=lambda tup: -1*tup[1] )
>>> np.shape(x)
(3, 3, 2)
>>> for index, score in x:
... pass
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: too many values to unpack (expected 2)
Here is where your error is coming from. You are expecting this returned matrix to have 2 elements, however it is a multislice matrix with no simple infer-able way to unpack it. I do not personally have enough experience with this subject material to be able to infer what you might mean to be doing, I can only show you where your problem is coming from. Hope this helps!

How to implement a context manager for MySql either with MySQL connector or pyMySql

Trying to implement a context manager for a MySql connection I get an error message.
I have use MySql.connector module including the connection option to connect a database, and the pyMySql module, always getting the same result.
import pymysql
class MySQLConnector:
def __init__(self, con_dict):
self.cnx = None
self.con_dict = con_dict
def __enter__(self):
self.cnx = pymysql.connect(**self.con_dict)
return self.cnx
def __exit__(self):
self.cnx.close()
class ReceiveBroke(QDialog):
def __init__(self, db, config):
super().__init__()
with MySQLConnector(config) as
cur = self.cnx.cnx.cursor()
qry = "SELECT * FROM horses
cur.execute(qry)
result = cur.fetchall()
print(result)
self.setUI()
self.conTest()
def conTest(self):
if self.cnx.ope
print("y")
I hope to obtain a working context manager closing the database connection a finishing the with Block.
Result: Error Message. Always after executing the last line within the "with" block :"TypeError exit takes one positional argument but four were given" at which time the program crashes.
hope you found the error already. If not, I think the error message is quite clear:
TypeError exit takes one positional argument but four were given
__exit__ function needs 4 arguments: self, type, value, traceback. These last 3 relate to any exception that may happen in the with block.

Count number of users per window using PySpark

I'm using Kafka to stream a JSON file, sending each line as a message. One of the keys is the user's email.
Then I use PySpark to count the number of unique users per window, using their email to identify them. The command
def print_users_count(count):
print 'The number of unique users is:', count
print_users_count((lambda message: message['email']).distinct().count())
Gives me the error below. How can I fix this?
AttributeError Traceback (most recent call last)
<ipython-input-19-311ba744b41f> in <module>()
2 print 'The number of unique users is:', count
3
----> 4 print_users_count((lambda message: message['email']).distinct().count())
AttributeError: 'function' object has no attribute 'distinct'
Here is my PySpark code:
from pyspark import SparkContext
from pyspark.streaming import StreamingContext
from pyspark.streaming.kafka import KafkaUtils
import json
try:
sc.stop()
except:
pass
sc = SparkContext(appName="KafkaStreaming")
sc.setLogLevel("WARN")
ssc = StreamingContext(sc, 60)
# Define the PySpark consumer.
kafkaStream = KafkaUtils.createStream(ssc, bootstrap_servers, 'spark-streaming2', {topicName:1})
# Parse the incoming data as JSON.
parsed = kafkaStream.map(lambda v: json.loads(v[1]))
# Count the number of messages per batch.
parsed.count().map(lambda x:'Messages in this batch: %s' % x).pprint()
Your not applying the lambda function to anything. What is message referencing? Right not the lambda function is just that, a function. That si why your getting AttributeError: 'function' object has no attribute 'distinct'. It is not being applied to any data, so it is not returning any data. You need to reference the dataframe which the key email is in.
See the pyspark docs for pyspark.sql.functions.countDistinct(col, *cols) and pyspark.sql.functions.approx_count_distinct pyspark docs. This should be a simpler solution to getting a unique count.