I am a novice when it comes to Python and I am trying to import a .csv file into an already existing MySQL table. I have tried it several different ways but I cannot get anything to work. Below is my latest attempt (not the best syntax I'm sure). I originally tried using ‘%s’ instead of ‘?’, but that did not work. Then I saw an example of the question mark but that clearly isn’t working either. What am I doing wrong?
import mysql.connector
import pandas as pd
db = mysql.connector.connect(**Login Info**)
mycursor = db.cursor()
df = pd.read_csv("CSV_Test_5.csv")
insert_data = (
"INSERT INTO company_calculations.bs_import_test(ticker, date_updated, bs_section, yr_0, yr_1, yr_2, yr_3, yr_4, yr_5, yr_6, yr_7, yr_8, yr_9, yr_10, yr_11, yr_12, yr_13, yr_14, yr_15)"
"VALUES(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"
)
for row in df.itertuples():
data_inputs = (row.ticker, row.date_updated, row.bs_section, row.yr_0, row.yr_1, row.yr_2, row.yr_3, row.yr_4, row.yr_5, row.yr_6, row.yr_7, row.yr_8, row.yr_9, row.yr_10, row.yr_11, row.yr_12, row.yr_13, row.yr_14, row.yr_15)
mycursor.execute(insert_data, data_inputs)
db.commit()
Error Message:
> Traceback (most recent call last): File
> "C:\...\Python_Test\Excel_Test_v1.py",
> line 33, in <module>
> mycursor.execute(insert_data, data_inputs) File "C:\...\mysql\connector\cursor_cext.py",
> line 325, in execute
> raise ProgrammingError( mysql.connector.errors.ProgrammingError: Not all parameters were used in the SQL statement
MySQL Connector/Python supports named parameters (which includes also printf style parameters (format)).
>>> import mysql.connector
>>> mysql.connector.paramstyle
'pyformat'
According to PEP-249 (DB API level 2.0) the definition of pyformat is:
pyformat: Python extended format codes, e.g. ...WHERE name=%(name)s
Example:
>>> cursor.execute("SELECT %s", ("foo", ))
>>> cursor.fetchall()
[('foo',)]
>>> cursor.execute("SELECT %(var)s", {"var" : "foo"})
>>> cursor.fetchall()
[('foo',)]
Afaik the qmark paramstyle (using question mark as a place holder) is only supported by MariaDB Connector/Python.
Related
i am trying to fetch the list of sql query running more than 3600 sec and kill those id's using python below is the code
import json
import mysql.connector
import pymysql
def main():
# TODO implement
connection = pymysql.connect(user='', password='',
host='',
port=3306,
database='');
cursor = connection.cursor() # get the cursor
# cursor.execute('SHOW PROCESSLIST;')
# extracted_data = cursor.fetchall();
# for i in extracted_data:
# print(i)
with connection.cursor() as cursor:
print(cursor.execute('SHOW PROCESSLIST'))
for item in cursor.fetchall():
if item.get('Time') > 3600 and item.get('command') == 'query':
_id = item.get('Id')
print('kill %s' % item)
cursor.execute('kill %s', _id)
connection.close()
main()
below is the error i am getting
"C:\drive c\pyfile\venv\Scripts\python.exe" "C:/drive c/pyfile/sqlnew2.py"
Traceback (most recent call last):
File "C:\drive c\pyfile\sqlnew2.py", line 23, in <module>
main()
File "C:\drive c\pyfile\sqlnew2.py", line 18, in main
if item.get('Time') > 3600 and item.get('command') == 'query':
AttributeError: 'tuple' object has no attribute 'get'
The .fetchall() method returns a tuple, not a dictionary. Therefore you should access the elements using the numerical indexes, for example item[0], item[1], etc
As an alternative, if you want to fetch the results as a dictionary, you can use a DictCursor
First import it:
import pymysql.cursors
Then modify the cursor line like that:
with connection.cursor(pymysql.cursors.DictCursor) as cursor:
...
I am currently working on a python script to print pieces of information on running EC2 instances on AWS using Boto3. I am trying to print the InstanceID, InstanceType, and PublicIp. I looked through Boto3's documentation and example scripts so this is what I am using:
import boto3
ec2client = boto3.client('ec2')
response = ec2client.describe_instances()
for reservation in response["Reservations"]:
for instance in reservation["Instances"]:
instance_id = instance["InstanceId"]
instance_type = instance["InstanceType"]
instance_ip = instance["NetworkInterfaces"][0]["Association"]
print(instance)
print(instance_id)
print(instance_type)
print(instance_ip)
When I run this, "instance" prints one large block of json code, my instanceID, and type. But I am getting an error since adding NetworkInterfaces.
instance_ip = instance["NetworkInterfaces"][0]["Association"]
returns:
Traceback (most recent call last):
File "/Users/me/AWS/describeInstances.py", line 12, in <module>
instance_ip = instance["NetworkInterfaces"][0]["Association"]
KeyError: 'Association'
What am I doing wrong while trying to print the PublicIp?
Here is the structure of NetworkInterfaces for reference:
The full Response Syntax for reference can be found here (https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ec2.html#EC2.Client.describe_instances)
Association man not always may be present. Also an instance may have more then one interface. So your working loop could be:
for reservation in response["Reservations"]:
for instance in reservation["Instances"]:
instance_id = instance["InstanceId"]
instance_type = instance["InstanceType"]
#print(instance)
print(instance_id, instance_type)
for network_interface in instance["NetworkInterfaces"]:
instance_ip = network_interface.get("Association", "no-association")
print(' -', instance_ip)
I'm getting the following errors when trying to decode this data, and the 2nd error after trying to compensate for the unicode error:
Error 1:
write.writerows(subjects)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' in position 160: ordinal not in range(128)
Error 2:
with open("data.csv", encode="utf-8", "w",) as writeFile:
SyntaxError: non-keyword arg after keyword arg
Code
import requests
import json
import csv
from bs4 import BeautifulSoup
import urllib
r = urllib.urlopen('https://thisiscriminal.com/wp-json/criminal/v1/episodes?posts=10000&page=1')
data = json.loads(r.read().decode('utf-8'))
subjects = []
for post in data['posts']:
subjects.append([post['title'], post['episodeNumber'],
post['audioSource'], post['image']['large'], post['excerpt']['long']])
with open("data.csv", encode="utf-8", "w",) as writeFile:
write = csv.writer(writeFile)
write.writerows(subjects)
Using requests and with the correction to the second part (as below) I have no problem running. I think your first problem is due to the second error (is a consequence of that being incorrect).
I am on Python3 and can run yours with my fix to open line and with
r = urllib.request.urlopen('https://thisiscriminal.com/wp-json/criminal/v1/episodes?posts=10000&page=1')
I personally would use requests.
import requests
import csv
data = requests.get('https://thisiscriminal.com/wp-json/criminal/v1/episodes?posts=10000&page=1').json()
subjects = []
for post in data['posts']:
subjects.append([post['title'], post['episodeNumber'],
post['audioSource'], post['image']['large'], post['excerpt']['long']])
with open("data.csv", encoding ="utf-8", mode = "w",) as writeFile:
write = csv.writer(writeFile)
write.writerows(subjects)
For your second, looking at documentation for open function, you need to use the right argument names and add the name of the mode argument if not positional matching.
with open("data.csv", encoding ="utf-8", mode = "w") as writeFile:
I have been trying to use nltk.pos_tag in my code but I face an error when I do so. I have already downloaded Penn treebank and max_ent_treebank_pos. But the error persists. here is my code :
import nltk
from nltk import tag
from nltk import*
a = "Alan Shearer is the first player to score over a hundred Premier League goals."
a_sentences = nltk.sent_tokenize(a)
a_words = [nltk.word_tokenize(sentence) for sentence in a_sentences]
a_pos = [nltk.pos_tag(sentence) for sentence in a_words]
print(a_pos)
and this is the error I get :
"Traceback (most recent call last):
File "<pyshell#9>", line 1, in <module>
print (nltk.pos_tag(text))
File "C:\Python34\lib\site-packages\nltk\tag\__init__.py", line 110, in pos_tag
tagger = PerceptronTagger()
File "C:\Python34\lib\site-packages\nltk\tag\perceptron.py", line 140, in __init__
AP_MODEL_LOC = 'file:'+str(find('taggers/averaged_perceptron_tagger/'+PICKLE))
File "C:\Python34\lib\site-packages\nltk\data.py", line 641, in find
raise LookupError(resource_not_found)
LookupError:
Resource 'taggers/averaged_perceptron_tagger/averaged_perceptron
_tagger.pickle' not found. Please use the NLTK Downloader to
obtain the resource: >>> nltk.download()
Searched in:
- 'C:\\Users\\T01142/nltk_data'
- 'C:\\nltk_data'
- 'D:\\nltk_data'
- 'E:\\nltk_data'
- 'C:\\Python34\\nltk_data'
- 'C:\\Python34\\lib\\nltk_data'
- 'C:\\Users\\T01142\\AppData\\Roaming\\nltk_data'
Call this from python:
nltk.download('averaged_perceptron_tagger')
Had the same problem in a Flask server. nltk used a different path when in server config, so I recurred to adding nltk.data.path.append("/home/yourusername/whateverpath/") inside of the server code right before the pos_tag call
Note there is some replication of this question
How to config nltk data directory from code?
nltk doesn't add $NLTK_DATA to search path?
POS tagging with NLTK. Can't locate averaged_perceptron_tagger
To resolve this error run following command on python prompt:
import nltk
nltk.download('averaged_perceptron_tagger')
Newbie to SQLAlchemy.
I'm having trouble adding a record. I modeled the add after the tutorial which passes multiple values (albeit hard coded values.) Attached is the routine and the error.
StackOverflow thinks my 'explanation to code' ratio is off, so I'm adding additional explanation so I can submit my query.
import pdb
from table import wrl
from sqlalchemy import or_, and_, desc, asc
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
rs = create_engine('credentials', echo=True)
aws = create_engine('credentials', echo=True)
rs_session = sessionmaker(bind=rs)
aws_session = sessionmaker(bind=aws)
rs = rs_session()
aws = aws_session()
# pdb.set_trace()
y = rs.query(wrl).order_by(wrl.UUID_PK).first()
cat = y.Added_Timestamp #now we have the oldest record time stamp value
query_string = cat[:8]+"%" #now we have the oldest record's date i.e. substring(20111215_121212;1;8)
move_me = rs.query(wrl).filter(wrl.Added_Timestamp.like(query_string)).limit(10)
pdb.set_trace()
for x in move_me:
# pdb.set_trace()
wrl_rec = wrl(x.UUID_PK,
x.Web_Request_Headers,
x.Web_Request_Body,
x.Current_Machine,
x.Current_Machine,
x.ResponseBody,
x.Full_Log_Message,
x.Remote_Address,
x.basic_auth_username,
x.Request_Method,
x.Request_URI,
x.Request_Protocol,
x.Time_To_Process_Request,
x.User_ID,
x.Error,
x.Added_Timestamp,
x.Processing_Time_Milliseconds,
x.mysql_timestamp)
aws.add(wrl_rec)
aws.commit()
print 'added %s ' % x.UUID_PK
Traceback (most recent call last):
File "migrate.py", line 47, in <module>
x.mysql_timestamp)
TypeError: __init__() takes exactly 1 argument (19 given)
Any suggestions appreciated.
The problem is not really SA related. My conjecture is that your constructor (wrl.__init__(self, ...)) is either not defined, or does not take any positional arguments, which you are trying to specify when creating this object in wrl_rec.
So basically, the error message is pretty much indicating your problem.
On a side-note, does order_by(wrl.UUID_PK) really return the oldest record by the Timestamp as your comment few lines below indicate? Somehow I highly doubt that.