I have a bunch of files in a directory named with nameid_cityid.txt, nameid and cityid being the ids of name (integer(10)) and city (integer(10)) in mydata table.
While the following solution works, I am doing type conversions since fetchall fetches 'L' and the file name tuple of nameid, cityid are strings,..
If you can suggest a pythonic or more elegant way of doing the same, that will be awesome, for me and the communtiy!
What I am trying to achieve :
Find those files from a directory that don't have a record in the database and then do something with that file, like parse/move/delete it.
MySQL table mydata :
nameid cityid
15633 45632
2354 76894
Python :
for pdffile in os.listdir(filepath):
cityid, nameid = pdffile.strip('.txt').split('_')[0], pdffile.strip('.txt').split('_')[1]
cursor.execute("select cityid, nameid from mydata")
alreadyparsed = cursor.fetchall()
targetvalues = ((str(cityid), str(nameid)) for cityid, nameid in alreadyparsed)
if (int(cityid), int(nameid)) in alreadyparsed:
print cityid, nameid, "Found"
else:
print cityid, nameid, "Not found"
I'd use a set for quick and easy testing:
cursor.execute("select CONCAT(nameid, '_', cityid, '.txt') from mydata")
present = set([r[0] for r in cursor])
for pdffile in os.listdir(filepath):
nameid, cityid = map(int, pdffile.rsplit('.', 1)[0].split('_'))
print nameid, cityid,
print "Found" if pdffile in present else "Not found"
First, I've pulled the query outside of the filename loop; no point in querying the same set of rows each time.
Secondly, I'll let MySQL generate filenames for me using CONCAT for ease of collecting the information into a set.
Thirdly, because we now have a set of filenames, testing each individual filename against the set is a simple pdffile in present test.
And finally, I've simplified your filename splitting logic to one line.
Now, if all you want is a set of filenames that are not present yet in the database (rather than enumerate which ones are and which ones are not), just use a set operation:
cursor.execute("select CONCAT(nameid, '_', cityid, '.txt') from mydata")
present = set([r[0] for r in cursor])
for pdffile in (set(os.listdir(filepath)) - present):
nameid, cityid = map(int, pdffile.rsplit('.', 1)[0].split('_'))
print nameid, cityid, "Found"
Here we use the .difference operation (with the - operator) to remove all the filenames for which there are already rows in the database, in one simple operation.
You could perform the concatenation in SQL, which will return a string:
SELECT CONCAT(nameid, '_', cityid, '.txt') FROM mydata
Related
Insert into D.d
Select * from A.a join B.b on
A.a.a1=B.b.b1
Join C.c on C.c.c1=B.b.b1
I have complex statements for which i need to extract source db name ( in above statement source DB are A,B,C and source tables are a,b,c &Target Db is D and target table is d)
Need output like
SourceDB SourceTbl TargetDB Targettbl
A,B,C a,b,c D d
Or we can get values in json format as well for each field.. Also this needs to accomodate for update and delete statements as well. Please assist
Thanks
You can use the SQLPARSE to parse the statement. I am providing a code below which is not optimally and efficiently written, but it has the logic to get the information
import sqlparse
raw = 'Insert into D.d ' \
'Select * from A.a join B.b on ' \
'A.a.a1=B.b.b1 Join C.c on C.c.c1=B.b.b1;'
parsed = sqlparse.parse(raw)[0]
tgt_switch = "N"
src_switch = "N"
src_table=[]
tgt_table= ""
for items in parsed.tokens:
#print(items,items.ttype)
if str(items) == "into":
tgt_switch ="Y"
if tgt_switch == "Y" and items.ttype is None:
tgt_switch = "N"
tgt_table = items
if str(items).lower() == "from" or str(items).lower() == "join":
src_switch = "Y"
if src_switch == "Y" and items.ttype is None:
src_switch = "N"
src_table.append(str(items))
target_db = str(tgt_table).split(".")[0]
target_tbl = str(tgt_table).split(".")[1]
print("Target DB is {} and Target table is {}".format(target_db,target_tbl))
for obj in src_table:
src_db = str(obj).split(".")[0]
src_tbl = str(obj).split(".")[1]
print("Source DB is {} and Source table is {}".format(src_db, src_tbl))
Snowflake does not offer any SQL statement parsing support. You can hack at it with regex'es, of course, or use any of the tools on the market.
If this query ran, and ran successfully, you can use ACCESS_HISTORY view https://docs.snowflake.com/en/sql-reference/account-usage/access_history.html to see which tables (A.a, B.b, C.c, D.d) and columns (A.a.a1, B.b.b1, C.c.c1, D.d.d1) it accessed and how (read or write).
I am using python sql to edit a very simple table named students (whose columns are name and age), as shown below:
('Rachel', 22)
('Linckle', 33)
('Bob', 45)
('Amanda', 25)
('Jacob', 85)
('Avi', 65)
('Michelle', 45)
I am defining python functions to execute SQL code.
In my first function I want to update the age values in students table where the name matches something (e.g. Bob). If I define the following function:
def update_age(age, name):
c.execute("""UPDATE students SET age = %s
WHERE name = %s""", (age, name))
And then:
update_age(99, 'Bob')
I will get:
('Rachel', 22)
('Linckle', 33)
('Bob', 99)
('Amanda', 25)
('Jacob', 85)
('Avi', 65)
('Michelle', 45)
On a second function I would like to specify also the name of the table, with the following code:
def update_age_table(table, age, name):
c.execute("""UPDATE %s SET age = %s
WHERE name = %s""",
(table, age, name)) # note that here I am only replacing students by the placeholder %s
Then if I do:
update_age_table(table='students', age=95, name='Jacob')
I will get the following error message (it is long, I am only displaying the last sentence:
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''students' SET age = 95
WHERE name = 'Jacob'' at line 1
I guess that the error comes from the fact that I am assigning two of the placeholders to variables, namely age and name, which is not the case of the table name, where there is no variable assignment.
Does anyone know how I can use placeholders in SQL commands without assigning them to variables?
ThatÅ› because you cannot pass the table name as a parameter in the execute sentence. You should do it this way:
def update_age_table(table, age, name):
c.execute("UPDATE "+table+" SET age = %s
WHERE name = %s",
(table, age, name)) #
The prepared statement doesn't work for table names
EDIT
You have to remove the table parameter like this:
def update_age_table(table, age, name):
c.execute("UPDATE "+table+" SET age = %s WHERE name = %s",(age, name)) #
Sorry was a mistake
dt= datetime.datetime.now()
new_date=str(dt)
idname=input("Please enter Your Id. ")
bname= input("Please Enter name of book which you want to Issue: ")
idn=(idname,)
sql="insert into id%s (issuedbook,date)"%idn +"values (%s,%s)"
val=(bname,new_date)
cursor.execute(sql,val)
cnx.commit()
insert_data()```
Without having tested it, this should be a better coding style of the accepted answer. As the whole Q/A shows, the variables are passed only at cursor.execution() time to make it more secure, but the table statement of the execute() string is evaluated before the args are evaluated, that is why tables have to be plain text evaluated before execute() but the variables do not. See another example with similar challenge at Python - pass a list as params to SQL, plus more variables where the table is not passed either.
Therefore, just as an add-on for the rightly accepted query:
def update_age_table(UPDATE_QUERY, args):
c.execute(UPDATE_QUERY, args)
c.commit()
# example for string testing:
table, age, name = "table_x", 2, "name_y"
UPDATE_QUERY = f"""
UPDATE {table}
SET age = %s
WHERE name = %s
"""
# # UPDATE_QUERY Out:
# '\n UPDATE table_x\n SET age = %s\n WHERE name = %s\n'
args = [age, name]
update_age_table(UPDATE_QUERY, args)
I'm trying to import from a CSV where some lines have an account number and some don't. Where accounts do have numbers I'd like to merge using them: there will be records where the name on an account has changed but the number will always stay the same. For the other records without an account number the best I can do is merge on the account name.
So really I need some kind of conditional: if a line has a account number, merge on that, else merge on account name. Something like...
LOAD CSV WITH HEADERS FROM 'file:///testfile.csv' AS line
MERGE (x:Thing {
CASE line.accountNumber WHEN NULL
THEN name: line.accountName
ELSE number: line.accountNumber
END
})
ON CREATE SET
x.name = line.accountName,
x.number = line.accountNumber
Though of course that doesn't work. Any ideas?
To test for a 'NULL' value in a .csv file in LOAD CSV, you have to test for an empty string.
testfile.csv
acct_name,acct_num
John,1
Stacey,2
Alice,
Bob,4
This assumes the account names are unique...
LOAD CSV WITH HEADERS FROM 'file:///testfile.csv' AS line
// If acct_num is not null, merge on account number and set name if node is created instead of found.
FOREACH(number IN (CASE WHEN line.acct_num <> "" THEN [TOINT(line.acct_num)] ELSE [] END) |
MERGE (x:Thing {number:number})
ON CREATE SET x.name = line.acct_name
)
// If acct_num is null, merge on account name. This node will not have an account number if it is created instead of matched.
FOREACH(name IN (CASE WHEN line.acct_num = "" THEN [line.acct_name] ELSE [] END) |
MERGE (x:Thing {name:name})
)
I have a table with columns machine id like (311a__) and (311bb__) and some of them like (08576). How can I retrieve them in SQL - how to insert into row where machine ID is like ( 311a__)? My question how to insert and select a column which has spaces in it.. How to retrieve data where machine_name ="%s__" is it correct
sql_local = """SELECT id FROM customer_1.pay_machines WHERE machine_name="%s" """ % machine
retVal = cursor.execute(sql_local)
if (retVal == 0):
sql_local = """INSERT INTO customer_1.pay_machines (machine_name, carpark_id) VALUES ("%s", 0)""" % machine
Surround odd (or reserved word) column names with backticks:
SELECT *
FROM pd
WHERE `machine id` = '(%s__)';
edit: removed invalid insert query as the first query is sufficient as an example
two questions:
So I have a list of filenames, each of which I would like to feed into a MYSQL query.
The first questions is how to loop through the filelist and pass the elements (the filenames) as a variable to MYSQL?
The second question is: How do I print the results in a more elegant way without the parenthesis and L's form the Tuple output that is returned? THe way I have below works for three columns, but I'd like a flexible way that I don't have to add sublists (cleaned1, 2..) when I fetch more rows.
Any help highly appreciated!!!
MyConnection = MySQLdb.connect( host = "localhost", user = "root", \
passwd = "xxxx", db = "xxxx")
MyCursor = MyConnection.cursor()
**MyList= (File1, File2, File3, File...., File36)
For i in Mylist:
do MYSQL query**
SQL = """SELECT a.column1, a.column2, b.column2 FROM **i in MyList** a, table2 b WHERE
a.column1=b.column1;"""
SQLLen = MyCursor.execute(SQL) # returns the number of records retrieved
AllOut = MyCursor.fetchall()
**List = list(AllOut) # this puts all the TUple information into a list
cleaned = [i[0] for i in List] # this cleans up the Tuple characters)
cleaned1 = [i[1] for i in List] # this cleans up the Tuple characters)
cleaned2 = [i[2] for i in List] # this cleans up the Tuple characters)
NewList=zip(cleaned,cleaned1,cleaned2) # This makes a new List
print NewList[0:10]**
# Close the files
MyCursor.close()
MyConnection.close()
I can figure out the saving to file, but I don't know how to pass a python variable into MYSQL.
convert the tuple to a list first: using
MyList = list(MyList)
and you will have two options:
try this:
for tablename in MyList:
c.execute("SELECT a.column1, a.column2, b.column2 FROM %s a, table2 b WHERE a.column1=b.column1", (tablename))
or :
for tablename in MyList:
SQL= "SELECT a.column1, a.column2, b.column2 FROM tablevar a, table2 b WHERE a.column1=b.column1"
SQL = SQL.replace('tablevar', tablename)
c.execute(SQL)
to print the results without the brackets you can use :
for tablename in MyList:
print tablename