UPDATE in mysql using python - mysql

how to do Update in mysql consider Rate is int(8)
k=int(4000);
db=MySQLdb.connect("localhost","user","pass,"databse")
cursor=db.cursor()
sql="UPDATE tablename SET Rate=k WHERE Name='xxx'
cursor=db.cursor()
try:
cursor.execute(sql)
db.commit()
except:
db.rollback()
db.close()

sql should be:
sql = "UPDATE tablename SET Rate = (%s) WHERE Name='xxx'"
and then you should do:
cursor.execute(sql,(k,))
note that k needs to be in a container to be passed correctly to .execute(). in this case it is a tuple.
also int(4000) is redundant. you can just do: k = 4000
if you also want to pass in the name you can do:
sql = "UPDATE tablename SET Rate = (%s) WHERE Name=(%s)"
and then:
cursor.execute(sql,(k,name))

Related

generate queries for each key in pyspark data frame

I have a data frame in pyspark like below
df = spark.createDataFrame(
[
('2021-10-01','A',25),
('2021-10-02','B',24),
('2021-10-03','C',20),
('2021-10-04','D',21),
('2021-10-05','E',20),
('2021-10-06','F',22),
('2021-10-07','G',23),
('2021-10-08','H',24)],("RUN_DATE", "NAME", "VALUE"))
Now using this data frame I want to update a table in MySql
# query to run should be similar to this
update_query = "UPDATE DB.TABLE SET DATE = '2021-10-01', VALUE = 25 WHERE NAME = 'A'"
# mysql_conn is a function which I use to connect to `MySql` from `pyspark` and run queries
# Invoking the function
mysql_conn(host, user_name, password, update_query)
Now when I invoke the mysql_conn function by passing parameters the query runs successfully and the record gets updated in the MySql table.
Now I want to run the update statement for all the records in the data frame.
For each NAME it has to pick the RUN_DATE and VALUE and replace in update_query and trigger the mysql_conn.
I think we need to a for loop but not sure how to proceed.
Instead of iterating through the dataframe with a for loop, it would be better to distribute the workload across each partitions using foreachPartition. Moreover, since you are writing a custom query instead of executing one query for each query, it would be more efficient to execute a batch operation to reduce the round trips, latency and concurrent connections. Eg
def update_db(rows):
temp_table_query=""
for row in rows:
if len(temp_table_query) > 0:
temp_table_query = temp_table_query + " UNION ALL "
temp_table_query = temp_table_query + " SELECT '%s' as RUNDATE, '%s' as NAME, %d as VALUE " % (row.RUN_DATE,row.NAME,row.VALUE)
update_query="""
UPDATE DBTABLE
INNER JOIN (
%s
) new_records ON DBTABLE.NAME = new_records.NAME
SET
DBTABLE.DATE = new_records.RUNDATE,
DBTABLE.VALUE = new_records.VALUE
""" % (temp_table_query)
mysql_conn(host, user_name, password, update_query)
df.foreachPartition(update_db)
View Demo on how the UPDATE query works
Let me know if this works for you.

How to put end of file as a condition in the condition to escape from while statement?

When I update data in a column called 'mine' where all rows are null values in mysql with Python, I want to escape the while statement when there are no more null values. What condition should I add to the query statement below? The current data is updated normally, but when all data is updated, the indication that it is over is not displayed.
import pymysql
conn=pymysql.connect(
user='root',
passwd='*',
host='',
db='practice',
charset='utf8')
curs = conn.cursor()
num = 0
while num >= 0:
num += 1
sql = "update zipcode set mine = %s where mine is null limit 1"
data = (num)
curs.execute(sql, data)
conn.commit()
conn.close()

Pass variables to SQL query LIMIT sql Pandas

I have two variables i have declared and asigned values to, I want to pass them to an sql query limits in python. Below is what i have tried so far. Any help will be very much appreciated
limitstart = 10
limitend = 100
df = pd.read_sql("SELECT NUMBERS FROM `table` LIMIT '{limitstart}', '{limitend}'", con=dbConnection)
I am getting a syntax error. I would want the query to eventually be
df = pd.read_sql("SELECT NUMBERS FROM `table` LIMIT 10, 100", con=dbConnection)
But i need to pass the variables
You have to just make a string of command so try something like
df = pd.read_sql("SELECT NUMBERS FROM table LIMIT " + str(limitstart) + ", "+str(limitend),con = dbConnection)
For Python3 you're missing f before your query and have unnecessary ':
limitstart = 10
limitend = 100
df = pd.read_sql(f"SELECT NUMBERS FROM `table` LIMIT {limitstart}, {limitend}", con=dbConnection)

table not updating while using functions in python

I made a python program which takes arguments from function call to update a table. The arguments are passed successfully but Does not update the table.
`
import mysql.connector
mydb = mysql.connector.connect(host="localhost",user='root',passwd="",database='student')
print(mydb)
mycursor = mydb.cursor()
mycursor.execute("create TABLE if not exists testtable ( num INT NOT NULL AUTO_INCREMENT,issue varchar(30), status varchar(30),PRIMARY KEY (num))")
def dev(y,z):
values=(y,z)
print(values)
print(mydb)
sql = "UPDATE form SET status = %s WHERE num = %s"
mycursor.execute(sql,values)
mydb.commit()
print(mycursor.rowcount, "record(s) affected")
values=('goog',1)
sql = "UPDATE form SET status = %s WHERE num = %s"
mycursor.execute(sql,values)
mydb.commit()
print(mycursor.rowcount, "record(s) affected")
dev('goog',2)`
a similar query outside the function works properly.
For some reason mycursor.execute() wont execute

How can I refer to the main query's table in a nested subquery?

I have a table named passive than contains a list of timestamped events per user. I want to fill the attribute duration, which correspond to the time between the current row's event and the next event done by this user.
I tried the following query:
UPDATE passive as passive1
SET passive1.duration = (
SELECT min(UNIX_TIMESTAMP(passive2.event_time) - UNIX_TIMESTAMP(passive1.event_time) )
FROM passive as passive2
WHERE passive1.user_id = passive2.user_id
AND UNIX_TIMESTAMP(passive2.event_time) - UNIX_TIMESTAMP(passive1.event_time) > 0
);
This returns the error message Error 1093 - You can't specify target table for update in FROM.
In order to circumvent this limitation, I tried to follow the structure given in https://stackoverflow.com/a/45498/395857, which uses a nested subquery in the FROM clause to create an implicit temporary table, so that it doesn't count as the same table we're updating:
UPDATE passive
SET passive.duration = (
SELECT *
FROM (SELECT min(UNIX_TIMESTAMP(passive2.event_time) - UNIX_TIMESTAMP(passive.event_time))
FROM passive, passive as passive2
WHERE passive.user_id = passive2.user_id
AND UNIX_TIMESTAMP(passive2.event_time) - UNIX_TIMESTAMP(passive1.event_time) > 0
)
AS X
);
However, the passive table in the nested subquery doesn't refer to the same passive as in the main query. Because of that, all rows have the same passive.duration value. How can I refer to the main query's passive in the nested subquery? (or maybe are there some alternative ways to structure such a query?)
Try Like this....
UPDATE passive as passive1
SET passive1.duration = (
SELECT min(UNIX_TIMESTAMP(passive2.event_time) - UNIX_TIMESTAMP(passive1.event_time) )
FROM (SELECT * from passive) Passive2
WHERE passive1.user_id = passive2.user_id
AND UNIX_TIMESTAMP(passive2.event_time) - UNIX_TIMESTAMP(passive1.event_time) > 0
)
;
We can use a Python script to circumvent the issue:
'''
We need an index on user_id, timestamp to speed up
'''
#!/usr/bin/python
# -*- coding: utf-8 -*-
# Download it at http://sourceforge.net/projects/mysql-python/?source=dlp
# Tutorials: http://mysql-python.sourceforge.net/MySQLdb.html
# http://zetcode.com/db/mysqlpython/
import MySQLdb as mdb
import datetime, random
def main():
start = datetime.datetime.now()
db=MySQLdb.connect(user="root",passwd="password",db="db_name")
db2=MySQLdb.connect(user="root",passwd="password",db="db_name")
cursor = db.cursor()
cursor2 = db2.cursor()
cursor.execute("SELECT observed_event_id, user_id, observed_event_timestamp FROM observed_events ORDER BY observed_event_timestamp ASC")
count = 0
for row in cursor:
count += 1
timestamp = row[2]
user_id = row[1]
primary_key = row[0]
sql = 'SELECT observed_event_timestamp FROM observed_events WHERE observed_event_timestamp > "%s" AND user_id = "%s" ORDER BY observed_event_timestamp ASC LIMIT 1' % (timestamp, user_id)
cursor2.execute(sql)
duration = 0
for row2 in cursor2:
duration = (row2[0] - timestamp).total_seconds()
if (duration > (60*60)):
duration = 0
break
cursor2.execute("UPDATE observed_events SET observed_event_duration=%s WHERE observed_event_id = %s" % (duration, primary_key))
if count % 1000 == 0:
db2.commit()
print "Percent done: " + str(float(count) / cursor.rowcount * 100) + "%" + " in " + str((datetime.datetime.now() - start).total_seconds()) + " seconds."
db.close()
db2.close()
diff = (datetime.datetime.now() - start).total_seconds()
print 'finished in %s seconds' % diff
if __name__ == "__main__":
main()