Python string formatting on MySQL query - mysql

I'm trying to execute a query using two parameters, one is a list and the other is a number.
Here is my code:
cursor.execute("SELECT cod_art, price_EU, price_third_country
FROM lt_gamma_articles
WHERE cod_art in (%s) AND cod_store = %d
ORDER BY cod_art, timestamp"
% format_strings, tuple(cod_art_int), int(shop) )
I get this error:
TypeError: not enough arguments for format string
I think the error is in the string formatting, but I don't know how to format it correctly and I've been stuck for a while.

From the looks of your code, I'm gussing your basing it off of imploding a list for use in a python MySQLDB IN clause. Assuming that's the case, and
format_strings is built similar to:
format_strings = ','.join(['%s'] * len(cod_art_int))
I would use .format to build the query string, and build up a list for the positional query parameters:
query_params = list(cod_art_int)
query_params.append(int(shop))
query_sql = """
SELECT cod_art, price_EU, price_third_country
FROM lt_gamma_articles
WHERE cod_art IN ({cod_art_list}) AND cod_store = %s
ORDER BY cod_art, timestamp
""".format(cod_art_list=format_strings)
cursor.execute(query_sql, query_params)
Explanation:
Say cod_art_int is this list of integer values:
cod_art_int = [10, 20, 30]
In order to use those values in the cod_art IN (...) part of the query, you need to add a %s for each one. That is done by:
format_strings = ','.join(['%s'] * len(cod_art_int))
Which can be broken down to:
step_one = ['%s'] * len(cod_art_int)
# ['%s', '%s', '%s']
format_strings = ','.join(step_one)
# '%s,%s,%s'
When you build the final query string, you replace {cod_art_list} with format_strings:
query_sql = """
SELECT cod_art, price_EU, price_third_country
FROM lt_gamma_articles
WHERE cod_art IN ({cod_art_list}) AND cod_store = %s
ORDER BY cod_art, timestamp
""".format(cod_art_list=format_strings)
And you get the query string:
SELECT cod_art, price_EU, price_third_country
FROM lt_gamma_articles
WHERE cod_art IN (%s,%s,%s) AND cod_store = %s
ORDER BY cod_art, timestamp
Then your query parameters will be safely substituted within the query to replace the %ss. You build up the parameter list to correspond to the %ss. Since cod_art IN (%s,%s,%s) is first, you add that to the list first, followed by the value for cod_store (int(shop) which I'm going to say is 456):
query_params = list(cod_art_int)
# [10, 20, 30]
query_params.append(int(shop))
# [10, 20, 30, 456]
In the end you execute the query with its parameters:
cursor.execute(query_sql, query_params)

Related

generate queries for each key in pyspark data frame

I have a data frame in pyspark like below
df = spark.createDataFrame(
[
('2021-10-01','A',25),
('2021-10-02','B',24),
('2021-10-03','C',20),
('2021-10-04','D',21),
('2021-10-05','E',20),
('2021-10-06','F',22),
('2021-10-07','G',23),
('2021-10-08','H',24)],("RUN_DATE", "NAME", "VALUE"))
Now using this data frame I want to update a table in MySql
# query to run should be similar to this
update_query = "UPDATE DB.TABLE SET DATE = '2021-10-01', VALUE = 25 WHERE NAME = 'A'"
# mysql_conn is a function which I use to connect to `MySql` from `pyspark` and run queries
# Invoking the function
mysql_conn(host, user_name, password, update_query)
Now when I invoke the mysql_conn function by passing parameters the query runs successfully and the record gets updated in the MySql table.
Now I want to run the update statement for all the records in the data frame.
For each NAME it has to pick the RUN_DATE and VALUE and replace in update_query and trigger the mysql_conn.
I think we need to a for loop but not sure how to proceed.
Instead of iterating through the dataframe with a for loop, it would be better to distribute the workload across each partitions using foreachPartition. Moreover, since you are writing a custom query instead of executing one query for each query, it would be more efficient to execute a batch operation to reduce the round trips, latency and concurrent connections. Eg
def update_db(rows):
temp_table_query=""
for row in rows:
if len(temp_table_query) > 0:
temp_table_query = temp_table_query + " UNION ALL "
temp_table_query = temp_table_query + " SELECT '%s' as RUNDATE, '%s' as NAME, %d as VALUE " % (row.RUN_DATE,row.NAME,row.VALUE)
update_query="""
UPDATE DBTABLE
INNER JOIN (
%s
) new_records ON DBTABLE.NAME = new_records.NAME
SET
DBTABLE.DATE = new_records.RUNDATE,
DBTABLE.VALUE = new_records.VALUE
""" % (temp_table_query)
mysql_conn(host, user_name, password, update_query)
df.foreachPartition(update_db)
View Demo on how the UPDATE query works
Let me know if this works for you.

Pass variables to SQL query LIMIT sql Pandas

I have two variables i have declared and asigned values to, I want to pass them to an sql query limits in python. Below is what i have tried so far. Any help will be very much appreciated
limitstart = 10
limitend = 100
df = pd.read_sql("SELECT NUMBERS FROM `table` LIMIT '{limitstart}', '{limitend}'", con=dbConnection)
I am getting a syntax error. I would want the query to eventually be
df = pd.read_sql("SELECT NUMBERS FROM `table` LIMIT 10, 100", con=dbConnection)
But i need to pass the variables
You have to just make a string of command so try something like
df = pd.read_sql("SELECT NUMBERS FROM table LIMIT " + str(limitstart) + ", "+str(limitend),con = dbConnection)
For Python3 you're missing f before your query and have unnecessary ':
limitstart = 10
limitend = 100
df = pd.read_sql(f"SELECT NUMBERS FROM `table` LIMIT {limitstart}, {limitend}", con=dbConnection)

placeholders for table names in python mysql

I am using python sql to edit a very simple table named students (whose columns are name and age), as shown below:
('Rachel', 22)
('Linckle', 33)
('Bob', 45)
('Amanda', 25)
('Jacob', 85)
('Avi', 65)
('Michelle', 45)
I am defining python functions to execute SQL code.
In my first function I want to update the age values in students table where the name matches something (e.g. Bob). If I define the following function:
def update_age(age, name):
c.execute("""UPDATE students SET age = %s
WHERE name = %s""", (age, name))
And then:
update_age(99, 'Bob')
I will get:
('Rachel', 22)
('Linckle', 33)
('Bob', 99)
('Amanda', 25)
('Jacob', 85)
('Avi', 65)
('Michelle', 45)
On a second function I would like to specify also the name of the table, with the following code:
def update_age_table(table, age, name):
c.execute("""UPDATE %s SET age = %s
WHERE name = %s""",
(table, age, name)) # note that here I am only replacing students by the placeholder %s
Then if I do:
update_age_table(table='students', age=95, name='Jacob')
I will get the following error message (it is long, I am only displaying the last sentence:
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''students' SET age = 95
WHERE name = 'Jacob'' at line 1
I guess that the error comes from the fact that I am assigning two of the placeholders to variables, namely age and name, which is not the case of the table name, where there is no variable assignment.
Does anyone know how I can use placeholders in SQL commands without assigning them to variables?
ThatÅ› because you cannot pass the table name as a parameter in the execute sentence. You should do it this way:
def update_age_table(table, age, name):
c.execute("UPDATE "+table+" SET age = %s
WHERE name = %s",
(table, age, name)) #
The prepared statement doesn't work for table names
EDIT
You have to remove the table parameter like this:
def update_age_table(table, age, name):
c.execute("UPDATE "+table+" SET age = %s WHERE name = %s",(age, name)) #
Sorry was a mistake
dt= datetime.datetime.now()
new_date=str(dt)
idname=input("Please enter Your Id. ")
bname= input("Please Enter name of book which you want to Issue: ")
idn=(idname,)
sql="insert into id%s (issuedbook,date)"%idn +"values (%s,%s)"
val=(bname,new_date)
cursor.execute(sql,val)
cnx.commit()
insert_data()```
Without having tested it, this should be a better coding style of the accepted answer. As the whole Q/A shows, the variables are passed only at cursor.execution() time to make it more secure, but the table statement of the execute() string is evaluated before the args are evaluated, that is why tables have to be plain text evaluated before execute() but the variables do not. See another example with similar challenge at Python - pass a list as params to SQL, plus more variables where the table is not passed either.
Therefore, just as an add-on for the rightly accepted query:
def update_age_table(UPDATE_QUERY, args):
c.execute(UPDATE_QUERY, args)
c.commit()
# example for string testing:
table, age, name = "table_x", 2, "name_y"
UPDATE_QUERY = f"""
UPDATE {table}
SET age = %s
WHERE name = %s
"""
# # UPDATE_QUERY Out:
# '\n UPDATE table_x\n SET age = %s\n WHERE name = %s\n'
args = [age, name]
update_age_table(UPDATE_QUERY, args)

Bulk update MySql with python

I have to update millions of row into MySQL. I am currently using for loop to execute query. To make the update faster I want to use executemany() of Python MySQL Connector, so that I can update in batches using single query for each batch.
I don't think mysqldb has a way of handling multiple UPDATE queries at one time.
But you can use an INSERT query with ON DUPLICATE KEY UPDATE condition at the end.
I written the following example for ease of use and readability.
import MySQLdb
def update_many(data_list=None, mysql_table=None):
"""
Updates a mysql table with the data provided. If the key is not unique, the
data will be inserted into the table.
The dictionaries must have all the same keys due to how the query is built.
Param:
data_list (List):
A list of dictionaries where the keys are the mysql table
column names, and the values are the update values
mysql_table (String):
The mysql table to be updated.
"""
# Connection and Cursor
conn = MySQLdb.connect('localhost', 'jeff', 'atwood', 'stackoverflow')
cur = conn.cursor()
query = ""
values = []
for data_dict in data_list:
if not query:
columns = ', '.join('`{0}`'.format(k) for k in data_dict)
duplicates = ', '.join('{0}=VALUES({0})'.format(k) for k in data_dict)
place_holders = ', '.join('%s'.format(k) for k in data_dict)
query = "INSERT INTO {0} ({1}) VALUES ({2})".format(mysql_table, columns, place_holders)
query = "{0} ON DUPLICATE KEY UPDATE {1}".format(query, duplicates)
v = data_dict.values()
values.append(v)
try:
cur.executemany(query, values)
except MySQLdb.Error, e:
try:
print"MySQL Error [%d]: %s" % (e.args[0], e.args[1])
except IndexError:
print "MySQL Error: %s" % str(e)
conn.rollback()
return False
conn.commit()
cur.close()
conn.close()
Explanation of one liners
columns = ', '.join('`{}`'.format(k) for k in data_dict)
is the same as
column_list = []
for k in data_dict:
column_list.append(k)
columns = ", ".join(columns)
Here's an example of usage
test_data_list = []
test_data_list.append( {'id' : 1, 'name' : 'Marco', 'articles' : 1 } )
test_data_list.append( {'id' : 2, 'name' : 'Keshaw', 'articles' : 8 } )
test_data_list.append( {'id' : 3, 'name' : 'Wes', 'articles' : 0 } )
update_many(data_list=test_data_list, mysql_table='writers')
Query output
INSERT INTO writers (`articles`, `id`, `name`) VALUES (%s, %s, %s) ON DUPLICATE KEY UPDATE articles=VALUES(articles), id=VALUES(id), name=VALUES(name)
Values output
[[1, 1, 'Marco'], [8, 2, 'Keshaw'], [0, 3, 'Wes']]
Maybe this can help
How to update multiple rows with single MySQL query in python?
cur.executemany("UPDATE Writers SET Name = %s WHERE Id = %s ",
[("new_value" , "3"),("new_value" , "6")])
conn.commit()

Ordering a queryset by occurrences

I have a django model:
class Field:
choice = models.CharField(choices=choices)
value = models.CharField(max_length=255)
In my database I have some cases where there are 3 "fields" with the same choice, and some cases where there is 1 field of that choice
How can I order the queryset so it returns, sorted by choice, but with all ones in a set of 3 at the start?
For example
[1,1,1,3,3,3,4,4,4,2,5] where 1,2,3,4,5 are possible choices?
This is the best I can do using django's ORM. Basically, just like in SQL, you have to construct a custom order_by statement. In our case, we'll place it in the SELECT and then order by it:
1) Get a list of choices sorted by frequency: [1, 3, 4, 2, 5]
freq_list = (
Field.objects.values_list('choice', flat=True)
.annotate(c=Count('id')).order_by('-c', 'choice')
)
2) Add indexes with enumerate: [(0,1), (1,3), (2,4), (3,2), (4,5)]
enum_list = list(enumerate(freq_list))
3) Create a list of cases: ['CASE', 'WHEN choice=1 THEN 0', ..., 'END']
case_list = ['CASE']
case_list += ["WHEN choice={1} THEN {0}".format(*tup) for tup in enum_list]
case_list += ['END']
4) Combine the case list into one string: 'CASE WHEN choice=1 THEN 0 ...'
case_statement = ' '.join(case_list)
5) Finally, use the case statement to select an extra field 'o' which will be corresponding order, then just order by this field
Field.objects.extra(select={'o': case_statement}).order_by('o')
To simplify all this, you can put the above code into a Model Manager:
class FieldManager(models.Manager):
def get_query_set(self):
freq_list = (
Field.objects.values_list('choice', flat=True)
.annotate(c=Count('id')).order_by('-c', 'choice')
)
enum_list = list(enumerate(freq_list))
case_list = ['CASE']
case_list += ["WHEN choice={1} THEN {0}".format(*tup) for tup in enum_list]
case_list += ['END']
case_statement = ' '.join(case_list)
ordered = Field.objects.extra(select={'o': case_statement}).order_by('o')
return ordered
class Field(models.Model):
...
freq_sorted = FieldManager()
Now you can query:
Field.freq_sorted.all()
Which will get you a Field QuerySet sorted by frequency of choices
You should make a function and detect which is repeated to select unique, then calling from mysql as a function over mysql