Pass variables to SQL query LIMIT sql Pandas - mysql

I have two variables i have declared and asigned values to, I want to pass them to an sql query limits in python. Below is what i have tried so far. Any help will be very much appreciated
limitstart = 10
limitend = 100
df = pd.read_sql("SELECT NUMBERS FROM `table` LIMIT '{limitstart}', '{limitend}'", con=dbConnection)
I am getting a syntax error. I would want the query to eventually be
df = pd.read_sql("SELECT NUMBERS FROM `table` LIMIT 10, 100", con=dbConnection)
But i need to pass the variables

You have to just make a string of command so try something like
df = pd.read_sql("SELECT NUMBERS FROM table LIMIT " + str(limitstart) + ", "+str(limitend),con = dbConnection)

For Python3 you're missing f before your query and have unnecessary ':
limitstart = 10
limitend = 100
df = pd.read_sql(f"SELECT NUMBERS FROM `table` LIMIT {limitstart}, {limitend}", con=dbConnection)

Related

generate queries for each key in pyspark data frame

I have a data frame in pyspark like below
df = spark.createDataFrame(
[
('2021-10-01','A',25),
('2021-10-02','B',24),
('2021-10-03','C',20),
('2021-10-04','D',21),
('2021-10-05','E',20),
('2021-10-06','F',22),
('2021-10-07','G',23),
('2021-10-08','H',24)],("RUN_DATE", "NAME", "VALUE"))
Now using this data frame I want to update a table in MySql
# query to run should be similar to this
update_query = "UPDATE DB.TABLE SET DATE = '2021-10-01', VALUE = 25 WHERE NAME = 'A'"
# mysql_conn is a function which I use to connect to `MySql` from `pyspark` and run queries
# Invoking the function
mysql_conn(host, user_name, password, update_query)
Now when I invoke the mysql_conn function by passing parameters the query runs successfully and the record gets updated in the MySql table.
Now I want to run the update statement for all the records in the data frame.
For each NAME it has to pick the RUN_DATE and VALUE and replace in update_query and trigger the mysql_conn.
I think we need to a for loop but not sure how to proceed.
Instead of iterating through the dataframe with a for loop, it would be better to distribute the workload across each partitions using foreachPartition. Moreover, since you are writing a custom query instead of executing one query for each query, it would be more efficient to execute a batch operation to reduce the round trips, latency and concurrent connections. Eg
def update_db(rows):
temp_table_query=""
for row in rows:
if len(temp_table_query) > 0:
temp_table_query = temp_table_query + " UNION ALL "
temp_table_query = temp_table_query + " SELECT '%s' as RUNDATE, '%s' as NAME, %d as VALUE " % (row.RUN_DATE,row.NAME,row.VALUE)
update_query="""
UPDATE DBTABLE
INNER JOIN (
%s
) new_records ON DBTABLE.NAME = new_records.NAME
SET
DBTABLE.DATE = new_records.RUNDATE,
DBTABLE.VALUE = new_records.VALUE
""" % (temp_table_query)
mysql_conn(host, user_name, password, update_query)
df.foreachPartition(update_db)
View Demo on how the UPDATE query works
Let me know if this works for you.

SQL BINARY Operator not showing exact match

I have a table with the following columns...
[Name] = [Transliteration] = [Hexadecimal] = [HexadecimalUTF8]
...with multiple rows of UTF-8 characters, such as:
ङ = ṅa = 0919 = e0a499
ञ = ña = 091e = e0a49e
ण = ṇa = 0923 = e0a4a3
न = na = 0928 = e0a4a8
In order to search for the row that exactly matches ña in the Transliteration column , I enter the following command:
SELECT DISTINCT * FROM Samskrta
WHERE BINARY (Transliteration = concat(0xc3b161))
ORDER BY HexadecimalUTF8;
...which produces 4 rows.
Why is the SQL command not producing only the row that exactly matches ña?
What SQL command produces only the row that exactly matches ña?
The following command produces the same results:
SELECT DISTINCT * FROM Samskrta
WHERE BINARY (Transliteration = 'ña')
ORDER BY HexadecimalUTF8;
FIRST OF ALL, your query can't work as indicated: you are applying BINARY() to the result of the logical comparison, NOT comparing the BINARY() of whatever to whatever.
Try reproducing your code PRECISELY if you expect people to be able to tender assistance.

How to do MySQL update query in r

In MySQL, what I did is:
update table set group = 1 where log(id) < 0
I wanna do same thing in R but I have no idea
What should I do?
We can use data.table in R. Convert the 'data.frame' to 'data.table' (setDT(tab)), using the logical condition in 'i', we assign (:=) 1 to 'group' variable. As this does in place, it would be much faster and efficient.
library(data.table)
setDT(tab)[log(id) <0, group := 1]
Using sqldf: You cannot use group as the variable name so im changing to catg.
sqldf(c("update table set catg = 1 where log(id) < 0", "select * from main.xy"))

SQLAlchemy MySQL IF Statement

I'm in the middle of converting an old legacy PHP system to Flask + SQLAlchemy and was wondering how I would construct the following:
I have a model:
class Invoice(db.Model):
paidtodate = db.Column(DECIMAL(10,2))
fullinvoiceamount = db.Column(DECIMAL(10,2))
invoiceamount = db.Column(DECIMAL(10,2))
invoicetype = db.Column(db.String(10))
acis_cost = db.Column(DECIMAL(10,2))
The query I need to run is:
SELECT COUNT(*) AS the_count, sum(if(paidtodate>0,paidtodate,if(invoicetype='CPCN' or invoicetype='CPON' or invoicetype='CBCN' or invoicetype='CBON' or invoicetype='CPUB' or invoicetype='CPGU' or invoicetype='CPSO',invoiceamount,
fullinvoiceamount))) AS amount,
SUM(acis_cost) AS cost, (SUM(if(paidtodate>0,paidtodate,invoiceamount))-SUM(acis_cost)) AS profit FROM tblclientinvoices
Is there an SQLAlchemyish way to construct this query? - I've tried googling for Mysql IF statments with SQlAlchemy but drew blanks.
Many thanks!
Use func(documentation) to generate SQL function expression:
qry = select([
func.count().label("the_count"),
func.sum(func.IF(
Invoice.paidtodate>0,
Invoice.paidtodate,
# #note: I prefer using IN instead of multiple OR statements
func.IF(Invoice.invoicetype.in_(
("CPCN", "CPON", "CBCN", "CBON", "CPUB", "CPGU", "CPSO",)
),
Invoice.invoiceamount,
Invoice.fullinvoiceamount)
)
).label("amount"),
func.sum(Invoice.acis_cost).label("Cost"),
(func.sum(func.IF(
Invoice.paidtodate>0,
Invoice.paidtodate,
Invoice.invoiceamount
))
- func.sum(Invoice.acis_cost)
).label("Profit"),
],
)
rows = session.query(qry).all()
for row in rows:
print row

SQL Server 2008: Error converting data type nvarchar to float

Presently troubleshooting a problem where running this SQL query:
UPDATE tblBenchmarkData
SET OriginalValue = DataValue, OriginalUnitID = DataUnitID,
DataValue = CAST(DataValue AS float) * 1.335
WHERE
FieldDataSetID = '6956beeb-a1e7-47f2-96db-0044746ad6d5'
AND ZEGCodeID IN
(SELECT ZEGCodeID FROM tblZEGCode
WHERE(ZEGCode = 'C004') OR
(LEFT(ZEGParentCode, 4) = 'C004'))
Results in the following error:
Msg 8114, Level 16, State 5, Line 1
Error converting data type nvarchar to float.
The really odd thing is, if I change the UPDATE to SELECT to inspect the values that are retrieved are numerical values:
SELECT DataValue
FROM tblBenchmarkData
WHERE FieldDataSetID = '6956beeb-a1e7-47f2-96db-0044746ad6d5'
AND ZEGCodeID IN
(SELECT ZEGCodeID
FROM tblZEGCode WHERE(ZEGCode = 'C004') OR
(LEFT(ZEGParentCode, 4) = 'C004'))
Here are the results:
DataValue
2285260
1205310
Would like to use TRY_PARSE or something like that; however, we are running on SQL Server 2008 rather than SQL Server 2012. Does anyone have any suggestions? TIA.
It would be helpful to see the schema definition of tblBenchmarkData, but you could try using ISNUMERIC in your query. Something like:
SET DataValue = CASE WHEN ISNUMERIC(DataValue)=1 THEN CAST(DataValue AS float) * 1.335
ELSE 0 END
Order of execution not always matches one's expectations.
If you set a where clause, it generally does not mean the calculations in the select list will only be applied to the rows that match that where. SQL Server may easily decide to do a bulk calculation and then filter out unwanted rows.
That said, you can easily write try_parse yourself:
create function dbo.try_parse(#v nvarchar(30))
returns float
with schemabinding, returns null on null input
as
begin
if isnumeric(#v) = 1
return cast(#v as float);
return null;
end;
So starting with your update query that's giving an error (please forgive me for rewriting it for my own clarity):
UPDATE B
SET
OriginalValue = DataValue,
OriginalUnitID = DataUnitID,
DataValue = CAST(DataValue AS float) * 1.335
FROM
dbo.tblBenchmarkData B
INNER JOIN dbo.tblZEGCode Z
ON B.ZEGCodeID = Z.ZEGCodeID
WHERE
B.FieldDataSetID = '6956beeb-a1e7-47f2-96db-0044746ad6d5'
AND (
Z.ZEGCode = 'C004' OR
Z.ZEGParentCode LIKE 'C004%'
)
I think you'll find that a SELECT statement with exactly the same expressions will give the same error:
SELECT
OriginalValue,
DataValue NewOriginalValue,
OriginalUnitID,
DataUnitID OriginalUnitID,
DataValue,
CAST(DataValue AS float) * 1.335 NewDataValue
FROM
dbo.tblBenchmarkData B
INNER JOIN dbo.tblZEGCode Z
ON B.ZEGCodeID = Z.ZEGCodeID
WHERE
B.FieldDataSetID = '6956beeb-a1e7-47f2-96db-0044746ad6d5'
AND (
Z.ZEGCode = 'C004' OR
Z.ZEGParentCode LIKE 'C004%'
)
This should show you the rows that can't convert:
SELECT
B.*
FROM
dbo.tblBenchmarkData B
INNER JOIN dbo.tblZEGCode Z
ON B.ZEGCodeID = Z.ZEGCodeID
WHERE
B.FieldDataSetID = '6956beeb-a1e7-47f2-96db-0044746ad6d5'
AND (
Z.ZEGCode = 'C004' OR
Z.ZEGParentCode LIKE 'C004%'
)
AND IsNumeric(DataValue) = 0
-- AND IsNumeric(DataValue + 'E0') = 0 -- try this if the prior doesn't work
The trick in the last commented line is to tack on things to the string to force only valid numbers to be numeric. For example, if you wanted only integers, IsNumeric(DataValue + '.0E0') = 0 would show you those that aren't.