I'm running an R script that grabs a query from MySQL. The query itself is functional, but is dependent on a variable "N".
To add this variable to my SQL code, I'm using sprintf to insert it by typing %s where I'd like it. However, my query also includes multiple of these LIKE statements:
FROM `Receipts`
WHERE `RetailerID`
IN ( '%s' ) # this is where "N" is placed
AND ( `Date` LIKE '%01/07/2014%')
I'm positive that this is the reason my query is not running. The sprintf command is having issue when it reaches these LIKE commands, probably thinking it is similar to %s.
Does anyone know how to get around this so that %01/07/2014% is still printed to the SQL query? I've tried using the escape %% like this, %%01/07/2014%% but it still doesn't work.
Is there a way I can format sprintf so it knows to skip these?
Thanks!
To make #cryo111's comment an explicit answer:
Use gsub like this:
N=10
query="select * from table where table.cnt=#N"
gsub("#N",N,query)
You can use RMySQL's dbEscapeString with sprintf to handle placeholders.
require(RMySQL)
con <- dbConnect(MySQL(), dbname = "foobar")
tmp <- sprintf("SELECT * FROM someField WHERE someOtherField = %s", "sometext")
dbEscapeStrings(con, tmp)
sprintf works fine for me provided I escape the % signs:
x <- "from receipts where retailerid in ('%s') and (date like '%%01/07/2014%%')"
> sprintf(x,"a")
[1] "from receipts where retailerid in ('a') and (date like '%01/07/2014%')"
The above runs just fine for me. However, in general, I wouldn't recommend sprintf over gsub just because it will become cumbersome fairly quickly.
I tend to use paste for sql queries. So in your case I would use something like:
paste("select * from Receipts where RetailerID in(",as.character(N),") and (date like '%01/07/2014%')"
Related
This is my timestamp
q1<-Sys.time()-777000
q1
#"2019-09-12 08:39:27 GMT"
This is what I am trying to do , am getting an error
Sys.time()
dbSendQuery(conn,"delete from anomaly_hourly_temp where report_time>q1")
Sys.time()
dbSendQuery(conn,"delete from anomaly_hourly_temp where report_time>q1")
Error in .local(conn, statement, ...) :
could not run statement: Unknown column 'q1' in 'where clause'
Also tried this , though it does not show any error , but it does not delete any row based on timestamps
Sys.time()
dbSendQuery(conn,"delete from anomaly_hourly_temp where report_time>'q1'")
Sys.time()
If I explicitly specify timestamp(q1) , it does work as given below
dbSendQuery(conn,"delete from anomaly_hourly_temp where report_time>'2019-09-22 11:42:51'")
Right now you attempt to interpolate the R variable into the SQL statement but SQL reads the literal q1 and not its underlying value. While concatenating the R variable into SQL string is a solution, it is safer, more efficient, avoids quotes, and industry best practice to run parameterization using a prepared statement with parameter binded in subsequent step:
# PREPARED STATEMENT
sql <- "delete from anomaly_hourly_temp where report_time > ?")
# BIND PARAM AND EXECUTE ACTION
dbSendQuery(conn, sql, list(q))
Use paste0 to paste the query together.
DBI::dbSendQuery(conn,
paste0("delete from anomaly_hourly_temp where report_time > '", q1, "'"))
Similar to paste0, we can also use paste/str_c/glue/sprintf or other function which helps to paste the query together.
I'm using PDO to connect to MySQL. Everything is working fine, except this doesn't work.
Does anyone knows why? And how should i do it?
SELECT * FROM flagIt WHERE :flagids LIKE CONCAT('%', flagIt.flagIt_id, '%')
:flagids is equivalent to a string like "ID1 ID2 ID3".
EDIT (just to compare)
SELECT * FROM flagIt WHERE 'ID1 ID2 ID3' LIKE CONCAT('%', flagIt.flagIt_id, '%)
If i use like this, it works fine, so...why it does not work with :flagids?
I hope you understand my problem.
Thank you very much.
EDIT
I tried:
"SELECT * FROM flagIt WHERE flagIt.flagIt_id IN(:flagids)"
and as Hobo Sapiens suggested
"SELECT * FROM flagIt WHERE FIND_IN_SET(flagIt.flagIt_id, :flagids)"
and nothing works!!!!!!!!!
This is the query you're submitting to PDO::prepare():
SELECT * FROM flagIt WHERE :flagids LIKE CONCAT('%', flagIt.flagIt_id, '%')
The process of preparing a statement doesn't just involve evaluating the contents of the placeholders, substituting them in a string and executing the resulting query.
A prepare asks the server to evaluate the query and prepare an execution plan that includes the table and indexes it will use. For that it needs to know which columns of which tables it must work with, which is why one cannot use a placeholder where you would need an identifier.
The problem with your query is that the server has no way to know at the time it prepares the statement whether the placeholder represents a string literal or a column identifier. Without that information, the preparation cannot be done, and your prepare will fail.
If you have some flexibility over the value you're using in :flagids you could use find_in_set():
SELECT * FROM flagIt WHERE find_in_set(flagIt_id, :flagids)
where a variable containing, for example, 'ID1,ID2,ID3' is bound to :flagids.
This will be fine for small lists, but will be slow for a large list.
MySQL reference for find_in_set()
I am researching how to read in data from a server directly to a data frame in R. In the past I have written SQL queries that were over 50 lines long (with all the selects and joins). Any advice on how to write long queries in R? Is there some way to write the query elsewhere in R, then paste it in to the "sqlQuery" part of the code?
Keep long SQL queries in .sql files and read them in using readLines + paste with collapse='\n'
my_query <- paste(readLines('your_query.sql'), collapse='\n')
results <- sqlQuery(con, my_query)
You can paste any SQL query into R as is and then simply replace the newlines + spaces with a single space. For instance:
## Connect ot DB
library(RODBC)
con <- odbcConnect('MY_DB')
## Paste the query as is (you can have how many spaces and lines you want)
query <-
"
SELECT [Serial Number]
,[Series]
,[Customer Name]
,[Zip_Code]
FROM [dbo].[some_db]
where [Is current] = 'Yes' and
[Serial Number] LIKE '5%' and
[Series] = '20'
order by [Serial Number]
"
## Simply replace the new lines + spaces with a space and you good to go
res <- sqlQuery(con, gsub("\\n\\s+", " ", query))
close(con)
Approach with separate .sql (most sql or nosql engines) files can be trouble if one prefer to edit code in one file.
As far as someone using RStudio (or other tool where code folding can be customized), simplifying can be done using parenthesis. I prefer using {...} and fold the code.
For example:
query <- {'
SELECT
user_id,
format(date,"%Y-%m") month,
product_group,
product,
sum(amount_net) income,
count(*) number
FROM People
WHERE
date > "2015-01-01" and
country = "Canada"
GROUP BY 1,2,3;'})
Folding a query can be even done within function (folding long argument), or in other situations where our code extends to inconvenient sizes.
I had this issue trying to run a 17 line SQL query through RODBC and tried #arvi1000's solution but no matter what I did it would produce an error message and not execute more than one line of the .sql file. Tried variations of the value for collapse and different ways for reading in the file. Spent 90 minutes trying to get it to work.. Suspect RODBC might behave differently with multi-line queries on different platforms or with different versions of MySQL or ODBC settings.
Anyway, the following loop arrangement may not be as elegant but it works and is possibly more robust:
channel <- odbcConnect("mysql_odbc", uid="username", pwd="password")
sqlString<-readLines("your_query.sql")
for (i in 1:length(sqlString)) {
print(noquote(sqlString[i]))
sqlQuery(channel, as.name(sqlString[i]))
}
In my script, all except the last lines were doing joins, creating temporary tables etc, only the last line had a SELECT statment and produced an output. .sql file was tidy with only one query per line, no comments or newline characters within the query. It seems that this loop runs all the code, but the output is possibly lost in the scope somewhere, so the one SELECT statement needs to be repeated outside the loop.
In my Django app, I need to generate a MySQL query like this:
SELECT * FROM player WHERE (myapp_player.sport_id = 4 AND (myapp_player.last_name LIKE 'smi%'))
UNION
SELECT * FROM player WHERE (myapp_player.sport_id = 4 AND (myapp_player.first_name LIKE 'smi%'));
I can't use Q objects to OR together the __istartswith filters because the query generated by the Django ORM does not use UNION and it runs at least 40 times slower than the UNION query above. For my application, this performance is unacceptable.
So I'm trying stuff like this:
Player.objects.raw("SELECT * FROM myapp_player WHERE (sport_id = %%s AND (last_name LIKE '%%s%')) UNION SELECT * FROM sports_player WHERE (sport_id = %%s AND (first_name LIKE '%%s%'))", (sport.id, qword, sport.id, qword))
I apologize for the long one-liner, but I wanted to avoid using a triple-quoted string while trying to debug this type of issue.
When I execute or repr this queryset object, I get exceptions like this:
*** ValueError: unsupported format character ''' (0x27) at index 133
That's a single-quote in single quotes, not a triple-quote. If I get rid of the single-quotes around the LIKE clauses, then I get a similar exception about the close-paren ) character that follows the LIKE clause.
Apparently Django and MySQL disagree on the correct syntax for this query, but is there a syntax that will work for both?
Finally, I'm not sure that my %%s syntax for string interpolation is correct, either. The Django docs suggest that I should be able to use the regular %s syntax in the arguments for raw(), but several online resources suggest using %%s or ? as the placeholder for string interpolation in raw SQL.
My sincere thanks for just a little bit of clarity on this issue!
I got it to work like this:
qword = word + '%'
Player.objects.raw("SELECT * FROM myapp_player WHERE (sport_id = %s AND (last_name LIKE %s)) UNION SELECT * FROM myapp_player WHERE (sport_id = %s AND (first_name LIKE %s))", (sport.id, qword, sport.id, qword))
Besides the fact that %s seems to be the correct way to parameterize the raw query, the key here was to add the % wildcard to the LIKE clause before calling raw() and to exclude the single quotes from around the LIKE clause. Even though there are no quotes around the LIKE clause, quotes appear in the query ultimately sent to the MySQL sever.
I'm using R to call a mySQL statement, where I define the variable outside the statement e.g.
foo = 23;
dbGetQuery(con, "select surname from names WHERE age = '.foo.' ;")
But this returns an empty set, I've googled around and tried'.&foo.' ".foo." '".&&foo."'
and many different combinations, but none of them work, I think this should be a mysql question rather than an R specific problem I'm having, but not sure. Normally variables have $values but not in R.
This should work:
foo = 23;
sqlStatement <- paste("select surname from names WHERE age =",foo,'"',sep="")
dbGetQuery(con, sqlStatement;)
You may want to look at the answers to this question: Can I gracefully include formatted SQL strings in an R script?.
The simplest solution is to use the paste command as Robert suggested.
The accepted answer gives bad advice which leaves your application vulnerable to SQL injection. You should always use bind variables instead of concatenating values directly into your query. Use the dbGetPreparedQUery method as described in this answer: Bind variables in R DBI
Adding the semi-colon at the end of query sometimes creates problem. Try changing your query from:
dbGetQuery(con, "select surname from names WHERE age = '.foo.' ;")
to:
dbGetQuery(con, "select surname from names WHERE age = '.foo.'")
AFAIK the command has to be a string, so you should append the single components. Not being familiar with R I cant help you out HOW to do that. In MS-VBA the string concatenation operator is '&'.