MySQL stored procedure fails when called from R - mysql

This procedure works from the MySQL commandline both remotely and on localhost and it works when called from PHP. In all cases the grants are adequate:
CREATE PROCEDURE `myDB`.`lee_expout` (IN e int, IN g int)
BEGIN
select lm.groupname, lee.location, starttime, dark,
inadist,smldist,lardist,emptydur,inadur,smldur,lardur,emptyct,entct,inact,smlct,larct
from lee join leegroup_map lm using (location)
where exp_id= e and std_interval!=0 and groupset_id= g
order by starttime,groupname,location;
END
I'm trying to call it from R:
library(DBI)
library(RMySQL)
db <- dbConnect(MySQL(), user="user", password="pswd",
dbname="myDB", host="the.host.com")
#args to pass to the procedure
exp_id<-16
group_id<-2
#the procedure call
p <- paste('CALL lee_expout(', exp_id, ',', group_id,')', sep= ' ')
#the bare query
q <- paste('select lm.groupname, lee.location, starttime, dark,
inadist,smldist,lardist,emptydur,inadur,smldur,lardur,emptyct,entct,inact,smlct,larct
from lee join leegroup_map lm using (location)
where exp_id=',
exp_id,
' and std_interval!=0 and groupset_id=',
group_id,
'order by starttime,groupname,location', sep=' ')
rs_p <- dbSendQuery(db, statement=p) #run procedure and fail
p_data<-fetch(rs_p,n=30)
rs_q <- dbSendQuery(db, statement=q) #or comment out p, run query and succeed
q_data<-fetch(rs_q,n=30)
The bare query runs fine. The procedure call fails with
RApache Warning/Error!!!Error in
mysqlExecStatement(conn, statement,
...) : RS-DBI driver: (could not
run statement: PROCEDURE
myDB.lee_expout can't return a
result set in the given context)
The MySQL docs say
For statements that can be determined
only at runtime to return a result
set, a PROCEDURE %s can't return a
result set in the given context error
occurs.
One would think that if a procedure were going to throw that error, it would be thrown under all circumstances instead of just from R.
Any thoughts on how to fix this?

As far as I know, calling SQL procedures from R (dbCallProc) is not yet formally implemented (see reference manual of 24 july 2010 : http://cran.r-project.org/web/packages/RMySQL/RMySQL.pdf)
RMySQL is transferred from S3 to S4 programming style, and is currently still under development (version 0.7 being the current one). I suggest you ask the same question on the database mailing list for R :
https://stat.ethz.ch/mailman/listinfo/r-sig-db
If it is possible, they'll show you how. If it isn't, they'll tell you why.

Try adding:
client.flag=CLIENT_MULTI_STATEMENTS
to your connection parameters. It may help.
There are some details about this in the RMySQL PDF.

Don't now about R, but this
p <- paste('CALL lee_expout(', exp_id, ',', group_id,')', sep= ' ')
does look a bit ugly, ie like string concatenation. Maybe R's database driver takes that badly. In general, you can use placeholders for variables and pass the values on as separate arguments. Besides various security arguments, this also takes care of any type/apostrophe/whatever issues - maybe here, too?

Related

Numbers change when querying MySQL/MariaDB through R (RMariaDB)/ Integer conversion in R

I was able to implement a connection from R through RMariaDB and DBI to a remote MariaDB-database. However, I am currently encountering a strange change of numbers when querying the database through R. I'll explain the differences:
I inserted one simple entry in my database with the following command:
INSERT INTO respondent ( id, name ) VALUES ( 2388793051, 'testuser' )
When I connect to this database directly on the remote server and execute a statement like this:
SELECT * FROM respondent;
it delivers these value
id: 2388793051, name: testuser
So I should also be able to connect to the database via R and receive the same results. So when I execute the following code in R, I expect to receive this inserted and saved information displayed above:
library(DBI)
library(RMariaDB)
conn <- DBI::dbConnect(drv=RMariaDB::MariaDB(), user="myusername", password="mypassword", host="127.0.0.1", port="1111", dbname="mydbname")
res <- dbGetQuery(conn, "SELECT * FROM respondent")
print(res)
However, the result of this query is the following
id name
-1906174245 testuser
As you can see, the id is now -1906174245 instead of the saved 2388793051 in the database. I don't understand this weird conversion of integers in the id-field. Can someone explain how this problem emerges and how I might solve it?
EDIT: I don't expect this to be a problem, but just to inform you: I am using an SSH tunnel to enable a connection via these specified ports from my local to my remote machine.
SOLUTION: What made the difference was to specify the id of a respondent in the database specification already as BIGINT instead of INT. Thanks to #JonnyCrunch

Common Table Expressions -- Using a Variable in the Predicate

I've written a common table expression to return hierarchical information and it seems to work without issue if I hard code a value into the WHERE statement. If I use a variable (even if the variable contains the same information as the hard coded value), I get the error The maximum recursion 100 has been exhausted before statement completion.
This is easier shown with a simple example (note, I haven't included the actual code for the CTE just to keep things clearer. If you think it's useful, I can certainly add it).
This Works
WITH Blder
AS
(-- CODE IS HERE )
SELECT
*
FROM Blder as b
WHERE b.PartNo = 'ABCDE';
This throws the Max Recursion Error
DECLARE #part CHAR(25);
SET #part = 'ABCDE'
WITH Blder
AS
(-- CODE IS HERE )
SELECT
*
FROM Blder as b
WHERE b.PartNo = #part;
Am I missing something silly? Or does the SQL engine handle hardcoded values and parameter values differently in this type of scenario?
Kindly put semicolon at the end of your variable assignment statement
SET #part ='ABCDE';
Your SELECT statement is written incorrectly: the SQL Server Query Optimizer is able to optimize away the potential cycle if fed the literal string, but not when it's fed a variable, which uses the plan that developed from the statistics.
SQL Server 2016 improved on the Query Optimizer, so if you could migrate your DB to SQL Server 2016 or newer, either with the DB compatibility level set to 130 or higher (for SQL Server 2016 and up), or have it kept at 100 (for SQL Server 2008) but with OPTION (USE HINT ('ENABLE_QUERY_OPTIMIZER_HOTFIXES')) added to the bottom of your SELECT statement, you should get the desired result without the max recursion error.
If you are stuck on SQL Server 2008, you could also add OPTION (RECOMPILE) to the bottom of your SELECT statement to create an ad hoc query plan that would be similar to the one that worked correctly.

RODBC ERROR: Could not SQLExecDirect in mysql

I have been trying to write an R script to query Impala database. Here is the query to the database:
select columnA, max(columnB) from databaseA.tableA where columnC in (select distinct(columnC) from databaseB.tableB ) group by columnA order by columnA
When I run this query manually (read: outside the Rscript via impala-shell), I am able to get the table contents. However, when the same is tried via the R script, I get the following error:
[1] "HY000 140 [Cloudera][ImpalaODBC] (140) Unsupported query."
[2] "[RODBC] ERROR: Could not SQLExecDirect 'select columnA, max(columnB) from databaseA.tableA where columnC in (select distinct(columnC) from databaseB.tableB ) group by columnA order by columnA'
closing unused RODBC handle 1
Why does the query fail when tried via R? and how do I fix this? Thanks in advance :)
Edit 1:
The connection script looks as below:
library("RODBC");
connection <- odbcConnect("Impala");
query <- "select columnA, max(columnB) from databaseA.tableA where columnC in (select distinct(columnC) from databaseB.tableB ) group by columnA order by columnA";
data <- sqlQuery(connection,query);
You need to install the relevant drivers, please look at the following link
I had the same issue, all i had to do was update the ODBC drivers.
Also if you can update your odbcConnect with the username and password
connection <- odbcConnect("Impala");
to
connection <- odbcConnect("Impala", uid="root", pwd="password")
The RODBC package is quirky: if there's no row updated/deleted in the query execution it will throw an error.
So before using sqlDelete to delete rows, or using sqlUpdate to update values, first check if there's at least one row that will be deleted/updated by querying COUNT(*).
I've had no problem after implementing the check, for Oracle SQL 12g.
An alternative would be to use a staging table for the new batch of data, and use sqlQuery to execute a MERGE command. RODBC won't complaint if there's zero row merged.
This might also be due to an error in your sql query itself. For example, I got this error when I missed an 'in' in the following generalized statement. Example:
stringstuff <- someDT$columnyouwanttouse
somestring <- toString(sprintf("'%s'", stringstuff))
RESULTS <- sqlQuery(con, paste0("select
fling as flam
and toot **in** (",somestring,")
limit 30
;"))
I got the error you did when I left out the 'in', so double check your syntax.
This error message can arise if the table doesn't exist in the database.
A few sensible checks:
Check for typos in the table name in your query
See if you can run the same query on the same database via another sql client
Talk to your data base administrator to confirm that the table does exist
Re-installing the RODBC package did the trick for me!
I had a similar problem. After unnisntalling the R version 4.2.1 and install the R version 4.1.3 the problem was solved.

Compiling sqlalchemy statements with named parameters

This is a follow up to Convert sqlalchemy core statements to raw SQL without a connection?. I would like to use insert with named arguments, but without executing them through SQLA.
I have an insert statement defined as such:
table = Table("table", meta_data,
Column("id", Integer, auto_increment=True, primary_key=1),
Column("name", String),
Column("full_name", String))
params = {"full_name": "some_name", "name": "some_other_name"}
stmt = sqlalchemy.sql.expression.insert(table).values(params)
c_stmt = stmt.compile(dialect=dialect)
I would later execute this statement through a DBAPI connection. How can I ensure that the position between the generated sql string and my set of parameters is consistent ?
Most SQL engines (all?) support named bind parameters as well as positional parameters, and SA provides an interface (bindparam, funny enough) to do so. The SA Docs have a section with examples that will probably serve your needs. Of course, the examples assume you are executing the queries with SA, so you'll need to interpret them for your use case.

how to close resultset in RMySQL?

I used RMySQL for import database, sometimes when I try to close the connection, I receive the following error:
Error in mysqlCloseConnection(conn, ...) :
connection has pending rows (close open results set first)
I have no other ways of correcting this other than restarting the computer, anything I can do so solve this? Thanks!
We can use the method dbClearResult.
Example:
dbClearResult(dbListResults(conn)[[1]])
As Multiplexer noted, you are probably doing it wrong by leaving parts of the result set behind.
DBI and the accessor packages like RMySQL have documentation that is a little challenging at times. I try to remind myself to use dbGetQuery() which grabs the whole result set at once. Here is a short snippet from the CRANberries code:
sql <- paste("select count(*) from packages ",
"where package='", curPkg, "' ",
"and version='", curVer, "';", sep="")
nb <- dbGetQuery(dbcon, sql)
After this I can close without worries (or do other operations).
As explained in previous answers, you get this error because RMysql didn't return all the results of the query.
I had this problem when the results where over 500 ,using :
my_result <- fetch( dbSendQuery(con, query))
looking at the documentation for fetch I found that you can specify the number of records retrieved :
n = maximum number of records to retrieve per fetch. Use n = -1 or n = Inf to retrieve all pending records.
Solutions :
1- set the number of record to infinity :
my_result <- fetch( dbSendQuery(con, query), n=Inf)
2- use dbGetQuery :
my_result <- dbGetQuery(con, query)
rs<- dbGetQuery(dbcon, sql)
data<-dbFetch(rs)
dbClearResult(rs)
last line removed the following error when continuing querying
Error in .local(conn, statement, ...) :
connection with pending rows, close resultSet before continuing
You need to close the resultset before closing the connection.
If you try to close the connection before closing the resultset which has pending rows then sometimes it lead to hang the machine.
I don't know much about rmysql but try to close the resultset first.
You have to remember about result's set yourself. In example below you have how to close/clear results and how to take the rows affected. To solve your problem use last line of code on variable which takes results from any of yours sent statement or query. :)
statementRes <- DBI::dbSendStatement(conn = db,
"CREATE TABLE IF NOT EXISTS great_dupa_test (
taxonomy_id INTERGER NOT NULL,
scientific_name TEXT);")
DBI::dbGetRowsAffected(statementRes)
DBI::dbClearResult(statementRes)
When I used this in R. It worked for me!
Just run the command of dbConnect(). I just reconnected the db.