How to Deploy model using plumber API from CMD line R? - mlr

New to using plumber API, trying to deploy R model, I have saved the R Model and a test data (OneRecord). Ran the plumber API from CMD line, 127.0.0.1:8000 returns Error "{"error":["500 - Internal server error"]}"
and the terminal shows an error of "simpleError in if (opts$show.learner.output) identity else capture.output: argument is of length zero"
My R Code
#plumb_test.R
library(plumber)
#Simple msg command
#* #apiTitle Plumber Example API
#* Echo back the input
#* #param msg The message to echo
#* #get /echo
function(msg=""){
list(msg = paste0("The message is: '", msg, "'"))
}
#My Model
#* #get /run
function(){
rf_prediction <- predict(readRDS("rf_unwrap.rds"), newdata = as.data.frame(readRDS("Test_data.Rds")))
rf_prediction$data
}
R Code for plumber run
library(plumber)
pr <- plumb("plumb_test.R")
pr$run(port=8000)
msg working properly
http://127.0.0.1:8000/echo?msg=hellohru
returns me
{"msg":["The message is: 'hellohru'"]}
But my model returns
{"error":["500 - Internal server error"]}
in the terminal I am getting
> pr$run(port=8000)
Starting server to listen on port 8000
<simpleError in if (opts$show.learner.output) identity else capture.output: argument is of length zero>
I am running from windows cmd line as follows
C:\R\R-3.5.2\bin>r -f plumb_run.R
All the files were in bin folder (model, test data, plumb R scripts)
Expecting the output of prediction, not sure what the error means.

loading mlr library along with plumber and using print in the function, everything worked
library(plumber)
library(mlr)
#My Model
#* #get /run
function(){
print(predict(readRDS("rf_unwrap.rds"), newdata = as.data.frame(readRDS("Test_data.Rds")))$data)
}

Related

MySQL Connection Error Handling with tryCatch

I have a R Shiny Application which uses MySQL as a datasource. When the application loads and the user logs into the app with their username and password, a database connection interface opens up where the user inputs their MySQL credentials. In order to prevent the app from crashing when the user enters the wrong MySQL connection credential, I am trying to use the following error handling.
# run when Load Data button is clicked
datapms <- eventReactive(input$pull_data, {
req(input$db_user,input$db_user,input$db_password,input$db_name,input$db_host,input$db_port)
progress <- Progress$new(session, min=1, max=15)
on.exit(progress$close())
progress$set(message = 'Pulling data from database',
detail = 'This message will disappear once completed.')
# establish a database connection
tryCatch({
con <- RMySQL::dbConnect(
RMySQL::MySQL(),
user = input$db_user,
password = input$db_password,
dbname = input$db_name,
host = input$db_host
)
}, error = function(e) {
debug_msg(e$message)
})
# construct the SQL statement
sql <- "SELECT * FROM pmsanalytics;"
# Fetch data
pmsanalytics <- tryCatch({
pmsanalytics <- dbGetQuery(conn = con, sql)
}, error = function(e) {
debug_msg(e$message)
})
### display debugging message in R (if local)
### and in the console log (if running in shiny)
debug_msg <- function(...) {
is_local <- Sys.getenv('SHINY_PORT') == ""
in_shiny <- !is.null(shiny::getDefaultReactiveDomain())
txt <- toString(list(...))
if (is_local) message(txt)
if (in_shiny) shinyjs::runjs(sprintf("console.debug(\"%s\")", text))
}
Initially this code was working, and the app was not crashing. However, now, when for example one enters the wrong connection credentials, i am getting the following error message:
Warning: Error in as.character: cannot coerce type 'closure' to vector of type 'character'
138: sprintf
136: debug_msg [C:\PMSAnalytics/app.R#107]
135: value[[3L]] [C:\PMSAnalytics/app.R#211]
134: tryCatchOne
133: tryCatchList
132: tryCatch
131: eventReactiveValueFunc [C:\PMSAnalytics/app.R#202]
Basically, the app crashes because there is no data which it is expecting to get, in other words the reactive datapms() data source it is expecting to get is empty.
Kind assist in reviewing my code to prevent app crashing.
Regards,
Chris

Connection issues in Storage trigger GCF

For my application, new file uploaded to storage is read and the data is added to a main file. The new file contains 2 lines, one a header and other an array whose values are separated by a comma. The main file will need maximum of 265MB. The new files will have maximum of 30MB.
def write_append_to_ecg_file(filename,ecg,patientdata):
file1 = open('/tmp/'+ filename,"w+")
file1.write(":".join(patientdata))
file1.write('\n')
file1.write(",".join(ecg.astype(str)))
file1.close()
def storage_trigger_function(data,context):
#Download the segment file
download_files_storage(bucket_name,new_file_name,storage_folder_name = blob_path)
#Read the segment file
data_from_new_file,meta = read_new_file(new_file_name, scale=1, fs=125, include_meta=True)
print("Length of ECG data from segment {} file {}".format(segment_no,len(data_from_new_file)))
os.remove(new_file_name)
#Check if the main ecg_file_exists
file_exists = blob_exists(bucket_name, blob_with_the_main_file)
print("File status {}".format(file_exists))
data_from_main_file = []
if ecg_file_exists:
download_files_storage(bucket_name,main_file_name,storage_folder_name = blob_with_the_main_file)
data_from_main_file,meta = read_new_file(main_file_name, scale=1, fs=125, include_meta=True)
print("ECG data from main file {}".format(len(data_from_main_file)))
os.remove(main_file_name)
data_from_main_file = np.append(data_from_main_file,data_from_new_file)
print("data after appending {}".format(len(data_from_main_file)))
write_append_to_ecg_file(main_file,data_from_main_file,meta)
token = upload_files_storage(bucket_name,main_file,storage_folder_name = main_file_blob,upload_file = True)
else:
write_append_to_ecg_file(main_file,data_from_new_file,meta)
token = upload_files_storage(bucket_name,main_file,storage_folder_name = main_file_blob,upload_file = True)
The GCF is deployed
gcloud functions deploy storage_trigger_function --runtime python37 --trigger-resource patch-us.appspot.com --trigger-event google.storage.object.finalize --timeout 540s --memory 8192MB
For the first file, I was able to read the file and write the data to the main file. But after uploading the 2nd file, its giving Function execution took 70448 ms, finished with status: 'connection error' On uploading the 3rd file, it gives the Function invocation was interrupted. Error: memory limit exceeded. Despite of deploying the function with 8192MB memory, I am getting this error. Can I get some help on this.

using nginx' lua to validate GitHub webhooks and delete cron-lock-file

What I have:
GNU/Linux host
nginx is up and running
there is a cron-job scheduled to run immediately after a specific file has been removed (similar to run-crons)
GitHub sends a webhook when someone pushes to a repository
What I want:
I do now want to run either lua or anything comparable to parse GitHub's request and validate it and then delete a file (if the request was valid of course).
Preferably all of this should happen without the hassle to maintain an additional PHP installation as there is currently none, or the need to use fcgiwrap or similar.
Template:
On the nginx side I have something equivalent to
location /deploy {
# execute lua (or equivalent) here
}
To read json body of GH webhook you nead use JSON4Lua lib, and to validate HMAC signature use luacrypto.
Preconfigure
Install required modules
$ sudo luarocks install JSON4Lua
$ sudo luarocks install luacrypto
In Nginx define location for deploy
location /deploy {
client_body_buffer_size 3M;
client_max_body_size 3M;
content_by_lua_file /path/to/handler.lua;
}
The max_body_size and body_buffer_size should be equal to prevent error
request body in temp file not supported
https://github.com/openresty/lua-nginx-module/issues/521
Process webhook
Get request payload data and check is correct
ngx.req.read_body()
local data = ngx.req.get_body_data()
if not data then
ngx.log(ngx.ERR, "failed to get request body")
return ngx.exit (ngx.HTTP_BAD_REQUEST)
end
Verify GH signature with use luacrypto
local function verify_signature (hub_sign, data)
local sign = 'sha1=' .. crypto.hmac.digest('sha1', data, secret)
-- this is simple comparison, but it's better to use a constant time comparison
return hub_sign == sign
end
-- validate GH signature
if not verify_signature(headers['X-Hub-Signature'], data) then
ngx.log(ngx.ERR, "wrong webhook signature")
return ngx.exit (ngx.HTTP_FORBIDDEN)
end
Parse data as json and check is master branch, for deploy
data = json.decode(data)
-- on master branch
if data['ref'] ~= branch then
ngx.say("Skip branch ", data['ref'])
return ngx.exit (ngx.HTTP_OK)
end
If all correct, call deploy function
local function deploy ()
-- run command for deploy
local handle = io.popen("cd /path/to/repo && sudo -u username git pull")
local result = handle:read("*a")
handle:close()
ngx.say (result)
return ngx.exit (ngx.HTTP_OK)
end
Example
Example constant time string compare
local function const_eq (a, b)
-- Check is string equals, constant time exec
getmetatable('').__index = function (str, i)
return string.sub(str, i, i)
end
local diff = string.len(a) == string.len(b)
for i = 1, math.min(string.len(a), string.len(b)) do
diff = (a[i] == b[i]) and diff
end
return diff
end
A complete example of how I use it in github gist https://gist.github.com/Samael500/5dbdf6d55838f841a08eb7847ad1c926
This solution does not implement verification for GitHub's hooks and assumes you have the lua extension and the cjson module installed:
location = /location {
default_type 'text/plain';
content_by_lua_block {
local cjson = require "cjson.safe"
ngx.req.read_body()
local data = ngx.req.get_body_data()
if
data
then
local obj = cjson.decode(data)
if
# checksum checking should go here
(obj and obj.repository and obj.repository.full_name) == "user/reponame"
then
local file = io.open("<your file>","w")
if
file
then
file:close()
ngx.say("success")
else
ngx.exit(ngx.HTTP_INTERNAL_SERVER_ERROR)
end
else
ngx.exit(ngx.HTTP_UNAUTHORIZED)
end
else
ngx.exit(ngx.HTTP_NOT_ALLOWED)
end
}
}

Using the LDAvis package in R to create a gist file of the result

I'm using LDAvis for topic modeling and trying to use the as.gist option to create a gist. When serVis executes there is a timeout in curl::curl_fetch_memory after about 10 seconds. If I immediately execute serVis again I get a different error 'Problems parsing JSON' and from then on whenever serVis is run that same error recurs.
If I start all over with a fresh workspace the same behavior occurs. The first time serVis is run, curl::curl_fetch_memory times out after about 10 seconds. Subsequent executions return 'Problems parsing JSON'.
If I don't use the as.gist option it works fine, but of course doesn't create a gist.
Very rarely, it works and a gist is created. If I change parameters to reduce the size of the JSON object it usually works, which makes me think it may be related to size.
I have explored the various RCurlOptions timeout settings. Currently, they are set as
options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem",
package = "RCurl"),
connecttimeout = 300, timeout = 3000,
followlocation = TRUE, dns.cache.timeout = 300))
Below is a console listing with debug set on curl::curl_fetch_memory.
> json <- createJSON(phi = cases$phi,
+ theta = cases$theta,
+ doc.len .... [TRUNCATED]
> serVis(json, open.browser = TRUE, as.gist = TRUE, description = 'APM Community')
debugging in: curl::curl_fetch_memory(url, handle = handle)
debug: {
output <- .Call(R_curl_fetch_memory, url, handle)
res <- handle_response_data(handle)
res$content <- output
res
}
Browse[2]> output <- .Call(R_curl_fetch_memory, url, handle)
Error: Timeout was reached
Browse[2]> output <- .Call(R_curl_fetch_memory, url, handle)
Browse[2]> rawToChar(output)
[1] "{\"message\":\"Problems parsing JSON\",\"documentation_url\":\"https://developer.github.com/v3\"}"
Browse[2]>
.
.
exiting from: curl::curl_fetch_memory(url, handle = handle)
Error: Problems parsing JSON
Any hints on how to debug this problem?

Run R silently from command line, export results to JSON

How might I call an R script from the shell (e.g. from Node.js exec) and export results as JSON (e.g. back to Node.js)?
The R code below basically works. It reads data, fits a model, converts the parameter estimates to JSON, and prints them to stdout:
#!/usr/bin/Rscript --quiet --slave
install.packages("cut", repos="http://cran.rstudio.com/");
install.packages("Hmisc", repos="http://cran.rstudio.com/");
install.packages("rjson", repos="http://cran.rstudio.com/");
library(rjson)
library(reshape2);
data = read.csv("/data/records.csv", header = TRUE, sep=",");
mylogit <- glm( y ~ x1 + x2 + x3, data=data, family="binomial");
params <- melt(mylogit$coefficients);
json <- toJSON(params);
json
Here's how I'd like to call it from Node...
var exec = require('child_process').exec;
exec('./model.R', function(err, stdout, stderr) {
var params = JSON.parse(stdout); // FAIL! Too much junk in stdout
});
Except the R process won't stop printing to stdout. I've tried --quiet --slave --silent which all help a little but not enough. Here's what's sent to stdout:
The downloaded binary packages are in
/var/folders/tq/frvmq0kx4m1gydw26pcxgm7w0000gn/T//Rtmpyk7GmN/downloaded_packages
The downloaded binary packages are in
/var/folders/tq/frvmq0kx4m1gydw26pcxgm7w0000gn/T//Rtmpyk7GmN/downloaded_packages
[1] "{\"value\":[4.04458733165933,0.253895751245782,-0.1142272181932,0.153106007464742,-0.00289013062471735,-0.00282580664375527,0.0970325223603164,-0.0906967639834928,0.117150317941983,0.046131890754108,6.48538603593323e-06,6.70646151749708e-06,-0.221173770066275,-0.232262366060079,0.163331098409235]}"
What's the best way to use R scripts on the command line?
Running R --silent --slave CMD BATCH model.R per the post below still results in a lot of extraneous text printed to model.Rout:
Run R script from command line
Those options only stop R's own system messages from printing, they won't stop another R function doing some printing. Otherwise you'll stop your last line from printing and you won't get your json to stdout!
Those messages are coming from install.packages, so try:
install.packages(-whatever-, quiet=TRUE)
which claims to reduce the amount of output. If it reduces it to zero, job done.
If not, then you can redirect stdout with sink, or run things inside capture.output.