converting json file into R/text file without execution - json

Rstudio has crashed while working and the unsaved files were not able to be loaded into the session. But the files are available in the JSON format. An example,
{
"contents" : "library(hgu133a.db)\nx <- hgu133aENSEMBL\nx\nlength(x)\ncount.mappedkeys(x)\nx[1:3]\nlinks(x[1:3])\n\n## Keep only the mapped keys\nkeys(x) <- mappedkeys(x)\nlength(x)\ncount.mappedkeys(x)\nx # now it is a submap\n\n## The above subsetting can also be achieved with\nx <- hgu133aENSEMBL[mappedkeys(hgu133aENSEMBL)]\n\n",
"created" : 1463131195093.000,
"dirty" : true,
"encoding" : "",
"folds" : "",
"hash" : "1482602869",
"id" : "737C178C",
"lastKnownWriteTime" : 0,
"path" : null,
"project_path" : null,
"properties" : {
"tempName" : "Untitled3"
},
"source_on_save" : false,
"type" : "r_source"
}
The JSON format files can be read using the jsonlite::fromJSON and the required information was stored in contents variable. When tried to read the commands using the readLines() or scan() the commands were being executed instead of converting them into a simple file. How to convert this into a r file ?
output(?):command in a r script/text file.
library(hgu133a.db)
x <- hgu133aENSEMBL
x
length(x)
count.mappedkeys(x)
x[1:3]
links(x[1:3])
## Keep only the mapped keys
keys(x) <- mappedkeys(x)
length(x)
count.mappedkeys(x)
x
# now it is a submap
## The above subsetting can also be achieved with
x <- hgu133aENSEMBL[mappedkeys(hgu133aENSEMBL)]

If anyone looking for answer to this question, the command suggested by #Kevin worked.
writeLines(json$contents, con = "/path/to/file.R")
Output:
library(hgu133a.db)
x <- hgu133aENSEMBL
x
length(x)
count.mappedkeys(x)
x[1:3]
links(x[1:3])
## Keep only the mapped keys
keys(x) <- mappedkeys(x)
length(x)
count.mappedkeys(x)
x # now it is a submap
## The above subsetting can also be achieved with
x <- hgu133aENSEMBL[mappedkeys(hgu133aENSEMBL)]

Related

How Do I Consume an Array of JSON Objects using Plumber in R

I have been experimenting with Plumber in R recently, and am having success when I pass the following data using a POST request;
{"Gender": "F", "State": "AZ"}
This allows me to write a function like the following to return the data.
#* #post /score
score <- function(Gender, State){
data <- list(
Gender = as.factor(Gender)
, State = as.factor(State))
return(data)
}
However, when I try to POST an array of JSON objects, I can't seem to access the data through the function
[{"Gender":"F","State":"AZ"},{"Gender":"F","State":"NY"},{"Gender":"M","State":"DC"}]
I get the following error
{
"error": [
"500 - Internal server error"
],
"message": [
"Error in is.factor(x): argument \"Gender\" is missing, with no default\n"
]
}
Does anyone have an idea of how Plumber parses JSON? I'm not sure how to access and assign the fields to vectors to score the data.
Thanks in advance
I see two possible solutions here. The first would be a command line based approach which I assume you were attempting. I tested this on a Windows OS and used column based data.frame encoding which I prefer due to shorter JSON string lengths. Make sure to escape quotation marks correctly to avoid 'argument "..." is missing, with no default' errors:
curl -H "Content-Type: application/json" --data "{\"Gender\":[\"F\",\"F\",\"M\"],\"State\":[\"AZ\",\"NY\",\"DC\"]}" http://localhost:8000/score
# [["F","F","M"],["AZ","NY","DC"]]
The second approach is R native and has the advantage of having everything in one place:
library(jsonlite)
library(httr)
## sample data
lst = list(
Gender = c("F", "F", "M")
, State = c("AZ", "NY", "DC")
)
## jsonify
jsn = lapply(
lst
, toJSON
)
## query
request = POST(
url = "http://localhost:8000/score?"
, query = jsn # values must be length 1
)
response = content(
request
, as = "text"
, encoding = "UTF-8"
)
fromJSON(
response
)
# [,1]
# [1,] "[\"F\",\"F\",\"M\"]"
# [2,] "[\"AZ\",\"NY\",\"DC\"]"
Be aware that httr::POST() expects a list of length-1 values as query input, so the array data should be jsonified beforehand. If you want to avoid the additional package imports altogether, some system(), sprintf(), etc. magic should do the trick.
Finally, here is my plumber endpoint (living in R/plumber.R and condensed a little bit):
#* #post /score
score = function(Gender, State){
lapply(
list(Gender, State)
, as.factor
)
}
and code to fire up the API:
pr = plumber::plumb("R/plumber.R")
pr$run(port = 8000)

Iteratively read a fixed number of lines into R

I have a josn file I'm working with that contains multiple json objects in a single file. R is unable to read the file as a whole. But since each object occurs at regular intervals, I would like to iteratively read a fixed number of lines into R.
There are a number of SO questions on reading single lines into R but I have been unable to extend these solutions to a fixed number of lines. For my problem I need to read 16 lines into R at a time (eg 1-16, 17-32 etc)
I have tried using a loop but can't seem to get the syntax right:
## File
file <- "results.json"
## Create connection
con <- file(description=file, open="r")
## Loop over a file connection
for(i in 1:1000) {
tmp <- scan(file=con, nlines=16, quiet=TRUE)
data[i] <- fromJSON(tmp)
}
The file contains over 1000 objects of this form:
{
"object": [
[
"a",
0
],
[
"b",
2
],
[
"c",
2
]
]
}
With #tomtom inspiration I was able to find a solution.
## File
file <- "results.json"
## Loop over a file
for(i in 1:1000) {
tmp <- paste(scan(file=file, what="character", sep="\n", nlines=16, skip=(i-1)*16, quiet=TRUE),collapse=" ")
assign(x = paste("data", i, sep = "_"), value = fromJSON(tmp))
}
I couldn't create a connection as each time I tried the connection would close before the file had been completely read. So I got rid of that step.
I had to include the what="character" variable as scan() seems to expect a number by default.
I included sep="\n", paste() and collapse=" " to create a single string rather than the vector of characters that scan() creates by default.
Finally I just changed the final assignment operator to have a bit more control over the names of the output.
This might help:
EDITED to make it use a list and Reduce into one file
## Loop over a file connection
data <- NULL
for(i in 1:1000) {
tmp <- scan(file=con, nlines=16, skip=(i-1)*16, quiet=TRUE)
data[[i]] <- fromJSON(tmp)
}
df <- Reduce(function(x, y) {paste(x, y, collapse = " ")})
You would have to make sure that you don't reach further than the end of the file though ;-)

R fromJSON incorrectly reads Unicode from file

I am trying to read json object in R from file, which contains names and surnames in unicode. Here is the content of the file "x1.json":
{"general": {"last_name":
"\u041f\u0430\u0449\u0435\u043d\u043a\u043e", "name":
"\u0412\u0456\u0442\u0430\u043b\u0456\u0439"}}
I use RJSONIO package and when I declare the JSON object directly, everything goes well:
x<-fromJSON('{"general": {"last_name": "\u041f\u0430\u0449\u0435\u043d\u043a\u043e", "name": "\u0412\u0456\u0442\u0430\u043b\u0456\u0439"}}')
x
# $general
# last_name name
# "Пащенко" "Віталій"
But when I read the same from file, strings are converted to some unknown for me encoding:
x1<-fromJSON("x1.json")
x1
# $general
# last_name name
# "\0370I5=:>" "\022VB0;V9"
Note that these are not escaped "\u" (which was discussed here)
I have tried to specify "encoding" argument, but this did not help:
> x1<-fromJSON("x1.json", encoding = "UTF-8")
> x1
$general
last_name name
"\0370I5=:>" "\022VB0;V9"
System information:
> Sys.getlocale()
[1] "LC_COLLATE=Ukrainian_Ukraine.1251;LC_CTYPE=Ukrainian_Ukraine.1251;LC_MONETARY=Ukrainian_Ukraine.1251;LC_NUMERIC=C;LC_TIME=Ukrainian_Ukraine.1251"
Switching to English (Sys.setlocale("LC_ALL","English")) has not changed the situation.
If your file had unicode data like this (instead of its representation)
{"general": {"last_name":"Пащенко", "name":"Віталій"}}
then,
> fromJSON("x1.json", encoding = "UTF-8")
will work
If you really want your code to work with current file, try like this
JSONstring=""
con <- file("x1.json",open = "r")
while (length(oneLine <- readLines(con, n = 1, warn = FALSE)) > 0) {
JSONstring <- paste(JSONstring,parse(text = paste0("'",oneLine, "'"))[[1]],sep='')
}
fromJSON(JSONstring)
use library("jsonlite") not rjson
library("jsonlite")
mydf <- toJSON( mydf, encoding = "UTF-8")
will be fine

get information from dataset (json format) in r

I created a datatable from mongodb collection. Data in this datatable is in JSON format but I cant get to extract the information from it..
{"place":{"bounding_box":{
"type":"Polygon",
"coordinates":[
[
[
-119.932568,
36.648905
],
[
-119.632419,
36.648905
]
]
]
}}}
I need the first two values of the coordinates: lat = 36.648905 and lon = -119.932568
But cant seems to extract that info:
my_lon <- myBigDF$place.bounding_box.coordinates[1[1[1]]]
I have tried few combination but I'm always getting NULL.
Thank you for any help..
--EDIT-- Including the code on how I'm connecting to db and creating dataframe from it..
mongo <- mongo.create(host="localhost" , db="mydb")
library(plyr)
## create the empty data frame
myDF = data.frame(stringsAsFactors = FALSE)
## create the cursor we will iterate over, basically a select * in SQL
cursor = mongo.find(mongo, namespace)
## create the counter
i = 1
## iterate over the cursor
while (mongo.cursor.next(cursor)) {
# iterate and grab the next record
tmp = mongo.bson.to.list(mongo.cursor.value(cursor))
# make it a dataframe
tmp.df = as.data.frame(t(unlist(tmp)), stringsAsFactors = F)
# bind to the master dataframe
myDF = rbind.fill(myDF, tmp.df)
}
It's hard to tell exactly how you are going from the JSON string to an R object. There are different libraries that parse thing differently. If I assume for a moment use "rjson", then you would have something like
x <- rjson::fromJSON('{"place":{"bounding_box":{ "type":"Polygon", "coordinates":[ [ [ -119.932568, 36.648905 ], [ -119.632419, 36.648905 ] ] ] }}}')
And because your data seems to have an excessive number of square brackets, things are a bit messy. You can get to the coordinates section with
x$place$bounding_box$coordinates
# [1]]
# [[1]][[1]]
# [1] -119.9326 36.6489
#
# [[1]][[2]]
# [1] -119.6324 36.6489
which is a list of lists of vectors. To make a nice matrix of lat/long coordinates you can do
do.call(rbind, x$place$bounding_box$coordinates[[1]])

Ragged list or data frame to JSON

I am trying to create a ragged list in R that corresponds to the D3 tree structure of flare.json. My data is in a data.frame:
path <- data.frame(P1=c("direct","direct","organic","direct"),
P2=c("direct","direct","end","end"),
P3=c("direct","organic","",""),
P4=c("end","end","",""), size=c(5,12,23,45))
path
P1 P2 P3 P4 size
1 direct direct direct end 5
2 direct direct organic end 12
3 organic end 23
4 direct end 45
but it could also be a list or reshaped if necessary:
path <- list()
path[[1]] <- list(name=c("direct","direct","direct","end"),size=5)
path[[2]] <- list(name=c("direct","direct","organic","end"), size=12)
path[[3]] <- list(name=c("organic", "end"), size=23)
path[[4]] <- list(name=c("direct", "end"), size=45)
The desired output is:
rl <- list()
rl <- list(name="root", children=list())
rl$children[1] <- list(list(name="direct", children=list()))
rl$children[[1]]$children[1] <- list(list(name="direct", children=list()))
rl$children[[1]]$children[[1]]$children[1] <- list(list(name="direct", children=list()))
rl$children[[1]]$children[[1]]$children[[1]]$children[1] <- list(list(name="end", size=5))
rl$children[[1]]$children[[1]]$children[2] <- list(list(name="organic", children=list()))
rl$children[[1]]$children[[1]]$children[[2]]$children[1] <- list(list(name="end", size=12))
rl$children[[1]]$children[2] <- list(list(name="end", size=23))
rl$children[2] = list(list(name="organic", children=list()))
rl$children[[2]]$children[1] <- list(list(name="end", size=45))
So when I print to json it's:
require(RJSONIO)
cat(toJSON(rl, pretty=T))
{
"name" : "root",
"children" : [
{
"name" : "direct",
"children" : [
{
"name" : "direct",
"children" : [
{
"name" : "direct",
"children" : [
{
"name" : "end",
"size" : 5
}
]
},
{
"name" : "organic",
"children" : [
{
"name" : "end",
"size" : 12
}
]
}
]
},
{
"name" : "end",
"size" : 23
}
]
},
{
"name" : "organic",
"children" : [
{
"name" : "end",
"size" : 45
}
]
}
]
}
I am having a hard time wrapping my head around the recursive steps that are necessary to create this list structure in R. In JS I can pretty easily move around the nodes and at each node determine whether to add a new node or keep moving down the tree by using push as needed, eg: new = {"name": node, "children": []}; or new = {"name": node, "size": size}; as in this example. I tried to split the data.frame as in this example:
makeList<-function(x){
if(ncol(x)>2){
listSplit<-split(x,x[1],drop=T)
lapply(names(listSplit),function(y){list(name=y,children=makeList(listSplit[[y]]))})
} else {
lapply(seq(nrow(x[1])),function(y){list(name=x[,1][y],size=x[,2][y])})
}
}
jsonOut<-toJSON(list(name="root",children=makeList(path)))
but it gives me an error
Error: evaluation nested too deeply: infinite recursion / options(expressions=)?
Error during wrapup: evaluation nested too deeply: infinite recursion / options(expressions=)?
The function given in the linked Q&A is essentially what you need, however it was failing on your data set because of the null values for some rows in the later columns. Instead of just blindly repeating the recursion until you run out of columns, you need to check for your "end" value, and use that to switch to making leaves:
makeList<-function(x){
listSplit<-split(x[-1],x[1], drop=TRUE);
lapply(names(listSplit),function(y){
if (y == "end") {
l <- list();
rows = listSplit[[y]];
for(i in 1:nrow(rows) ) {
l <- c(l, list(name=y, size=rows[i,"size"] ) );
}
l;
}
else {
list(name=y,children=makeList(listSplit[[y]]))
}
});
}
I believe this does what you want, though it has some limitations. In particular, it is assumed that every branch in your network is unique (i.e. there can't be two rows in your data frame that are equal for every column other than size):
df.split <- function(p.df) {
p.lst.tmp <- unname(split(p.df, p.df[, 1]))
p.lst <- lapply(
p.lst.tmp,
function(x) {
if(ncol(x) == 2L && nrow(x) == 1L) {
return(list(name=x[1, 1], size=unname(x[, 2])))
} else if (isTRUE(is.na(unname(x[ ,2])))) {
return(list(name=x[1, 1], size=unname(x[, ncol(x)])))
}
list(name=x[1, 1], children=df.split(x[, -1, drop=F]))
}
)
p.lst
}
all.equal(rl, df.split(path)[[1]])
# [1] TRUE
Though note you had the organic size switched, so I had to fix your rl to get this result (rl has it as 45, but your path as 23). Also, I modified your path data.frame slightly:
path <- data.frame(
root=rep("root", 4),
P1=c("direct","direct","organic","direct"),
P2=c("direct","direct","end","end"),
P3=c("direct","organic",NA,NA),
P4=c("end","end",NA,NA),
size=c(5,12,23,45),
stringsAsFactors=F
)
WARNING: I haven't tested this with other structures, so it's possible it will hit corner cases that you'll need to debug.