Getting data from JSON file in R - json

Lets say that I have the following json file:
{
"id": "000018ac-04ef-4270-81e6-9e3cb8274d31",
"currentCompany": "",
"currentTitle": "--",
"currentPosition": ""
}
I use the following code:
Usersfile <- ('trial.json') #where trial the json above
library('rjson')
c <- file(Usersfile,'r')
l <- readLines(c,-71L)
json <- lapply(X=l,fromJSON)
and I have the following error:
Error: parse error: premature EOF
{
(right here) ------^
But when I enter the json file(with notepad) and put the data in one line:
{"id": "000018ac-04ef-4270-81e6-9e3cb8274d31","currentCompany": "","currentTitle": "--","currentPosition": ""}
The code works fine.(In reality the file is really big to do it manually for each line). Why is this happening? How can I overcome that?
Also this one doesnt work:
{ "id": "000018ac-04ef-4270-81e6-9e3cb8274d31","currentCompany": "","currentTitle": "--","currentPosition": ""
}
EDIT: I used the following code that I could read only the first value:
library('rjson')
c <- file.path(Usersfile)
data <- fromJSON(file=c)

Surprised this was never answered! Using the jsonlite package, you can collapse your json data into one character element using paste(x, collapse="") removing EOF markers for proper import into an R dataframe. I, too, faced a pretty-printed json with exact error:
library(jsonlite)
json <- do.call(rbind,
lapply(paste(readLines(Usersfile, warn=FALSE),
collapse=""),
jsonlite::fromJSON))

Related

Error while trying to parse json into R

I have recently started using R and have a task regarding parsing json in R to get a non-json format. For this, i am using the "fromJSON()" function. I have tried to parse json as a text file. It runs successfully when i do it with just a single row entry. But when I try it with multiple row entries, i get the following error:
fromJSON("D:/Eclairs/Printing/test3.txt")
Error in feed_push_parser(readBin(con, raw(), n), reset = TRUE) :
lexical error: invalid char in json text.
[{'CategoryType':'dining','City':
(right here) ------^
> fromJSON("D:/Eclairs/Printing/test3.txt")
Error in feed_push_parser(readBin(con, raw(), n), reset = TRUE) :
parse error: trailing garbage
"mumbai","Location":"all"}] [{"JourneyType":"Return","Origi
(right here) ------^
> fromJSON("D:/Eclairs/Printing/test3.txt")
Error in feed_push_parser(readBin(con, raw(), n), reset = TRUE) :
parse error: after array element, I expect ',' or ']'
:"mumbai","Location":"all"} {"JourneyType":"Return","Origin
(right here) ------^
The above errors are due to three different formats in which i tried to parse the json text, but the result was the same, only the location suggested by changed.
Please help me to identify the cause of this error or if there is a more efficient way o performing the task.
The original file that i have is an excel sheet with multiple columns and one of those columns consists of json text. The way i tried right now is by extracting just the json column and converting it to a tab separated text and then parsing it as:
fromJSON("D:/Eclairs/Printing/test3.txt")
Please also suggest if this can be done more efficiently. I need to map all the columns in the excel to the non-json text as well.
Example:
[{"CategoryType":"dining","City":"mumbai","Location":"all"}]
[{"CategoryType":"reserve-a-table","City":"pune","Location":"Kothrud,West Pune"}]
[{"Destination":"Mumbai","CheckInDate":"14-Oct-2016","CheckOutDate":"15-Oct-2016","Rooms":"1","NoOfPax":"3","NoOfAdult":"3","NoOfChildren":"0"}]
Consider reading in the text line by line with readLines(), iteratively saving the JSON dataframes to a growing list:
library(jsonlite)
con <- file("C:/Path/To/Jsons.txt", open="r")
jsonlist <- list()
while (length(line <- readLines(con, n=1, warn = FALSE)) > 0) {
jsonlist <- append(jsonlist, list(fromJSON(line)))
}
close(con)
jsonlist
# [[1]]
# CategoryType City Location
# 1 dining mumbai all
# [[2]]
# CategoryType City Location
# 1 reserve-a-table pune Kothrud,West Pune
# [[3]]
# Destination CheckInDate CheckOutDate Rooms NoOfPax NoOfAdult NoOfChildren
# 1 Mumbai 14-Oct-2016 15-Oct-2016 1 3 3 0

JSON (using jsonlite) parsing error in R

I have the following JSON file:
{"id":1140854908,"name":"'Amran"}
{"id":1140852651,"name":"'Asir"}
{"id":1140855190,"name":"'Eua"}
{"id":1140851307,"name":"A Coruna"}
{"id":1140854170,"name":"A`Ana"}
I used the package jsonlite but I get a parsing error
library(jsonlite)
try <- fromJSON("states.txt",simplifyDataFrame = T)
# Error in feed_push_parser(readBin(con, raw(), n), reset = TRUE) :
# parse error: trailing garbage
# :1140854908,"name":"'Amran"} {"id":1140852651,"name":"'Asir"
# (right here) ------^
Try changing your data file to below
[
{"id":1140854908,"name":"'Amran"}
,{"id":1140852651,"name":"'Asir"}
,{"id":1140855190,"name":"'Eua"}
,{"id":1140851307,"name":"A Coruna"}
,{"id":1140854170,"name":"A`Ana"}
]
The same code worked for me.. It is looking for an array..
Your file is a newline delimited JSON (http://ndjson.org/). You can read it with jsonlite like this:
try <- stream_in(file("states.txt"))

Iteratively read a fixed number of lines into R

I have a josn file I'm working with that contains multiple json objects in a single file. R is unable to read the file as a whole. But since each object occurs at regular intervals, I would like to iteratively read a fixed number of lines into R.
There are a number of SO questions on reading single lines into R but I have been unable to extend these solutions to a fixed number of lines. For my problem I need to read 16 lines into R at a time (eg 1-16, 17-32 etc)
I have tried using a loop but can't seem to get the syntax right:
## File
file <- "results.json"
## Create connection
con <- file(description=file, open="r")
## Loop over a file connection
for(i in 1:1000) {
tmp <- scan(file=con, nlines=16, quiet=TRUE)
data[i] <- fromJSON(tmp)
}
The file contains over 1000 objects of this form:
{
"object": [
[
"a",
0
],
[
"b",
2
],
[
"c",
2
]
]
}
With #tomtom inspiration I was able to find a solution.
## File
file <- "results.json"
## Loop over a file
for(i in 1:1000) {
tmp <- paste(scan(file=file, what="character", sep="\n", nlines=16, skip=(i-1)*16, quiet=TRUE),collapse=" ")
assign(x = paste("data", i, sep = "_"), value = fromJSON(tmp))
}
I couldn't create a connection as each time I tried the connection would close before the file had been completely read. So I got rid of that step.
I had to include the what="character" variable as scan() seems to expect a number by default.
I included sep="\n", paste() and collapse=" " to create a single string rather than the vector of characters that scan() creates by default.
Finally I just changed the final assignment operator to have a bit more control over the names of the output.
This might help:
EDITED to make it use a list and Reduce into one file
## Loop over a file connection
data <- NULL
for(i in 1:1000) {
tmp <- scan(file=con, nlines=16, skip=(i-1)*16, quiet=TRUE)
data[[i]] <- fromJSON(tmp)
}
df <- Reduce(function(x, y) {paste(x, y, collapse = " ")})
You would have to make sure that you don't reach further than the end of the file though ;-)

Convert tweets into Bson using twitteR and rmongo library

Since the streamR connection API doesn't work anymore on Tweeter I try to convert the output from searchTwitter function (from TwitteR) into BSON before insert it in a mongodb database.
test.tweets = searchTwitter("mongodb", n=10, lang="en")
class(test.tweets)
test.text=laply(test.tweets,function(t) t$getText())
class(toJSON(test.text))
bson <- mongo.bson.from.JSON(test.text)
R return an error : "Error in mongo.bson.from.JSON(test.text) : Not a valid JSON content:..."
How to resolve this conversion or does exist another solution ?
Thank you
This works
library(rmongodb)
library(jsonlite)
test.text <- c("A tweet", "Another tweet")
(bson <- mongo.bson.from.JSON(toJSON(test.text)))
# 1 : 2 A tweet
# 2 : 2 Another tweet

R JSON UTF-8 parsing

I have an issue when trying to parse a JSON file in russian alphabet in R. The file looks like this:
[{"text": "Валера!", "type": "status"}, {"text": "когда выйдет", "type": "status"}, {"text": "КАК ДЕЛА?!)", "type": "status"}]
and it is saved in UTF-8 encoding. I tried libraries rjson, RJSONIO and jsonlite to parse it, but it doesn't work:
library(jsonlite)
allFiles <- fromJSON(txt="ru_json_example_short.txt")
gives me error
Error in feed_push_parser(buf) :
lexical error: invalid char in json text.
[{"text": "Валера!", "
(right here) ------^
When I save the file in ANSI encodieng, it works OK, but then, the Russian alphabet transforms into question marks, so the output is unusable.
Does anyone know how to parse such JSON file in R, please?
Edit: Above mentioned applies for UTF-8 file saved in Windows Notepad. When I save it in PSPad and the parse it, the result looks like this:
text type
1 <U+0412><U+0430><U+043B><U+0435><U+0440><U+0430>! status
2 <U+043A><U+043E><U+0433><U+0434><U+0430> <U+0432><U+044B><U+0439><U+0434><U+0435><U+0442> status
3 <U+041A><U+0410><U+041A> <U+0414><U+0415><U+041B><U+0410>?!) status
Try the following:
dat <- fromJSON(sprintf("[%s]",
paste(readLines("./ru_json_example_short.txt"),
collapse=",")))
dat
[[1]]
text type
1 Валера! status
2 когда выйдет status
3 КАК ДЕЛА?!) status
ref: Error parsing JSON file with the jsonlite package