Filter in Nested Data Frame - json

I am playing around with the Yelp data set and want to filter the business set according to the category.
I imported the JSON file into R with
yelp_business = stream_in(file("yelp_academic_dataset_business.json"))
which results then in the following data frame:
'data.frame': 77445 obs. of 15 variables:
$ business_id : chr "5UmKMjUEUNdYWqANhGckJw" "UsFtqoBl7naz8AVUBZMjQQ" "3eu6MEFlq2Dg7bQh8QbdOg" "cE27W9VPgO88Qxe4ol6y_g" ...
$ full_address : chr "4734 Lebanon Church Rd\nDravosburg, PA 15034" "202 McClure St\nDravosburg, PA 15034" "1 Ravine St\nDravosburg, PA 15034" "1530 Hamilton Rd\nBethel Park, PA 15234" ...
$ hours :'data.frame': 77445 obs. of 7 variables:
..$ Friday :'data.frame': 77445 obs. of 2 variables:
.. ..$ close: chr "21:00" NA NA NA ...
.. ..$ open : chr "11:00" NA NA NA ...
..$ Tuesday :'data.frame': 77445 obs. of 2 variables:
.. ..$ close: chr "21:00" NA NA NA ...
.. ..$ open : chr "11:00" NA NA NA ...
..$ Thursday :'data.frame': 77445 obs. of 2 variables:
.. ..$ close: chr "21:00" NA NA NA ...
.. ..$ open : chr "11:00" NA NA NA ...
..$ Wednesday:'data.frame': 77445 obs. of 2 variables:
.. ..$ close: chr "21:00" NA NA NA ...
.. ..$ open : chr "11:00" NA NA NA ...
..$ Monday :'data.frame': 77445 obs. of 2 variables:
.. ..$ close: chr "21:00" NA NA NA ...
.. ..$ open : chr "11:00" NA NA NA ...
..$ Sunday :'data.frame': 77445 obs. of 2 variables:
.. ..$ close: chr NA NA NA NA ...
.. ..$ open : chr NA NA NA NA ...
..$ Saturday :'data.frame': 77445 obs. of 2 variables:
.. ..$ close: chr NA NA NA NA ...
.. ..$ open : chr NA NA NA NA ...
$ open : logi TRUE TRUE TRUE FALSE TRUE TRUE ...
$ categories :List of 77445
..$ : chr "Fast Food" "Restaurants"
..$ : chr "Nightlife"
..$ : chr "Auto Repair" "Automotive"
..$ : chr "Active Life" "Mini Golf" "Golf"
..$ : chr "Shopping" "Home Services" "Internet Service Providers" "Mobile Phones" ...
..$ : chr "Bars" "American (New)" "Nightlife" "Lounges" ...
..$ : chr "Active Life" "Trainers" "Fitness & Instruction"
..$ : chr "Bars" "American (Traditional)" "Nightlife" "Restaurants"
..$ : chr "Auto Repair" "Automotive" "Tires"
..$ : chr "Active Life" "Mini Golf"
..$ : chr "Home Services" "Contractors"
..$ : chr "Veterinarians" "Pets"
..$ : chr "Libraries" "Public Services & Government"
..$ : chr "Automotive" "Auto Parts & Supplies"
I now want to filter all rows according to the business category and want to include all categories that have food in the category list.
However, if I just try it that way:
input ="food"
engage = filter(yelp_business, grepl(input, categories))
I receive the following error code:
Error: data_frames can only contain 1d atomic vectors and lists
I first suspected the nested structure to be a reason for that. However using tidyjson does not help either as category is a list and not a dataframe within the main dataframe.
Does anyone have an idea how to solve this? I just need a list of all food restaurant's business ids to then filter the review json file from Yelp to extract the written reviews.
Any help with this is really appreciated! Thanks a lot!

tidyjson does not yet support ndjson, and I am not quite sure how to nicely work with stream_in().
However, it is possible to read the file directly and process naturally with tidyjson. I am using the development version from devtools::install_github('jeremystan/tidyjson').
document.id gives a nice identification of objects, so I find the document.ids that have "food" in one of the "categories." From that point, we filter and do whatever additional data analysis is desired.
library(dplyr)
library(stringr)
library(tidyjson)
j <- readLines("yelp_academic_dataset_business.json")
raw <- j %>% as.tbl_json()
## pull out the categories for filtering
prep <- raw %>% enter_object("categories") %>%
gather_array() %>% append_values_string()
## filter to 'food' categories (use document.id to identify json objects)
keepids <- prep[str_detect(str_to_lower(prep$string), "food"), ]$document.id %>%
unique()
## filter and do any further data analysis you want to do
raw %>% filter(document.id %in% keepids) %>%
spread_values(
name = json_chr(name),
city = json_chr(city),
state = json_chr(state),
stars = json_chr(stars))
#> # A tbl_json: 21 x 5 tibble with a "JSON" attribute
#> `attr(., "JSON")` document.id name city
#> <chr> <int> <chr> <chr>
#> 1 "{\"business_id\":..." 2 Cut and Taste Las Vegas
#> 2 "{\"business_id\":..." 8 Taco Bell Scottsdale
#> 3 "{\"business_id\":..." 10 Sehne Backwaren Stuttgart
#> 4 "{\"business_id\":..." 20 Graceful Cake Creations Mesa
#> 5 "{\"business_id\":..." 26 Chipotle Mexican Grill Toronto
#> 6 "{\"business_id\":..." 30 Carrabba's Italian Grill Glendale
#> 7 "{\"business_id\":..." 32 I Deal Coffee Toronto
#> 8 "{\"business_id\":..." 34 Lo-Lo's Chicken & Waffles Phoenix
#> 9 "{\"business_id\":..." 38 Kabob Palace Las Vegas
#> 10 "{\"business_id\":..." 43 Tea Shop 168 Markham
#> # ... with 11 more rows, and 2 more variables: state <chr>, stars <chr>
NOTE - I only processed the first 100 records of the yelp_academic_dataset_business.json file.

Related

Select data frame elements from multiple lists inside a list

I have a list (IPCs) containing multiple data frames.
here is a sample from my list:
$ http://www.sumobrain.com/patents/us/Measured-object-support-mechanism-for-unbalance-measuring-apparatus/4981043.html
:List of 1
..$ :'data.frame': 3 obs. of 5 variables:
.. ..$ X1: chr [1:3] "2001826A" "2857764A" "3452604A"
.. ..$ X2: chr [1:3] "1935-05-21" "1958-10-28" "1969-07-01"
.. ..$ X3: chr [1:3] "Russell et al." "Frank" "Schaub"
.. ..$ X4: chr [1:3] "73/478" "73/477" "73/475"
.. ..$ X5: chr [1:3] "Machine for balancing heavy bodies" "Rotor balance testing machine" "BALANCE TESTING APPARATUS HEAD"
$ http://www.sumobrain.com/patents/us/Encoder-with-wide-index/4982189.html
:List of 1
..$ :'data.frame': 8 obs. of 5 variables:
.. ..$ X1: chr [1:8] "3500449A" "4212000A" "4233592A" "4524347A" ...
.. ..$ X2: chr [1:8] "1970-03-10" "1980-07-08" "1980-11-11" "1985-06-18" ...
.. ..$ X3: chr [1:8] "Lenz" "Yamada" "Leichle" "Rogers" ...
.. ..$ X4: chr [1:8] "341/6" "341/16" "341/6" "341/3" ...
.. ..$ X5: chr [1:8] "ELECTRONIC ENCODER INDEX" "Position-to-digital encoder" "Method for detection of the angular position of a part driven in rotation and instrumentation using it" "Position encoder" ...
$ http://www.sumobrain.com/patents/us/Device-for-detecting-at-least-one-variable-relating-to-the-movement-of-a-movable-body/4982106.html
:List of 1
..$ :'data.frame': 2 obs. of 5 variables:
.. ..$ X1: chr [1:2] "3956973A" "4797564A"
.. ..$ X2: chr [1:2] "1976-05-18" "1989-01-10"
.. ..$ X3: chr [1:2] "Pomplas" "Ramunas"
.. ..$ X4: chr [1:2] "92/5R" "307/119"
.. ..$ X5: chr [1:2] "Die casting machine with piston positioning control" "Robot overload detection mechanism"
I would like to select only the first and fifth elements (X1 and X5) from all data frames, to later construct a further dataset with only these two elements.
I have tried to grab X1 with this:
citations_IPC <- sapply(IPCs, function(x){
y<-x[,1]
return(y)
})
and X5 with:
citations_titles <- sapply(IPCs[[1]], function(z){
e<-z[,5]
return(e)
})
Then I convert citations_IPCs and citations_titles into a single data frame with:
citation_list <- data.frame(IPC = unlist(lapply(citations_IPC, paste)), title = unlist(lapply(citations_titles, paste)) )
1#problem
If I write the sapply function on an individual list (e.g. IPCs[[1]]) I get the result I want:
citations_IPC <- sapply(IPCs[[1]], function(x){
y<-x[,1]
return(y)
})
result:
> citations_IPC
[,1]
[1,] "3415985A"
[2,] "3916190A"
[3,] "4088895A"
[4,] "4633084A"
[5,] "4670651A"
[6,] "4860224A"
However, this function doesn't work for the whole lists (IPCs).
The error I get is:
"Error in x[, 1] : incorrect number of dimensions"
I am guessing the problem might be due to a few lists within my dataset with no data frame, no observations and no variables. In that case I would need a function which allows me to use the sapply() on the dataset despite the lines without data frame.
Please any suggestions would be really appreciated.
Many thanks
str(IPCs)
> str(IPCs)
List of 19
$ http://www.sumobrain.com/patents/us/Method-and-apparatus-for-the-quantitative,-depth-differential-analysis-of-solid-samples-with-the-use-of-two-ion-beams/4982090.html :List of 1
..$ :'data.frame': 6 obs. of 5 variables:
.. ..$ X1: chr [1:6] "3415985A" "3916190A" "4088895A" "4633084A" ...
.. ..$ X2: chr [1:6] "1968-12-10" "1975-10-28" "1978-05-09" "1986-12-30" ...
.. ..$ X3: chr [1:6] "Castaing et al." "Valentine et al." "Martin" "Gruen et al." ...
.. ..$ X4: chr [1:6] "250/309" "250/309" "250/309" "250/309" ...
.. ..$ X5: chr [1:6] "Ionic microanalyzer wherein secondary ions are emitted from a sample surface upon bombardment by neutral atoms" "Depth profile analysis apparatus" "Memory device utilizing ion beam readout" "High efficiency direct detection of ions from resonance ionization of sputtered atoms" ...
$ http://www.sumobrain.com/patents/us/Set-on-oscillator/4982165.html
:List of 1
..$ :'data.frame': 2 obs. of 5 variables:
.. ..$ X1: chr [1:2] "4437066A" "4558282A"
.. ..$ X2: chr [1:2] "1984-03-13" "1985-12-10"
.. ..$ X3: chr [1:2] "Gordon" "Lowenschuss"
.. ..$ X4: chr [1:2] "328/14" "307/523"
.. ..$ X5: chr [1:2] "Apparatus for synthesizing a signal by producing samples of such signal at a rate less than the Nyquist sampling rate" "Digital frequency synthesizer"
$ http://www.sumobrain.com/patents/us/Voltage-measuring-apparatus/4982151.html
:List of 1
..$ :'data.frame': 7 obs. of 5 variables:
.. ..$ X1: chr [1:7] "3419802A" "3419803A" "4446425A" "4603293A" ...
.. ..$ X2: chr [1:7] "1968-12-31" "1968-12-31" "1984-05-01" "1986-07-29" ...
.. ..$ X3: chr [1:7] "Pelenc et al." "Pelenc et al." "Valdmanis et al." "Mourou et al." ...
.. ..$ X4: chr [1:7] "324/96" "324/96" "" "" ...
.. ..$ X5: chr [1:7] "Apparatus for current measurement by means of the faraday effect" "Apparatus for current measurement by means of the faraday effect" "Measurement of electrical signals with picosecond resolution" "Measurement of electrical signals with subpicosecond resolution" ...
Here is an example:
First lets make a list with some random iris columns:
data(iris)
lis = list(iris[1:3], iris[2:4])
using lapply with a custom function to extract columns 1 and 2 from each data frame. If they are not named the same force a rename of the columns for the next step:
b = lapply(lis, function(x){
z = x[,c(1,2)]
colnames(z) = c("z1", "z2")
return(z)
}
)
Now b is a list of only the columns you wish.
rbind the data frames in b:
do.call(rbind, b)
done
Here is a way to do what I understand of your question.
First some fake data.
op <- options(stringsAsFactors = FALSE) # to make sure we have characters not factors
set.seed(9506)
nr <- c(6, 2, 7)
IPCs <- lapply(1:3, function(n){
res <- as.data.frame(replicate(5, sample(LETTERS, nr[n], TRUE)))
names(res) <- paste0("X", 1:5)
res
})
names(IPCs) <- paste0("df", seq_along(dat))
str(IPCs)
options(op) # put it back as it was
Now the code to extract the 1st and 5th columns of each data.frame and paste them together in order to form a df.
result <- list(
sapply(IPCs, `[[`, 1),
sapply(IPCs, function(x) x[[ncol(x)]])
)
result <- as.data.frame(lapply(result, function(x) sapply(x, paste, collapse = "")))
names(result) <- c("citations_IPC", "citations_titles")
result

unest list with nested data frames in R

I'm working on parsing out some data from the foursquare api and I have one portion of the results that look like the following, a large list with nested dataframes:
List of 1
$ :List of 26
..$ :'data.frame': 1 obs. of 6 variables:
.. ..$ id : chr "4bf58dd8d48988d129951735"
.. ..$ name : chr "Train Station"
.. ..$ pluralName: chr "Train Stations"
.. ..$ shortName : chr "Train Station"
.. ..$ icon :'data.frame': 1 obs. of 2 variables:
.. .. ..$ prefix: chr "https://ss3.4sqi.net/img/categories_v2/travel/trainstation_"
.. .. ..$ suffix: chr ".png"
.. ..$ primary : logi TRUE
..$ :'data.frame': 1 obs. of 6 variables:
.. ..$ id : chr "4bf58dd8d48988d1fe931735"
.. ..$ name : chr "Bus Station"
.. ..$ pluralName: chr "Bus Stations"
.. ..$ shortName : chr "Bus Station"
.. ..$ icon :'data.frame': 1 obs. of 2 variables:
.. .. ..$ prefix: chr "https://ss3.4sqi.net/img/categories_v2/travel/busstation_"
.. .. ..$ suffix: chr ".png"
.. ..$ primary : logi TRUE
..$ :'data.frame': 1 obs. of 6 variables:
I'm trying yo unnest these dataframe for certain elements so that i can cbind them to a pre-existing file I have. Ultimately I would like the end result to look something like the following:
$id $name
4bf58dd8d48988d129951735 train station
4bf58dd8d48988d1fe931735 bus station
etc.
Thanks!
Suppose your large list is called mylist. Then you can either iterate through mylist[[1]] and extract the relevant columns:
do.call(rbind, lapply(mylist[[1]], `[`, c("id", "name")))
or use the rbind.pages function from jsonlite:
jsonlite::rbind.pages(mylist[[1]])[c("id", "name")]
both of which will give you
# id name
# 1 4bf58dd8d48988d129951735 Train Station
# 2 4bf58dd8d48988d1fe931735 Bus Station

Extracting data from list in R

library(RCurl)
library(rjson)
json <- getURL('https://extraction.import.io/query/runtime/17d882b5-c118-4f27-8ce1-90085ec0b116?_apikey=d5a8a01e20174e95887dc0f385e4e3f6d7ef5ca1428d5a029f2aa352509948ade8e5d7fb0dc941f4769a32b541ca6b38a7cd6578dfd81b357fbc4f2e008f5154f1dbfcff31878798fa887b70b1ff59dd&url=http%3A%2F%2Fwww.numbeo.com%2Fcost-of-living%2Fcompare_cities.jsp%3Fcountry1%3DSingapore%26country2%3DAustralia%26city1%3DSingapore%26city2%3DMelbourne')
obj <- fromJSON(json)
I would like to get the data into nice columns of data, but many steps in the list are "nameless". Any idea of how to organise the data?
Check out this difference, and let me know what you think. This is what your object looks like:
library(RCurl)
library(rjson)
json <- getURL('https://extraction.import.io/query/runtime/17d882b5-c118-4f27-8ce1-90085ec0b116?_apikey=d5a8a01e20174e95887dc0f385e4e3f6d7ef5ca1428d5a029f2aa352509948ade8e5d7fb0dc941f4769a32b541ca6b38a7cd6578dfd81b357fbc4f2e008f5154f1dbfcff31878798fa887b70b1ff59dd&url=http%3A%2F%2Fwww.numbeo.com%2Fcost-of-living%2Fcompare_cities.jsp%3Fcountry1%3DSingapore%26country2%3DAustralia%26city1%3DSingapore%26city2%3DMelbourne')
obj <- rjson::fromJSON(json)
str(obj)
List of 2
$ extractorData:List of 3
..$ url : chr "http://www.numbeo.com/cost-of-living/compare_cities.jsp?country1=Singapore&country2=Australia&city1=Singapore&city2=Melbourne"
..$ resourceId: chr "b1250747011ee774e7c881617c86a5a9"
..$ data :List of 1
.. ..$ :List of 1
.. .. ..$ group:List of 52
.. .. .. ..$ :List of 6
.. .. .. .. ..$ COL VALUE :List of 1
.. .. .. .. .. ..$ :List of 1
.. .. .. .. .. .. ..$ text: chr "Meal, Inexpensive Restaurant"
Indeed a lot of Lists in between there that you don't need. Now try the jsonlite package's fromJSON function:
library(jsonlite)
obj2<- jsonlite::fromJSON(json)
List of 2
$ extractorData:List of 3
..$ url : chr "http://www.numbeo.com/cost-of-living/compare_cities.jsp?country1=Singapore&country2=Australia&city1=Singapore&city2=Melbourne"
..$ resourceId: chr "b1250747011ee774e7c881617c86a5a9"
..$ data :'data.frame': 1 obs. of 1 variable:
.. ..$ group:List of 1
.. .. ..$ :'data.frame': 52 obs. of 6 variables:
.. .. .. ..$ COL VALUE :List of 52
.. .. .. .. ..$ :'data.frame': 1 obs. of 1 variable:
.. .. .. .. .. ..$ text: chr "Meal, Inexpensive Restaurant"
.. .. .. .. ..$ :'data.frame': 1 obs. of 1 variable:
.. .. .. .. .. ..$ text: chr "Meal for 2 People, Mid-range Restaurant, Three-course"
.. .. .. .. ..$ :'data.frame': 1 obs. of 1 variable:
Still though, this JSON just isn't pretty, we'll need to fix this.
I take it you want that data frame in there. So start with
df <- obj2$extractorData$data$group[[1]]
and there's your data frame. Problem though: every single cell is in a list here. Including NULL values, and you can't just unlist those, they'll disappear and the columns in which they were will grow shorter...
Edit: Here's how to handle the columns with list(NULL) values.
df[sapply(df[,2],is.null),2] <- NA
df[sapply(df[,3],is.null),3] <- NA
df[sapply(df[,4],is.null),4] <- NA
df[sapply(df[,5],is.null),5] <- NA
df2 <- sapply(df, unlist) %>% as.data.frame
It can be written more elegantly for sure, but this'll get you going and it's understandable.

Parsing BLS JSON with R

I would like to run some canned reports with Knitr in R relying on a number of third party resources, some offered as text files, and some offered through public APIs.
I am not particulary well versed parsing JSON files, however, and quickly lose my bearings when they get mildly complicated (which I don't particularly think my example is, anyway, but still.)
Here's the call:
library(rjson)
addr = 'http://api.bls.gov/publicAPI/v1/timeseries/data/ENU0607510010'
json_data <- fromJSON(file=addr, method='C')
Here's what it looks like--any way to stuff that into a dataframe for further (automatic) melting and plotting?
> str(json_data)[1:100]
List of 4
$ status : chr "REQUEST_SUCCEEDED"
$ responseTime: num 14
$ message : list()
$ Results :List of 1
..$ series:List of 1
.. ..$ :List of 2
.. .. ..$ seriesID: chr "ENU0607510010"
.. .. ..$ data :List of 35
.. .. .. ..$ :List of 5
.. .. .. .. ..$ year : chr "2013"
.. .. .. .. ..$ period : chr "M09"
.. .. .. .. ..$ periodName: chr "September"
.. .. .. .. ..$ value : chr "615958"
.. .. .. .. ..$ footnotes :List of 1
.. .. .. .. .. ..$ :List of 2
.. .. .. .. .. .. ..$ code: chr "P"
.. .. .. .. .. .. ..$ text: chr " Preliminary."
.. .. .. ..$ :List of 5
.. .. .. .. ..$ year : chr "2013"
.. .. .. .. ..$ period : chr "M08"
.. .. .. .. ..$ periodName: chr "August"
.. .. .. .. ..$ value : chr "615326"
.. .. .. .. ..$ footnotes :List of 1
.. .. .. .. .. ..$ :List of 2
.. .. .. .. .. .. ..$ code: chr "P"
.. .. .. .. .. .. ..$ text: chr " Preliminary."
.. .. .. ..$ :List of 5
.. .. .. .. ..$ year : chr "2013"
.. .. .. .. ..$ period : chr "M07"
.. .. .. .. ..$ periodName: chr "July"
.. .. .. .. ..$ value : chr "611071"
.. .. .. .. ..$ footnotes :List of 1
.. .. .. .. .. ..$ :List of 2
.. .. .. .. .. .. ..$ code: chr "P"
.. .. .. .. .. .. ..$ text: chr " Preliminary."
.. .. .. ..$ :List of 5
Give this a go. I need to move from RJSONIO to jsonlite at some point, but this will get you your data. It's all a matter of figuring out the structure so you can do the sapply's. I added the bar chart because I had it in a gist example for BLS data already.
library(RCurl)
library(RJSONIO)
library(ggplot2)
bls.content <- getURLContent("http://api.bls.gov/publicAPI/v1/timeseries/data/ENU0607510010")
bls.json <- fromJSON(bls.content, simplify=TRUE)
tmp <-bls.json$Results[[1]][[1]]
bls.df <- data.frame(year=sapply(tmp$data,"[[","year"),
period=sapply(tmp$data,"[[","period"),
periodName=sapply(tmp$data,"[[","periodName"),
value=as.numeric(sapply(tmp$data,"[[","value")),
stringsAsFactors=FALSE)
head(bls.df, n=10)
## year period periodName value
## 1 2013 M09 September 615958
## 2 2013 M08 August 615326
## 3 2013 M07 July 611071
## 4 2013 M06 June 610893
## 5 2013 M05 May 610750
## 6 2013 M04 April 607797
## 7 2013 M03 March 603286
## 8 2013 M02 February 600868
## 9 2013 M01 January 593770
## 10 2012 M13 Annual 586538
gg <- ggplot(data=bls.df, aes(x=year, y=value, group=period))
gg <- gg + geom_bar(stat="identity", position="dodge", aes(fill=period))
gg

Get only specific object within json in a data frame

I would like to import a single object from a json file into a R data frame. Normally I use fromJSON() from the jsonlite package. However now I want to load this json into a data frame and then only the object that is called plays.
If I use:
library(jsonlite)
df <- fromJSON("http://live.nhl.com/GameData/20132014/2013020555/PlayByPlay.json")
It gives a data frame containing all the objects. Is there a way to only load the plays object in the data frame? Or should I just load the complete json and restructure this within R?
That does return a dataframe, although it 's kind of a mangled gemisch of list and dataframe. If you use a different package, it is just a list. Using str(df) (warning ...long output)
library(RJSONIO)
str(df)
#------------
List of 1
$ data:List of 2
..$ refreshInterval: num 0
..$ game :List of 7
.. ..$ awayteamid : num 24
.. ..$ awayteamname: chr "Anaheim Ducks"
.. ..$ hometeamname: chr "Washington Capitals"
.. ..$ plays :List of 1
.. .. ..$ play:List of 102
.. .. .. ..$ :List of 28
-----------Output truncated----------------
.... shows that the plays portions can be obtained with:
plays_out <- df$data$game$plays
I do not see that there is any advantage in trying to parse this yourself. Most of the "volume" of data is in the plays component.
When I use jsonlite::fromJSON I get a slightly different structure which is sufficiently different that I now I need to use a different call to get the plays items:
> str(df )
'data.frame': 1 obs. of 2 variables:
$ refreshInterval:List of 1
..$ data: num 0
$ game :'data.frame': 1 obs. of 7 variables:
..$ awayteamid :List of 1
.. ..$ data: num 24
..$ awayteamname:List of 1
.. ..$ data: chr "Anaheim Ducks"
..$ hometeamname:List of 1
.. ..$ data: chr "Washington Capitals"
..$ plays :'data.frame': 1 obs. of 1 variable:
.. ..$ play:List of 1
.. .. ..$ data:'data.frame': 102 obs. of 29 variables:
.. .. .. ..$ aoi :List of 102
.. .. .. .. ..$ : num 8470612 8470621 8473933 8473972 8475151 ...
.. .. .. .. ..$ : num 8459442 8467332 8467400 8471476 8471699 ...
.. .. .. .. ..$ : num 8459442 8467332 8467400 8471476 8471699 ...
.. .. .. .. ..$ : num 8459442 8467332 8467400 8471476 8471699 ...
#------snipped output------------
> length(df$game$plays)
[1] 1
> length(df$game$plays$play)
[1] 1
> length(df$game$plays$play$data)
[1] 29
I think I prefer the result from RJSONIO::fromJSON, since it doesn't add the complexity of dataframe coercion.