Convert JSON to data.frame with more than 2 columns - json

I am trying to properly convert a JSON to a data.frame with 3 columns.
This is a simplification of my data
# simplification of my real data
my_data <- '{"Bag 1": [["bananas", 1], ["oranges", 2]],"Bag 2": [["bananas", 3], ["oranges", 4], ["apples", 5]]}'
library(jsonlite)
my_data <- fromJSON(my_data)
> my_data
$`Bag 1`
[,1] [,2]
[1,] "bananas" "1"
[2,] "oranges" "2"
$`Bag 2`
[,1] [,2]
[1,] "bananas" "3"
[2,] "oranges" "4"
[3,] "apples" "5"
I try to convert that to a data.frame
# this return an error about "arguments imply differing number of rows: 2, 3"
my_data <- as.data.frame(my_data)
> my_data <- as.data.frame(my_data)
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, :
arguments imply differing number of rows: 2, 3
This is my solution to create the data.frame
# my solution
my_data <- data.frame(fruit = do.call(c, my_data),
bag_number = rep(1:length(my_data),
sapply(my_data, length)))
# how it looks
my_data
> my_data
fruit bag_number
Bag 11 bananas 1
Bag 12 oranges 1
Bag 13 1 1
Bag 14 2 1
Bag 21 bananas 2
Bag 22 oranges 2
Bag 23 apples 2
Bag 24 3 2
Bag 25 4 2
Bag 26 5 2
But my idea is to obtain something like this to avoid problems like doing my_data[a:b,1] when I want to use ggplot2 and others.
fruit | quantity | bag_number
oranges | 2 | 1
bananas | 1 | 1
oranges | 4 | 2
bananas | 3 | 2
apples | 5 | 2

library(plyr)
# import data (note that the rJSON package does this differently than the jsonlite package)
data.import <- jsonlite::fromJSON(my_data)
# combine all data using plyr
df <- ldply(data.import, rbind)
# clean up column names
colnames(df) <- c('bag_number', 'fruit', 'quantity')
bag_number fruit quantity
1 Bag 1 bananas 1
2 Bag 1 oranges 2
3 Bag 2 bananas 3
4 Bag 2 oranges 4
5 Bag 2 apples 5

purrr / tidyverse version. You also get proper types with this and rid of "Bag":
library(jsonlite)
library(purrr)
library(readr)
fromJSON(my_data, flatten=TRUE) %>%
map_df(~as.data.frame(., stringsAsFactors=FALSE), .id="bag") %>%
type_convert() %>%
setNames(c("bag_number", "fruit", "quantity")) -> df
df$bag_number <- gsub("Bag ", "", df$bag_number)

Related

Scraping Website with Unchanging URL in R

I would like to scrape a series of tables from a website whose URL does not change when I click through the tables in my browser. Each table corresponds to a unique date. The default table is that which corresponds to today's date. I can scroll through past dates in my browser, but can't seem to find a way to do so in R.
Using library(rvest) this bit of code will reliably download the table that corresponds to today's date (I'm only interested in the first of the three tables).
webad <- "https://official.nba.com/referee-assignments/"
off <- webad %>%
read_html() %>%
html_table()
off <- off[[1]]
How can I download the table that corresponds to, say "2022-10-04", to "2022-10-06", or to yesterday?
I've tried to work through it by identifying the node under which the table lies, in the hopes that I could manipulate it to reflect a prior date. However, the following reproduces the same table as above:
webad <- "https://official.nba.com/referee-assignments/"
off <- webad %>%
read_html() %>%
html_nodes("#main > div > section:nth-child(1) > article > div > div.dayContent > div > table") %>%
html_table()
off <- off[[1]]
Scrolling through past dates in my browser, I've identified various places in the html that reference the prior date; but I can't seem to change it from R, yet alone get the table I download to reflect a change:
webad %>%
read_html() %>%
html_nodes("#main > div > section:nth-child(1) > article > header > div")
I've messed around some with html_form(), follow_link(), and set_values() also, but to no avail.
Is there a good way to navigate this particular URL in R?
You can consider the following approach :
library(RSelenium)
library(rvest)
port <- as.integer(4444L + rpois(lambda = 1000, 1))
rd <- rsDriver(chromever = "105.0.5195.52", browser = "chrome", port = port)
remDr <- rd$client
remDr$open()
url <- "https://official.nba.com/referee-assignments/"
remDr$navigate(url)
web_Obj_Date <- remDr$findElement("css selector", "#ref-filters-menu > li > div > button")
web_Obj_Date$clickElement()
web_Obj_Date_Input <- remDr$findElement("id", 'ref-date')
web_Obj_Date_Input$clearElement()
web_Obj_Date_Input$sendKeysToElement(list("2022-10-05"))
web_Obj_Date_Input$doubleclick()
web_Obj_Date <- remDr$findElement("css selector", "#ref-filters-menu > li > div > button")
web_Obj_Date$clickElement()
web_Obj_Go_Button <- remDr$findElement("css selector", "#date-filter")
web_Obj_Go_Button$submitElement()
html_Content <- remDr$getPageSource()[[1]]
read_html(html_Content) %>% html_table()
[[1]]
# A tibble: 5 x 5
Game `Official 1` `Official 2` `Official 3` Alternate
<chr> <chr> <chr> <chr> <lgl>
1 Indiana # Charlotte John Goble (#10) Lauren Holtkamp (#7) Phenizee Ransom (#70) NA
2 Cleveland # Philadelphia Marc Davis (#8) Jacyn Goble (#68) Tyler Mirkovich (#97) NA
3 Toronto # Boston Josh Tiven (#58) Matt Boland (#18) Intae hwang (#96) NA
4 Dallas # Oklahoma City Courtney Kirkland (#61) Mitchell Ervin (#27) Cheryl Flores (#91) NA
5 Phoenix # L.A. Lakers Bill Kennedy (#55) Rodney Mott (#71) Jenna Reneau (#93) NA
[[2]]
# A tibble: 0 x 5
# ... with 5 variables: Game <lgl>, Official 1 <lgl>, Official 2 <lgl>, Official 3 <lgl>, Alternate <lgl>
# i Use `colnames()` to see all variable names
[[3]]
# A tibble: 0 x 5
# ... with 5 variables: Game <lgl>, Official 1 <lgl>, Official 2 <lgl>, Official 3 <lgl>, Alternate <lgl>
# i Use `colnames()` to see all variable names
[[4]]
# A tibble: 6 x 7
S M T W T F S
<int> <int> <int> <int> <int> <int> <int>
1 NA NA NA NA NA NA 1
2 2 3 4 5 6 7 8
3 9 10 11 12 13 14 15
4 16 17 18 19 20 21 22
5 23 24 25 26 27 28 29
6 30 31 NA NA NA NA NA
Here is another approach that can be considered :
library(RDCOMClient)
library(rvest)
url <- "https://official.nba.com/referee-assignments/"
IEApp <- COMCreate("InternetExplorer.Application")
IEApp[['Visible']] <- TRUE
IEApp$Navigate(url)
Sys.sleep(5)
doc <- IEApp$Document()
clickEvent <- doc$createEvent("MouseEvent")
clickEvent$initEvent("click", TRUE, FALSE)
web_Obj_Date <- doc$querySelector("#ref-filters-menu > li > div > button")
web_Obj_Date$dispatchEvent(clickEvent)
web_Obj_Date_Input <- doc$GetElementById('ref-date')
web_Obj_Date_Input[["Value"]] <- "2022-10-05"
web_Obj_Go_Button <- doc$querySelector("#date-filter")
web_Obj_Go_Button$dispatchEvent(clickEvent)
html_Content <- doc$Body()$innerHTML()
read_html(html_Content) %>% html_table()
[[1]]
# A tibble: 5 x 5
Game `Official 1` `Official 2` `Official 3` Alternate
<chr> <chr> <chr> <chr> <lgl>
1 Indiana # Charlotte John Goble (#10) Lauren Holtkamp (#7) Phenizee Ransom (#70) NA
2 Cleveland # Philadelphia Marc Davis (#8) Jacyn Goble (#68) Tyler Mirkovich (#97) NA
3 Toronto # Boston Josh Tiven (#58) Matt Boland (#18) Intae hwang (#96) NA
4 Dallas # Oklahoma City Courtney Kirkland (#61) Mitchell Ervin (#27) Cheryl Flores (#91) NA
5 Phoenix # L.A. Lakers Bill Kennedy (#55) Rodney Mott (#71) Jenna Reneau (#93) NA
[[2]]
# A tibble: 8 x 7
Game `Official 1` `Official 2` `Official 3` Alternate `` ``
<chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 "Game" "Official 1" "Official 2" "Official 3" "Alternate" NA NA
2 "S" "M" "T" "W" "T" "F" "S"
3 "" "" "" "" "" "" "1"
4 "2" "3" "4" "5" "6" "7" "8"
5 "9" "10" "11" "12" "13" "14" "15"
6 "16" "17" "18" "19" "20" "21" "22"
7 "23" "24" "25" "26" "27" "28" "29"
8 "30" "31" "" "" "" "" ""
[[3]]
# A tibble: 7 x 7
Game `Official 1` `Official 2` `Official 3` Alternate `` ``
<chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 "S" "M" "T" "W" "T" "F" "S"
2 "" "" "" "" "" "" "1"
3 "2" "3" "4" "5" "6" "7" "8"
4 "9" "10" "11" "12" "13" "14" "15"
5 "16" "17" "18" "19" "20" "21" "22"
6 "23" "24" "25" "26" "27" "28" "29"
7 "30" "31" "" "" "" "" ""
[[4]]
# A tibble: 6 x 7
S M T W T F S
<int> <int> <int> <int> <int> <int> <int>
1 NA NA NA NA NA NA 1
2 2 3 4 5 6 7 8
3 9 10 11 12 13 14 15
4 16 17 18 19 20 21 22
5 23 24 25 26 27 28 29
6 30 31 NA NA NA NA NA
If you install the Docker software (see https://docs.docker.com/engine/install/), you can consider the following approach with firefox :
library(RSelenium)
library(rvest)
shell('docker run -d -p 4445:4444 selenium/standalone-firefox')
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4445L, browserName = "firefox")
remDr$open()
url <- "https://official.nba.com/referee-assignments/"
remDr$navigate(url)
web_Obj_Date <- remDr$findElement("css selector", "#ref-filters-menu > li > div > button")
web_Obj_Date$clickElement()
web_Obj_Date_Input <- remDr$findElement("id", 'ref-date')
web_Obj_Date_Input$clearElement()
web_Obj_Date_Input$sendKeysToElement(list("2022-10-05"))
web_Obj_Date_Input$doubleclick()
web_Obj_Date <- remDr$findElement("css selector", "#ref-filters-menu > li > div > button")
web_Obj_Date$clickElement()
web_Obj_Go_Button <- remDr$findElement("css selector", "#date-filter")
web_Obj_Go_Button$submitElement()
html_Content <- remDr$getPageSource()[[1]]
read_html(html_Content) %>% html_table()
[[1]]
# A tibble: 5 x 5
Game `Official 1` `Official 2` `Official 3` Alternate
<chr> <chr> <chr> <chr> <lgl>
1 Indiana # Charlotte John Goble (#10) Lauren Holtkamp (#7) Phenizee Ransom (#70) NA
2 Cleveland # Philadelphia Marc Davis (#8) Jacyn Goble (#68) Tyler Mirkovich (#97) NA
3 Toronto # Boston Josh Tiven (#58) Matt Boland (#18) Intae hwang (#96) NA
4 Dallas # Oklahoma City Courtney Kirkland (#61) Mitchell Ervin (#27) Cheryl Flores (#91) NA
5 Phoenix # L.A. Lakers Bill Kennedy (#55) Rodney Mott (#71) Jenna Reneau (#93) NA
[[2]]
# A tibble: 0 x 5
# ... with 5 variables: Game <lgl>, Official 1 <lgl>, Official 2 <lgl>, Official 3 <lgl>, Alternate <lgl>
# i Use `colnames()` to see all variable names
[[3]]
# A tibble: 0 x 5
# ... with 5 variables: Game <lgl>, Official 1 <lgl>, Official 2 <lgl>, Official 3 <lgl>, Alternate <lgl>
# i Use `colnames()` to see all variable names
[[4]]
# A tibble: 6 x 7
S M T W T F S
<int> <int> <int> <int> <int> <int> <int>
1 NA NA NA NA NA NA 1
2 2 3 4 5 6 7 8
3 9 10 11 12 13 14 15
4 16 17 18 19 20 21 22
5 23 24 25 26 27 28 29
6 30 31 NA NA NA NA NA

Comparing the contents of two csv files, where the relation between the two files is specified in a third file?

I have two files with sales data, and I want to validate whether the sales numbers in the first file are the same as the sales numbers in the second file. But the product ID used in each file are different. I do have a 3rd file with the the correspondence between the old product ID and the new product ID.
Old Sales file
Product ID Store ID Week ID Sales
a 1 201801 5
a 2 201801 4
a 2 201802 3
b 1 201801 3
b 2 201802 4
b 3 201801 2
c 2 201802 2
New Sales file
Product ID Store ID Week ID Sales
X 1 201801 5
X 2 201801 4
X 2 201802 3
Y 1 201801 5
Y 2 201802 4
Y 3 201801 2
Z 2 201802 2
And an Old product ID/New Product ID correspondence file:
Old Product ID New Product ID
a X
b Y
c Z
I want to run a script or a command that could verify if the sales are the same for each product/store/week combination in both files. That is:
If a and X designated the same product, then I want to check if for a given store and a given week the sales will always match in both file.
Note that not all product present in the old sales file are necessarily present in the new sales file.
The output should look like:
Product ID Store ID Week ID Sales Diff
X 1 201801 0
X 2 201801 0
X 2 201802 0
Y 1 201801 2
Y 2 201802 0
Y 3 201801 0
Z 2 201802 0
I'm thinking of either pulling all 3 files into a bunch of pandas data frames and then merging and doing the validation using pandas merge and difference utilities, or pulling the files into some redshift tables and using SQL to validate. But both seem like overkill. Is there a simpler way of doing this using command line/bash utilities?
I'm a fan of the "do it in sql" approach, specifically, sqlite:
#!/bin/sh
oldsales="$1"
newsales="$2"
junction="$3"
# Import into database. Do once and reuse if running repeated reports on the same data
if [ ! -f sales.db ]; then
sqlite3 -batch sales.db <<EOF
CREATE TABLE old_sales(product_id TEXT, store_id INTEGER, week_id INTEGER, sales INTEGER
, PRIMARY KEY(product_id, store_id, week_id)) WITHOUT ROWID;
CREATE TABLE new_sales(product_id TEXT, store_id INTEGER, week_id INTEGER, sales INTEGER
, PRIMARY KEY(product_id, store_id, week_id)) WITHOUT ROWID;
CREATE TABLE mapping(old_id TEXT PRIMARY KEY, new_id TEXT) WITHOUT ROWID;
.mode csv
.separator \t
.import '|tail -n +2 "$oldsales"' old_sales
.import '|tail -n +2 "$newsales"' new_sales
.import '|tail -n +2 "$junction"' mapping
.quit
EOF
fi
# And query it
sqlite3 -batch sales.db <<EOF
.headers on
.mode list
.separator \t
SELECT n.product_id AS "Product ID", n.store_id AS "Store ID", n.week_id AS "Week ID"
, n.sales - o.sales AS "Sales Diff"
FROM old_sales AS o
JOIN mapping AS m ON o.product_id = m.old_id
JOIN new_sales AS n ON m.new_id = n.product_id
AND o.store_id = n.store_id
AND o.week_id = n.week_id
ORDER BY "Product ID", "Store ID", "Week ID";
.quit
EOF
This assumes your data files are delimited by tabs, and produces tab deliminated output (Easy to change if desired). It also caches the data in the file sales.db and re-uses that if it exists, so you can run the report multiple times on the same data and only populate the database the first time, for efficiencies's sake.
$ ./report.sh old_sales.tsv new_sales.tsv product_mappings.tsv
Product ID Store ID Week ID Sales Diff
X 1 201801 0
X 2 201801 0
X 2 201802 0
Y 1 201801 2
Y 2 201802 0
Y 3 201801 0
Z 2 201802 0
Here's a suggestion for your pandas approach. I called your old dataframe old and your new dataframe new:
First we use your third dataframe as a dictionary to map the old Product ID's to the new ones:
product_id_dct = dict(zip(df3['Old Product ID'], df3['New Product ID']))
old['Product ID'] = old['Product ID'].map(product_id_dct)
print(old)
Product ID Store ID Week ID Sales
0 X 1 201801 5
1 X 2 201801 4
2 X 2 201802 3
3 Y 1 201801 3
4 Y 2 201802 4
5 Y 3 201801 2
6 Z 2 201802 2
Then we do a left merge on the columns which you want to check the changes on. Note a left merge will give us all the matches, and the differences will show in NaN. In this case we don't have any:
new.merge(old, on=['Product ID', 'Store ID', 'Week ID', 'Sales'],
suffixes=['_new', '_old'],
how='left')
Product ID Store ID Week ID Sales
0 X 1 201801 5
1 X 2 201801 4
2 X 2 201802 3
3 Y 1 201801 3
4 Y 2 201802 4
5 Y 3 201801 2
6 Z 2 201802 2
If we leave sales out as a key, we can compare more easily because of the suffixes argument:
new.merge(old, on=['Product ID', 'Store ID', 'Week ID'],
suffixes=['_new', '_old'],
how='left')
Product ID Store ID Week ID Sales_new Sales_old
0 X 1 201801 5 5
1 X 2 201801 4 4
2 X 2 201802 3 3
3 Y 1 201801 3 3
4 Y 2 201802 4 4
5 Y 3 201801 2 2
6 Z 2 201802 2 2
$ cat tst.awk
BEGIN { OFS="\t" }
ARGIND==1 { map[$2] = $1; next }
ARGIND==2 { old[$1,$2,$3] = $4; next }
FNR==1 { gsub(/ +/,OFS); sub(/ *$/,"_Diff"); print; next }
{ print $1, $2, $3, $4 - old[map[$1],$2,$3] }
$ awk -f tst.awk map old new | column -s$'\t' -t
Product ID Store ID Week ID Sales_Diff
X 1 201801 0
X 2 201801 0
X 2 201802 0
Y 1 201801 2
Y 2 201802 0
Y 3 201801 0
Z 2 201802 0
The above uses GNU awk for ARGIND. With other awks just add the line FNR==1 { ARGIND++ } just after the BEGIN line.

mysql sort by number and then varchar

I want to sort data like below in mysql
unsorted data
aa
1
2
f
11
3
df
10
Sorted data(desired output)
1
2
3
10
11
aa
df
f

Json list not 'flattening' properly

I have the below list in a column of a data frame.
As you can see, the variables change through the items. The column affilications is not always present.
I have been trying to flatten the list to a data frame or to a list of 3, but I am geeting a single columg with all elements of every column.
Is there a way I can tell R that each element has 3 columns and that the first one is not always present and to fill it with let's say null.
[[1]]
NULL
[[2]]
affiliations author_id author_name
1 Punjabi University 780E3459 munish puri
2 Punjabi University 48D92C79 rajesh dhaliwal
3 Punjabi University 7D9BD37C r s singh
[[3]]
author_id author_name
1 7FF872BC barbara eileen ryan
[[4]]
author_id author_name
1 0299B8E9 fraser j harbutt
[[5]]
author_id author_name
1 7DAB7B72 richard m freeland
[[6]]
NULL
This is what I'm getting when I try and flatten it.
authors
1 Punjabi University
2 Punjabi University
3 Punjabi University
4 780E3459
5 48D92C79
6 7D9BD37C
7 munish puri
8 rajesh dhaliwal
9 r s singh
10 7FF872BC
But what I really need would be:
[[1]] NULL
[[2]]affiliations author_id author_name
1 Punjabi University 780E3459 munish puri
2 Punjabi University 48D92C79 rajesh dhaliwal
3 Punjabi University 7D9BD37C r s singh
[[3]] NULL author_id author_name
1 NULL 7FF872BC barbara eileen ryan
I i understand you correctly you have data as follows:
require(tidyverse)
list(
NULL,
tibble(a=c(2, 2), b=c(2, 2), c=c(2, 2)),
tibble(b=3, c=3)
)
So:
[[1]]
NULL
[[2]]
# A tibble: 2 x 3
a b c
<dbl> <dbl> <dbl>
1 2 2 2
2 2 2 2
[[3]]
# A tibble: 1 x 2
b c
<dbl> <dbl>
1 3 3
Using bind_rows results in:
bind_rows(list(
NULL,
tibble(a=c(2, 2), b=c(2, 2), c=c(2, 2)),
tibble(b=3, c=3)
))
# A tibble: 3 x 3
a b c
<dbl> <dbl> <dbl>
1 2 2 2
2 2 2 2
3 NA 3 3

How to extract strings from rows which have .json like format?

I have imported a .json file using library(jsonlite) stream_in(file(".json"))
However, one of the columns still looks as a .json format.
Im not really sure how proceed in order to extact the columns ID and email from the .json column.
My example:
date <- as.Date(as.character( c("2015-02-13",
"2015-02-14",
"2015-02-14")))
ID <- c(1,2,3)
name <- c("John","Michael","Thomas")
drinks <- c("Beer","Coffee","Tee")
consumed <- c(2,5,3)
john<- "{\"employeID\":\"1\",\"other_details\":{\"email\":\"john#gmx.com\"},\"computer\":\"yes\"}"
michael<- "{\"employeID\":\"2\",\"other_details\":{\"email\":\"michael#yahoo.com\"},\"computer\":\"yes\"}"
thomas<- "{\"employeID\":\"3\",\"other_details\":{\"email\":\"thomas#gmail.com\"},\"computer\":\"yes\"}"
json <- c(john,michael,thomas)
df <- data.frame(date,ID,name,drinks,consumed,json)
Where the data.frame looks like that:
I would like to get the following format:
date ID name drinks consumed email computer
#1 2015-02-13 1 John Beer 2 john#gmx.com yes
#2 2015-02-14 2 Michael Coffee 5 michael#yahoo.com no
#3 2015-02-14 3 Thomas Tee 3 thomas#gmail.com yes
What I have tried was to was first to use the library(jsonlite) again in different variations but it always results in:
fromJSON(df$json[1])
Error: Argument 'txt' must be a JSON string, URL or file.
How can I extract these fields properly?
df$json is a factor vector while fromJSON only accepts a JSON string, URL or file. You can try
fromJSON(as.character(df$json[1]))
or add stringsAsFactor=FALSE when you create df.
You do your task, you can try:
library(tidyverse)
df %>%
filter(json != "{}") %>% # Drop rows with json == "{}"
rowwise() %>%
do(data.frame(ID = .$ID, jsonlite::fromJSON(.$json), stringsAsFactors=FALSE)) %>%
merge(df %>% select(-json), by="ID", all.y=TRUE)
Output:
ID employeID email computer date name drinks consumed
1 1 1 john#gmx.com yes 2015-02-13 John Beer 2
2 2 2 michael#yahoo.com yes 2015-02-14 Michael Coffee 5
3 3 3 thomas#gmail.com yes 2015-02-14 Thomas Tee 3
It can handle cases with "{}" in json column.
df2 <- df %>%
rbind(data.frame(date="2015-02-14", ID=4, name="Kitman",
drinks="Chocolate", consumed=1, json="{}"))
df2 %>%
filter(json != "{}") %>%
rowwise() %>%
do(data.frame(ID = .$ID, jsonlite::fromJSON(.$json), stringsAsFactors=FALSE)) %>%
merge(df2 %>% select(-json), by="ID", all.y=TRUE)
Output:
ID employeID email computer date name drinks consumed
1 1 1 john#gmx.com yes 2015-02-13 John Beer 2
2 2 2 michael#yahoo.com yes 2015-02-14 Michael Coffee 5
3 3 3 thomas#gmail.com yes 2015-02-14 Thomas Tee 3
4 4 <NA> <NA> <NA> 2015-02-14 Kitman Chocolate 1
Outdated:
cbind(
df %>% select(-json),
df$json %>%
map(~as.data.frame(jsonlite::fromJSON(.))) %>%
do.call("rbind", .)
)
Output:
date ID name drinks consumed employeID email computer
1 2015-02-13 1 John Beer 2 1 john#gmx.com yes
2 2015-02-14 2 Michael Coffee 5 2 michael#yahoo.com yes
3 2015-02-14 3 Thomas Tee 3 3 thomas#gmail.com yes
First, try:
ndjson::stream_in("filename.json")
The ndjson package is faster than jsonlite and was built for flattening (it's very task-specific and not as swiss-army-knife-ish as the highly useful jsonlite pkg).
Or, we can keep the tidyverse idioms all the way through:
library(tidyverse)
map_df(df$json, ~jsonlite::fromJSON(as.character(.))) %>%
bind_cols(select(df, -json)) %>%
mutate_if(is.factor, as.character) %>%
mutate_if(is.list, as.character) %>%
select(ID, name, drinks, consumed, everything())
## # A tibble: 3 × 8
## ID name drinks consumed computer employeID other_details.email date
## <dbl> <chr> <chr> <dbl> <chr> <chr> <chr> <date>
## 1 1 John Beer 2 yes 1 john#gmx.com 2015-02-13
## 2 2 Michael Coffee 5 yes 2 michael#yahoo.com 2015-02-14
## 3 3 Thomas Tee 3 yes 3 thomas#gmail.com 2015-02-14
And, you get your character columns.