How to convert JSON column to list in R - json

I have data set like this
genres
[{"id": 28, "name": "Action"}, {"id": 12, "name": "Adventure"}, {"id": 14, "name": "Fantasy"}, {"id": 878, "name": "Science Fiction"}]
[{"id": 12, "name": "Adventure"}, {"id": 14, "name": "Fantasy"}, {"id": 28, "name": "Action"}]
[{"id": 28, "name": "Action"}, {"id": 12, "name": "Adventure"}, {"id": 80, "name": "Crime"}]
[{"id": 28, "name": "Action"}, {"id": 80, "name": "Crime"}, {"id": 18, "name": "Drama"}, {"id": 53, "name": "Thriller"}]
[{"id": 28, "name": "Action"}, {"id": 12, "name": "Adventure"}, {"id": 878, "name": "Science Fiction"}]
[{"id": 14, "name": "Fantasy"}, {"id": 28, "name": "Action"}, {"id": 12, "name": "Adventure"}]
[{"id": 16, "name": "Animation"}, {"id": 10751, "name": "Family"}]
Now I want to populate a drop down list of Shiny application so I want to convert the column JSON to data table, I use apply functions but does not get desired result, can someone please help me out.
Code:
lapply(dt, fromJSON(dt$genres))

I am new to R so it may be not best solution but here my finding which is working in my case, please let me know if any feedback
genre <- raw$genres
genreList <- lapply(genre,function(x) fromJSON(x))
genreList <- genreList[sapply(genreList, function(x) as.numeric(dim(x)[1])) > 0]
genreList <- genreList[!sapply(genreList, is.null)]
finalGenre <- unique(Reduce(function(...) merge(..., all=T), genreList))
And the output
id name
1 28 Action
2 53 Thriller
3 10769 Foreign
4 12 Adventure
5 10751 Family
6 878 Science Fiction
7 27 Horror
8 16 Animation
9 80 Crime
10 35 Comedy
11 10770 TV Movie
12 18 Drama
13 99 Documentary
14 10752 War
15 10402 Music
16 10749 Romance
17 14 Fantasy
18 37 Western
19 36 History
20 9648 Mystery

Related

capture instances of a column value where its seen more than once

I have the following data frame:
id_1 id_2 id_3 id_4 id_5
0133 11 kelly AA-1 1
2119 22 Wade AA-2 1
3903 33 John BB-1 1
3903 33 John BB-2 1
3903 33 John BB-3 1
5133 44 Emily C-1 1
9148 99 Pete BB-34 1
9148 99 Pete BB-23 1
2910 111 Mark DD-3 1
I want to iterate through it and capture any instace where the id_1 has the same value when its greater than 1.
I want to only capture colums id_4 and id_5. This will ultimately be added to a JSON object so the end result would be:
{"id_1": "0133", "id_2": "11", "id_3": "kelly", "items": [{"id_4":"AA-1", "id_5":"1"}]}
{"id_1": "2119", "id_2": "22", "id_3": "Wade", "items": [{"id_4":"AA-2", "id_5":"1"}]}
{"id_1": "3903", "id_2": "33", "id_3": "John", "items": [{"id_4":"BB-1", "id_5":"1",{"id_4":"BB-1", "BB-2":"1"}, ,{"id_4":"BB-1", "BB-3":"1"}]}
{"id_1": "5133", "id_2": "44", "id_3": "Emily", "items": [{"id_4":"C-1", "id_5":"1"}]}
{"id_1": "9148", "id_2": "99", "id_3": "Pete", "items": [{"id_4":"BB-34", "id_5":"1",{"id_4":"BB-23", "BB-2":"1"}]}
{"id_1": "2910", "id_2": "111", "id_3": "Mark", "items": [{"id_4":"DD-3", "id_5":"1"}]}
Would anyone know the best approach to accomplish something like this? any insight is greatly appreciated.
You can try something like:
df.groupby('id_1').nth(1).reset_index()[['id_4','id_5']]
Then you can convert it to JSON.

Adding more data in array of object in PostgreSQL

I have a table of cart with 2 columns (user_num, data).
user_num will have the phone number of user and
data will have an array of object like [{ "id": 1, "quantity": 1 }, { "id": 2, "quantity": 2 }, { "id": 3, "quantity": 3 }] here id is product id.
user_num | data
----------+--------------------------------------------------------------------------------------
1 | [{ "id": 1, "quantity": 1 }, { "id": 2, "quantity": 2 }, { "id": 3, "quantity": 3 }]
I want to add more data of products in above array of objects in PostgreSQL.
Thanks!
To add the value use the JSONB array append operator ||
Demo
update
test
set
data = data || '[{"id": 4, "quantity": 4}, {"id": 5, "quantity": 5}]'
where
user_num = 1;

How to make pgsql return the json array

everyone , I face some issue to convert the data into json object. There is a table called milestone with the following data:
id name parentId
a test1 A
b test2 B
c test3 C
I want to convert the result into a json type in Postgres:
[{"id": "a", "name": "test1", "parentId": "A"}]
[{"id": "b", "name": "test2", "parentId": "B"}]
[{"id": "c", "name": "test3", "parentId": "C"}]
if there are anyone know how to handle , please let me know , thanks all
You can get each row of the table as simple json object with to_jsonb():
select to_jsonb(m)
from milestone m
to_jsonb
-----------------------------------------------
{"id": "a", "name": "test1", "parentid": "A"}
{"id": "b", "name": "test2", "parentid": "B"}
{"id": "c", "name": "test3", "parentid": "C"}
(3 rows)
If you want to get a single element array for each row, use jsonb_build_array():
select jsonb_build_array(to_jsonb(m))
from milestone m
jsonb_build_array
-------------------------------------------------
[{"id": "a", "name": "test1", "parentid": "A"}]
[{"id": "b", "name": "test2", "parentid": "B"}]
[{"id": "c", "name": "test3", "parentid": "C"}]
(3 rows)
You can also get all rows as a json array with jsonb_agg():
select jsonb_agg(to_jsonb(m))
from milestone m
jsonb_agg
-----------------------------------------------------------------------------------------------------------------------------------------------
[{"id": "a", "name": "test1", "parentid": "A"}, {"id": "b", "name": "test2", "parentid": "B"}, {"id": "c", "name": "test3", "parentid": "C"}]
(1 row)
Read about JSON Functions and Operators in the documentation.
You can use ROW_TO_JSON
From Documentation :
Returns the row as a JSON object. Line feeds will be added between
level-1 elements if pretty_bool is true.
For the query :
select
row_to_json(tbl)
from
(select * from tbl) as tbl;
You can check here in DEMO

Unknown duplicates from querying a nested JSON

I would like to do text search in a JSON object in a table.
I have a table called Audio that is structured like below:
id| keyword | transcript | user_id | company_id | client_id
-----------------------------------------------------------
This is the JSON data structure of transcript:
{"transcript": [
{"duration": 2390.0,
"interval": [140.0, 2530.0],
"speaker": "Speaker_2",
"words": [
{"p": 0, "s": 0, "e": 320, "c": 0.545, "w": "This"},
{"p": 1, "s": 320, "e": 620, "c": 0.825, "w": "call"},
{"p": 2, "s": 620, "e": 780, "c": 0.909, "w": "is"},
{"p": 3, "s": 780, "e": 1010, "c": 0.853, "w": "being"},
{"p": 4, "s": 1010, "e": 1250, "c": 0.814, "w": "recorded"}
]
},
{"duration": 4360.0,
"interval": [3280.0, 7640.0],
"speaker": "Speaker_1",
"words": [
{"p": 5, "s": 5000, "e": 5020, "c": 0.079, "w": "as"},
{"p": 6, "s": 5020, "e": 5100, "c": 0.238, "w": "a"},
{"p": 7, "s": 5100, "e": 5409, "c": 0.689, "w": "group"},
{"p": 8, "s": 5410, "e": 5590, "c": 0.802, "w": "called"},
{"p": 9, "s": 5590, "e": 5870, "c": 0.834, "w": "tricks"}
]
},
...
}
What I am trying to do is to do a text search in the "w" field within "words". This is the query that I tried to run:
WITH info_data AS (
SELECT transcript_info->'words' AS info
FROM Audio t, json_array_elements(transcript->'transcript') AS transcript_info)
SELECT info_item->>'w', id
FROM Audio, info_data idata, json_array_elements(idata.info) AS info_item
WHERE info_item->>'w' ilike '%this';
Right now I only have four columns with data and the fifth column is null. And there are five columns in total. However, I got the following result where even the column that doesn't have data results an output:
?column? | id
----------+----
This | 2
This | 5
This | 1
This | 3
This | 4
This | 2
This | 5
I would love to know what the problem of my query is and whether there are more efficient way in doing this.
The problem is that you make a cartesian join between table Audio on the one hand and info_data and info_item on the other hand (there is an implicit lateral join between these latter two) here:
FROM Audio, info_data idata, json_array_elements(idata.info) AS info_item
You can solve this by adding Audio.id to the CTE and then adding WHERE Audio.id = info_data.id.
It is doubtful that this is the most efficient solution (CTEs rarely are). If you just want to get those rows where the word "this" is a word in the transcript, then you are most likely better off like this:
SELECT DISTINCT id
FROM (
SELECT id, transcript_info->'words' AS info
FROM Audio, json_array_elements(transcript->'transcript') AS transcript_info) AS t,
json_array_elements(info) AS words
WHERE words->>'w' ILIKE 'this';
Note that the % in the pattern string is very inefficient. Since very few words in the English language other than "this" end with the same, I have taken the liberty of removing it.

converting R dataframes to json object

Say I have the following dataframes:
df1 <- data.frame(Name = c("Harry","George"), color=c("#EA0001", "#EEEEEE"))
Name color
1 Harry #EA0001
2 George #EEEEEE
df.details <- data.frame(Name = c(rep("Harry",each=3), rep("George", each=3)),
age=21:23,
total=c(14,19,24,1,9,4)
)
Name age total
1 Harry 21 14
2 Harry 22 19
3 Harry 23 24
4 George 21 1
5 George 22 9
6 George 23 4
I know how to convert each df to json like this:
library(jsonlite)
toJSON(df.details)
[{"Name":"Harry","age":21,"total":14},{"Name":"Harry","age":22,"total":19},{"Name":"Harry","age":23,"total":24},{"Name":"George","age":21,"total":1},{"Name":"George","age":22,"total":9},{"Name":"George","age":23,"total":4}]
However, I am looking to get the following structure to my JSON data:
{
"myjsondata": [
{
"Name": "Harry",
"color": "#EA0001",
"details": [
{
"age": 21,
"total": 14
},
{
"age": 22,
"total": 19
},
{
"age": 23,
"total": 24
}
]
},
{
"Name": "George",
"color": "#EEEEEE",
"details": [
{
"age": 21,
"total": 1
},
{
"age": 22,
"total": 9
},
{
"age": 23,
"total": 4
}
]
}
]
}
I think the answer may be in how I store the data in a list in R before converting, but not sure.
Try this format:
df1$details <- split(df.details[-1], df.details$Name)[df1$Name]
df1
# Name color details
#1 Harry #EA0001 21, 22, 23, 14, 19, 24
#2 George #EEEEEE 21, 22, 23, 1, 9, 4
toJSON(df1)
#[{
#"Name":"Harry",
#"color":"#EA0001",
#"details":[
# {"age":21,"total":14},
# {"age":22,"total":19},
# {"age":23,"total":24}]},
#{
#"Name":"George",
#"color":"#EEEEEE",
#"details":[
# {"age":21,"total":1},
# {"age":22,"total":9},
# {"age":23,"total":4}]}
#]