Cannot read JSON in pandas

Cannot read JSON in pandas - json

I have a nested JSON. I want it to be read in pandas in order to explore it, but I got errors. When to use read_json method, I got: "Trailing data". It is valid JSON. How to read it in pd? (Tried differently, but did not work). It looks like this:
{
"contributors": null,
"coordinates": null,
"created_at": "Fri May 26 08:54:00 +0000 2017",
"entities": {
"hashtags": [],
"media": [
{
"display_url": "pic.twitter.com/Pm28ORTePl",
"expanded_url": "",
"id": 868027417121751040,
"id_str": "868027417121751040",
"indices": [
94,
117
],
"media_url": "",
"sizes": {
"large": {
"h": 404,
"resize": "fit",
"w": 773
},
"medium": {
"h": 404,
"resize": "fit",
"w": 773
},
"small": {
"h": 355,
"resize": "fit",
"w": 680
},
"thumb": {
"h": 150,
"resize": "crop",
"w": 150
}
},
"type": "photo",
"url": ""
}
],
"symbols": [],
"urls": [
{
"display_url": "",
"expanded_url": "",
"indices": [
70,
93
],
"url": ""
}
],
"user_mentions": []
},
"extended_entities": {
"media": [
{
"display_url": "pic.twitter.com/Pm28ORTePl",
"expanded_url": "1",
"id": 868027417121751040,
"id_str": "868027417121751040",
"indices": [
94,
117
],
"media_url": "",
"media_url_https": "",
"sizes": {
"large": {
"h": 404,
"resize": "fit",
"w": 773
},
"medium": {
"h": 404,
"resize": "fit",
"w": 773
},
"small": {
"h": 355,
"resize": "fit",
"w": 680
},
"thumb": {
"h": 150,
"resize": "crop",
"w": 150
}
},
"type": "photo",
"url": ""
}
]
},
"favorite_count": 1,
"favorited": false,
"geo": null,
"id": 868027425757724672,
"id_str": "868027425757724672",
"in_reply_to_screen_name": null,
"in_reply_to_status_id": null,
"in_reply_to_status_id_str": null,
"in_reply_to_user_id": null,
"in_reply_to_user_id_str": null,
"is_quote_status": false,
"lang": "ru",
"place": null,
"possibly_sensitive": false,
"retweet_count": 0,
"retweeted": false,
"source": "Twitter Web Client",
"text": "\u041f\u0440\u043e\u043f\u0430\u0432\u0448\u0430\u044f \u0432 \u041a\u043e\u043a\u0448\u0435\u0442\u0430\u0443 \u0448\u043a\u043e\u043b\u044c\u043d\u0438\u0446\u0430 \u0436\u0438\u043b\u0430 \u0432 \u0437\u0430\u0431\u0440\u043e\u0448\u0435\u043d\u043d\u043e\u043c \u0434\u043e\u043c\u0435 \u0438 \u0431\u0440\u043e\u0434\u044f\u0436\u043d\u0438\u0447\u0430\u043b\u0430\n",
"truncated": false,
"user": {
"contributors_enabled": false,
"created_at": "Wed May 18 11:59:50 +0000 2011",
"default_profile": true,
"default_profile_image": false,
"description": "\u041a\u0430\u0437\u0430\u0445\u0441\u0442\u0430\u043d\u0441\u043a\u0438\u0439 \u0438\u043d\u0442\u0435\u0440\u043d\u0435\u0442-\u043f\u043e\u0440\u0442\u0430\u043b",
"entities": {
"description": {
"urls": []
},
"url": {
"urls": [
{
"display_url": "",
"expanded_url": "",
"indices": [
0,
22
],
"url": ""
}
]
}
},
"favourites_count": 87,
"follow_request_sent": false,
"followers_count": 17989,
"following": true,
"friends_count": 98,
"geo_enabled": true,
"has_extended_profile": false,
"id": 300811189,
"id_str": "300811189",
"is_translation_enabled": false,
"is_translator": false,
"lang": "ru",
"listed_count": 86,
"location": "\u0410\u043b\u043c\u0430\u0442\u044b",
"name": "",
"notifications": false,
"profile_background_color": "C0DEED",
"profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png",
"profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png",
"profile_background_tile": false,
"profile_banner_url": "https://pbs.twimg.com/profile_banners/300811189/1489117916",
"profile_image_url": "http://pbs.twimg.com/profile_images/840047424882298881/NxZSyfhM_normal.jpg",
"profile_image_url_https": "https://pbs.twimg.com/profile_images/840047424882298881/NxZSyfhM_normal.jpg",
"profile_link_color": "1DA1F2",
"profile_sidebar_border_color": "C0DEED",
"profile_sidebar_fill_color": "DDEEF6",
"profile_text_color": "333333",
"profile_use_background_image": true,
"protected": false,
"screen_name": "",
"statuses_count": 53011,
"time_zone": "Quito",
"translator_type": "none",
"url": "",
"utc_offset": -18000,
"verified": false
}
}

Sorry, but your JSON is actually not valid, despite your saying it is.
This line:
"media_url": "": "",
Should probably be:
"media_url": "",
At which point, when I added the final bracket } that was outside of your code block, validated as properly formed JSON.

Related

Nested Json data to postgresql using python

How can i import my json file to a postgresql using python my data will look likes
{
"blocked_by": false,
"blocking": false,
"contributors_enabled": false,
"created_at": "Thu Dec 17 06:32:35 +0000 2020",
"default_profile": true,
"default_profile_image": false,
"description": "seo",
"entities": {
"description": {
"urls": []
}
},
"favourites_count": 12,
"follow_request_sent": false,
"followers_count": 56,
"following": false,
"friends_count": 1344,
"geo_enabled": false,
"has_extended_profile": true,
"id": 1339458394508374018,
"id_str": "1339458394508374018",
"is_translation_enabled": false,
"is_translator": false,
"lang": null,
"listed_count": 3,
"live_following": false,
"location": "",
"muting": false,
"name": "katheryn myle",
"notifications": false,
"profile_background_color": "F5F8FA",
"profile_background_image_url": null,
"profile_background_image_url_https": null,
"profile_background_tile": false,
"profile_image_url": "http://pbs.twimg.com/profile_images/1343124937389842432/30cfUmGe_normal.jpg",
"profile_image_url_https": "https://pbs.twimg.com/profile_images/1343124937389842432/30cfUmGe_normal.jpg",
"profile_link_color": "1DA1F2",
"profile_sidebar_border_color": "C0DEED",
"profile_sidebar_fill_color": "DDEEF6",
"profile_text_color": "333333",
"profile_use_background_image": true,
"protected": false,
"screen_name": "KatherynMyle",
"status": {
"contributors": null,
"coordinates": null,
"created_at": "Tue Apr 05 10:17:04 +0000 2022",
"entities": {
"hashtags": [],
"symbols": [],
"urls": [
{
"display_url": "twitter.com/i/web/status/1\u2026",
"expanded_url": "https://twitter.com/i/web/status/1511286791319564293",
"indices": [
117,
140
],
"url": "shortened url"
}
],
"user_mentions": [
{
"id": 2247373052,
"id_str": "2247373052",
"indices": [
0,
12
],
"name": "GDevelop",
"screen_name": "GDevelopApp"
}
]
},
"favorite_count": 0,
"favorited": false,
"geo": null,
"id": 1511286791319564293,
"id_str": "1511286791319564293",
"in_reply_to_screen_name": "GDevelopApp",
"in_reply_to_status_id": null,
"in_reply_to_status_id_str": null,
"in_reply_to_user_id": 2247373052,
"in_reply_to_user_id_str": "2247373052",
"is_quote_status": false,
"lang": "en",
"place": null,
"retweet_count": 0,
"retweeted": false,
"source": "Twitter Web App",
"text": "#GDevelopApp Hello\nHow are you fine I am a link builder and we are selling links and posts on high-quality sites if\u2026 ",
"truncated": true
},
"statuses_count": 382,
"time_zone": null,
"translator_type": "none",
"url": null,
"utc_offset": null,
"verified": false,
"withheld_in_countries": []
I want to create a table with it have ing the index or key name as the column name and also i want only some index data from it not all (like utc _offset,place,etc.).so i want to insert the specific data i wanted to postgresql
https://drive.google.com/file/d/1OkJFfHyU4Eb-V3mrU2qJ_jwBaQQkunJa/view?usp=drivesdk this is the json file

Your question is very vague and I can't come up with a short answer except this one:
pSQL does support JSON as a data type:
https://www.postgresql.org/docs/current/datatype-json.html
https://www.postgresql.org/docs/current/functions-json.html
You could just store the whole json as a single json element and select/convert/whatever within postgre later on.
How to insert JSONB into Postgresql with Python?

json jq request format

I need help to extract data from this JSON file using jq.
The app flv then verify the streamname mystrame is active and extract meta.video.height
I did try lot of queries without success and my jq knowledge is poor.
{
"port": 1935,
"server_index": 0,
"applications": [{
"name": "hls",
"live": {
"streams": [{
"name": "donbosco",
"time": 2380739,
"bw_in": 2112440,
"bytes_in": 541618713,
"bw_out": 0,
"bytes_out": 0,
"bw_audio": 35544,
"bw_video": 2076888,
"clients": [{
"id": 453,
"address": "127.0.0.1",
"time": 2380959,
"flashver": "FMLE/3.0 (compatible; Lavf57.83.100)",
"dropped": 0,
"avsync": 28,
"timestamp": 2382635,
"publishing": true,
"active": true
}],
"records": [],
"meta": {
"video": {
"width": 1168,
"height": 720,
"frame_rate": 25,
"codec": "H264",
"profile": "High",
"level": 3.1
},
"audio": {
"codec": "AAC",
"profile": "LC",
"channels": 2,
"sample_rate": 16000
}
},
"nclients": 1,
"publishing": true,
"active": true
}],
"nclients": 1
},
"recorders": {
"count": 0,
"lists": []
}
},
{
"name": "flv",
"live": {
"streams": [{
"name": "mystream",
"time": 2382811,
"bw_in": 2059096,
"bytes_in": 541841549,
"bw_out": 2059096,
"bytes_out": 543351459,
"bw_audio": 35472,
"bw_video": 2023624,
"clients": [{
"id": 452,
"address": "127.0.0.1",
"time": 2382727,
"flashver": "LNX 9,0,124,2",
"dropped": 0,
"avsync": -12,
"timestamp": 2384520,
"publishing": false,
"active": true
},
{
"id": 451,
"address": "127.0.0.1",
"time": 2383031,
"flashver": "FMLE/3.0 (compatible; Lavf58.74.100)",
"dropped": 0,
"avsync": -12,
"timestamp": 2384520,
"publishing": true,
"active": true
}
],
"records": [],
"meta": {
"video": {
"width": 1168,
"height": 720,
"frame_rate": 25,
"codec": "H264",
"profile": "High",
"level": 3.1
},
"audio": {
"codec": "AAC",
"profile": "LC",
"channels": 2,
"sample_rate": 16000
}
},
"nclients": 2,
"publishing": true,
"active": true
}],
"nclients": 2
},
"recorders": {
"count": 0,
"lists": []
}
}
]
}

Are you asking for help with the command-line JSON processor jq or the JavaScript library jQuery? They are not the same. For jq, try
jq '
.applications
| map(select(.name == "flv"))[].live.streams
| map(select(.name == "mystream" and .active))[].meta.video.height
'
720
Demo

Ionic Json nested

I'm new in Ionic development. I'm having a problem retrieving JSON.
"Failed to load https://api.wh.geniussports.com/v1/basketball/competitions/19816/matcheslive?ak=eebd8ae256142ac3fd24bd2003d28782: No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'http://localhost:8100' is therefore not allowed access."
Here's my json format
{
"response": {
"meta": {
"version": 1,
"code": 200,
"status": "success",
"request": "http://api.wh.geniussports.com/v1/basketball/competitions/19816/matcheslive?ak=eebd8ae256142ac3fd24bd2003d28782",
"time": 1528177532,
"count": 10,
"limit": 10
},
"data": [
{
"leagueId": 6,
"matchId": 784470,
"competitionId": 19816,
"venueId": 17386,
"poolNumber": 0,
"roundNumber": "1",
"roundDescription": "",
"matchNumber": 1,
"matchStatus": "COMPLETE",
"matchName": "",
"phaseName": "",
"extraPeriodsUsed": 0,
"matchTime": "2017-11-17 19:30:00",
"matchTimeUTC": "2017-11-17 11:30:00",
"enddate": null,
"timeActual": "2017-11-17 19:45:37",
"timeEndActual": "2017-11-17 21:12:10",
"durationActual": 87,
"temperature": 0,
"attendance": 0,
"duration": 120,
"weather": "",
"twitterHashtag": "",
"liveStream": 1,
"matchType": "REGULAR",
"keywords": "",
"ticketURL": "",
"externalId": "920",
"nextMatchId": 0,
"placeIfWon": 0,
"placeIfLost": 0,
"updated": "2017-11-24 11:04:00",
"linkDetail": "/v1/basketball/matches/784470",
"linkDetailLeague": "/v1/basketball/leagues/6",
"venue": {
"venueId": 17386,
"venueName": "Nanhai Gymnasium",
"venueNameInternational": "",
"venueNickname": "Nanhai Gym",
"venueNicknameInternational": "",
"surfaceName": "",
"locationName": "",
"website": "",
"ticketURL": "",
"externalId": "54",
"linkDetailVenue": "/v1/basketball/venues/17386"
},
"leagueName": "ASEAN Basketball League",
"leagueNameInternational": "",
"competitionName": "2017 ASEAN Basketball League",
"competitionNameInternational": "",
"gsId": "",
"competitors": [
{
"competitorType": "TEAM",
"competitorName": "Singapore Slingers",
"competitorId": 88261,
"linkDetailCompetitor": "/v1/basketball/teams/88261",
"scoreString": "59",
"scoreSecondaryString": "",
"completionStatus": "COMPLETE",
"resultPlacing": 0,
"isDrawn": 0,
"isHomeCompetitor": 0,
"teamId": 88261,
"teamName": "Singapore Slingers",
"teamGsId": null,
"teamNameInternational": "",
"teamNickname": "Singapore Slingers",
"teamNicknameInternational": "",
"teamCode": "",
"teamCodeInternational": "",
"website": "",
"internationalReference": "",
"externalId": "74",
"images": {
"logo": {
"L1": {
"size": "L1",
"height": 600,
"width": 600,
"bytes": 45196,
"url": "http://img.wh.sportingpulseinternational.com/5b71cf0a1af51c8376eda43e6ba5bc22L1.jpg"
},
"M1": {
"size": "M1",
"height": 400,
"width": 400,
"bytes": 25005,
"url": "http://img.wh.sportingpulseinternational.com/5b71cf0a1af51c8376eda43e6ba5bc22M1.jpg"
},
"S1": {
"size": "S1",
"height": 200,
"width": 200,
"bytes": 9180,
"url": "http://img.wh.sportingpulseinternational.com/5b71cf0a1af51c8376eda43e6ba5bc22S1.jpg"
},
"T1": {
"size": "T1",
"height": 75,
"width": 75,
"bytes": 2270,
"url": "http://img.wh.sportingpulseinternational.com/5b71cf0a1af51c8376eda43e6ba5bc22T1.jpg"
}
}
},
"clubId": 62,
"clubGsId": null,
"clubName": "Singapore Slingers",
"clubNameInternational": "",
"linkDetailClub": "/v1/basketball/clubs/62"
},
Below is my loadUser function
loadUser(){
this.http.get('http://api.wh.geniussports.com/v1/basketball/competitions/19816/matcheslive?ak=eebd8ae256142ac3fd24bd2003d28782')
.map(res => res.json())
.subscribe(res => {
this.data = data.results;
console.log(data.results);
}, err => {
console.log(err);
});
}
My main goal is to log the data[] array. Please help me

This is the problem with browser, typically its a security concern not to allow other requests which may lead to XSS attack easily. If only for development I suggest you to install a plugin which will disable in your browser plugin
If for production, then you need to configure your API then do this .

From JSON file to Data Frame in R

I have an extract of tweets in JSON format. I have attached an sample of the data. I would need to convert this JSON into a dataframe.
So far I managed to convert it using the "jsonlite" package:
json_data <- jsonlite::stream_in(file("myjsonfile.txt"))
But it does not load all the information contained in the tweets. For example I only see the user who retweeted but not who posted the tweet.
You can view the json file better using this website by copy pasting the file and selecting format: http://jsonviewer.stack.hu/
The data is coming from the Twitter API (more information on this data available here: https://dev.twitter.com/overview/api/tweets
Thank you in advance for your time and help.
ML_Enthousiast
{"favorited": false, "in_reply_to_status_id_str": null, "in_reply_to_user_id": null, "truncated": false, "in_reply_to_user_id_str": null, "coordinates": null, "retweeted": false, "text": "RT #Antoniotalks: Revenue streams for #OpenData companies!\n#Cloud #StartUp #SMM #AI #IoT #Fintech #BigData #deeplearning #Mpgvip\u2026 ", "retweet_count": 0, "filter_level": "low", "created_at": "Thu Jun 29 18:47:18 +0000 2017", "favorite_count": 0, "retweeted_status": {"favorited": false, "in_reply_to_status_id_str": null, "in_reply_to_user_id": null, "display_text_range": [0, 140], "truncated": true, "in_reply_to_user_id_str": null, "coordinates": null, "retweeted": false, "text": "Revenue streams for #OpenData companies!\n#Cloud #StartUp #SMM #AI #IoT #Fintech #BigData #deeplearning #Mpgvip\u2026 ", "retweet_count": 38, "filter_level": "low", "created_at": "Wed Jun 28 12:45:08 +0000 2017", "favorite_count": 48, "in_reply_to_screen_name": null, "extended_tweet": {"extended_entities": {"media": [{"media_url_https": "", "sizes": {"thumb": {"w": 150, "h": 150, "resize": "crop"}, "large": {"w": 1200, "h": 927, "resize": "fit"}, "medium": {"w": 1200, "h": 927, "resize": "fit"}, "small": {"w": 680, "h": 525, "resize": "fit"}}, "type": "photo", "expanded_url": "", "id": 880044388679901184, "media_url": "http://pbs.twimg.com/media/DDaLtXXXYAAI2eM.jpg", "id_str": "880044388679901184", "display_url": "pic.twitter.com/aw9HeukUYv", "indices": [139, 162], "url": ""}]}, "full_text": "Revenue streams for #OpenData companies!\n#Cloud #StartUp #SMM #AI #IoT #Fintech #BigData #deeplearning #Mpgvip #defstar5 #DataScience #CIO ", "entities": {"user_mentions": [], "hashtags": [{"text": "OpenData", "indices": [20, 29]}, {"text": "Cloud", "indices": [41, 47]}, {"text": "StartUp", "indices": [48, 56]}, {"text": "SMM", "indices": [57, 61]}, {"text": "AI", "indices": [62, 65]}, {"text": "IoT", "indices": [66, 70]}, {"text": "Fintech", "indices": [71, 79]}, {"text": "BigData", "indices": [80, 88]}, {"text": "deeplearning", "indices": [89, 102]}, {"text": "Mpgvip", "indices": [103, 110]}, {"text": "defstar5", "indices": [111, 120]}, {"text": "DataScience", "indices": [121, 133]}, {"text": "CIO", "indices": [134, 138]}], "media": [{"media_url_https": "", "sizes": {"thumb": {"w": 150, "h": 150, "resize": "crop"}, "large": {"w": 1200, "h": 927, "resize": "fit"}, "medium": {"w": 1200, "h": 927, "resize": "fit"}, "small": {"w": 680, "h": 525, "resize": "fit"}}, "type": "photo", "expanded_url": "", "id": 880044388679901184, "media_url": "", "id_str": "880044388679901184", "display_url": "pic.twitter.com/aw9HeukUYv", "indices": [139, 162], "url": ""}], "symbols": [], "urls": []}, "display_text_range": [0, 138]}, "in_reply_to_status_id": null, "source": "Buffer", "id_str": "880044392110796800", "entities": {"user_mentions": [], "hashtags": [{"text": "OpenData", "indices": [20, 29]}, {"text": "Cloud", "indices": [41, 47]}, {"text": "StartUp", "indices": [48, 56]}, {"text": "SMM", "indices": [57, 61]}, {"text": "AI", "indices": [62, 65]}, {"text": "IoT", "indices": [66, 70]}, {"text": "Fintech", "indices": [71, 79]}, {"text": "BigData", "indices": [80, 88]}, {"text": "deeplearning", "indices": [89, 102]}, {"text": "Mpgvip", "indices": [103, 110]}], "symbols": [], "urls": [{"display_url": "twitter.com/i/web/status/8\u2026", "indices": [112, 135], "expanded_url": "", "url": "8H"}]}, "lang": "en", "id": 880044392110796800, "is_quote_status": false, "geo": null, "user": {"screen_name": "Antoniotalks", "profile_background_image_url": "", "profile_image_url": "jpg", "follow_request_sent": null, "profile_background_tile": false, "id": 2445890839, "is_translator": false, "description": "A father & CEO of Recruitd (#imrecruitd). Helping companies magnify their #employer and #recruitment #brand and #jobseekers with the #skillstosucceed.", "listed_count": 198, "favourites_count": 398, "created_at": "Tue Apr 15 19:13:52 +0000 2014", "notifications": null, "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png", "contributors_enabled": false, "profile_background_color": "C0DEED", "following": null, "friends_count": 6792, "protected": false, "default_profile": true, "profile_use_background_image": true, "name": "Antonio Giugno", "location": "London, England", "geo_enabled": true, "id_str": "2445890839", "utc_offset": -25200, "profile_banner_url": "0", "profile_text_color": "333333", "lang": "en-gb", "statuses_count": 4058, "profile_sidebar_fill_color": "DDEEF6", "default_profile_image": false, "profile_image_url_https": "4433/dVeGYfTX_normal.jpg", "profile_link_color": "1DA1F2", "url": "rnontubein", "verified": false, "profile_sidebar_border_color": "C0DEED", "followers_count": 6323, "time_zone": "Pacific Time (US & Canada)"}, "contributors": null, "possibly_sensitive": false, "place": null}, "in_reply_to_screen_name": null, "timestamp_ms": "1498762038396", "in_reply_to_status_id": null, "source": "Mobile Web (M2)", "id_str": "880497923150286848", "entities": {"user_mentions": [{"screen_name": "Antoniotalks", "id": 2445890839, "id_str": "2445890839", "name": "Antonio Giugno", "indices": [3, 16]}], "hashtags": [{"text": "OpenData", "indices": [38, 47]}, {"text": "Cloud", "indices": [59, 65]}, {"text": "StartUp", "indices": [66, 74]}, {"text": "SMM", "indices": [75, 79]}, {"text": "AI", "indices": [80, 83]}, {"text": "IoT", "indices": [84, 88]}, {"text": "Fintech", "indices": [89, 97]}, {"text": "BigData", "indices": [98, 106]}, {"text": "deeplearning", "indices": [107, 120]}, {"text": "Mpgvip", "indices": [121, 128]}], "symbols": [], "urls": [{"indices": [130, 130], "expanded_url": null, "url": ""}]}, "lang": "en", "id": 880497923150286848, "is_quote_status": false, "geo": null, "user": {"screen_name": "henrymbuguak", "profile_background_image_url": "://abs.twimg.com/images/themes/theme3/bg.gif", "profile_image_url": "://pbs.twimg.com/profile_images/822772556818239489/0yTbHCGj_normal.jpg", "follow_request_sent": null, "profile_background_tile": false, "id": 310697279, "is_translator": false, "description": "I enjoy coding. Visit my github project: :// ://github.com/henrymbuguak", "listed_count": 62, "favourites_count": 978, "created_at": "Sat Jun 04 05:55:09 +0000 2011", "notifications": null, "profile_background_image_url_https": "://abs.twimg.com/images/themes/theme3/bg.gif", "contributors_enabled": false, "profile_background_color": "EDECE9", "following": null, "friends_count": 2540, "protected": false, "default_profile": false, "profile_use_background_image": true, "name": "kiarie henry mbugua", "location": "Njoro, Kenya.", "geo_enabled": false, "id_str": "310697279", "utc_offset": 10800, "profile_banner_url": "://pbs.twimg.com/profile_banners/310697279/1484999353", "profile_text_color": "634047", "lang": "en", "statuses_count": 3775, "profile_sidebar_fill_color": "E3E2DE", "default_profile_image": false, "profile_image_url_https": "//pbs.twimg.com/profile_images/822772556818239489/0yTbHCGj_normal.jpg", "profile_link_color": "088253", "url": null, "verified": false, "profile_sidebar_border_color": "D3D2CF", "followers_count": 2141, "time_zone": "Nairobi"}, "contributors": null, "place": null}

If I read in your data using
indata <- jsonlite::read_json("myjsonfile.json")
then I get all the information contained in the JSON file. It is a nested list so you may need to extract the information you want from one of the elements in the list
> names(indata)
[1] "favorited" "in_reply_to_status_id_str"
[3] "in_reply_to_user_id" "truncated"
[5] "in_reply_to_user_id_str" "coordinates"
[7] "retweeted" "text"
[9] "retweet_count" "filter_level"
[11] "created_at" "favorite_count"
[13] "retweeted_status" "in_reply_to_screen_name"
[15] "timestamp_ms" "in_reply_to_status_id"
[17] "source" "id_str"
[19] "entities" "lang"
[21] "id" "is_quote_status"
[23] "geo" "user"
[25] "contributors" "place"
The information about the user is for example (only a part shown)
> indata$user
$screen_name
[1] "henrymbuguak"
$profile_background_image_url
[1] "://abs.twimg.com/images/themes/theme3/bg.gif"
$profile_image_url
[1] "://pbs.twimg.com/profile_images/822772556818239489/0yTbHCGj_normal.jpg"
$follow_request_sent
NULL
$profile_background_tile
[1] FALSE
$id
[1] 310697279
so you can get the user with indata$user$screen_name

parsing a large multi nested json into scala case class

Below is a json example of Twitter's tweet. It's a large json. What is the best library/method to parse it into a case class in scala?
For instance, in Play Framework 2.x it's possible to do that with it's internal library by defining case classes and implicit conversions, but in this case I don't use Play. Should I?
spray-json seems to be most popular scala json library, but in this case it looks quite disappointing - standard approach seems to be limited to 22 elements and uses pattern matching, which becomes ridiculous in the context of multi nested structure with hundreds of elements. Any ideas?
{
"created_at": "Sat Oct 24 06:44:34 +0000 2015",
"id": 657809891558576132,
"id_str": "657809891558576132",
"text": "RT #M23projects: Kara Walker \"Go to Hell or Atlanta, Whichever Come First\" #victoriamiro #London https://t.co/HapqKa4i0l https://t.co/95G…",
"source": "Twitter for iPhone",
"truncated": false,
"in_reply_to_status_id": null,
"in_reply_to_status_id_str": null,
"in_reply_to_user_id": null,
"in_reply_to_user_id_str": null,
"in_reply_to_screen_name": null,
"user": {
"id": 2792146884,
"id_str": "2792146884",
"name": "Tonbridge School Art",
"screen_name": "ArtTonSchool",
"location": "Tonbridge",
"url": null,
"description": "Tonbridge School is an independent day and boarding school for boys. Tweets by the Art Department.",
"protected": false,
"verified": false,
"followers_count": 187,
"friends_count": 288,
"listed_count": 10,
"favourites_count": 1069,
"statuses_count": 1764,
"created_at": "Fri Sep 05 15:37:43 +0000 2014",
"utc_offset": 3600,
"time_zone": "London",
"geo_enabled": true,
"lang": "en-gb",
"contributors_enabled": false,
"is_translator": false,
"profile_background_color": "C0DEED",
"profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png",
"profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png",
"profile_background_tile": false,
"profile_link_color": "0084B4",
"profile_sidebar_border_color": "C0DEED",
"profile_sidebar_fill_color": "DDEEF6",
"profile_text_color": "333333",
"profile_use_background_image": true,
"profile_image_url": "http://pbs.twimg.com/profile_images/507921409738543104/V35eZACR_normal.jpeg",
"profile_image_url_https": "https://pbs.twimg.com/profile_images/507921409738543104/V35eZACR_normal.jpeg",
"profile_banner_url": "https://pbs.twimg.com/profile_banners/2792146884/1410119421",
"default_profile": true,
"default_profile_image": false,
"following": null,
"follow_request_sent": null,
"notifications": null
},
"geo": null,
"coordinates": null,
"place": null,
"contributors": null,
"retweeted_status": {
"created_at": "Sat Oct 24 02:27:06 +0000 2015",
"id": 657745100739506176,
"id_str": "657745100739506176",
"text": "Kara Walker \"Go to Hell or Atlanta, Whichever Come First\" #victoriamiro #London https://t.co/HapqKa4i0l https://t.co/95GaLC4XTo",
"source": "Twitter for iPhone",
"truncated": false,
"in_reply_to_status_id": null,
"in_reply_to_status_id_str": null,
"in_reply_to_user_id": null,
"in_reply_to_user_id_str": null,
"in_reply_to_screen_name": null,
"user": {
"id": 999716342,
"id_str": "999716342",
"name": "M23",
"screen_name": "M23projects",
"location": "New York",
"url": "http://M23.co",
"description": "M23's project space + itinerant program promotes new work by new artists. \nhttp://Instagram.com/m23projects",
"protected": false,
"verified": false,
"followers_count": 9150,
"friends_count": 7353,
"listed_count": 174,
"favourites_count": 1354,
"statuses_count": 4666,
"created_at": "Sun Dec 09 17:13:35 +0000 2012",
"utc_offset": -14400,
"time_zone": "Eastern Time (US & Canada)",
"geo_enabled": true,
"lang": "en",
"contributors_enabled": false,
"is_translator": false,
"profile_background_color": "547587",
"profile_background_image_url": "http://pbs.twimg.com/profile_background_images/884257252/e329bbc1b91d695862d5b23a209f2d34.jpeg",
"profile_background_image_url_https": "https://pbs.twimg.com/profile_background_images/884257252/e329bbc1b91d695862d5b23a209f2d34.jpeg",
"profile_background_tile": true,
"profile_link_color": "414A4D",
"profile_sidebar_border_color": "FFFFFF",
"profile_sidebar_fill_color": "DDEEF6",
"profile_text_color": "333333",
"profile_use_background_image": true,
"profile_image_url": "http://pbs.twimg.com/profile_images/458985956830236673/Z_4Bq9PJ_normal.jpeg",
"profile_image_url_https": "https://pbs.twimg.com/profile_images/458985956830236673/Z_4Bq9PJ_normal.jpeg",
"profile_banner_url": "https://pbs.twimg.com/profile_banners/999716342/1398650659",
"default_profile": false,
"default_profile_image": false,
"following": null,
"follow_request_sent": null,
"notifications": null
},
"geo": null,
"coordinates": null,
"place": null,
"contributors": null,
"is_quote_status": false,
"retweet_count": 2,
"favorite_count": 3,
"entities": {
"hashtags": [
{
"text": "London",
"indices": [
74,
81
]
}
],
"urls": [
{
"url": "https://t.co/HapqKa4i0l",
"expanded_url": "http://instagram.com/m23projects",
"display_url": "instagram.com/m23projects",
"indices": [
82,
105
]
}
],
"user_mentions": [
{
"screen_name": "victoriamiro",
"name": "Victoria Miro",
"id": 373924746,
"id_str": "373924746",
"indices": [
58,
71
]
}
],
"symbols": [],
"media": [
{
"id": 657745078413201408,
"id_str": "657745078413201408",
"indices": [
106,
129
],
"media_url": "http://pbs.twimg.com/media/CSDHqfeUkAA4a0Y.jpg",
"media_url_https": "https://pbs.twimg.com/media/CSDHqfeUkAA4a0Y.jpg",
"url": "https://t.co/95GaLC4XTo",
"display_url": "pic.twitter.com/95GaLC4XTo",
"expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
"type": "photo",
"sizes": {
"small": {
"w": 340,
"h": 255,
"resize": "fit"
},
"medium": {
"w": 600,
"h": 450,
"resize": "fit"
},
"thumb": {
"w": 150,
"h": 150,
"resize": "crop"
},
"large": {
"w": 1024,
"h": 768,
"resize": "fit"
}
}
}
]
},
"extended_entities": {
"media": [
{
"id": 657745078413201408,
"id_str": "657745078413201408",
"indices": [
106,
129
],
"media_url": "http://pbs.twimg.com/media/CSDHqfeUkAA4a0Y.jpg",
"media_url_https": "https://pbs.twimg.com/media/CSDHqfeUkAA4a0Y.jpg",
"url": "https://t.co/95GaLC4XTo",
"display_url": "pic.twitter.com/95GaLC4XTo",
"expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
"type": "photo",
"sizes": {
"small": {
"w": 340,
"h": 255,
"resize": "fit"
},
"medium": {
"w": 600,
"h": 450,
"resize": "fit"
},
"thumb": {
"w": 150,
"h": 150,
"resize": "crop"
},
"large": {
"w": 1024,
"h": 768,
"resize": "fit"
}
}
},
{
"id": 657745085275095040,
"id_str": "657745085275095040",
"indices": [
106,
129
],
"media_url": "http://pbs.twimg.com/media/CSDHq5CUwAAC-6a.jpg",
"media_url_https": "https://pbs.twimg.com/media/CSDHq5CUwAAC-6a.jpg",
"url": "https://t.co/95GaLC4XTo",
"display_url": "pic.twitter.com/95GaLC4XTo",
"expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
"type": "photo",
"sizes": {
"small": {
"w": 340,
"h": 453,
"resize": "fit"
},
"medium": {
"w": 600,
"h": 800,
"resize": "fit"
},
"thumb": {
"w": 150,
"h": 150,
"resize": "crop"
},
"large": {
"w": 768,
"h": 1024,
"resize": "fit"
}
}
},
{
"id": 657745085300277248,
"id_str": "657745085300277248",
"indices": [
106,
129
],
"media_url": "http://pbs.twimg.com/media/CSDHq5IVAAAn2YH.jpg",
"media_url_https": "https://pbs.twimg.com/media/CSDHq5IVAAAn2YH.jpg",
"url": "https://t.co/95GaLC4XTo",
"display_url": "pic.twitter.com/95GaLC4XTo",
"expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
"type": "photo",
"sizes": {
"small": {
"w": 340,
"h": 453,
"resize": "fit"
},
"medium": {
"w": 600,
"h": 800,
"resize": "fit"
},
"thumb": {
"w": 150,
"h": 150,
"resize": "crop"
},
"large": {
"w": 768,
"h": 1024,
"resize": "fit"
}
}
},
{
"id": 657745085275082752,
"id_str": "657745085275082752",
"indices": [
106,
129
],
"media_url": "http://pbs.twimg.com/media/CSDHq5CUkAAd0oL.jpg",
"media_url_https": "https://pbs.twimg.com/media/CSDHq5CUkAAd0oL.jpg",
"url": "https://t.co/95GaLC4XTo",
"display_url": "pic.twitter.com/95GaLC4XTo",
"expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
"type": "photo",
"sizes": {
"small": {
"w": 340,
"h": 255,
"resize": "fit"
},
"medium": {
"w": 600,
"h": 450,
"resize": "fit"
},
"thumb": {
"w": 150,
"h": 150,
"resize": "crop"
},
"large": {
"w": 1024,
"h": 768,
"resize": "fit"
}
}
}
]
},
"favorited": false,
"retweeted": false,
"possibly_sensitive": false,
"filter_level": "low",
"lang": "en"
},
"is_quote_status": false,
"retweet_count": 0,
"favorite_count": 0,
"entities": {
"hashtags": [
{
"text": "London",
"indices": [
91,
98
]
}
],
"urls": [
{
"url": "https://t.co/HapqKa4i0l",
"expanded_url": "http://instagram.com/m23projects",
"display_url": "instagram.com/m23projects",
"indices": [
99,
122
]
}
],
"user_mentions": [
{
"screen_name": "M23projects",
"name": "M23",
"id": 999716342,
"id_str": "999716342",
"indices": [
3,
15
]
},
{
"screen_name": "victoriamiro",
"name": "Victoria Miro",
"id": 373924746,
"id_str": "373924746",
"indices": [
75,
88
]
}
],
"symbols": [],
"media": [
{
"id": 657745078413201408,
"id_str": "657745078413201408",
"indices": [
123,
140
],
"media_url": "http://pbs.twimg.com/media/CSDHqfeUkAA4a0Y.jpg",
"media_url_https": "https://pbs.twimg.com/media/CSDHqfeUkAA4a0Y.jpg",
"url": "https://t.co/95GaLC4XTo",
"display_url": "pic.twitter.com/95GaLC4XTo",
"expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
"type": "photo",
"sizes": {
"small": {
"w": 340,
"h": 255,
"resize": "fit"
},
"medium": {
"w": 600,
"h": 450,
"resize": "fit"
},
"thumb": {
"w": 150,
"h": 150,
"resize": "crop"
},
"large": {
"w": 1024,
"h": 768,
"resize": "fit"
}
},
"source_status_id": 657745100739506176,
"source_status_id_str": "657745100739506176",
"source_user_id": 999716342,
"source_user_id_str": "999716342"
}
]
},
"extended_entities": {
"media": [
{
"id": 657745078413201408,
"id_str": "657745078413201408",
"indices": [
123,
140
],
"media_url": "http://pbs.twimg.com/media/CSDHqfeUkAA4a0Y.jpg",
"media_url_https": "https://pbs.twimg.com/media/CSDHqfeUkAA4a0Y.jpg",
"url": "https://t.co/95GaLC4XTo",
"display_url": "pic.twitter.com/95GaLC4XTo",
"expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
"type": "photo",
"sizes": {
"small": {
"w": 340,
"h": 255,
"resize": "fit"
},
"medium": {
"w": 600,
"h": 450,
"resize": "fit"
},
"thumb": {
"w": 150,
"h": 150,
"resize": "crop"
},
"large": {
"w": 1024,
"h": 768,
"resize": "fit"
}
},
"source_status_id": 657745100739506176,
"source_status_id_str": "657745100739506176",
"source_user_id": 999716342,
"source_user_id_str": "999716342"
},
{
"id": 657745085275095040,
"id_str": "657745085275095040",
"indices": [
123,
140
],
"media_url": "http://pbs.twimg.com/media/CSDHq5CUwAAC-6a.jpg",
"media_url_https": "https://pbs.twimg.com/media/CSDHq5CUwAAC-6a.jpg",
"url": "https://t.co/95GaLC4XTo",
"display_url": "pic.twitter.com/95GaLC4XTo",
"expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
"type": "photo",
"sizes": {
"small": {
"w": 340,
"h": 453,
"resize": "fit"
},
"medium": {
"w": 600,
"h": 800,
"resize": "fit"
},
"thumb": {
"w": 150,
"h": 150,
"resize": "crop"
},
"large": {
"w": 768,
"h": 1024,
"resize": "fit"
}
},
"source_status_id": 657745100739506176,
"source_status_id_str": "657745100739506176",
"source_user_id": 999716342,
"source_user_id_str": "999716342"
},
{
"id": 657745085300277248,
"id_str": "657745085300277248",
"indices": [
123,
140
],
"media_url": "http://pbs.twimg.com/media/CSDHq5IVAAAn2YH.jpg",
"media_url_https": "https://pbs.twimg.com/media/CSDHq5IVAAAn2YH.jpg",
"url": "https://t.co/95GaLC4XTo",
"display_url": "pic.twitter.com/95GaLC4XTo",
"expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
"type": "photo",
"sizes": {
"small": {
"w": 340,
"h": 453,
"resize": "fit"
},
"medium": {
"w": 600,
"h": 800,
"resize": "fit"
},
"thumb": {
"w": 150,
"h": 150,
"resize": "crop"
},
"large": {
"w": 768,
"h": 1024,
"resize": "fit"
}
},
"source_status_id": 657745100739506176,
"source_status_id_str": "657745100739506176",
"source_user_id": 999716342,
"source_user_id_str": "999716342"
},
{
"id": 657745085275082752,
"id_str": "657745085275082752",
"indices": [
123,
140
],
"media_url": "http://pbs.twimg.com/media/CSDHq5CUkAAd0oL.jpg",
"media_url_https": "https://pbs.twimg.com/media/CSDHq5CUkAAd0oL.jpg",
"url": "https://t.co/95GaLC4XTo",
"display_url": "pic.twitter.com/95GaLC4XTo",
"expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
"type": "photo",
"sizes": {
"small": {
"w": 340,
"h": 255,
"resize": "fit"
},
"medium": {
"w": 600,
"h": 450,
"resize": "fit"
},
"thumb": {
"w": 150,
"h": 150,
"resize": "crop"
},
"large": {
"w": 1024,
"h": 768,
"resize": "fit"
}
},
"source_status_id": 657745100739506176,
"source_status_id_str": "657745100739506176",
"source_user_id": 999716342,
"source_user_id_str": "999716342"
}
]
},
"favorited": false,
"retweeted": false,
"possibly_sensitive": false,
"filter_level": "low",
"lang": "en",
"timestamp_ms": "1445669074321"
}
**UPDATE: ** I guess I should stick to play-json, even more so for performance reasons - http://derekwyatt.org/2014/01/15/benchmarking-spray-json-argonaut-play-json/

You can depends on Play's JSON library by itself:
// build.sbt
libraryDependencies += "com.typesafe.play" % "play-json_2.11" % "X.X.X"
// Tweet.scala
import play.api.libs.json._
case class User(id: String, name: String, ...)
implicit val userFormat = Json.format[User]
case class Tweet(id: String, content: String, user: User)
implicit val tweetFormat = Json.format[Tweet]
This will use play-json's macros to auto-generate the formatters you need to parse JSON into instances of Tweet and User.
Regardless of library you choose you won't find an elegant solution for handling more than 22 fields since that's a limitation of the case class implementation (up until 2.11) rather than any specific design choice by a library.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Cannot read JSON in pandas - json

Sorry, but your JSON is actually not valid, despite your saying it is. This line: "media_url": "": "", Should probably be: "media_url": "", At which point, when I added the final bracket } that was outside of your code block, validated as properly formed JSON.

Related

Nested Json data to postgresql using python

json jq request format

Ionic Json nested

From JSON file to Data Frame in R

parsing a large multi nested json into scala case class

Categories

Resources