hourly_forecast from Wunderground API - json

I'm trying to get the hourly forecast temperature for a specific hour of the day from the Wunderground API for a research project:
This is an example of the JSON:
http://api.wunderground.com/api/[key]/geolookup/astronomy/forecast/history_/hourly/conditions/q/mo/columbia.json
The specific section looks like this:
"hourly_forecast": [
{
"FCTTIME": {
"hour": "17","hour_padded": "17","min": "00","sec": "0","year": "2013","mon": "7","mon_padded": "07","mon_abbrev": "Jul","mday": "15","mday_padded": "15","yday": "195","isdst": "1","epoch": "1373925600","pretty": "5:00 PM CDT on July 15, 2013","civil": "5:00 PM","month_name": "July","month_name_abbrev": "Jul","weekday_name": "Monday","weekday_name_night": "Monday Night","weekday_name_abbrev": "Mon","weekday_name_unlang": "Monday","weekday_name_night_unlang": "Monday Night","ampm": "PM","tz": "","age": ""
},
"temp": {"english": "86", "metric": "30"},
"dewpoint": {"english": "71", "metric": "22"},
"condition": "Thunderstorm",
"icon": "tstorms",
"icon_url":"http://icons-ak.wxug.com/i/c/k/tstorms.gif",
"fctcode": "15",
"sky": "66",
"wspd": {"english": "10", "metric": "16"},
"wdir": {"dir": "ESE", "degrees": "119"},
"wx": "Scattered Thunderstorms , Scattered Light Rain Showers",
"uvi": "3",
"humidity": "60",
"windchill": {"english": "-9998", "metric": "-9998"},
"heatindex": {"english": "91", "metric": "33"},
"feelslike": {"english": "91", "metric": "33"},
"qpf": {"english": "", "metric": ""},
"snow": {"english": "", "metric": ""},
"pop": "30",
"mslp": {"english": "30.14", "metric": "1020"}
}
,
I get the JSON like so:
$.ajax({
url : "http://api.wunderground.com/api/[key]/geolookup/forecast/hourly/history_/astronomy/conditions/q/"+lat+","+lon+".json",
dataType : "jsonp",
success : function(parsed_json) {
Then I try to run through the hourly forecasts like so (normally mday and hour are replaced with a variable containing today's date and a specific hour, but for troubleshooting, I put these numbers in).
// get the forecast
$.each( parsed_json['hourly_forecast'], function( i, value ) {
if( value.FCTTIME.mday == 15 && value.FCTTIME.hour == 19) {
six_hour_forecast = value.temp.english;
}
However, I consistently get the wrong temp.english for six_hour_forecast.
So close — what am I missing?

This did it:
// Get the forecast 6 hour temp
$.each( parsed_json['hourly_forecast'], function( index, value ) {
if( value['FCTTIME']['hour']==sunrise_hour*1 + 6 && value['FCTTIME']['mday']==window.current_day ) {
window.six_hour_forecast = value.temp.english;
}
}); // end each

Related

How to identify and explode a nested json file as columns of a dataframe?

I am reframing my question again so that it would be more clear.
My data looks like this .
{
"Research": {
"#xmlns": "http://www.xml.org/2013/2/XML",
"#language": "eng",
"#createDateTime": "2022-03-25T10:12:39Z",
"#researchID": "abcd",
"Product": {
"#productID": "abcd",
"StatusInfo": {
"#currentStatusIndicator": "Yes",
"#statusDateTime": "2022-03-25T12:18:41Z",
"#statusType": "Published"
},
"Source": {
"Organization": {
"#primaryIndicator": "Yes",
"#type": "SellSideFirm",
"OrganizationID": [
{
"#idType": "L1",
"#text": "D827C98E315F"
},
{
"#idType": "TR",
"#text": "3202"
},
{
"#idType": "TR",
"#text": "SZA"
}
],
"OrganizationName": {
"#nameType": "Legal",
"#text": "Citi"
},
"PersonGroup": {
"PersonGroupMember": {
"#primaryIndicator": "Yes",
"#sequence": "1",
"Person": {
"#personID": "tr56",
"FamilyName": "Wang",
"GivenName": "Bond",
"DisplayName": "Bond Wang",
"Biography": "Bond Wang is a",
"BiographyFormatted": "Bond Wang",
"PhotoResourceIdRef": "AS44556"
}
}
}
}
},
"Content": {
"Title": "Premier",
"Abstract": "None",
"Synopsis": "Premier’s solid 1H22 result .",
"Resource": [
{
"#language": "eng",
"#primaryIndicator": "Yes",
"#resourceID": "9553",
"Length": {
"#lengthUnit": "Pages",
"#text": "17"
},
"MIMEType": "text/html",
"URL": "https://www.DFKJG.com/rendition/eppublic"
},
{
"#language": "eng",
"#primaryIndicator": "No",
"#resourceID": "4809",
"Length": {
"#lengthUnit": "Pages",
"#text": "17"
},
"MIMEType": "ABS/pdf",
"Name": "asdf.pdf",
"Comments": "fr5.pdf"
},
{
"#language": "eng",
"#primaryIndicator": "No",
"#resourceID": "6d13a965723e",
"Length": {
"#lengthUnit": "Pages",
"#text": "17"
},
"MIMEType": "text/html",
"URL": "https://www.dfgdfg.com/"
},
{
"#primaryIndicator": "No",
"#resourceID": "709c7bdb1c99",
"MIMEType": "tyy/image",
"URL": "https://ir.ght.com"
},
{
"#primaryIndicator": "No",
"#resourceID": "gfjhgj",
"MIMEType": "gtty/image",
"URL": "https://ir.gtty.com"
}
]
},
"Context": {
"#external": "Yes",
"IssuerDetails": {
"Issuer": {
"#issuerType": "Corporate",
"#primaryIndicator": "Yes",
"SecurityDetails": {
"Security": {
"#estimateAction": "Revision",
"#primaryIndicator": "Yes",
"#targetPriceAction": "Increase",
"SecurityID": [
{
"#idType": "RIC",
"#idValue": "PMV.AX",
"#publisherDefinedValue": "RIC"
},
{
"#idType": "Bloomberg",
"#idValue": "PMV#AU"
},
{
"#idType": "SEDOL",
"#idValue": "6699781"
}
],
"SecurityName": "Premier Investments Ltd",
"AssetClass": {
"#assetClass": "Equity"
},
"AssetType": {
"#assetType": "Stock"
},
"SecurityType": {
"#securityType": "Common"
},
"Rating": {
"#rating": "NeutralSentiment",
"#ratingType": "Rating",
"#aspect": "Investment",
"#ratingDateTime": "2020-07-31T08:24:37Z",
"RatingEntity": {
"#ratingEntity": "PublisherDefined",
"PublisherDefinedValue": "Citi"
}
}
}
},
"IssuerID": {
"#idType": "PublisherDefined",
"#idValue": "PMV.AX",
"#publisherDefinedValue": "TICKER"
},
"IssuerName": {
"#nameType": "Legal",
"NameValue": "Premier Investments Ltd"
}
}
},
"ProductDetails": {
"#periodicalIndicator": "No",
"#publicationDateTime": "2022-03-25T12:18:41Z",
"ProductCategory": {
"#productCategory": "Report"
},
"ProductFocus": {
"#focus": "Issuer",
"#primaryIndicator": "Yes"
},
"EntitlementGroup": {
"Entitlement": [
{
"#includeExcludeIndicator": "Include",
"#primaryIndicator": "No",
"AudienceTypeEntitlement": {
"#audienceType": "PublisherDefined",
"#entitlementContext": "TR",
"#text": "20012"
}
},
{
"#includeExcludeIndicator": "Include",
"#primaryIndicator": "No",
"AudienceTypeEntitlement": {
"#audienceType": "PublisherDefined",
"#entitlementContext": "TR",
"#text": "2001"
}
}
]
}
},
"ProductClassifications": {
"Discipline": {
"#disciplineType": "Investment",
"#researchApproach": "Fundamental"
},
"Subject": {
"#publisherDefinedValue": "TREPS",
"#subjectValue": "PublisherDefined"
},
"Country": {
"#code": "AU",
"#primaryIndicator": "Yes"
},
"Region": {
"#primaryIndicator": "Yes",
"#emergingIndicator": "No",
"#regionType": "Australasia"
},
"AssetClass": {
"#assetClass": "Equity"
},
"AssetType": {
"#assetType": "Stock"
},
"SectorIndustry": [
{
"#classificationType": "GICS",
"#code": "25201040",
"#focusLevel": "Yes",
"#level": "4",
"#primaryIndicator": "Yes",
"Name": "Household Appliances"
},
{
"#classificationType": "GICS",
"#code": "25504020",
"#focusLevel": "Yes",
"#level": "4",
"#primaryIndicator": "Yes",
"Name": "Computer & Electronics Retail"
},
{
"#classificationType": "GICS",
"#code": "25504040",
"#focusLevel": "Yes",
"#level": "4",
"#primaryIndicator": "Yes",
"Name": "Specialty Stores"
},
{
"#classificationType": "GICS",
"#code": "25504030",
"#focusLevel": "Yes",
"#level": "4",
"#primaryIndicator": "Yes",
"Name": "Home Improvement Retail"
},
{
"#classificationType": "GICS",
"#code": "25201050",
"#focusLevel": "Yes",
"#level": "4",
"#primaryIndicator": "Yes",
"Name": "Housewares & Specialties"
}
]
}
}
}
}
}
I want to explode all of its elements into data frame .
The no of columns that has list like structure can change also.
Basically we will not be knowing if next input will have few column or more columns to be exploded .
This is what i have tried so far but it looks like it does not give me correct answer .
Also the column values i have hardcoded but it should identify and then explode.
import xmltodict as xmltodict
from pprint import pprint
import pandas as pd
import json
from tabulate import tabulate
dict =(xmltodict.parse("""xml data"""))
json_str = json.dumps(dict)
resp = json.loads(json_str)
print(resp)
df = pd.json_normalize(resp)
cols=['Research.Product.Source.Organization.OrganizationID','Research.Product.Content.Resource','Research.Product.Context.IssuerDetails.Issuer.SecurityDetails.Security.SecurityID','Research.Product.Context.ProductDetails.EntitlementGroup.Entitlement','Research.Product.Context.ProductClassifications.SectorIndustry']
def expplode_columns(df, cols):
df_e = df.copy()
for c in cols:
df_e = df_e.explode(c, ignore_index=True)
return df_e
df2 = expplode_columns(df, cols)
print(tabulate(df2, headers="keys", tablefmt="psql"))
# df2.to_csv('dataframe.csv', header=True, index=False)
As suggested in the comments, you can define a helper function in pure Python to recursively flatten the nested values of your data.
So, with the json file you provided, here is one way to do it:
def flatten(data, new_data):
"""Recursive helper function.
Args:
data: nested dictionary.
new_data: empty dictionary.
Returns:
Flattened dictionary.
"""
for key, value in data.items():
if isinstance(value, dict):
flatten(value, new_data)
if isinstance(value, str) or isinstance(value, int) or isinstance(value, list):
new_data[key] = value
return new_data
And then:
import json
import pandas as pd
with open("file.json") as f:
content = json.load(f)
df = pd.DataFrame.from_dict(flatten(content, {}), orient="index").T
From here, you can deal with columns which contains lists of dictionaries with identical keys, but different values, by exploding them and repeating the other values, like this:
cols_with_lists = [col for col in df.columns if isinstance(df.loc[0, col], list)]
for col in cols_with_lists:
temp_df = pd.concat(
[pd.DataFrame(item, index=[i]) for i, item in enumerate(df.loc[0, col])],
axis=0,
)
df = pd.concat([df.drop(columns=[col]), temp_df], axis=1).fillna(method="ffill")
So that, finally, the json file is entirely flattened:
print(df)
# Output
#xmlns #language ... #primaryIndicator Name
0 http://www.xml.org/2013/2/XML eng ... Yes Household Appliances
1 http://www.xml.org/2013/2/XML eng ... Yes Computer & Electronics Retail
2 http://www.xml.org/2013/2/XML eng ... Yes Specialty Stores
3 http://www.xml.org/2013/2/XML eng ... Yes Home Improvement Retail
4 http://www.xml.org/2013/2/XML eng ... Yes Housewares & Specialties
[5 rows x 73 columns]
Little hacky but you can extract columns that has a list type in it. Then use reduce to recursively explode and normalize all columns until there are no more list/object.
I haven't tested well but something like this.
from functools import reduce
def full_explode_normalize(df):
# Extract list columns
explode_cols = [x for x in df.columns if isinstance(df.iloc[0][x], list)]
if len(explode_cols) < 1:
return df
# Explode and normalize the list
df = reduce(_explode, explode_cols, df)
return df
def _explode(df, col):
df = df.explode(col)
if isinstance(df.iloc[0][col], list):
df = _explode(df, col)
elif isinstance(df.iloc[0][col], object):
df_child = pd.json_normalize(df[col])
# To prevent column name collision, add the parent column name as prefix.
df_child.columns = [f'{col}.{x}' for x in df_child.columns]
df = pd.concat([df.loc[:, ~df.columns.isin([col])].reset_index(drop=True), df_child], axis=1)
return df

Snowflake - Querying Nested JSON

I need some help querying this JSON file I've ingested into a temp table in Snowflake. So, I've created a JSON_DATA variant column and plan to query and do a COPY INTO another table, but my query isn't working yet... I feel I'm close (possibly?)
JSON layout:
{
"nextPage": "01",
"page": "0",
"status": "ok",
"transactions": [
{
"id": "65985",
"recordTp": "vendorbill",
"values": {
"account": [
{
"text": "14500 Deferred Expenses",
"value": "249"
}
],
"account.number": "1450",
"account.type": [
{
"text": "Deferred Expense",
"value": "DeferExpense"
}
],
"amount": "51733",
"classnohierarchy": [
{
"text": "901 Corporate",
"value": "139"
}
],
"currency": [
{
"text": "Canadian Dollar",
"value": "3"
}
],
"customer.altname": "V Sties expenses (Tor)",
"customer.custate": "12/31/2019",
"customer.custentient": "ada Inc.",
"customer.custendate": "1/1/2019",
"customer.entyid": "PR781",
"departmentnohierarchy": [
{
"text": "8rity",
"value": "37"
}
],
"fxamount": "689",
"location": [
{
"text": "Othad Projects",
"value": "48"
}
],
"postingperiod": [
{
"text": "Jan 2020",
"value": "1"
}
],
"subsidiary.custrecord_region": [
{
"text": "CANADA",
"value": "3"
}
],
"subsidiarynohierarchy": [
{
"text": "ada Inc.",
"value": "25"
}
]
}
},
I've been able to query the values that are not (deeply) nested but I need help getting, for example, the values from 'classnohierarchy', to get both the 'text' and 'value' I tried:
transactions.value:"values".classnohierarchy.text::string as class_txt,
transactions.value:"values".classnohierarchy.value::string as class_val,
but it's returning NULL values.
Below is my entire query:
SELECT
JSON_DATA:status::string as connection_status,
transactions.value:id::string as id,
transactions.value:recordType::string as record_type,
transactions.value:"values"::variant as trans_val,
transactions.value:"values".account as acc,
transactions.value:"values".account.text as text,
transactions.value:"values".account.value as val,
transactions.value:"values"."account.number"::string as acc_num,
transactions.value:"values"."account.type".text::string as acc_type_txt,
transactions.value:"values"."account.type".value::string as acc_type_val,
transactions.value:"values".amount::string as amount,
**transactions.value:"values".classnohierarchy.text::string as class_txt,
transactions.value:"values".classnohierarchy.value::string as class_val,**
transactions.value:"values".currency.text::string as currency_text,
transactions.value:"values".currency.value::string as currency_val,
transactions.value:"values"."customer.altname"::string as customer_project_name,
transactions.value:"values"."customer.custate"::string as customer_end_date,
transactions.value:"values"."customer.custentient"::string as customer_end_client,
transactions.value:"values"."customer.custendate"::string as customer_start_date,
transactions.value:"values"."customer.entyid"::string as customer_project_id,
transactions.value:"values".departmentnohierarchy.text::string as department_name,
transactions.value:"values".departmentnohierarchy.value::string as department_value,
transactions.value:"values".fxamount::string as fx_amount,
transactions.value:"values".location.text::string as product_name,
transactions.value:"values".postingperiod.text::string as postingperiod,
transactions.value:"values".postingperiod.value::string as postingperiod,
transactions.value:"values"."subsidiary.custrecord_region".text::string as region_name,
transactions.value:"values"."subsidiary.custrecord_region".value::string as region_value,
transactions.value:"values".subsidiarynohierarchy.text::string as entity_name,
transactions.value:"values".subsidiarynohierarchy.value::string as entity_value,
FROM MY_TABLE,
LATERAL FLATTEN (JSON_DATA:transactions) as transactions
and here's a picture of whats showing in Snowflake:
SNOWFLAKE_SCREENSHOT
departmentnohierarchy is an array. you need to mention the index as below.
select *,transactions.VALUE:"values".departmentnohierarchy[0].value::text as department_name
FROM jsont1,
LATERAL FLATTEN (JSON_DATA:transactions) as transactions

First two values from json object using std template package

The json response from api is like this
{
"ResponseCode": "1",
"Response": "Data Found",
"data": [
{
"Season": "KHARIF",
"Sector": "AGRICULTURE",
"Category": "Cereals",
"Crop": "Paddy (Dhan)",
"QueryType": "\tField Preparation\t",
"QueryText": "top dressing for paddy",
"KccAns": "top dressing for paddy : apply urea 25kg+SSP 15kg + neem cake 5kg+MN mixture 5kg mixed with 40kg of sand",
"StateName": "PUDUCHERRY",
"DistrictName": "KARAIKAL",
"BlockName": "KARAIKAL",
"CreatedOn": "1/5/2014 6:48:09 PM"
},
{
"Season": "KHARIF",
"Sector": "AGRICULTURE",
"Category": "Others",
"Crop": "Others",
"QueryType": "Weather",
"QueryText": "weather forecasting details",
"KccAns": "today no rain",
"StateName": "PUDUCHERRY",
"DistrictName": "KARAIKAL",
"BlockName": "KARAIKAL",
"CreatedOn": "1/5/2014 9:04:50 PM"
},
{
"Season": "KHARIF",
"Sector": "AGRICULTURE",
"Category": "Others",
"Crop": "Others",
"QueryType": "0",
"QueryText": "details about soil testing",
"KccAns": "contact to agricultural department",
"StateName": "PUDUCHERRY",
"DistrictName": "KARAIKAL",
"BlockName": "KARAIKAL",
"CreatedOn": "1/8/2014 10:21:18 AM"
},
{
"Season": "KHARIF",
"Sector": "AGRICULTURE",
"Category": "Cereals",
"Crop": "Paddy (Dhan)",
"QueryType": "Fertilizer Use and Availability",
"QueryText": "paddy top dressing fertilizer",
"KccAns": "apply urea 25 kg + potash 15 kg + neem cake 5 kg + microfood 5 kg / ac",
"StateName": "PUDUCHERRY",
"DistrictName": "KARAIKAL",
"BlockName": "KARAIKAL",
"CreatedOn": "1/12/2014 8:01:45 AM"
}
]
}
I am trying to write a golang template which returns only the first two data points in data section of the response object. This is the template {{range $element := .data}} {{$element}} {{end}} i am using at the moment but this returns all the sub data in .data feild. How can i make this workout.
You can use the slice template function to take the first two elements. Example:
{{$dataSliced := slice .data 0 2}}
{{range $element := $dataSliced}}
{{$element}}
{{end}}
Or you can also create a custom template function for the slicing.
More about template function: https://golang.org/pkg/text/template.

Parsing a json using Angular js

{
"statusCode": "000",
"statusMessage": "Record Successfully Fetched",
"dsStatusCode": "000",
"dsStatusMessage": "Record Successfully Fetched",
"businessInput": null,
"businessOutput": {
"systemCircleId": "2",
"category": [
{
"categoryId": "abcs",
"sys": "5ID",
"displayName": "National Roaming Recharge",
"packsList": [
{
"amount": "79",
"benefits": "dsdsdsds",
"packId": "1344",
"processingFees": "70.3",
"serviceTax": "8.7",
"validity": "30 Days",
"volume": "0.0",
"isTop5": "no",
"fileName": "null"
},
{
"amount": "188",
"benefits": "Roaming Tariff - Incoming Free, Outgoing local # 80p/min, STD #1.15Rs/min with Talk Time 120 in main A/c",
"packId": "1263",
"fess": "47.3",
"serviceTax": "20.7",
"validity": "28 Days",
"volume": "0.0",
"isTop5": "no",
"fileName": "null"
},
{
"amount": "306",
"benefits": "FTT 306 with Roaming Tariff - Incoming Free, Outgoing local # 80p/min, STD #1.15Rs/min",
"packId": "1290",
"processingFees": "0",
"serviceTax": "33.7",
"validity": "28 Days",
"volume": "0.0",
"isTop5": "no",
"fileName": "null"
}
]
}
]
}
}
I want to parse this json to filter packlist for each category id using angularjs
assign a variable to the JSON you have. and use scope.$eval on the variable
Example
var jsonVar = { "statusCode": "000",
"statusMessage": "Record Successfully Fetched",
"dsStatusCode": "000",
"dsStatusMessage": "Record Successfully Fetched",
"businessInput": null
}
scope.$eval(jsonVar) // this gives the object on which you can do the ng-repeat
if you still have problems. Try using JSON.stringify(jsonVar) and then perform a scope.$eval on the this.
var jsonString = JSON.stringify(jsonVar);
scope.$eval(jsonString);// This returns a object too

How to convert a json file without the same length of json objects into csv

I have a json file and I want to convert it to csv format.
The problem I face is that every json object in the file has not the same length of the converted columns I have. For example the one object have 49 columnns and the next have 50.
I provide here an example of 2 data from which the first one has not the creator.slug but the next has it is and so there is the problem with data. The problem is that the process create all 50 columns but for the object which don't have the value creator.slug it takes the next price.
{
"id": 301852363,
"name": "Song of the Sea",
"blurb": "One evening, two shows: SIRENS and The Girl From Bare Cove. Building a community. Giving voice to survivors of sexual violence.",
"goal": 5000,
"pledged": 671,
"state": "live",
"slug": "song-of-the-sea",
"disable_communication": false,
"country": "US",
"currency": "USD",
"currency_symbol": "$",
"currency_trailing_code": true,
"deadline": 1399293386,
"state_changed_at": 1397133386,
"created_at": 1396672480,
"launched_at": 1397133386,
"backers_count": 20,
"photo": {
"full": "https://s3.amazonaws.com/ksr/projects/939387/photo-full.jpg?1397874930",
"ed": "https://s3.amazonaws.com/ksr/projects/939387/photo-ed.jpg?1397874930",
"med": "https://s3.amazonaws.com/ksr/projects/939387/photo-med.jpg?1397874930",
"little": "https://s3.amazonaws.com/ksr/projects/939387/photo-little.jpg?1397874930",
"small": "https://s3.amazonaws.com/ksr/projects/939387/photo-small.jpg?1397874930",
"thumb": "https://s3.amazonaws.com/ksr/projects/939387/photo-thumb.jpg?1397874930",
"1024x768": "https://s3.amazonaws.com/ksr/projects/939387/photo-1024x768.jpg?1397874930",
"1536x1152": "https://s3.amazonaws.com/ksr/projects/939387/photo-1536x1152.jpg?1397874930"
},
"creator": {
"id": 1714048992,
"name": "Maridee Slater",
"slug": "maridee",
"avatar": {
"thumb": "https://s3.amazonaws.com/ksr/avatars/996153/DSC_0310.thumb.jpg?1337713264",
"small": "https://s3.amazonaws.com/ksr/avatars/996153/DSC_0310.small.jpg?1337713264",
"medium": "https://s3.amazonaws.com/ksr/avatars/996153/DSC_0310.medium.jpg?1337713264"
},
"urls": {
"web": {
"user": "https://www.kickstarter.com/profile/maridee"
},
"api": {
"user": "https://api.kickstarter.com/v1/users/1714048992?signature=1398256877.e6d63adcca055cd041a5920368b197d40459f748"
}
}
},
"location": {
"id": 2459115,
"name": "New York",
"slug": "new-york-ny",
"short_name": "New York, NY",
"displayable_name": "New York, NY",
"country": "US",
"state": "NY",
"urls": {
"web": {
"discover": "https://www.kickstarter.com/discover/places/new-york-ny",
"location": "https://www.kickstarter.com/locations/new-york-ny"
},
"api": {
"nearby_projects": "https://api.kickstarter.com/v1/discover?signature=1398256786.89b2c4539aeab4ad25982694dd7e659e8c12028f&woe_id=2459115"
}
}
},
"category": {
"id": 17,
"name": "Theater",
"slug": "theater",
"position": 14,
"urls": {
"web": {
"discover": "http://www.kickstarter.com/discover/categories/theater"
}
}
},
"urls": {
"web": {
"project": "https://www.kickstarter.com/projects/maridee/song-of-the-sea"
}
}
},
{
"id": 967108708,
"name": "Good Bread Alley",
"blurb": "A play by April Yvette Thompson. A Gullah Healer Woman and an Afro-Cuban Priest forge a new world of magic & dreams in Jim Crow Miami.",
"goal": 100000,
"pledged": 33242,
"state": "live",
"slug": "good-bread-alley",
"disable_communication": false,
"country": "US",
"currency": "USD",
"currency_symbol": "$",
"currency_trailing_code": true,
"deadline": 1399271911,
"state_changed_at": 1396334313,
"created_at": 1393278556,
"launched_at": 1396334311,
"backers_count": 261,
"photo": {
"full": "https://s3.amazonaws.com/ksr/projects/883489/photo-full.jpg?1397869394",
"ed": "https://s3.amazonaws.com/ksr/projects/883489/photo-ed.jpg?1397869394",
"med": "https://s3.amazonaws.com/ksr/projects/883489/photo-med.jpg?1397869394",
"little": "https://s3.amazonaws.com/ksr/projects/883489/photo-little.jpg?1397869394",
"small": "https://s3.amazonaws.com/ksr/projects/883489/photo-small.jpg?1397869394",
"thumb": "https://s3.amazonaws.com/ksr/projects/883489/photo-thumb.jpg?1397869394",
"1024x768": "https://s3.amazonaws.com/ksr/projects/883489/photo-1024x768.jpg?1397869394",
"1536x1152": "https://s3.amazonaws.com/ksr/projects/883489/photo-1536x1152.jpg?1397869394"
},
"creator": {
"id": 749318998,
"name": "April Yvette Thompson",
"avatar": {
"thumb": "https://s3.amazonaws.com/ksr/avatars/9751919/kick_thumb.thumb.jpg?1396128151",
"small": "https://s3.amazonaws.com/ksr/avatars/9751919/kick_thumb.small.jpg?1396128151",
"medium": "https://s3.amazonaws.com/ksr/avatars/9751919/kick_thumb.medium.jpg?1396128151"
},
"urls": {
"web": {
"user": "https://www.kickstarter.com/profile/749318998"
},
"api": {
"user": "https://api.kickstarter.com/v1/users/749318998?signature=1398256877.af4db50c53f93339b05c7813f4534e833eaca270"
}
}
},
"location": {
"id": 2459115,
"name": "New York",
"slug": "new-york-ny",
"short_name": "New York, NY",
"displayable_name": "New York, NY",
"country": "US",
"state": "NY",
"urls": {
"web": {
"discover": "https://www.kickstarter.com/discover/places/new-york-ny",
"location": "https://www.kickstarter.com/locations/new-york-ny"
},
"api": {
"nearby_projects": "https://api.kickstarter.com/v1/discover?signature=1398256786.89b2c4539aeab4ad25982694dd7e659e8c12028f&woe_id=2459115"
}
}
},
"category": {
"id": 17,
"name": "Theater",
"slug": "theater",
"position": 14,
"urls": {
"web": {
"discover": "http://www.kickstarter.com/discover/categories/theater"
}
}
},
"urls": {
"web": {
"project": "https://www.kickstarter.com/projects/749318998/good-bread-alley"
}
}
}
Here is the code I run
#open the json file
require(RJSONIO)
require(rjson)
library("rjson")
filename2 <- "C:/Users/Desktop/in.json"
json_data <- fromJSON(file = filename2)
#unlist the json because it has a problem
unlisted <- unlist(unlist(json_data,recursive=FALSE),recursive=FALSE)
use to fill the NA but as I can understand now it is for already existed nulls http://stackoverflow.com/questions/16947643/getting-imported-json-data-into-a-data-frame-in-r/16948174#16948174
unlisted <- lapply(unlisted, function(x) {
x[sapply(x, is.null)] <- NA
unlist(x)
})
json <- do.call("rbind", unlisted)
Here is a full list with the columns of the output csv and after that I provide what I would like to keep from every object of json, less columns
id
name
blurb
goal
pledged
state
slug
disable_communication
country
currency
currency_symbol
currency_trailing_code
deadline
state_changed_at
created_at
launched_at
backers_count
photo.full
photo.ed
photo.med
photo.little
photo.small
photo.thumb
photo.1024x768
photo.1536x1152
creator.id
creator.name
creator.slug
creator.avatar.thumb
creator.avatar.small
creator.avatar.medium
creator.urls.web.user
creator.urls.api.user
location.id
location.name
location.slug
location.short_name
location.displayable_name
location.country
location.state
location.urls.web.discover
location.urls.web.location
location.urls.api.nearby_projects
category.id
category.name
category.slug
category.position
category.urls.web.discover
category.urls.web.project
category.urls.web.rewards
Here it is the list of columns I would try to have in the output csv:
id
name
blurb
goal
pledged
state
slug
disable_communication
country
currency
currency_symbol
currency_trailing_code
deadline
state_changed_at
created_at
launched_at
backers_count
creator.id
creator.name
creator.slug
location.id
location.name
location.slug
location.short_name
location.displayable_name
location.country
location.state
category.id
category.name
category.slug
category.position
Looks like there's a very similar question (with answer, though not pure R) here: convert json to csv format
However, since you do seem to want most, if not all, the JSON in a "wide CSV" format you can use fromJSON from jsonlite, rbindlist from data.table (which gets you the fill=TRUE parameter to handle uneven lists nicely) and unlist:
library(jsonlite)
library(data.table)
# tell fromJSON we want a list back
json_data <- fromJSON("in.json", simplifyDataFrame=FALSE)
# iterate over the list we have so we can "flatten" it then
# covert it back to a data.frame-like object
dat <- rbindlist(lapply(json_data, function(x) {
as.list(unlist(x))
}), fill=TRUE)
You may need to tweak column names, but I think this gets you what you're looking for.