Access multiple dictionaries inside a list of a json file

Access multiple dictionaries inside a list of a json file - json

I am trying to create a dataframe from a json code. But I cannot access multiple objects inside a list. Only the first value is being retrieved.
This is my json code:
[{'id': '1', 'fnamae': 'Rasab', 'lname': 'Asdaf', 'Age': 21, 'Language': ['python', 'json'], 'parents': {'mother': {'name': 'Mrs. Mother', 'phone': '1212121212'}, 'father': {'name': 'Mr. Father', 'phone': '1212121212'}}, 'siblings': [{'name': 'jamuna', 'phone': 564851312}, {'name': 'Killana', 'phone': 1212121212}]}, {'id': '2', 'fnamae': 'Muddassir', 'lname': 'Jameel', 'Age': 25, 'Language': ['React', 'json'], 'parents': {'mother': {'name': 'Mrs. Mutherinlaw', 'phone': 9654512}, 'father': {'name': 'Mr. Futherinlaw', 'phone': 53154278}}, 'siblings': [{'name': 'Giallan', 'phone': 998742568}, {'name': 'Simba', 'phone': 12355875}]}, {'id': '3', 'fnamae': 'Farhan', 'lname': 'Akhtar', 'Age': 25, 'Language': ['Drupal', 'PHP'], 'parents': {'mother': {'name': 'Heung min son', 'phone': 89546487}, 'father': {'name': 'Kane', 'phone': 4564823545}}, 'siblings': [{'name': 'Xamcs', 'phone': 78654325}, {'name': 'sinfbad', 'phone': 45648232}]}]
And this is my code to access "siblings" list from the json files to create a dataframe.
s = l['siblings']
df2 = pd.DataFrame(s.str[0].values.tolist())
df2
But the output is:
name phone
0 jamuna 564851312
1 Giallan 998742568
2 Xamcs 78654325
My expected output would be to include the multiple names and phone numbers of the siblings.
name phone
0 [jamuna,Killana] 564851312,468451
1 [Giallan,Simba] 998742568,654684
2 [Xamcs, sinfbad] 786543254,654654
When I change my code to s.str[1] I am able to retrieve the second set of list. But how do I iterate over them

You're going to have to do a nested list comprehension:
import pandas as pd
pd.DataFrame(
{
key: [[j[key] for j in i["siblings"]] for i in json_content]
for key in ["name", "phone"]
}
)
This will give you
| | name | phone |
|---:|:----------------------|:------------------------|
| 0 | ['jamuna', 'Killana'] | [564851312, 1212121212] |
| 1 | ['Giallan', 'Simba'] | [998742568, 12355875] |
| 2 | ['Xamcs', 'sinfbad'] | [78654325, 45648232] |

Use a list comprehension to derive the output
pd.DataFrame([d for l in json_content for d in l['siblings']])

Related

Insert a list of dicts as json into SQLite cell and get it back as it is

I need to insert into SQLite table into one cell a list of dicts and be able to get it back as it is, - a list of dicts. There is an option to insert it as a json string. The SQLite is accepting it, but returning a 'malformed node or string'. I´m using pycharm. Tried to use ast.literal_eval() to transform to list - the same error. And Why all that brackets for and the coma ata the end? Would apreciate any help.
import json, sqlite3, pprint
list1 = [{'id': '9', 'name': 'Buiding'},
{'id': '10', 'name': 'Security'},
{'id': '11', 'name': 'Mass Media'},
{'id': '12', 'name': 'Consulting'},
{'id': '13', 'name': 'Medical care'}]
conn = sqlite3.connect('test.db')
c = conn.cursor()
c.executescript("DROP TABLE IF EXISTS result; CREATE TABLE result (data json)")
c.execute("insert into result values (?)", [json.dumps(list1)])
conn.commit()
a = c.execute("select json_extract (data, '$') FROM result;").fetchall()
conn.close()
pprint.pprint(a)
[('[{"id":"9","name":"Buiding"},{"id":"10","name":"Security"},
{"id":"11","name":"Mass Media"},{"id":"12","name":"Consulting"},
{"id":"13","name":"Medical care"}]',)]

Do not use json_extract! You insert the dict as string via JSON.dumps, you read it back as string into a str variable, and unpack via JSON.loads.
import json, sqlite3, pprint
list1 = [{'id': '9', 'name': 'Buiding'},
{'id': '10', 'name': 'Security'},
{'id': '11', 'name': 'Mass Media'},
{'id': '12', 'name': 'Consulting'},
{'id': '13', 'name': 'Medical care'}]
conn = sqlite3.connect(':memory:')
c = conn.cursor()
c.executescript("DROP TABLE IF EXISTS countries; CREATE TABLE result (data json)")
c.execute("insert into result(data) values (?)", [json.dumps(list1)])
conn.commit()
a = json.loads(c.execute("select data FROM result;").fetchone()[0])
conn.close()
pprint.pprint(a == list1)

How to extract dict columns with similar names from json array in pandas dataframe

I have a dataframe of 20k rows x 45 columns that has been normalized nearly fully, but I have one pesky column in particular.
I have copied just the index and the problem column, omitting the other 44 columns for simplicity in data display.
agencies
0 [{'id': 29, 'name': 'Air Force, Dept of'}, {'id': 2, 'name': 'HOUSE OF REPRESENTATIVES'}, {'id': 1, 'name': 'SENATE'}]
1 [{'id': 29, 'name': 'Air Force, Dept of'}, {'id': 2, 'name': 'HOUSE OF REPRESENTATIVES'}, {'id': 1, 'name': 'SENATE'}]
2 [{'id': 2, 'name': 'HOUSE OF REPRESENTATIVES'}, {'id': 1, 'name': 'SENATE'}]
3 [{'id': 2, 'name': 'HOUSE OF REPRESENTATIVES'}, {'id': 1, 'name': 'SENATE'}]
4 [{'id': 2, 'name': 'HOUSE OF REPRESENTATIVES'}, {'id': 1, 'name': 'SENATE'}]
Here, I would like to extract each of the values within the name label, however, they all have the same name so json_normalize() puts them all in the same column, and lengthens the dataset by however many entries are in each array.
I would like to extract them into name_1, name_2, name_3, ... , name_max_amount_of_names. So let's suppose the max amount of name entries in the column is 5, I would like to have:
name_1, name_2, name_3, name_4, name_5.
I have tried normalization and cannot figure this out further.
Thank you in advance.
EDIT:
Thanks to the kind commenter below, I'm close, however, it seems to be creating a new column for each unique 'name', and that's not what I was trying to accomplish as it clutters the data with many NaN's.
I have included a screenshot of the results.

Try as follows.
We apply Series.explode to get each item from each list on a separate row (but still with the appropriate index number).
We wrap this result inside pd.json_normalize to get a flat table.
We now need to set a new index (with apply(pd.Series) we wouldn't have this problem) with the exploded index values (so: .set_index(df.agencies.explode().index)).
Finally, we use df.pivot to get the data in the correct shape.
Now, we are basically done, except for renaming the df.columns.
import pandas as pd
data = {'agencies':
{0: [{'id': 29, 'name': 'Air Force, Dept of'},
{'id': 2, 'name': 'HOUSE OF REPRESENTATIVES'},
{'id': 1, 'name': 'SENATE'}],
1: [{'id': 29, 'name': 'Air Force, Dept of'},
{'id': 2, 'name': 'HOUSE OF REPRESENTATIVES'},
{'id': 1, 'name': 'SENATE'}],
2: [{'id': 2, 'name': 'HOUSE OF REPRESENTATIVES'},
{'id': 1, 'name': 'SENATE'}],
3: [{'id': 2, 'name': 'HOUSE OF REPRESENTATIVES'},
{'id': 1, 'name': 'SENATE'}],
4: [{'id': 2, 'name': 'HOUSE OF REPRESENTATIVES'},
{'id': 1, 'name': 'SENATE'}]}}
df = pd.DataFrame(data)
df_names = pd.json_normalize(df.agencies.explode())\
.set_index(df.agencies.explode().index).pivot(
index=None,columns='id', values='name')
# order of column names will be:
# sorted(pd.json_normalize(df.agencies.explode())\
# .set_index(df.agencies.explode().index)['id'].unique())
# i.e.: [1, 2, 29]
# (reorder them as appropriate, and then) overwrite as name_1, name_2, name_3
df_names.columns = [f'name_{idx}' for idx in range(1, len(df_names.columns)+1)]
print(df_names)
name_1 name_2 name_3
0 SENATE HOUSE OF REPRESENTATIVES Air Force, Dept of
1 SENATE HOUSE OF REPRESENTATIVES Air Force, Dept of
2 SENATE HOUSE OF REPRESENTATIVES NaN
3 SENATE HOUSE OF REPRESENTATIVES NaN
4 SENATE HOUSE OF REPRESENTATIVES NaN
# assignment to orig df would be:
# df = pd.concat([df,df_names],axis=1)
Update
The OP has updated the question. Let's produce a small example to clarify the apparent problem. The adjusted data is as follows:
import pandas as pd
data = {'agencies':
{0: [{'id': 29, 'name': 'Air Force, Dept of 29'},
{'id': 1, 'name': 'SENATE'},
{'id': 4, 'name': 'Air Force, Dept of 4'},],
1: [{'id': 2, 'name': 'Air Force, Dept of 2'},
{'id': 1, 'name': 'SENATE'}]
}}
So, here we have 3 unmatched key-value pairs: 'id': 2, 4, and 29. Applying the method described above, we will end up with this:
name_1 name_2 name_3 name_4
0 SENATE NaN Air Force, Dept of 4 Air Force, Dept of 29
1 SENATE Air Force, Dept of 2 NaN NaN
Here, the names associated with id: 1 work fine (name_1), because this key is found in both lists of dicts. However, the other name keys all lack a "match" in the other list, so that they end up with their own column in consecutively order (based on the ids). I.e. name_2 fills names associated with 'id': 2, then name_3 for 4, and name_4 for 29.
If I understand the update correctly, the OP rather wishes to "use up" each new consecutive name column with name-keys as much as possible, before creating a new column. I.e., in the current example, this would mean that name_2 is to be filled with the name for 'id': 4 in row 0, and 'id': 2 in row 1. And then only the name for 'id': 29 will get its own column (name_3), since name_2 is already "full". We can achieve this quite easily by adding an intermediate step:
df = pd.DataFrame(data)
first = pd.json_normalize(df.agencies.explode())
second = first.set_index(df.agencies.explode().index)
# rank all `ids` per group, and overwrite the original `ids`
# i.e. [1, 4, 29] -> [1, 2, 3]
second['id'] = second.groupby(level=0)['id'].rank()
final = second.pivot(index=None,columns='id', values='name')
final.columns = [f'name_{idx}' for idx in range(1, len(final.columns)+1)]
print(final)
name_1 name_2 name_3
0 SENATE Air Force, Dept of 4 Air Force, Dept of 29
1 SENATE Air Force, Dept of 2 NaN

Normalize a json column that has has column names passed as value in python

Edit: sample json of details column:
{6591: '[]',
8112: "[{'name': 'start', 'time': 1659453223851}, {'name': 'arrival', 'time': 1659454209024, 'location': [-73.7895605, 40.6869539]}, {'name': 'departure', 'time': 1659453289013, 'location': [-73.8124575, 40.7091602]}]",
5674: '[]',
4236: '[]',
3148: "[{'name': 'start', 'time': 1659121571280}, {'name': 'arrival', 'time': 1659122768105, 'location': [-74.220351348, 40.748419051]}, {'name': 'departure', 'time': 1659121605076, 'location': [-74.189452444, 40.715865856]}]",
3408: "[{'name': 'start', 'time': 1659113772531}, {'name': 'arrival', 'time': 1659114170204, 'location': [-73.9469142, 40.671488]}, {'name': 'departure', 'time': 1659113832693, 'location': [-73.956379, 40.6669802]}]",
1438: '[]',
3634: '[]',
5060: "[{'name': 'start', 'time': 1659190337964}, {'name': 'arrival', 'time': 1659190367182, 'location': [-76.614058283, 39.292697049]}, {'name': 'departure', 'time': 1659190345722, 'location': [-76.614058283, 39.292697049]}]",
6614: '[]',
7313: '[]',
7653: '[]',
9446: '[]',
1237: '[]',
6974: "[{'name': 'start', 'time': 1659383554887}, {'name': 'adminCompletion', 'time': 1659386192031, 'data': {'adminId': 'ZFQCAL6aeS', 'sendNotificationFromAdminComplete': False}}, {'name': 'arrival', 'time': 1659385764198, 'location': [-73.943001009, 40.705886527]}, {'name': 'departure', 'time': 1659383653199, 'location': [-73.94038015, 40.814893186]}]",
762: '[]',
4843: '[]',
8682: '[]',
7271: '[]',
4672: "[{'name': 'start', 'time': 1659131562088}, {'name': 'arrival', 'time': 1659131937387, 'location': [-87.62621, 41.9015626]}, {'name': 'departure', 'time': 1659131637316, 'location': [-87.6263294, 41.9094856]}]"}
I have a dataframe columns like 'details' and 'id'. It looks like this. I want to completely flatten details column.
details id
[{'name': 'start', 'time': 1659479418}, {'name': 'arrival', 'time': 1659452651073, 'location': [-75.040536278, 40.034055]}, {'name': 'departure', 'time': 1659451650, 'location': [-75.1609003, 39.947729034]}] 1
[] 2
[] 3
[{'name': 'start', 'time': 1659126581459}, {'name': 'arrival', 'time': 1659128206850, 'location': [-80.3165751, 25.8625698]}, {'name': 'departure', 'time': 1659126641679, 'location': [-80.2511886, 25.921769]}] 4
[{'name': 'start', 'time': 1659120813100}, {'name': 'arrival', 'time': 1659121980125, 'location': [-76.642292, 39.307895253]}, {'name': 'departure', 'time': 1659120903093, 'location': [-76.741190426, 39.34240617]}] 5
[] 6
[] 7
[{'name': 'start', 'time': 1659217203753}, {'name': 'adminCompletion', 'time': 1659217336224, 'data': {'adminId': '~R~WZt7bKO979BRTqHyarS2p', 'sendNotification': False}}, {'name': 'arrival', 'time': 1659217308939, 'location': [-73.941830752, 40.702405857]}, {'name': 'departure', 'time': 1659217288936, 'location': [-73.941830752, 40.702405857]}] 8
[{'name': 'start', 'time': 1659189824814}, {'name': 'arrival', 'time': 1659191937100, 'location': [-76.406627, 39.984]}, {'name': 'departure', 'time': 1659189915191, 'location': [-76.614515552, 39.292407218]}] 9
[] 10
what is expected from this is:
start_time admincompletiontime adminId sendnotification arrival_time arrival_location departure_time departure_location id
1659479418 1.65945E+12 [-75.040536278, 40.034055] 1659451650 [-75.1609003, 39.947729034] 1
2
3
1.65913E+12 1.65913E+12 [-80.3165751, 25.8625698] 1.65913E+12 [-80.2511886, 25.921769] 4
1.65922E+12 1.65922E+12 ~R~WZt7bKO979BRTqHyarS2p FALSE 1.65922E+12 [-73.941830752, 40.702405857] 1.65922E+12 [-73.941830752, 40.702405857] 8
I want to extract all the columns that are passed as values. pd.json_normalize() did not work for me in this case. please suggest.

Your data is pretty scuffed, you need to clean it up, but following a pattern like this should start you in the right direction:
from ast import literal_eval
data = {key:literal_eval(value) for key, value in data.items()}
data = [[{y['name']:{'time':y['time'],'location':y.get('location')}} for y in x] for x in data.values() if x]
df = pd.concat([pd.json_normalize(x) for x in data])
df = (df.dropna(how='all', axis=1)
.bfill()
.dropna()
.drop_duplicates('start.time')
.reset_index(drop=True))
print(df)
Output:
start.time arrival.time arrival.location departure.time departure.location adminCompletion.time
0 1.659453e+12 1.659454e+12 [-73.7895605, 40.6869539] 1.659453e+12 [-73.8124575, 40.7091602] 1.659386e+12
1 1.659122e+12 1.659454e+12 [-73.7895605, 40.6869539] 1.659453e+12 [-73.8124575, 40.7091602] 1.659386e+12
2 1.659114e+12 1.659123e+12 [-74.220351348, 40.748419051] 1.659122e+12 [-74.189452444, 40.715865856] 1.659386e+12
3 1.659190e+12 1.659114e+12 [-73.9469142, 40.671488] 1.659114e+12 [-73.956379, 40.6669802] 1.659386e+12
4 1.659384e+12 1.659190e+12 [-76.614058283, 39.292697049] 1.659190e+12 [-76.614058283, 39.292697049] 1.659386e+12
5 1.659132e+12 1.659386e+12 [-73.943001009, 40.705886527] 1.659384e+12 [-73.94038015, 40.814893186] 1.659386e+12

read large file excel with multiple worksheet to json with python

I have a large excel file and have multiple worksheets its 100 MB
sheet A
id | name | address
1 | joe | A
2 | gis | B
3 | leo | C
work_1
id| call
1 | 10
1 | 8
2 | 1
3 | 3
work_2
id| call
2 | 4
3 | 8
3 | 7
desired json for each id
data = { id: 1,
address: A,
name: Joe,
log : [{call:10}, {call:8 }]
}
data= { id: 2,
address: B,
name: Gis,
log : [{call:1}, {call:4}]
}
data= { id: 3,
address: C,
name: Leo,
log : [{call:3}, {call:8}, {call:7}]
}
i've tried with pandas but it takes 5 minutes to run it and it only read_excel without any processing. is there any solution to make it faster and how to get desired json?
maybe divide the process into chunk(but pandas removed chunksize for read_excel) and add some threading to make interval so the procees could be print each batch.

You can do:
works=pd.concat([work1,work2],ignore_index=True)
mapper_works=works.groupby('id')[['call']].apply(lambda x: x.to_dict('records'))
dfa['log']=dfa['id'].map(mapper_works)
data=dfa.reindex(columns=['id','address','name','log']).to_dict('records')
print(data)
The output is a list of dict for each id:
[{'id': 1, 'address': 'A', 'name': 'joe', 'log': [{'call': 10}, {'call': 8}]},
{'id': 2, 'address': 'B', 'name': 'gis', 'log': [{'call': 1}, {'call': 4}]},
{'id': 3, 'address': 'C', 'name': 'leo', 'log': [{'call': 3}, {'call': 8}, {'call': 7}]}
]
If you want you can assign to a column:
dfa['dicts']=data
print(dfa)
id name address log \
0 1 joe A [{'call': 10}, {'call': 8}]
1 2 gis B [{'call': 1}, {'call': 4}]
2 3 leo C [{'call': 3}, {'call': 8}, {'call': 7}]
dicts
0 {'id': 1, 'address': 'A', 'name': 'joe', 'log'...
1 {'id': 2, 'address': 'B', 'name': 'gis', 'log'...
2 {'id': 3, 'address': 'C', 'name': 'leo', 'log'...

Normalize nested json with pandas when keys vary by record

I have a nested json data set, example below. Where the attributes vary by each well. How can I normalize this data into a dataframe when the keys change from case to case? I'd like rows that don't have the keys to appear as null.
{WellID: 3.
Attributes:[
Name: xxx, Value, yyy
....
....]}
Sample Data:
[{'WellID': 3,
'Attributes': [{'Name': 'Production Start Date',
'Value': '5/17/2012 12:00:00 AM'},
{'Name': 'Latitude', 'Value': '36.594260510'},
{'Name': 'Longitude', 'Value': '-97.706833870'},
{'Name': 'Has Plunger', 'Value': 'True'},
{'Name': 'Has Flare', 'Value': 'True'},
{'Name': 'Has VRU', 'Value': 'True'},
{'Name': 'State', 'Value': 'OK'},
{'Name': 'Country', 'Value': 'USA'},
{'Name': 'County', 'Value': '047'},
{'Name': 'Alcohol Injector', 'Value': 'False'},
{'Name': 'Shut In', 'Value': 'False'},
{'Name': 'Active', 'Value': 'True'}]},
{'WellID': 4,
'Attributes': [{'Name': 'Production Start Date',
'Value': '5/31/2012 12:00:00 AM'},
{'Name': 'Latitude', 'Value': '36.564503337'},
{'Name': 'Longitude', 'Value': '-97.600837012'},
{'Name': 'State', 'Value': 'OK'},
{'Name': 'Country', 'Value': 'USA'},
{'Name': 'County', 'Value': '047'},
{'Name': 'Alcohol Injector', 'Value': 'False'},
{'Name': 'Shut In', 'Value': 'False'},
{'Name': 'Active', 'Value': 'True'}]},
{'WellID': 5,
'Attributes': [{'Name': 'Production Start Date',
'Value': '8/18/2012 12:00:00 AM'},
{'Name': 'Latitude', 'Value': '36.592378770'},
{'Name': 'Longitude', 'Value': '-97.725740930'},
{'Name': 'Has Plunger', 'Value': 'True'},
{'Name': 'Has Flare', 'Value': 'True'},
{'Name': 'Has VRU', 'Value': 'True'},
{'Name': 'State', 'Value': 'OK'},
{'Name': 'Country', 'Value': 'USA'},
{'Name': 'County', 'Value': '047'},
{'Name': 'Alcohol Injector', 'Value': 'False'},
{'Name': 'Shut In', 'Value': 'True'},
{'Name': 'Active', 'Value': 'True'}]},
{'WellID': 6,
'Attributes': [{'Name': 'Latitude', 'Value': '36.572665500'},
{'Name': 'Longitude', 'Value': '-97.672614600'},
{'Name': 'State', 'Value': 'OK'},
{'Name': 'Country', 'Value': 'USA'},
{'Name': 'County', 'Value': '047'},
{'Name': 'Alcohol Injector', 'Value': 'False'},
{'Name': 'Shut In', 'Value': 'False'},
{'Name': 'Active', 'Value': 'True'}]},
{'WellID': 7,
'Attributes': [{'Name': 'Latitude', 'Value': '36.562985200'},
{'Name': 'Longitude', 'Value': '-97.617945400'},
{'Name': 'State', 'Value': 'OK'},
{'Name': 'Country', 'Value': 'USA'},
{'Name': 'County', 'Value': '047'},
{'Name': 'Alcohol Injector', 'Value': 'False'},
{'Name': 'Shut In', 'Value': 'False'},
{'Name': 'Active', 'Value': 'True'}]}]
I tried to use this statement:
result = json_normalize(subset, 'Attributes',['WellID'], errors='ignore')
But it results in following which isn't flat:
Name Value WellID
0 Production Start Date 5/17/2012 12:00:00 AM 3
1 Latitude 36.594260510 3
2 Longitude -97.706833870 3
3 Has Plunger True 3
4 Has Flare True 3
5 Has VRU True 3
6 State OK 3
7 Country USA 3
8 County 047 3
9 Alcohol Injector False 3
10 Shut In False 3
11 Active True 3
12 Production Start Date 5/31/2012 12:00:00 AM 4
13 Latitude 36.564503337 4
14 Longitude -97.600837012 4
15 State OK 4
16 Country USA 4
17 County 047 4
18 Alcohol Injector False 4
19 Shut In False 4
20 Active True 4
21 Production Start Date 8/18/2012 12:00:00 AM 5
22 Latitude 36.592378770 5
23 Longitude -97.725740930 5
24 Has Plunger True 5
25 Has Flare True 5
26 Has VRU True 5
27 State OK 5
28 Country USA 5
29 County 047 5
30 Alcohol Injector False 5
31 Shut In True 5
32 Active True 5
33 Latitude 36.572665500 6
34 Longitude -97.672614600 6
35 State OK 6
36 Country USA 6
37 County 047 6
38 Alcohol Injector False 6
39 Shut In False 6
40 Active True 6
41 Latitude 36.562985200 7
42 Longitude -97.617945400 7
43 State OK 7
44 Country USA 7
45 County 047 7
46 Alcohol Injector False 7
47 Shut In False 7
48 Active True 7
Please advise on how to get it into the following format:
Well ID | Latitude | Longitude | State | .... etc
I now have a dataset that has multiple fields on the Well ID label. Is there a way to get all those fields into the data frame without manually typing them all in?
Thanks,

You may try .pivot after json_normalize.
from pandas.io.json import json_normalize
df1 = json_normalize(your_data, meta=['WellID'], record_path=['Attributes'])
df2 = df1.pivot(index='WellID', columns='Name', values='Value')
print(df2)
# Output
# Name Active Alcohol Injector Country County Has Flare Has Plunger Has VRU \
# WellID
# 3 True False USA 047 True True True
# 4 True False USA 047 None None None
# 5 True False USA 047 True True True
# 6 True False USA 047 None None None
# 7 True False USA 047 None None None
#
# Name Latitude Longitude Production Start Date Shut In State
# WellID
# 3 36.594260510 -97.706833870 5/17/2012 12:00:00 AM False OK
# 4 36.564503337 -97.600837012 5/31/2012 12:00:00 AM False OK
# 5 36.592378770 -97.725740930 8/18/2012 12:00:00 AM True OK
# 6 36.572665500 -97.672614600 None False OK
# 7 36.562985200 -97.617945400 None False OK

Did you just want to pivot your result dataframe? If so here is a minimal example of how to do that.
Create data in a long table format, similar to your normalized json:
import pandas as pd
data = pd.DataFrame({'name': ['lat', 'long', 'country', 'active', 'state'], 'value': [90, 90, 'US', True, 'OH'], 'id': 2})
data
Here is our data:
Then to pivot, use:
pivoted = data.pivot(index = 'id', columns = 'name')
pivoted
Gives:

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Access multiple dictionaries inside a list of a json file - json

Use a list comprehension to derive the output pd.DataFrame([d for l in json_content for d in l['siblings']])

Related

Insert a list of dicts as json into SQLite cell and get it back as it is

How to extract dict columns with similar names from json array in pandas dataframe

Normalize a json column that has has column names passed as value in python

read large file excel with multiple worksheet to json with python

Normalize nested json with pandas when keys vary by record

Categories

Resources