Python: JSON to Dictionary - json

Two examples for a JSON request. Both examples should have the correct JSON syntax, yet only the second version seems to be translatable to a dictionary.
#doesn't work
string_js3 = """{"employees": [
{
"FNAME":"FTestA",
"LNAME":"LTestA",
"SSN":6668844441
},
{
"FNAME":"FTestB",
"LNAME":"LTestB",
"SSN":6668844442
}
]}
"""
#works
string_js4 = """[
{
"FNAME":"FTestA",
"LNAME":"LTestA",
"SSN":6668844441
},
{
"FNAME":"FTestB",
"LNAME":"LTestB",
"SSN":6668844442
}]
"""
This gives an error, while the same with string_js4 works
L1 = json.loads(string_js3)
print(L1[0]['FNAME'])
So I have 2 questions:
1) Why doesn't the first version work
2) Is there a simple way to make the first version also work?

Both of these strings are valid JSON. Where you are getting stuck is in how you are accessing the resulting data structures.
L1 (from string_js3) is a (nested) dict;
L2 (from string_js4) is a list of dicts.
Walkthrough:
import json
string_js3 = """{
"employees": [{
"FNAME": "FTestA",
"LNAME": "LTestA",
"SSN": 6668844441
},
{
"FNAME": "FTestB",
"LNAME": "LTestB",
"SSN": 6668844442
}
]
}"""
string_js4 = """[{
"FNAME": "FTestA",
"LNAME": "LTestA",
"SSN": 6668844441
},
{
"FNAME": "FTestB",
"LNAME": "LTestB",
"SSN": 6668844442
}
]"""
L1 = json.loads(string_js3)
L2 = json.loads(string_js4)
The resulting objects:
L1
{'employees': [{'FNAME': 'FTestA', 'LNAME': 'LTestA', 'SSN': 6668844441},
{'FNAME': 'FTestB', 'LNAME': 'LTestB', 'SSN': 6668844442}]}
L2
[{'FNAME': 'FTestA', 'LNAME': 'LTestA', 'SSN': 6668844441},
{'FNAME': 'FTestB', 'LNAME': 'LTestB', 'SSN': 6668844442}]
type(L1), type(L2)
(dict, list)
1) Why doesn't the first version work?
Because calling L1[0] is trying to return the value from the key 0, and that key doesn't exist. From the docs, "It is an error to extract a value using a non-existent key." L1 is a dictionary with just one key:
L1.keys()
dict_keys(['employees'])
2) Is there a simple way to make the first version also work?
There are several ways, but it ultimately depends on what your larger problem looks like. I'm going to assume you want to modify the Python code rather than the JSON files/strings themselves. You could do:
L3 = L1['employees'].copy()
You now have a list of dictionaries that resembles L2:
L3
[{'FNAME': 'FTestA', 'LNAME': 'LTestA', 'SSN': 6668844441},
{'FNAME': 'FTestB', 'LNAME': 'LTestB', 'SSN': 6668844442}]

Related

Pulling specific Parent/Child JSON data with Python

I'm having a difficult time figuring out how to pull specific information from a json file.
So far I have this:
# Import json library
import json
# Open json database file
with open('jsondatabase.json', 'r') as f:
data = json.load(f)
# assign variables from json data and convert to usable information
identifier = data['ID']
identifier = str(identifier)
name = data['name']
name = str(name)
# Collect data from user to compare with data in json file
print("Please enter your numerical identifier and name: ")
user_id = input("Numerical identifier: ")
user_name = input("Name: ")
if user_id == identifier and user_name == name:
print("Your inputs matched. Congrats.")
else:
print("Your inputs did not match our data. Please try again.")
And that works great for a simple JSON file like this:
{
"ID": "123",
"name": "Bobby"
}
But ideally I need to create a more complex JSON file and can't find deeper information on how to pull specific information from something like this:
{
"Parent": [
{
"Parent_1": [
{
"Name": "Bobby",
"ID": "123"
}
],
"Parent_2": [
{
"Name": "Linda",
"ID": "321"
}
]
}
]
}
Here is an example that you might be able to pick apart.
You could either:
Make a custom de-jsonify object_hook as shown below and do something with it. There is a good tutorial here.
Just gobble up the whole dictionary that you get without a custom de-jsonify and drill down into it and make a list or set of the results. (not shown)
Example:
import json
from collections import namedtuple
data = '''
{
"Parents":
[
{
"Name": "Bobby",
"ID": "123"
},
{
"Name": "Linda",
"ID": "321"
}
]
}
'''
Parent = namedtuple('Parent', ['name', 'id'])
def dejsonify(json_str: dict):
if json_str.get("Name"):
parent = Parent(json_str.get('Name'), int(json_str.get('ID')))
return parent
return json_str
res = json.loads(data, object_hook=dejsonify)
print(res)
# then we can do whatever... if you need lookups by name/id,
# we could put the result into a dictionary
all_parents = {(p.name, p.id) : p for p in res['Parents']}
lookup_from_input = ('Bobby', 123)
print(f'found match: {all_parents.get(lookup_from_input)}')
Result:
{'Parents': [Parent(name='Bobby', id=123), Parent(name='Linda', id=321)]}
found match: Parent(name='Bobby', id=123)

Dask how to open json with list of dicts

I'm trying to open a bunch of JSON files using read_json In order to get a Dataframe as follow
ddf.compute()
id owner pet_id
0 1 "Charlie" "pet_1"
1 2 "Charlie" "pet_2"
3 4 "Buddy" "pet_3"
but I'm getting the following error
_meta = pd.DataFrame(
columns=list(["id", "owner", "pet_id"]])
).astype({
"id":int,
"owner":"object",
"pet_id": "object"
})
ddf = dd.read_json(f"mypets/*.json", meta=_meta)
ddf.compute()
*** ValueError: Metadata mismatch found in `from_delayed`.
My JSON files looks like
[
{
"id": 1,
"owner": "Charlie",
"pet_id": "pet_1"
},
{
"id": 2,
"owner": "Charlie",
"pet_id": "pet_2"
}
]
As far I understand the problem is that I'm passing a list of dicts, so I'm looking for the right way to specify it the meta= argument
PD:
I also tried doing it in the following way
{
"id": [1, 2],
"owner": ["Charlie", "Charlie"],
"pet_id": ["pet_1", "pet_2"]
}
But Dask is wrongly interpreting the data
ddf.compute()
id owner pet_id
0 [1, 2] ["Charlie", "Charlie"] ["pet_1", "pet_2"]
1 [4] ["Buddy"] ["pet_3"]
The invocation you want is the following:
dd.read_json("data.json", meta=meta,
blocksize=None, orient="records",
lines=False)
which can be largely gleaned from the docstring.
meta looks OK from your code
blocksize must be None, since you have a whole JSON object per file and cannot split the file
orient "records" means list of objects
lines=False means this is not a line-delimited JSON file, which is the more common case for Dask (you are not assuming that a newline character means a new record)
So why the error? Probably Dask split your file on some newline character, and so a partial record got parsed, which therefore did not match your given meta.

How to create a list from json key:values in python3

I'm looking to create a python3 list of the locations from the json file city.list.json downloaded from OpenWeatherMaps http://bulk.openweathermap.org/sample/city.list.json.gz. The file passes http://json-validator.com/ but I can not figure out how to correctly open the file and create a list of values of key 'name'. I keep hitting json.loads errors about io.TextIOWrapper etc.
I created a short test file
[
{
"id": 707860,
"name": "Hurzuf",
"country": "UA",
"coord": {
"lon": 34.283333,
"lat": 44.549999
}
}
,
{
"id": 519188,
"name": "Novinki",
"country": "RU",
"coord": {
"lon": 37.666668,
"lat": 55.683334
}
}
]
Is there a way to parse this and create a list ["Hurzuf", "Novinki"] ?
You should use json.load() instead of json.loads(). I named my test file file.json and here is the code:
import json
with open('file.json', mode='r') as f:
# At first, read the JSON file and store its content in an Python variable
# By using json.load() function
json_data = json.load(f)
# So now json_data contains list of dictionaries
# (because every JSON is a valid Python dictionary)
# Then we create a result list, in which we will store our names
result_list = []
# We start to iterate over each dictionary in our list
for json_dict in json_data:
# We append each name value to our result list
result_list.append(json_dict['name'])
print(result_list) # ['Hurzuf', 'Novinki']
# Shorter solution by using list comprehension
result_list = [json_dict['name'] for json_dict in json_data]
print(result_list) # ['Hurzuf', 'Novinki']
You just simply iterate over elements in your list and check whether the key is equal to name.

GAE python27 return nested json

This seems such a simple task, yet it eludes me...
class ViewAllDogs(webapp2.RequestHandler):
""" Returns an array of json objects representing all dogs. """
def get(self):
query = Dog.query()
results = query.fetch(limit = MAX_DOGS) # 100
aList = []
for match in results:
aList.append({'id': match.id, 'name': match.name,
'owner': match.owner, arrival_date':match.arrival_date})
aList.append({'departure_history':{'departure_date': match.departure_date,
'departed_dog': match.departed_dog}})
self.response.headers['Content-Type'] = 'application/json'
self.response.write(json.dumps(aList))
The above, my best attempt to date, gets me:
[
{
"arrival_date": null,
"id": "a link to self",
"owner": 354773,
"name": "Rover"
},
{
"departure_history": {
"departed_dog": "Jake",
"departure_date": 04/24/2017
}
},
# json array of objects continues...
]
What I'm trying to get is the departure_history nested:
[
{
"id": "a link to self...",
"owner": 354773,
"name": "Rover",
"departure_history": {
"departed_dog": "Jake",
"departure_date": 04/24/2017
},
"arrival_date": 04/25/2017,
},
# json array of objects continues...
]
I've tried a bunch of different combinations, looked at json docs, python27 docs, no joy, and burned about way too many hours with this. I got this far with the many related SO posts on this topic. Thanks in advance.
You can simplify a little:
aList = []
for match in results:
aDog = {'id': match.id,
'name': match.name,
'owner': match.owner,
'arrival_date':match.arrival_date,
'departure_history': {
'departure_date': match.departure_date,
'departed_dog': match.departed_dog}
}
aList.append(aDog)
This seems a bit hackish, but it works. If you know a better way, by all means, let me know. Thanks.
class ViewAllDogs(webapp2.RequestHandler):
""" Returns an array of json objects representing all dogs. """
def get(self):
query = Dog.query()
results = query.fetch(limit = MAX_DOGS) # 100
aList = []
i = 0
for match in results:
aList.append({'id': match.id, 'name': match.name,
'owner': match.owner, arrival_date':match.arrival_date})
aList[i]['departure_history'] = ({'departure_history':{'departure_date': match.departure_date,
'departed_dog': match.departed_dog}})
i += 1
self.response.headers['Content-Type'] = 'application/json'
self.response.write(json.dumps(aList))

compare input to fields in a json file in ruby

I am trying to create a function that takes an input. Which in this case is a tracking code. Look that tracking code up in a JSON file then return the tracking code as output. The json file is as follows:
[
{
"tracking_number": "IN175417577",
"status": "IN_TRANSIT",
"address": "237 Pentonville Road, N1 9NG"
},
{
"tracking_number": "IN175417578",
"status": "NOT_DISPATCHED",
"address": "Holly House, Dale Road, Coalbrookdale, TF8 7DT"
},
{
"tracking_number": "IN175417579",
"status": "DELIVERED",
"address": "Number 10 Downing Street, London, SW1A 2AA"
}
]
I have started using this function:
def compare_content(tracking_number)
File.open("pages/tracking_number.json", "r") do |file|
file.print()
end
Not sure how I would compare the input to the json file. Any help would be much appreciated.
You can use the built-in JSON module.
require 'json'
def compare_content(tracking_number)
# Loads ENTIRE file into string. Will not be effective on very large files
json_string = File.read("pages/tracking_number.json")
# Uses the JSON module to create an array from the JSON string
array_from_json = JSON.parse(json_string)
# Iterates through the array of hashes
array_from_json.each do |tracking_hash|
if tracking_number == tracking_hash["tracking_number"]
# If this code runs, tracking_hash has the data for the number you are looking up
end
end
end
This will parse the JSON supplied into an array of hashes which you can then compare to the number you are looking up.
If you are the one generating the JSON file and this method will be called a lot, consider mapping the tracking numbers directly to their data for this method to potentially run much faster. For example,
{
"IN175417577": {
"status": "IN_TRANSIT",
"address": "237 Pentonville Road, N1 9NG"
},
"IN175417578": {
"status": "NOT_DISPATCHED",
"address": "Holly House, Dale Road, Coalbrookdale, TF8 7DT"
},
"IN175417579": {
"status": "DELIVERED",
"address": "Number 10 Downing Street, London, SW1A 2AA"
}
}
That would parse into a hash, where you could much more easily grab the data:
require 'json'
def compare_content(tracking_number)
json_string = File.read("pages/tracking_number.json")
hash_from_json = JSON.parse(json_string)
if hash_from_json.key?(tracking_number)
tracking_hash = hash_from_json[tracking_number]
else
# Tracking number does not exist
end
end