Find if nested key exists in json python - json

In the following JSON response, what's the proper way to check if the nested key "C" exists in python 2.7?
{
"A": {
"B": {
"C": {"D": "yes"}
}
}
}
one line JSON
{ "A": { "B": { "C": {"D": "yes"} } } }

This is an old question with accepted answer, but I would do this using nested if statements instead.
import json
json = json.loads('{ "A": { "B": { "C": {"D": "yes"} } } }')
if 'A' in json:
if 'B' in json['A']:
if 'C' in json['A']['B']:
print(json['A']['B']['C']) #or whatever you want to do
or if you know that you always have 'A' and 'B':
import json
json = json.loads('{ "A": { "B": { "C": {"D": "yes"} } } }')
if 'C' in json['A']['B']:
print(json['A']['B']['C']) #or whatever

Use the json module to parse the input. Then within a try statement try to retrieve key "A" from the parsed input then key "B" from the result and then key "C" from that result. If an error gets thrown the nested "C" does not exists

An quite easy and comfortable way is to use the package python-benedict with full keypath support. Therefore, cast your existing dict d with the function benedict():
d = benedict(d)
Now your dict has full key path support and you can check if the key exists in the pythonic way, using the in operator:
if 'mainsnak.datavalue.value.numeric-id' in d:
# do something
Please find here the complete documentation.

I used a simple recursive solution:
def check_exists(exp, value):
# For the case that we have an empty element
if exp is None:
return False
# Check existence of the first key
if value[0] in exp:
# if this is the last key in the list, then no need to look further
if len(value) == 1:
return True
else:
next_value = value[1:len(value)]
return check_exists(exp[value[0]], next_value)
else:
return False
To use this code, just set the nested key in an array of strings, for example:
rc = check_exists(json, ["A", "B", "C", "D"])

Related

Pulling specific Parent/Child JSON data with Python

I'm having a difficult time figuring out how to pull specific information from a json file.
So far I have this:
# Import json library
import json
# Open json database file
with open('jsondatabase.json', 'r') as f:
data = json.load(f)
# assign variables from json data and convert to usable information
identifier = data['ID']
identifier = str(identifier)
name = data['name']
name = str(name)
# Collect data from user to compare with data in json file
print("Please enter your numerical identifier and name: ")
user_id = input("Numerical identifier: ")
user_name = input("Name: ")
if user_id == identifier and user_name == name:
print("Your inputs matched. Congrats.")
else:
print("Your inputs did not match our data. Please try again.")
And that works great for a simple JSON file like this:
{
"ID": "123",
"name": "Bobby"
}
But ideally I need to create a more complex JSON file and can't find deeper information on how to pull specific information from something like this:
{
"Parent": [
{
"Parent_1": [
{
"Name": "Bobby",
"ID": "123"
}
],
"Parent_2": [
{
"Name": "Linda",
"ID": "321"
}
]
}
]
}
Here is an example that you might be able to pick apart.
You could either:
Make a custom de-jsonify object_hook as shown below and do something with it. There is a good tutorial here.
Just gobble up the whole dictionary that you get without a custom de-jsonify and drill down into it and make a list or set of the results. (not shown)
Example:
import json
from collections import namedtuple
data = '''
{
"Parents":
[
{
"Name": "Bobby",
"ID": "123"
},
{
"Name": "Linda",
"ID": "321"
}
]
}
'''
Parent = namedtuple('Parent', ['name', 'id'])
def dejsonify(json_str: dict):
if json_str.get("Name"):
parent = Parent(json_str.get('Name'), int(json_str.get('ID')))
return parent
return json_str
res = json.loads(data, object_hook=dejsonify)
print(res)
# then we can do whatever... if you need lookups by name/id,
# we could put the result into a dictionary
all_parents = {(p.name, p.id) : p for p in res['Parents']}
lookup_from_input = ('Bobby', 123)
print(f'found match: {all_parents.get(lookup_from_input)}')
Result:
{'Parents': [Parent(name='Bobby', id=123), Parent(name='Linda', id=321)]}
found match: Parent(name='Bobby', id=123)

Change name of main row Rails in JSON

So i have a json:
{
"code": "Q0934X",
"name": "PIDBA",
"longlat": "POINT(23.0 33.0)",
"altitude": 33
}
And i want to change the column code to Identifier
The wished output is this
{
"Identifier": "Q0934X",
"name": "PIDBA",
"longlat": "POINT(23.0 33.0)",
"altitude": 33
}
How can i do in the shortest way? Thanks
It appears that both "the json" you have and your desired result are JSON strings. If the one you have is json_str you can write:
json = JSON.parse(json_str).tap { |h| h["Identifier"] = h.delete("code") }.to_json
puts json
#=> {"name":"PIDBA","longlat":"POINT(23.0 33.0)","altitude":33,"Identifier":"Q0934X"}
Note that Hash#delete returns the value of the key being removed.
Perhaps transform_keys is an option.
The following seems to work for me (ruby 2.6):
json = JSON.parse(json_str).transform_keys { |k| k === 'code' ? 'Identifier' : k }.to_json
But this may work for Ruby 3.0 onwards (if I've understood the docs):
json = JSON.parse(json_str).transform_keys({ 'code': 'Identifier' }).to_json

Reading array fields in Spark 2.2

Suppose you have a bunch of data whose rows look like this:
{
'key': [
{'key1': 'value11', 'key2': 'value21'},
{'key1': 'value12', 'key2': 'value22'}
]
}
I would like to read this into a Spark Dataset. One way to do it is as follows:
case class ObjOfLists(k1: List[String], k2: List[String])
case class Data(k: ObjOfLists)
Then you can do:
sparkSession.read.json(pathToData).select(
struct($"key.key1" as "k1", $"key.key2" as "k2") as "k"
)
.as[Data]
This works fine, but it kind of butchers the data a little bit; after all in the data 'key' points to a list of objects rather than an object of lists. In other words, what I really want is:
case class Obj(k1: String, k2: String)
case class DataOfList(k: List[Obj])
My question: is there some other syntax I can put in select which allows the resulting Dataframe to be converted to a Dataset[DataOfList]?
I tried using the same select syntax as above, and got:
Exception in thread "main" org.apache.spark.sql.AnalysisException: need an array field but got struct<k1:array<string>,k2:array<string>>;
So I also tried:
sparkSession.read.json(pathToData).select(
array(struct($"key.key1" as "k1", $"key.key2" as "k2")) as "k"
)
.as[DataOfList]
This compiles and runs, but the data looks like this:
DataOfList(List(Obj(org.apache.spark.sql.catalyst.expressions.UnsafeArrayData#bb2a5516,org.apache.spark.sql.catalyst.expressions.UnsafeArrayData#bec5e4a7)))
Any other ideas?
Just recast data to reflect expected names:
case class Obj(k1: String, k2: String)
case class DataOfList(k: Seq[Obj])
val text = Seq("""{
"key": [
{"key1": "value11", "key2": "value21"},
{"key1": "value12", "key2": "value22"}
]
}""").toDS
val df = spark.read.json(text)
df
.select($"key".cast("array<struct<k1:string,k2:string>>").as("k"))
.as[DataOfList]
.first
DataOfList(List(Obj(value11,value21), Obj(value12,value22)))
With extraneous objects you define schema on read:
val textExtended = Seq("""{
"key": [
{"key0": "value01", "key1": "value11", "key2": "value21"},
{"key1": "value12", "key2": "value22", "key3": "value32"}
]
}""").toDS
val schemaSubset = StructType(Seq(StructField("key", ArrayType(StructType(Seq(
StructField("key1", StringType),
StructField("key2", StringType))))
)))
val df = spark.read.schema(schemaSubset).json(textExtended)
and proceed as before.

Look for JSON example with all allowed combinations of structure in max depth 2 or 3

I've wrote a program which process JSON objects. Now I want to verify if I've missed something.
Is there an JSON-example of all allowed JSON structure combinations? Something like this:
{
"key1" : "value",
"key2" : 1,
"key3" : {"key1" : "value"},
"key4" : [
[
"string1",
"string2"
],
[
1,
2
],
...
],
"key5" : true,
"key6" : false,
"key7" : null,
...
}
As you can see at http://json.org/ on the right hand side the grammar of JSON isn't quite difficult, but I've got several exceptions because I've forgotten to handles some structure combinations which are possible. E.g. inside an array there can be "string, number, object, array, true, false, null" but my program couldn't handle arrays inside an array until I ran into an exception. So everything was fine until I got this valid JSON object with arrays inside an array.
I want to test my program with a JSON object (which I'm looking for). After this test I want to be feel certain that my program handle every possible valid JSON structure on earth without an exception.
I don't need nesting in depth 5 or so. I only need something in nested depth 2 or max 3. With all base types which nested all allowed base types, inside this base type.
Have you thought of escaped characters and objects within an object?
{
"key1" : {
"key1" : "value",
"key2" : [
"String1",
"String2"
],
},
"key2" : "\"This is a quote\"",
"key3" : "This contains an escaped slash: \\",
"key4" : "This contains accent charachters: \u00eb \u00ef",
}
Note: \u00eb and \u00ef are resp. charachters ë and ï
Choose a programming language that support json.
Try to load your json, on fail the exception's message is descriptive.
Example:
Python:
import json, sys;
json.loads(open(sys.argv[1]).read())
Generate:
import random, json, os, string
def json_null(depth = 0):
return None
def json_int(depth = 0):
return random.randint(-999, 999)
def json_float(depth = 0):
return random.uniform(-999, 999)
def json_string(depth = 0):
return ''.join(random.sample(string.printable, random.randrange(10, 40)))
def json_bool(depth = 0):
return random.randint(0, 1) == 1
def json_list(depth):
lst = []
if depth:
for i in range(random.randrange(8)):
lst.append(gen_json(random.randrange(depth)))
return lst
def json_object(depth):
obj = {}
if depth:
for i in range(random.randrange(8)):
obj[json_string()] = gen_json(random.randrange(depth))
return obj
def gen_json(depth = 8):
if depth:
return random.choice([json_list, json_object])(depth)
else:
return random.choice([json_null, json_int, json_float, json_string, json_bool])(depth)
print(json.dumps(gen_json(), indent = 2))

Find Duplicate JSON Keys in Sublime Text 3

I have a JSON file that, for now, is validated by hand prior to being placed into production. Ideally, this is an automated process, but for now this is the constraint.
One thing I found helpful in Eclipse were the JSON tools that would highlight duplicate keys in JSON files. Is there similar functionality in Sublime Text or through a plugin?
The following JSON, for example, could produce a warning about duplicate keys.
{
"a": 1,
"b": 2,
"c": 3,
"a": 4,
"d": 5
}
Thanks!
There are plenty of JSON validators available online. I just tried this one and it picked out the duplicate key right away. The problem with using Sublime-based JSON linters like JSONLint is that they use Python's json module, which does not error on extra keys:
import json
json_str = """
{
"a": 1,
"b": 2,
"c": 3,
"a": 4,
"d": 5
}"""
py_data = json.loads(json_str) # changes JSON into a Python dict
# which is unordered
print(py_data)
yields
{'c': 3, 'b': 2, 'a': 4, 'd': 5}
showing that the first a key is overwritten by the second. So, you'll need another, non-Python-based, tool.
Even Python documentation says that:
The RFC specifies that the names within a JSON object should be
unique, but does not mandate how repeated names in JSON objects should
be handled. By default, this module does not raise an exception;
instead, it ignores all but the last name-value pair for a given name:
weird_json = '{"x": 1, "x": 2, "x": 3}'
json.loads(weird_json) {'x': 3}
The object_pairs_hook parameter can be used to alter this behavior.
So as pointed from docs:
class JsonUniqueKeysChecker:
def __init__(self):
self.keys = []
def check(self, pairs):
for key, _value in pairs:
if key in self.keys:
raise ValueError("Non unique Json key: '%s'" % key)
else:
self.keys.append(key)
return pairs
And then:
c = JsonUniqueKeysChecker()
print(json.loads(json_str, object_pairs_hook=c.check)) # raises
JSON is very easy format, not very detailed so things like that can be painful. Detection of doubled keys is easy but I bet it's quite a lot of work to forge plugin from that.