Remove empty elements from nested JSON - json

I have a nested json with an arbitrary depth level :
json_list = [
{
'class': 'Year 1',
'room': 'Yellow',
'students': [
{'name': 'James', 'sex': 'M', 'grades': {}},
]
},
{
'class': 'Year 2',
'info': {
'teachers': {
'math': 'Alan Turing',
'physics': []
}
},
'students': [
{ 'name': 'Tony', 'sex': 'M', 'age': ''},
{ 'name': 'Jacqueline', 'sex': 'F' },
],
'other': []
}
]
I want to remove any element that its value meet certain criteria.
For example:
values_to_drop = ({}, (), [], '', ' ')
filtered_json = clean_json(json_list, values_to_drop)
filtered_json
Expected Output of clean_json:
[
{
'class': 'Year 1',
'room': 'Yellow',
'students': [
{'name': 'James', 'sex': 'M'},
]
},
{
'class': 'Year 2',
'info': {
'teachers': {
'math': 'Alan Turing',
}
},
'students': [
{ 'name': 'Tony', 'sex': 'M'},
{ 'name': 'Jacqueline', 'sex': 'F'},
]
}
]
I thought of something like first converting the object to string using json.dumps and then looking in the string and replacing each value that meets the criteria with some kind of flag to filter it after before reading it again with json.loads but I couldn't figure it out and I don't know if this is the way to go

I managed to get the desired output by tweaking this answer a bit:
def clean_json(json_obj, values_to_drop):
if isinstance(json_obj, dict):
json_obj = {
key: clean_json(value, values_to_drop)
for key, value in json_obj.items()
if value not in values_to_drop}
elif isinstance(json_obj, list):
json_obj = [clean_json(item, values_to_drop)
for item in json_obj
if item not in values_to_drop]
return json_obj

Related

splitting JSON file using python

I have below big json file
{
"sections": [
{
"facts": [
{
"name": "Server",
"value": "<https://xxxxxxx:18443/collector/pipeline/v1_allagents>"
},
{
"name": "Environment",
"value": "dev"
},
{
"name": "Issue",
"value": "Server is [EDITED]"
}
]
},
{
"facts": [
{
"name": "Server",
"value": "<https://xxxxx:18443/collector/pipeline/customer-characterstics-v1>"
},
{
"name": "Environment",
"value": "dev"
},
{
"name": "Issue",
"value": "Server is [STOPPED]"
}
]
}
{'facts':
[
{'name': 'Server', 'value': u'<https://xxxxxx:18443/collector/pipeline/soap-post-v1_relations>'},
{'name': 'Environment', 'value': u'dev'}, {'name': 'Issue', 'value': u' status is [STOPPED]'}
]
},
{'facts':
[
{'name': 'Server', 'value': u'<https://xxxxxxx.134:18443/collector/pipeline/characterstics-v1_allagents>'},
{'name': 'Environment', 'value': u'dev'}, {'name': 'Issue', 'value': u' status is [EDITED]'}
]
},
{'facts':
[
{'name': 'Server', 'value': u'<https://xxxxxxx:18443/collector/pipeline/ab23-8128b7c9fcf2>'},
{'name': 'Environment', 'value': u'dev'}, {'name': 'Issue', 'value': u'status is [EDITED]'}
]
}
]
}
....
now I'm struggling to split above file as below and dump into another new files:
{
"text": "Status",
"themeColor": "#FF0000",
{
"sections": [
{
"facts": [
{
"name": "Server",
"value": "<https://xxxxxxx:18443/collector/pipeline/v1_allagents>"
},
{
"name": "Environment",
"value": "dev"
},
{
"name": "Issue",
"value": "Server is [EDITED]"
}
]
}
]
}
}
what I could able to achieve so far is print each tags under facts, but not the way I expect as above.
so, I'm having trouble adding those extra lines prior the final ones and then dump it to another file.
How should I approach this? not using JQ.
each splitted file should have same header and then exactly same pattern for key sections and facts .
edit:
As per Andrej's solution it works perfectly alright for one split at a time.
But how to split the file based on n size, let's say I want to split my original big file where 5 facts exists 2 facts per file.n = 2
so, it should create 3 json files , where first 2 contains 2 blocks of facts and last one should be only one since that's left.
Then final output should be:
{'text': ' Status', 'themeColor': '#FF0000', 'sections':
[
{'facts':
[
{'name': 'Server', 'value': u'<https://xxxxxx:18443/collector/pipeline/soap-post-v1>'},
{'name': 'Environment', 'value': u'dev'},
{'name': 'Issue', 'value': u' status is [STOPPED]'}
]
},
{'facts':
[
{'name': 'Server', 'value': u'<https://xxxxx:18443/collector/pipeline/be9694085a70>'},
{'name': 'Environment', 'value': u'dev'},
{'name': 'Issue', 'value': u' status is [STOPPED]'}
]
}
]
}
and
{'text': ' Status', 'themeColor': '#FF0000', 'sections':
[
{'facts':
[
{'name': 'Server', 'value': u'<https://xxxxxx:18443/collector/pipeline/soap-post-v1_relations>'},
{'name': 'Environment', 'value': u'dev'}, {'name': 'Issue', 'value': u' status is [STOPPED]'}
]
},
{'facts':
[
{'name': 'Server', 'value': u'<https://xxxxxxx.134:18443/collector/pipeline/characterstics-v1_allagents>'},
{'name': 'Environment', 'value': u'dev'}, {'name': 'Issue', 'value': u' status is [EDITED]'}
]
}
]}
as per above one block of fact from original file, hence it will create it's own json
{'text': ' Status', 'themeColor': '#FF0000', 'sections':
[
{'facts':
[
{'name': 'Server', 'value': u'<https://xxxxxxx:18443/collector/pipeline/ab23-8128b7c9fcf2>'},
{'name': 'Environment', 'value': u'dev'}, {'name': 'Issue', 'value': u'status is [EDITED]'}
]
}
]}
You can load the big file json into dictionary using json module. Then treat the loaded data as classical Python dict.
If your file contains the string in question, then this example:
import json
with open('YOUR_JSON_FILE.json', 'r') as f_in:
data = json.load(f_in)
for i, fact in enumerate(data['sections'], 1):
with open('data_out_{}.json'.format(i), 'w') as f_out:
d = {}
d['text'] = 'Status'
d['themeColor'] = '#FF0000'
d['sections'] = fact
json.dump(d, f_out, indent=4)
This creates two files data_out_1.json and data_out_2.json containing:
{
"text": "Status",
"themeColor": "#FF0000",
"sections": {
"facts": [
{
"name": "Server",
"value": "<https://xxxxxxx:18443/collector/pipeline/v1_allagents>"
},
{
"name": "Environment",
"value": "dev"
},
{
"name": "Issue",
"value": "Server is [EDITED]"
}
]
}
}
and
{
"text": "Status",
"themeColor": "#FF0000",
"sections": {
"facts": [
{
"name": "Server",
"value": "<https://xxxxx:18443/collector/pipeline/customer-characterstics-v1>"
},
{
"name": "Environment",
"value": "dev"
},
{
"name": "Issue",
"value": "Server is [STOPPED]"
}
]
}
}
EDIT:
To chunk the JSON file, you can use this example:
import json
def chunk(lst, n):
for i in range(0, len(lst), n):
yield lst[i:i + n]
with open('YOUR_JSON_FILE.json', 'r') as f_in:
data = json.load(f_in)
for i, fact in enumerate(chunk(data['sections'], 2), 1): # <-- change 2 to your chunk size
with open('data_out_{}.json'.format(i), 'w') as f_out:
d = {}
d['text'] = 'Status'
d['themeColor'] = '#FF0000'
d['sections'] = fact
json.dump(d, f_out, indent=4)
import json
with open('/tmp/json_response_output.json') as datafile:
datastore = json.load(datafile)
for n, details in enumerate(datastore['sections']):
split_json = datastore.copy()
split_json['sections'] = [details]
with open(f'json_response_output_part{n}.json', 'w') as f:
json.dump(split_json, f, indent=4, ensure_ascii=False)

MongoDB - How to query and aggregate nested documents values

I have the following 3 Json documents within a mongoDB collection:
{
'title': 'Best',
'array' : [
{'name' : '1',
'value': '2'
},
{'name' : '3',
'value': '4'
}
]
}
and :
{
'title': 'Best',
'array' : [
{'name' : '5',
'value': '6'
},
{'name' : '7',
'value': '8'
}
]
}
and:
{
'title': 'Worst',
'array' : [
{'name' : 'Not_needed',
'value': 'Not_needed'
},
{'name' : 'Not_needed',
'value': 'Not_needed'
}
]
}
I need a query that gives me:
{[
{'name' : '1',
'value': '2'
},
{'name' : '3',
'value': '4'
},
{'name' : '5',
'value': '6'
},
{'name' : '7',
'value': '8'
}
]}
How can I do that? Is that what people refer to as aggregation ? Could you please provide me with a mongoDB query for that?
Here is a query using aggregation framework that nearly generates the document you want.
db.test.aggregate([
{ "$unwind": "$array" },
{ "$group":
{
"_id": "$title",
"array": {"$push": "$array"}
}
},
{ "$project":
{
"array": 1,
"_id": 0
}
}
])

How to return Nested TreeView JSON in Web API

I'm new to web API. I need some help to generate the JSON like following.
[
{
'id': 66,
'text': 'This is the first comment.',
'creator': {
'id': 52,
'display_name': 'Ben'
},
'respondsto': null,
'created_at': '2014-08-14T13:19:59.751Z',
'responses': [
{
'id': 71,
'text': 'This is a response to the first comment.',
'creator': {
'id': 14,
'display_name': 'Daniel',
},
'respondsto': 66,
'created_at': '2014-08-14T13:27:13.915Z',
'responses': [
{
'id': 87,
'text': 'This is a response to the response.',
'creator': {
'id': 52,
'display_name': 'Ben',
},
'respondsto': 71,
'created_at': '2014-08-14T13:27:38.046Z',
'responses': []
}
]
}
]
},
{
'id': 70,
'text': 'Đây là bình luận thứ hai.',
'creator': {
'id': 12,
'display_name': 'Nguyễn'
},
'respondsto': null,
'created_at': '2014-08-14T13:25:47.933Z',
'responses': []
}
];
My Intention is to give JSON Data for the Image.
I'm able to generate normal JSON data. I was struck how to create that responses inside responses until the empty response comes.
Any help would be appreciated.
UPDATE: I found the Answer

JSON Help regarding objects and stringify

I have a variable here which equates to
var4 = "{name: 'TestUser', data: [1.0, 0.8, 0.64]}"
series: [
{
name: 'TestUser',
data: [1.0, 0.8, 0.64]
}
],
I would like to find out how I can put var4 into my series instead of typing in the data. I have read up about JSON.stringify and parse but it doesn't seem to work here.
Use JSON.parse(str) to turn a string into an object. Also define the JSON variable name as a string.
series = [
{
'name': 'TestUser',
'data': [1.0, 0.8, 0.64]
},
];
series.push(JSON.parse(var4));
Or maybe this is what you want:
myJsonObj = {
'series': [
{
'name': 'TestUser',
'data': [1.0, 0.8, 0.64]
},
]
}
myJsonObj.series.push(JSON.parse(var4));
Or this:
jsonObj = JSON.parse(var4);
myJsonObj = {
'series': [
{
'name': jsonObj.name,
'data': jsonObj.data
},
]
}

Trouble formatting json objects with lodash _.groupBy

I'm having trouble reformatting an object in order to group by lists.
// input
{
'M': [
{name: Bob, id: 1},
{name: John, id: 2},
{name: John, id: 3},
],
'F': [
{name: Liz, id: 4},
{name: Mary, id: 5},
{name: Mary, id: 6},
]
}
// desired output
{
'M': [
'Bob': [ {name: Bob, id: 1},]
'John': [ {name: John, id: 2}, {name: John, id: 3} ]
],
'F': [
'Liz': [ {name: Liz, id: 4} ]
'Mary': [ {name: Mary, id: 5}, {name: Mary, id: 6} ]
]
}
My current script is only returning the 'M' key and I'm not what is causing it
for (var key in obj) {
var data = _.groupBy(obj[key], 'name')
return data;
}
I've also tried
Object.keys(obj).forEach(obj, key => {
var data = _.groupBy(obj[key], 'name')
return data;
})
but it throws TypeError: #<Object> is not a function
You can use mapValues to group each gender groups by their names through groupBy.
var output = _.mapValues(input, names => _.groupBy(names, 'name'));
var input = {
'M': [
{name: 'Bob', id: 1},
{name: 'John', id: 2},
{name: 'John', id: 3},
],
'F': [
{name: 'Liz', id: 4},
{name: 'Mary', id: 5},
{name: 'Mary', id: 6},
]
};
var output = _.mapValues(input, names => _.groupBy(names, 'name'));
console.log(output);
body > div { min-height: 100%; top: 0; }
<script src="https://cdnjs.cloudflare.com/ajax/libs/lodash.js/4.17.4/lodash.js"></script>
Use _.forOwn since you want to iterate its own properties.
_.forOwn(obj, function(key, val) {
var ret = {};
ret[key] = _.groupBy(val, 'name');
return ret;
});