I am new to python and am struggling with remove a key and value from a json return by an http request. When querying a task I get the following back.
data = requests.get(url,headers=hed).json()['data']
[{
'gid': '12011553977',
'due_on': None,
'name': 'do something',
'notes': 'blalbla,
'projects': [{
'gid': '120067502445',
'name': 'Project1'
}]
}, {
'gid': '12002408815',
'due_on': '2021-10-21',
'name': 'Proposal',
'notes': 'bla',
'projects': [{
'gid': '12314323523',
'name': 'Project1'
}, {
'gid': '12314323523',
'name': 'Project2'
}, {
'gid': '12314323523',
'name': 'Project3'
}]
I am trying to remove 'gid' from all projects so projects look like this
'projects': [{
'name': 'Company'
}]
What is the best way to do this with python3?
You can use recursion to make a simpler function to handle all elements and sub-elements. I haven't done extensive testing, or included any error checking or exception handling; but this should be close to what you want:
def rec_pop(top_level_list,key_to_pop='gid'):
for item in top_level_list:
item.pop(key_to_pop)
for v in item.values():
if isinstance(v,list):
rec_pop(v)
# call recursive fn
rec_pop(data)
Result:
In [25]: data
Out[25]:
[{'due_on': None,
'name': 'do something',
'notes': 'blalbla',
'projects': [{'name': 'Project1'}]},
{'due_on': '2021-10-21',
'name': 'Proposal',
'notes': 'bla',
'projects': [{'name': 'project2'}]}]
Related
I have a nested json with an arbitrary depth level :
json_list = [
{
'class': 'Year 1',
'room': 'Yellow',
'students': [
{'name': 'James', 'sex': 'M', 'grades': {}},
]
},
{
'class': 'Year 2',
'info': {
'teachers': {
'math': 'Alan Turing',
'physics': []
}
},
'students': [
{ 'name': 'Tony', 'sex': 'M', 'age': ''},
{ 'name': 'Jacqueline', 'sex': 'F' },
],
'other': []
}
]
I want to remove any element that its value meet certain criteria.
For example:
values_to_drop = ({}, (), [], '', ' ')
filtered_json = clean_json(json_list, values_to_drop)
filtered_json
Expected Output of clean_json:
[
{
'class': 'Year 1',
'room': 'Yellow',
'students': [
{'name': 'James', 'sex': 'M'},
]
},
{
'class': 'Year 2',
'info': {
'teachers': {
'math': 'Alan Turing',
}
},
'students': [
{ 'name': 'Tony', 'sex': 'M'},
{ 'name': 'Jacqueline', 'sex': 'F'},
]
}
]
I thought of something like first converting the object to string using json.dumps and then looking in the string and replacing each value that meets the criteria with some kind of flag to filter it after before reading it again with json.loads but I couldn't figure it out and I don't know if this is the way to go
I managed to get the desired output by tweaking this answer a bit:
def clean_json(json_obj, values_to_drop):
if isinstance(json_obj, dict):
json_obj = {
key: clean_json(value, values_to_drop)
for key, value in json_obj.items()
if value not in values_to_drop}
elif isinstance(json_obj, list):
json_obj = [clean_json(item, values_to_drop)
for item in json_obj
if item not in values_to_drop]
return json_obj
I've copied a sample from the site https://www.newtonsoft.com/jsonschema/help/html/ValidatingJson.htm:
using Newtonsoft.Json.Linq;
using Newtonsoft.Json.Schema;
...
string schemaJson = #"{
'description': 'A person',
'type': 'object',
'properties': {
'name': {'type': 'string'},
'hobbies': {
'type': 'array',
'items': {'type': 'string'}
}
}
}";
JSchema schema = JSchema.Parse(schemaJson);
JObject person = JObject.Parse(#"{
'name': 'James',
'hobbies': ['.NET', 'Blogging', 'Reading', 'Xbox', 'LOLCATS']
}");
bool valid = person.IsValid(schema);
// true
When I try to debug it in VS2019 I get an exception: "Method not found: 'Boolean Newtonsoft.Json.Schema.SchemaExtensions.IsValid"
I feel so stupid for asking, but what am I doing wrong?
I am making a API call and getting my JSON data like so:
import requests
import jmespath
import pandas as pd
import json
url = 'a.com'
r = requests.get(url).json()
The object returned looks like this:
{'question': [{
'response': {'firstname': {'value': 'John'},
'lastname': {'value': 'Bob'}},
'profile_question': [{
'identities': [{'type': 'ID,
'value': '1'},
{'type': 'EMAIL',
'value': 'test#test.com'}]}]}]}
I tried putting this into json.fr but I get error that it is not correctly formed json. How every I can crawl this object as is but not successfully for what I need.
I am trying to use jmespath library to crawl and want to pull out four pieces of information firstname, lastname, ID, EMAIL like so and appending the data into list:
lst =[]
fname = jmespath.search('question[*].response.{firstname:firstname.value}',my_dict)
lst.append(fname)
lname = jmespath.search('question[*].response.{lastname:lastname.value}',my_dict)
lst.append(lname)
email_path = jmespath.search("question[*].profile_question[].identities.{email:[?type=='EMAIL'].value}",my_dict)
lst.append(email)
ID = jmespath.search("question[*].profile_question[].identities.{email:[?type=='ID'].value}",my_dict)
lst.append(ID)
I append into a list in hopes of creating tuples per iteration that I can push into a dataframe.
The list looks like this:
[[{'firstname': 'John'}],
[{'lastname': 'Bob'}],
[{'email': ['test#test.com']}],
[{'ID': ['1']}]]
However when I crawl the dictionary with missing values like so:
{'question': [{
'response': {'firstname': {'value': 'John'},
'lastname': {'value': 'Bob'}},
'profile-question': [{
'identities': [{'type': 'ID,
'value': '1'},
{'type': 'EMAIL',
'value': 'test#test.com'}]}]}],
'response': {'firstname': {'value': 'John1'},
'lastname': {'value': 'Bob1'}},
'profile-question': [{
'identities': [{'type': 'ID,
'value': '2'}]}]}
causes my list to behave like this (I can not tell why):
[[{'firstname': 'John'}], [{'email': ['test#test.com']}], [{'email': ['1']},[{'firstname': 'John'}],
[{'lastname': 'Bob'}],
[{'email': [][][]}],
[{'ID': ['1']}]]]
which causes the df to look like this:
firstname lastname email ID
john bob test#test.com 1
john1 bob1 test#test.com 1
How do I crawl a JSON dict object as it comes in from the API, pulling out four pieces of data firstname, lastname, email, ID and appending into a dataframe like so? :
firstname lastname email ID
john bob test#test.com 1
john1 bob1 2
More than willing to get away from jmespath library, and to add, the above dictionary has many more fields, I have shortened so only the key points and their indentation is listed.
Well before anything the reason for the error is because the json object is missing quotes after ID.
{'question': [{
'response': {'firstname': {'value': 'John'},
'lastname': {'value': 'Bob'}},
'profile_question': [{
'identities': [{'type': 'ID,
'value': '1'},
{'type': 'EMAIL',
'value': 'test#test.com'}]}]}]}
It should look like this:
{'question': [{
'response': {'firstname': {'value': 'John'},
'lastname': {'value': 'Bob'}},
'profile_question': [{
'identities': [{'type': 'ID',
'value': '1'},
{'type': 'EMAIL',
'value': 'test#test.com'}]}]}]}
From here you can use the json library to turn the json object into a python dictionary object with json.loads(). Once you fixed the json object, your code can look something like this.
import jmespath as jp
import pandas as pd
jon = {'question':
[{'response': {'firstname': {'value': 'John'},
'lastname': {'value': 'Bob'}},
'profile_question': [{'identities': [{'type': 'ID',
'value': '1'},
{'type': 'EMAIL', 'value': 'test#test.com'}]}]}]}
jsons = [jon] # list of all json objects
df_list = []
for json in jsons:
try:
fname = jp.search('question[*].response.firstname.value', jon)[0]
except IndexError:
fname = None
try:
lname = jp.search('question[*].response.lastname.value', jon)[0]
except IndexError:
lname = None
try:
email = jp.search("question[*].profile_question[].identities.{email:[?type=='EMAIL'].value}", jon)[0]['email'][0]
except IndexError:
email = None
try:
user_id = jp.search("question[*].profile_question[].identities.{email:[?type=='ID'].value}", jon)[0]['email'][0]
except IndexError:
user_id = None
df_list.append(pd.DataFrame({'firstname': fname, 'lastname': lname, 'email': email, 'id': user_id}, index=[0]))
df = pd.concat(df_list, ignore_index=True, sort=False)
print(df)
suppose I have below json data, my requirement is I only want to parse few key-value pair to be print. Like I want name,description,start & end key-value pair will be printed just by calling print once. All other key-value pair should be skipped while printing. Only the asked above information is needed for my work, so I don't want to keep all other key_value pair.
'events': [
{
'name': {
'text': 'Sample Event',
'html': 'Sample Event'
},
'description': {
'text': 'This is a test event',
'html': '<P>This is a test event</P>'
},
'start': {
'timezone': 'Asia/Kolkata',
'local': '2018-10-12T19:00:00',
'utc': '2018-10-12T13:30:00Z'
},
'end': {
'timezone': 'Asia/Kolkata',
'local': '2018-10-12T22:00:00',
'utc': '2018-10-12T16:30:00Z'
},
'organization_id': '269994560152',
'created': '2018-09-02T15:48:49Z',
'changed': '2018-09-02T15:49:00Z',
'capacity': 1,
'capacity_is_custom': False,
'status': 'live',
'currency': 'USD',
'listed': True,
'shareable': True,
'invite_only': False,
'online_event': False,
'show_remaining': True,
'tx_time_limit': 480,
'hide_start_date': False,
'hide_end_date': False,
'locale': 'en_US',
'is_locked': False,
'privacy_setting': 'unlocked',
'is_series': False,
'is_series_parent': False,
'is_reserved_seating': False,
'show_pick_a_seat': False,
'show_seatmap_thumbnail': False,
'show_colors_in_seatmap_thumbnail': False,
'source': 'create_2.0',
'is_free': True,
'version': '3.0.0',
'logo_id': None,
'category_id': '199',
'subcategory_id': None,
'format_id': '16',
'logo': None
}
This works fine, filling grid:
$("#grid").kendoGrid({
dataSource: {
data: [
{'id': 1, 'name': 2, 'author': 3},
{'id': 1, 'name': 2, 'author': 3},
{'id': 1, 'name': 2, 'author': 3},
] ,
},
but when I load list from getJSON:
$.getJSON('/api/notes/', function(data) {
dataSource = data.rows;
});
Pointing data to dataSource array nothing is displayed :(
If received data is in data.rows, you should do:
$("#grid").data("kendoGrid").dataSource.data = data.rows;
But, why do you not use transport.read in the grid.dataSource for loading data instead of using getJSON?
You should use the data method of the dataSource.
e.g.
$.getJSON('/api/notes/', function(data) {
dataSource.data(data.rows);
});