Extracting JSON data using python and running into keyerror - json

I am trying to extract json file data using python but running in some errors.
aircraft.json (json file):
{ "now" : 1609298440.3,
"messages" : 31501,
"aircraft" : [
{"hex":"abadf9","alt_baro":37000,"alt_geom":36625,"gs":541.9,"track":73.3,"baro_rate":0,"version":0,"nac_p":7,"nac_v":1,"sil":2,"sil_type":"unknown","mlat":[],"tisb":[],"messages":13,"seen":6.6,"rssi":-25.3},
{"hex":"acc02b","flight":"SWA312 ","alt_baro":37000,"alt_geom":36650,"gs":549.3,"track":62.2,"baro_rate":0,"category":"A3","nav_qnh":1013.6,"nav_altitude_mcp":36992,"nav_heading":56.2,"lat":42.171346,"lon":-93.298198,"nic":8,"rc":186,"seen_pos":66.3,"version":2,"nic_baro":1,"nac_p":8,"nac_v":1,"sil":3,"sil_type":"perhour","gva":1,"sda":2,"mlat":[],"tisb":[],"messages":1205,"seen":7.4,"rssi":-26.0},
{"hex":"ac9e9a","category":"A4","version":2,"sil_type":"perhour","mlat":[],"tisb":[],"messages":746,"seen":119.1,"rssi":-26.6},
{"hex":"a96577","flight":"DAL673 ","alt_baro":40025,"alt_geom":39625,"gs":371.4,"track":265.1,"baro_rate":0,"squawk":"2641","emergency":"none","category":"A4","nav_qnh":1013.6,"nav_altitude_mcp":40000,"nav_heading":258.8,"lat":42.057220,"lon":-94.098337,"nic":8,"rc":186,"seen_pos":0.9,"version":2,"nic_baro":1,"nac_p":9,"nac_v":1,"sil":3,"sil_type":"perhour","gva":2,"sda":2,"mlat":[],"tisb":[],"messages":3021,"seen":0.3,"rssi":-21.8},
{"hex":"aa56db","category":"A3","version":2,"sil_type":"perhour","mlat":[],"tisb":[],"messages":1651,"seen":85.3,"rssi":-26.4}
]
}
My code:
import json
json_file = open('test.json')
aircraft_json = json.load(json_file)
for i in aircraft_json['aircraft']:
print(i['hex'],i['flight'],i['alt_baro'],i['alt_geom'],i['gs'],i['gs'],i['track'],i['baro_rate'],i[
'category'],i['nav_qnh'],i['nav_altitude_mcp'],i['lat'],i['lon'],i['nic'],i['rc'],i['seen_pos'],i['version'],i['nic_baro'],i['nac_p'],i['nac_v'],i['sil'],i['sil_type'],i['gva'],i['sda'],i['mlat'],i['tisb'],i['messages'],i['seen'],i['rssi'])
json_file.close()
Output:
Traceback (most recent call last):
File "/home/pi/aircraft_json_to_csv.py", line 11, in <module>
print(i['hex'],i['flight'],i['alt_baro'],i['alt_geom'],i['gs'],i['gs'],i['track'],i['baro_rate'],i[
KeyError: 'flight
The json file is updated every second and json file may miss key values like 'flight' or any random key values. My question is if that key is missing then how to replace those missing value with empty space without getting keyerror.
Thank you

My advice would be to give each field a suitable default value and store these fields in a dictionary.
Then, instead of assuming the field is present, check if the field exists. If it doesn't, then apply the default value.
Below is a simple example of this in action.
The defaults dict has been populated with a few possible defaults
to get you started, to which you would add the rest of the fields as well.
I've adapted the loop to iterate through the keys of the dict (all the known fields so to speak), and add the default value for any missing field.
import json
with open('aircraft.json') as json_file:
aircraft_json = json.load(json_file)
defaults = {
'alt_baro': 0,
'alt_geom': 0,
'version': 0,
'baro_rate': 0,
'mlat': [],
'tisb': []
# similarly for the other fields
}
for dat in aircraft_json['aircraft']:
for field in defaults.keys():
if field not in dat:
dat[field] = defaults[field]
print(dat[field], end=' ')
print('')

Related

Check if a value exists in a json file with python

I've the following json file (banneds.json):
{
"players": [
{
"avatar": "https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/07/07aa315f664efa92456569429230bc2c254c3ff8_full.jpg",
"created": 1595050663,
"created_by": "<#128152620136267776>",
"nick": "teste",
"steam64": 76561198046619692
},
{
"avatar": "https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/21/21fa5c468597e9c890212b2e3bdb0fac781c040c_full.jpg",
"created": 1595056420,
"created_by": "<#128152620136267776>",
"nick": "ingridão",
"steam64": 76561199058918551
}
]
}
And I want to insert new values if the new value (inserted by user) is not already in the json, however when I try to search if the value is already there I receive a false value, an example of what I'm doing ( not the original code, only an example ):
import json
check = 76561198046619692
with open('banneds.json', 'r') as file:
data = json.load(file)
if check in data:
print(True)
else:
print(False)
I'm always receiving the "False" result, but the value is there, someone can give me a light of what I'm doing wrong please? I tried the entire night to find a solution, but no one works :(
Thanks for the help!
You are checking data as a dictionary object. When checking using if check in data it checks if data object have a key matching the value of the check variable (data.keys() to list all keys).
One easy way would be to use if check in data["players"].__str__() which will convert value to a string and search for the match.
If you want to make sure that check value only checks for the steam64 values, you can write a simple function that will iterate over all "players" and will check their "steam64" values. Another solution would be to make list of "steam64" values for faster and easier checking.
You can use any() to check if value of steam64 key is there.
For example:
import json
def check_value(data, val):
return any(player['steam64']==val for player in data['players'])
with open('banneds.json', 'r') as f_in:
data = json.load(f_in)
print(check_value(data, 76561198046619692))
Prints:
True

How to check for specific field values based on some condition while converting csv file to json format

Below is the code to convert csv file to json format in python.
I have two fields 'recommendation' and 'rating'. Based on the recommendation value I need to set the value for rating field like if recommendation is 1 then rating =1 and vice versa. With the answer I got I'm getting output for only one record entry instead of getting all the records. I think it's overriding. Do I need to create separate list for that and append each record entry to the list to get the output for all records.
here's the updated code:
def main(input_file):
csv_rows = []
with open(input_file, 'r') as csvfile:
reader = csv.DictReader(csvfile, delimiter='|')
title = reader.fieldnames
for row in reader:
entry = OrderedDict()
for field in title:
entry[field] = row[field]
[c.update({'RATING': c['RECOMMENDATIONS']}) for c in reader]
csv_rows.append(entry)
with open(json_file, 'w') as f:
json.dump(csv_rows, f, sort_keys=True, indent=4, ensure_ascii=False)
f.write('\n')
I want to create the nested format like the below:
"rating": {
"user_rating": {
"rating": 1
},
"recommended": {
"rating": 1
}
After you've read the file in, using the csv.DictReader, you'll have a list of dicts. Since you want to set the values now, it's a simple dict manipulation. There are several ways, of which one is:
[c.update({'rating': c['recommendation']}) for c in read_csvDictReader]
Hope that helps.

filter particular field name and value from field_dict of package django-reversion

I have a function which returns json data as history from Version of reversion.models.
from django.http import HttpResponse
from reversion.models import Version
from django.contrib.admin.models import LogEntry
import json
def history_list(request):
history_list = Version.objects.all().order_by('-revision__date_created')
data = []
for i in history_list:
data.append({
'date_time': str(i.revision.date_created),
'user': str(i.revision.user),
'object': i.object_repr,
'field': i.revision.comment.split(' ')[-1],
'new_value_field': str(i.field_dict),
'type': i.content_type.name,
'comment': i.revision.comment
})
data_ser = json.dumps(data)
return HttpResponse(data_ser, content_type="application/json")
When I run the above snippet I get the output json as
[{"type": "fruits", "field": "colour", "object": "anyobject", "user": "anyuser", "new_value_field": "{'price': $23, 'weight': 2kgs, 'colour': 'red'}", "comment": "Changed colour."}]
From the function above,
'comment': i.revision.comment
returns json as "comment": "changed colour" and colour is the field which I have written in the function to retrieve it from comment as
'field': i.revision.comment.split(' ')[-1]
But i assume getting fieldname and value from field_dict is a better approach
Problem: from the above json list I would like to filter new_field_value and old_value. In the new_filed_value only value of colour.
Getting the changed fields isn't as easy as checking the comment, as this can be overridden.
Django-reversion just takes care of storing each version, not comparing.
Your best option is to look at the django-reversion-compare module and its admin.py code.
The majority of the code in there is designed to produce a neat side-by-side HTML diff page, but the code should be able to be re-purposed to generate a list of changed fields per object (as there can be more than one changed field per version).
The code should* include a view independent way to get the changed fields at some point, but this should get you started:
from reversion_compare.admin import CompareObjects
from reversion.revisions import default_revision_manager
def changed_fields(obj, version1, version2):
"""
Create a generic html diff from the obj between version1 and version2:
A diff of every changes field values.
This method should be overwritten, to create a nice diff view
coordinated with the model.
"""
diff = []
# Create a list of all normal fields and append many-to-many fields
fields = [field for field in obj._meta.fields]
concrete_model = obj._meta.concrete_model
fields += concrete_model._meta.many_to_many
# This gathers the related reverse ForeignKey fields, so we can do ManyToOne compares
reverse_fields = []
# From: http://stackoverflow.com/questions/19512187/django-list-all-reverse-relations-of-a-model
changed_fields = []
for field_name in obj._meta.get_all_field_names():
f = getattr(
obj._meta.get_field_by_name(field_name)[0],
'field',
None
)
if isinstance(f, models.ForeignKey) and f not in fields:
reverse_fields.append(f.rel)
fields += reverse_fields
for field in fields:
try:
field_name = field.name
except:
# is a reverse FK field
field_name = field.field_name
is_reversed = field in reverse_fields
obj_compare = CompareObjects(field, field_name, obj, version1, version2, default_revision_manager, is_reversed)
if obj_compare.changed():
changed_fields.append(field)
return changed_fields
This can then be called like so:
changed_fields(MyModel,history_list_item1, history_list_item2)
Where history_list_item1 and history_list_item2 correspond to various actual Version items.
*: Said as a contributor, I'll get right on it.

DynamoDB2 and Boto 2.9.5 bad request on batch query and single item get

I am having trouble doing any single or batch query with boto 2.9.5 using the DynamoDB2 API
I need to do a batch query like this:
one_org = Table('[table-name]').batch_get(keys=[
{'key': '[user-id-hash]'},
{'key': '[user-id-hash]'},
{'key': '[user-id-hash]'},
{'key': '[user-id-hash]'},
])
for user in one_org:
for key, value in user.items():
print key, value
I keep getting this exception:
boto.dynamodb2.exceptions.ValidationException: ValidationException: 400 Bad Request
{
u'message': u'The provided key element does not match the schema',
u'__type': u'com.amazon.coral.validate#ValidationException'
}
Given this message I'd think there'd be a problem with the name of the key, but our key is called key, so it doesn't make any sense to me.
I included the stack trace below:
Traceback (most recent call last):
File "aws/interfaces.py", line 38, in <module>
for user in one_org:
File "/home/kasper/Falcon/thenest/venv/local/lib/python2.7/site-packages/boto/dynamodb2/results.py", line 59, in next
self.fetch_more()
File "/home/kasper/Falcon/thenest/venv/local/lib/python2.7/site-packages/boto/dynamodb2/results.py", line 141, in fetch_more
results = self.the_callable(*args, **kwargs)
File "/home/kasper/Falcon/thenest/venv/local/lib/python2.7/site-packages/boto/dynamodb2/table.py", line 949, in _batch_get
raw_results = self.connection.batch_get_item(request_items=items)
File "/home/kasper/Falcon/thenest/venv/local/lib/python2.7/site-packages/boto/dynamodb2/layer1.py", line 152, in batch_get_item
body=json.dumps(params))
File "/home/kasper/Falcon/thenest/venv/local/lib/python2.7/site-packages/boto/dynamodb2/layer1.py", line 1479, in make_request
retry_handler=self._retry_handler)
File "/home/kasper/Falcon/thenest/venv/local/lib/python2.7/site-packages/boto/connection.py", line 852, in _mexe
status = retry_handler(response, i, next_sleep)
File "/home/kasper/Falcon/thenest/venv/local/lib/python2.7/site-packages/boto/dynamodb2/layer1.py", line 1518, in _retry_handler
response.status, response.reason, data)
boto.dynamodb2.exceptions.ValidationException: ValidationException: 400 Bad Request
{u'message': u'The provided key element does not match the schema', u'__type': u'com.amazon.coral.validate#ValidationException'}
I was facing the same issue this morning. If you have defined a RangeKey in your schema then you need to specify that as well. If you don't want to specify the RangeKey and only get the item using HashKey then consider removing the RangeKey.
The error is saying that the value(s) provided do not match the type defined in the schema. I don't know what your schema is but, as an example, if the schema defined the primary key (called key in your case) as a string and you provided an integer value or vice-versa you would get this error.
Check your schema and make sure you are passing in the right type of value for your query.

Using Python's csv.dictreader to search for specific key to then print its value

BACKGROUND:
I am having issues trying to search through some CSV files.
I've gone through the python documentation: http://docs.python.org/2/library/csv.html
about the csv.DictReader(csvfile, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds) object of the csv module.
My understanding is that the csv.DictReader assumes the first line/row of the file are the fieldnames, however, my csv dictionary file simply starts with "key","value" and goes on for atleast 500,000 lines.
My program will ask the user for the title (thus the key) they are looking for, and present the value (which is the 2nd column) to the screen using the print function. My problem is how to use the csv.dictreader to search for a specific key, and print its value.
Sample Data:
Below is an example of the csv file and its contents...
"Mamer","285713:13"
"Champhol","461034:2"
"Station Palais","972811:0"
So if i want to find "Station Palais" (input), my output will be 972811:0. I am able to manipulate the string and create the overall program, I just need help with the csv.dictreader.I appreciate any assistance.
EDITED PART:
import csv
def main():
with open('anchor_summary2.csv', 'rb') as file_data:
list_of_stuff = []
reader = csv.DictReader(file_data, ("title", "value"))
for i in reader:
list_of_stuff.append(i)
print list_of_stuff
main()
The documentation you linked to provides half the answer:
class csv.DictReader(csvfile, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds)
[...] maps the information read into a dict whose keys are given by the optional fieldnames parameter. If the fieldnames parameter is omitted, the values in the first row of the csvfile will be used as the fieldnames.
It would seem that if the fieldnames parameter is passed, the given file will not have its first record interpreted as headers (the parameter will be used instead).
# file_data is the text of the file, not the filename
reader = csv.DictReader(file_data, ("title", "value"))
for i in reader:
list_of_stuff.append(i)
which will (apparently; I've been having trouble with it) produce the following data structure:
[{"title": "Mamer", "value": "285713:13"},
{"title": "Champhol", "value": "461034:2"},
{"title": "Station Palais", "value": "972811:0"}]
which may need to be further massaged into a title-to-value mapping by something like this:
data = {}
for i in list_of_stuff:
data[i["title"]] = i["value"]
Now just use the keys and values of data to complete your task.
And here it is as a dictionary comprehension:
data = {row["title"]: row["value"] for row in csv.DictReader(file_data, ("title", "value"))}
The currently accepted answer is fine, but there's a slightly more direct way of getting at the data. The dict() constructor in Python can take any iterable.
In addition, your code might have issues on Python 3, because Python 3's csv module expects the file to be opened in text mode, not binary mode. You can make your code compatible with 2 and 3 by using io.open instead of open.
import csv
import io
with io.open('anchor_summary2.csv', 'r', newline='', encoding='utf-8') as f:
data = dict(csv.reader(f))
print(data['Champhol'])
As a warning, if your csv file has two rows with the same value in the first column, the later value will overwrite the earlier value. (This is also true of the other posted solution.)
If your program really is only supposed to print the result, there's really no reason to build a keyed dictionary.
import csv
import io
# Python 2/3 compat
try:
input = raw_input
except NameError:
pass
def main():
# Case-insensitive & leading/trailing whitespace insensitive
user_city = input('Enter a city: ').strip().lower()
with io.open('anchor_summary2.csv', 'r', newline='', encoding='utf-8') as f:
for city, value in csv.reader(f):
if user_city == city.lower():
print(value)
break
else:
print("City not found.")
if __name __ == '__main__':
main()
The advantage of this technique is that the csv isn't loaded into memory and the data is only iterated over once. I also added a little code the calls lower on both the keys to make the match case-insensitive. Another advantage is if the city the user requests is near the top of the file, it returns almost immediately and stops looking through the file.
With all that said, if searching performance is your primary consideration, you should consider storing the data in a database.