Getting data from a json file in Python 3.4.4 - json

How I can get data from a json file?
Like, I have a json, with the content:
{Key:"MyValue",KeyTwo:{KeyThree:"Value Two"}}

OK, first of all, JSON strings must use double quotes. The JSON python library enforces this so you are unable to load your string. Your data needs to look like this:
{"Key":"MyValue","KeyTwo":{"KeyThree":"Value Two"}}
Then you can try this:
import json
data = json.load(open("file_name.json"))
for x in data:
print("%s: %s" % (x, data[x]))

First, modify your JSON data as follows. Let suppose it is in Data.json.
{"Key":"MyValue","KeyTwo":{"KeyThree":"Value Two"}}
Now, you can try the below code.
import json
with open('Data.json', 'r') as f:
data = f.read().strip();
# String
print(data)
print(type(data))
"""
{"Key":"MyValue","KeyTwo":{"KeyThree":"Value Two"}}
<type 'str'>
"""
# 1st way
dict1 = eval(data) # KeyError, if your JSON data is not in correct form
print(dict1)
print(type(dict1))
"""
{'KeyTwo': {'KeyThree': 'Value Two'}, 'Key': 'MyValue'}
<type 'dict'>
"""
# 2nd way
dict2 = json.loads(data) # ValueError, if your JSON data is not in correct form
print(dict2)
print(type(dict2))
"""
{u'KeyTwo': {u'KeyThree': u'Value Two'}, u'Key': u'MyValue'}
<type 'dict'>
"""

Related

Convert csv to json with django admin actions

Register a csv in django-admin and through an action in django-admin to convert to json and store the value in a JSONField
However, in action django-admin I'm getting this error and I can't convert it to json...
admin.py
(....)
def read_data_csv(path):
with open(path, newline='') as csvfile:
reader = csv.DictReader(csvfile)
data = []
for row in reader:
data.append(dict(row))
return data
def convert(modeladmin, request, queryset):
for extraction in queryset:
csv_file_path = extraction.lawsuits
read_data_csv(csv_file_path)
Error:
TypeError at /admin/core/extraction/
expected str, bytes or os.PathLike object, not FieldFile
This is about extraction.lawsuits which is a FieldFile instance. Just pass read_data_csv function csv_file_path.path as argument. That should work.

How to convert this json file to pandas dataframe

The format in the file looks like this
{ 'match' : 'a', 'score' : '2'},{......}
I've tried pd.DataFrame and I've also tried reading it by line but it gives me everything in one cell
I'm new to python
Thanks in advance
Expected result is a pandas dataframe
Try use json_normalize() function
Example:
from pandas.io.json import json_normalize
values = [{'match': 'a', 'score': '2'}, {'match': 'b', 'score': '3'}, {'match': 'c', 'score': '4'}]
df = json_normalize(values)
print(df)
Output:
If one line of your file corresponds to one JSON object, you can do the following:
# import library for working with JSON and pandas
import json
import pandas as pd
# make an empty list
data = []
# open your file and add every row as a dict to the list with data
with open("/path/to/your/file", "r") as file:
for line in file:
data.append(json.loads(line))
# make a pandas data frame
df = pd.DataFrame(data)
If there is more than only one JSON object on one row of your file, then you should find those JSON objects, for example here are two possible options. The solution with the second option would look like this:
# import all you will need
import pandas as pd
import json
from json import JSONDecoder
# define function
def extract_json_objects(text, decoder=JSONDecoder()):
pos = 0
while True:
match = text.find('{', pos)
if match == -1:
break
try:
result, index = decoder.raw_decode(text[match:])
yield result
pos = match + index
except ValueError:
pos = match + 1
# make an empty list
data = []
# open your file and add every JSON object as a dict to the list with data
with open("/path/to/your/file", "r") as file:
for line in file:
for item in extract_json_objects(line):
data.append(item)
# make a pandas data frame
df = pd.DataFrame(data)

Cannot write dict object as correct JSON to file

I try to read JSON from file, get values, transform them and back write to new file.
{
"metadata": {
"info": "important info"
},
"timestamp": "2018-04-06T12:19:38.611Z",
"content": {
"id": "1",
"name": "name test",
"objects": [
{
"id": "1",
"url": "http://example.com",
"properties": [
{
"id": "1",
"value": "1"
}
]
}
]
}
}
Above is a JSON that I read from file.
Below I attach a python program that gets values, creates new JSON and write it to file.
import json
from pprint import pprint
def load_json(file_name):
return json.load(open(file_name))
def get_metadata(json):
return json["metadata"]
def get_timestamp(json):
return json["timestamp"]
def get_content(json):
return json["content"]
def create_json(metadata, timestamp, content):
dct = dict(__metadata=metadata, timestamp=timestamp, content=content)
return json.dumps(dct)
def write_json_to_file(file_name, json_content):
with open(file_name, 'w') as file:
json.dump(json_content, file)
STACK_JSON = 'stack.json';
STACK_OUT_JSON = 'stack-out.json'
if __name__ == '__main__':
json_content = load_json(STACK_JSON)
print("Loaded JSON:")
print(json_content)
metadata = get_metadata(json_content)
print("Metadata:", metadata)
timestamp = get_timestamp(json_content)
print("Timestamp:", timestamp)
content = get_content(json_content)
print("Content:", content)
created_json = create_json(metadata, timestamp, content)
print("\n\n")
print(created_json)
write_json_to_file(STACK_OUT_JSON, created_json)
But the problem is that create json is not correct. Finally as result I get:
"{\"__metadata\": {\"info\": \"important info\"}, \"timestamp\": \"2018-04-06T12:19:38.611Z\", \"content\": {\"id\": \"1\", \"name\": \"name test\", \"objects\": [{\"id\": \"1\", \"url\": \"http://example.com\", \"properties\": [{\"id\": \"1\", \"value\": \"1\"}]}]}}"
It is not that what I want to achieve. It's not correct JSON. What do I wrong?
Solution:
Change the write_json_to_file(...) method like this:
def write_json_to_file(file_name, json_content):
with open(file_name, 'w') as file:
file.write(json_content)
Explanation:
The problem is, that when you're calling write_json_to_file(STACK_OUT_JSON, created_json) at the end of your script, the variable created_json contains a string - it's the JSON representation of the dictionary created in the create_json(...) function. But inside the write_json_to_file(file_name, json_content), you're calling:
json.dump(json_content, file)
You're telling the json module write the JSON representation of variable json_content (which contains a string) into the file. And JSON representation of a string is a single value encapsulated in double-quotes ("), with all the double-quotes it contains escaped by \.
What you want to achieve is to simply write the value of the json_content variable into the file and not have it first JSON-serialized again.
Problem
You're converting a dict into a json and then right before you write it into a file, you're converting it into a json again. When you retry to convert a json to a json it gives you the \" since it's escaping the " since it assumes that you have a value there.
How to solve it?
It's a great idea to read the json file, convert it into a dict and perform all sorts of operations to it. And only when you want to print out an output or write to a file or return an output you convert to a json since json.dump() is expensive, it adds 2ms (approx) of overhead which might not seem much but when your code is running in 500 microseconds it's almost 4 times.
Other Recommendations
After seeing your code, I realize you're coming from a java background and while in java the getThis() or getThat() is a great way to module your code since we represent our code in classes in java, in python it just causes problems in the readability of the code as mentioned in the PEP 8 style guide for python.
I've updated the code below:
import json
def get_contents_from_json(file_path)-> dict:
"""
Reads the contents of the json file into a dict
:param file_path:
:return: A dictionary of all contents in the file.
"""
try:
with open(file_path) as file:
contents = file.read()
return json.loads(contents)
except json.JSONDecodeError:
print('Error while reading json file')
except FileNotFoundError:
print(f'The JSON file was not found at the given path: \n{file_path}')
def write_to_json_file(metadata, timestamp, content, file_path):
"""
Creates a dict of all the data and then writes it into the file
:param metadata: The meta data
:param timestamp: the timestamp
:param content: the content
:param file_path: The file in which json needs to be written
:return: None
"""
output_dict = dict(metadata=metadata, timestamp=timestamp, content=content)
with open(file_path, 'w') as outfile:
json.dump(output_dict, outfile, sort_keys=True, indent=4, ensure_ascii=False)
def main(input_file_path, output_file_path):
# get a dict from the loaded json
data = get_contents_from_json(input_file_path)
# the print() supports multiple args so you don't need multiple print statements
print('JSON:', json.dumps(data), 'Loaded JSON as dict:', data, sep='\n')
try:
# load your data from the dict instead of the methods since it's more pythonic
metadata = data['metadata']
timestamp = data['timestamp']
content = data['content']
# just cumulating your print statements
print("Metadata:", metadata, "Timestamp:", timestamp, "Content:", content, sep='\n')
# write your json to the file.
write_to_json_file(metadata, timestamp, content, output_file_path)
except KeyError:
print('Could not find proper keys to in the provided json')
except TypeError:
print('There is something wrong with the loaded data')
if __name__ == '__main__':
main('stack.json', 'stack-out.json')
Advantages of the above code:
More Modular and hence easily unit testable
Handling of exceptions
Readable
More pythonic
Comments because they are just awesome!

How do I grab info from this json file?

I'm trying to grab some numbers from this json file, but I don't how to do it correctly. This is the json file I am trying to gather information from:
http://stats.nba.com/stats/leaguedashteamstats?Conference=&DateFrom=&DateTo=&Division=&GameScope=&GameSegment=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=PerGame&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2016-17&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&VsConference=&VsDivision=
I've been trying to get this code to work, but I can't figure it out:
import json
from pprint import pprint
with open('data.json') as data_file:
data = json.load(data_file)
data["rowSet"] ["1610612737"] ["Atlanta Hawks"]
I'm trying to get the statistics from each team.
The following Python script should do it.
#!/usr/bin/env python
import json
with open('leaguedashteamstats.json') as data_file:
data = json.load(data_file)
# extract headers names
headers = data['resultSets'][0]['headers']
# extract raw json rows
raw_rows = data['resultSets'][0]['rowSet']
team_stats = []
for row in raw_rows:
print row[1] # prints team name
# mixes header names and values and prints them out
for (header, value) in zip(headers, row):
print header, value
print '\n'
Both data and code can be seen here:
https://gist.github.com/cevaris/24d0b7d97677667aedb14059a6959da1#file-1-team-stats-output
Disclaimer: this code doesn't contain any validation, but it should lead you in the right direction:
import json
with open('data.json') as data_file:
data = json.load(data_file)
for rs in data.get('resultSets'):
for r_ in [r for r in rs.get('rowSet') if r[1] == 'Atlanta Hawks']:
print(r_)
You basically need to determine specific keys that you are going to loop through, or obtain.
This should hopefully get you to where you need to be.

Convert Pandas DataFrame to JSON format

I have a Pandas DataFrame with two columns – one with the filename and one with the hour in which it was generated:
File Hour
F1 1
F1 2
F2 1
F3 1
I am trying to convert it to a JSON file with the following format:
{"File":"F1","Hour":"1"}
{"File":"F1","Hour":"2"}
{"File":"F2","Hour":"1"}
{"File":"F3","Hour":"1"}
When I use the command DataFrame.to_json(orient = "records"), I get the records in the below format:
[{"File":"F1","Hour":"1"},
{"File":"F1","Hour":"2"},
{"File":"F2","Hour":"1"},
{"File":"F3","Hour":"1"}]
I'm just wondering whether there is an option to get the JSON file in the desired format. Any help would be appreciated.
The output that you get after DF.to_json is a string. So, you can simply slice it according to your requirement and remove the commas from it too.
out = df.to_json(orient='records')[1:-1].replace('},{', '} {')
To write the output to a text file, you could do:
with open('file_name.txt', 'w') as f:
f.write(out)
In newer versions of pandas (0.20.0+, I believe), this can be done directly:
df.to_json('temp.json', orient='records', lines=True)
Direct compression is also possible:
df.to_json('temp.json.gz', orient='records', lines=True, compression='gzip')
I think what the OP is looking for is:
with open('temp.json', 'w') as f:
f.write(df.to_json(orient='records', lines=True))
This should do the trick.
use this formula to convert a pandas DataFrame to a list of dictionaries :
import json
json_list = json.loads(json.dumps(list(DataFrame.T.to_dict().values())))
Try this one:
json.dumps(json.loads(df.to_json(orient="records")))
convert data-frame to list of dictionary
list_dict = []
for index, row in list(df.iterrows()):
list_dict.append(dict(row))
save file
with open("output.json", mode) as f:
f.write("\n".join(str(item) for item in list_dict))
To transform a dataFrame in a real json (not a string) I use:
from io import StringIO
import json
import DataFrame
buff=StringIO()
#df is your DataFrame
df.to_json(path_or_buf=buff,orient='records')
dfJson=json.loads(buff)
instead of using dataframe.to_json(orient = “records”)
use dataframe.to_json(orient = “index”)
my above code convert the dataframe into json format of dict like {index -> {column -> value}}
Here is small utility class that converts JSON to DataFrame and back: Hope you find this helpful.
# -*- coding: utf-8 -*-
from pandas.io.json import json_normalize
class DFConverter:
#Converts the input JSON to a DataFrame
def convertToDF(self,dfJSON):
return(json_normalize(dfJSON))
#Converts the input DataFrame to JSON
def convertToJSON(self, df):
resultJSON = df.to_json(orient='records')
return(resultJSON)