Creating JSON data from string and using json.dumps - json

I am trying to create JSON data to pass to InfluxDB. I create it using strings but I get errors. What am I doing wrong. I am using json.dumps as has been suggested in various posts.
Here is basic Python code:
json_body = "[{'points':["
json_body += "['appx', 1, 10, 0]"
json_body += "], 'name': 'WS1', 'columns': ['RName', 'RIn', 'SIn', 'OIn']}]"
print("Write points: {0}".format(json_body))
client.write_points(json.dumps(json_body))
The output I get is
Write points: [{'points':[['appx', 1, 10, 0]], 'name': 'WS1', 'columns': ['RName', 'RIn', 'SIn', 'OIn']}]
Traceback (most recent call last):
line 127, in main
client.write_points(json.dumps(json_body))
File "/usr/local/lib/python2.7/dist-packages/influxdb/client.py", line 173, in write_points
return self.write_points_with_precision(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/influxdb/client.py", line 197, in write_points_with_precision
status_code=200
File "/usr/local/lib/python2.7/dist-packages/influxdb/client.py", line 127, in request
raise error
influxdb.client.InfluxDBClientError
I have tried with double quotes too but get the same error. This is stub code (to minimize the solution), I realize in the example the points list contains just one list object but in reality it contains multiple. I am generating the JSON code reading through outputs of various API calls.
json_body = '[{\"points\":['
json_body += '[\"appx\", 1, 10, 0]'
json_body += '], \"name\": \"WS1\", \"columns\": [\"RName\", \"RIn\", \"SIn\", \"OIn\"]}]'
print("Write points: {0}".format(json_body))
client.write_points(json.dumps(json_body))
I understand if I used the below things would work:
json_body = [{ "points": [["appx", 1, 10, 0]], "name": "WS1", "columns": ["Rname", "RIn", "SIn", "OIn"]}]

You don't need to create JSON manually. Just pass an appropriate Python structure into write_points function. Try something like that:
data = [{'points':[['appx', 1, 10, 0]],
'name': 'WS1',
'columns': ['RName', 'RIn', 'SIn', 'OIn']}]
client.write_points(data)

Please visit JSON.org for proper JSON structure. I can see several errors with your self-generated JSON:
The outer-most item can be an unordered object, enclosed by curly braces {}, or an ordered array, enclosed by brackets []. Don't use both. Since your data is structured like a dict, the curly braces are appropriate.
All strings need to be enclosed in double quotes, not single. "This is valid JSON". 'This is not valid'.
Your 'points' value array is surrounded by double brackets, which is unnecessary. Only use a single set.
Please check out the documentation of the json module for details on how to use it. Basically, you can feed json.dumps() your Python data structure, and it will output it as valid JSON.
In [1]: my_data = {'points': ["appx", 1, 10, 0], 'name': "WS1", 'columns': ["RName", "RIn", "SIn", "OIn"]}
In [2]: my_data
Out[2]: {'points': ['appx', 1, 10, 0], 'name': 'WS1', 'columns': ['RName', 'RIn', 'SIn', 'OIn']}
In [3]: import json
In [4]: json.dumps(my_data)
Out[4]: '{"points": ["appx", 1, 10, 0], "name": "WS1", "columns": ["RName", "RIn", "SIn", "OIn"]}'
You'll notice the value of using a Python data structure first: because it's Python, you don't need to worry about single vs. double quotes, json.dumps() will automatically convert them. However, building a string with embedded single quotes leads to this:
In [5]: op_json = "[{'points':[['appx', 1, 10, 0]], 'name': 'WS1', 'columns': ['RName', 'RIn', 'SIn', 'OIn']}]"
In [6]: json.dumps(op_json)
Out[6]: '"[{\'points\':[[\'appx\', 1, 10, 0]], \'name\': \'WS1\', \'columns\': [\'RName\', \'RIn\', \'SIn\', \'OIn\']}]"'
since you fed the string to json.dumps(), not the data structure.
So next time, don't attempt to build JSON yourself, rely on the dedicated module to do it.

Related

How to parse nested JSON file in Pandas

I'm trying to transform a JSON file generated by the Day One Journal to a text file using Python but hit a brick wall.
This is broadly the format:
{'metadata': {'version': '1.0'},
'entries': [{'richText': '{"meta":{"version":1,"small-lines-removed":true,"created":{"platform":"com.bloombuilt.dayone-mac","version":1344}},"contents":[{"attributes":{"line":{"header":1,"identifier":"F78B28DA-488E-489E-9C95-1A0648099792"}},"text":"2022\\n"},{"attributes":{"line":{"header":0,"identifier":"FA8C6594-F43D-4652-B442-DAF72A379799"}},"text":"\\n"},{"attributes":{"line":{"header":0,"identifier":"0923BCC8-B24A-4C0D-963C-73D09561EECD"}},"text":"It’s the beginning of a new year"},{"embeddedObjects":[{"type":"horizontalRuleLine"}]},{"text":"\\n\\n\\n\\n"},{"embeddedObjects":[{"type":"horizontalRuleLine"}]}]}',
'duration': 0,
'creationOSVersion': '12.1',
'weather': {'sunsetDate': '2022-01-12T16:15:28Z',
'temperatureCelsius': 7,
'weatherServiceName': 'HAMweather',
'windBearing': 230,
'sunriseDate': '2022-01-12T08:00:44Z',
'conditionsDescription': 'Mostly Clear',
'pressureMB': 1042,
'visibilityKM': 48.28020095825195,
'relativeHumidity': 81,
'windSpeedKPH': 6,
'weatherCode': 'clear-night',
'windChillCelsius': 6.699999809265137},
'editingTime': 2925.313938140869,
'timeZone': 'Europe/London',
'creationDeviceType': 'Hal 9000',
'uuid': '988D9D9876624FAEB88F9BCC666FD9CD',
'creationDeviceModel': 'MacBookPro15,2',
'starred': False,
'location': {'region': {'center': {'longitude': -0.0095,
'latitude': 51},
'radius': 75},
'localityName': 'London',
'country': 'United Kingdom',
'timeZoneName': 'Europe/London',
'administrativeArea': 'England',
'longitude': -0.0095,
'placeName': 'Somewhere',
'latitude': 51},
'isPinned': False,
'creationDevice': 'somedevice'...,
}
I only want the 'text' (of which there might be a number of 'text' entries and 'creationDate' so I've got a daily record.
My code to pull out the data is straightforward:
import json
# Opening JSON file
f = open('files/2022.json')
# returns JSON object as
# a dictionary
data = json.load(f)
# Closing file
f.close()
I've tried using list comprensions and then concatenating the Series in Pandas, but two don't match in length - because multiple entries on one day mix up the dataframe.
I wanted to use this code, but:
result = []
for i in data['entries']:
entry = i['creationDate'] + i['text']
result.append(entry)
but I get this error:
KeyError: 'text'
What do I need to do?
Update:
{'richText': '{"meta":{"version":1,"small-lines-removed":true,"created":{"platform":"com.bloombuilt.dayone-mac","version":1344}},"contents":[{"text":"Later than I planned\\n"}]}',
'duration': 0,
'creationOSVersion': '12.1',
'weather': {'sunsetDate': '2022-01-12T16:15:28Z',
'temperatureCelsius': 7,
'weatherServiceName': 'HAMweather',
'windBearing': 230,
'sunriseDate': '2022-01-12T08:00:44Z',
'conditionsDescription': 'Mostly Clear',
'pressureMB': 1042,
'visibilityKM': 48.28020095825195,
'relativeHumidity': 81,
'windSpeedKPH': 6,
'weatherCode': 'clear-night',
'windChillCelsius': 6.699999809265137},
'editingTime': 672.3099998235703,
'timeZone': 'Europe/London',
'creationDeviceType': 'Computer',
'uuid': 'F53DCC5E05BB4106A49C76954117DBF4',
'creationDeviceModel': 'xompurwe',
'isPinned': False,
'creationDevice': 'Computer',
'text': 'Later than I planned \\\n',
'modifiedDate': '2022-01-05T01:01:29Z',
'isAllDay': False,
'creationDate': '2022-01-05T00:39:19Z',
'creationOSName': 'macOS'},
Sort of managed to work a solution - thank you to everyone who helped this morning, particularly #Tomer S.
My solution was:
result = []
for i in data['entries']:
print (i['creationDate'] + i['text'])
result.append(entry)
It still won't get what I want

json.loads function not giving python dictionary

I am trying to convert the below mentioned json string to python dictionary. I am using python 3's json package for the same. Here is the code that I am using :
a = "[{'id': 35, 'name': 'Comedy'}, {'id': 18, 'name': 'Drama'}, {'id': 10751, 'name': 'Family'}, {'id': 10749, 'name': 'Romance'}]"
b = json.loads(json.dumps(a))
print(type(b))
And the output that I am getting from the above code is:
<class 'str'>
I saw the similar questions asked in stackoverflow, but the solutions presented for those questions do not apply to my case.
The json string that you are trying to convert is not properly formatted. Also, you need to only call json.loads to convert string into dict or list.
The updated code would look like:
import json
a = '[{"id": 35, "name": "Comedy"}, {"id": 18, "name": "Drama"}, {"id": 10751, "name": "Family"}, {"id": 10749, "name": "Romance"}]'
b = json.loads(a)
print(type(b))
Hope this explains why you are not getting the expected results.
JSON Array is enclosed in [ ] while JSON object is enclosed in { }
The string in a is a json array so you can change that into a list only.
Your key and value should be enclosed with double quotes, that's the requirement to use json library of python.
b = json.loads(a) will give a list of dictionary objects.
To get further dictionary of dictionary you need to associate a key with each individual dictionary.
d = dict()
ind = 0
for data in b:
d[ind] = data
ind+=1
Now the output that you get will be
{0: {'id': 35, 'name': 'Comedy'}, 1: {'id': 18, 'name': 'Drama'}, 2: {'id': 10751, 'name': 'Family'}, 3: {'id': 10749, 'name': 'Romance'}}
which is a dictionary of dictionary.
Thank you

Why does dask.bag.read_text(filename).map(json.loads) return a list?

I need to read several json.gz files using Dask. I am trying to achieve this by using dask.bag.read_text(filename).map(json.loads), but the output is a nested list (the files contain lists of dictionaries), whereas I would like to get a just a list of dictionaries.
I have included a small example that reproduces my problem, below.
import json
import gzip
import dask.bag as db
dict_list = [{'id': 123, 'name': 'lemurt', 'indices': [1,10]}, {'id': 345, 'name': 'katin', 'indices': [2,11]}]
filename = './test.json.gz'
# Write json
with gzip.open(filename, 'wt') as write_file:
json.dump(dict_list , write_file)
# Read json
with gzip.open(filename, "r") as read_file:
data = json.load(read_file)
# Read json with Dask
data_dask = db.read_text(filename).map(json.loads).compute()
print(data)
print(data_dask)
I would like to get the first output:
[{'id': 123, 'name': 'lemurt', 'indices': [1, 10]}, {'id': 345, 'name': 'katin', 'indices': [2, 11]}]
But instead I get the second one:
[[{'id': 123, 'name': 'lemurt', 'indices': [1, 10]}, {'id': 345, 'name': 'katin', 'indices': [2, 11]}]]
The read_text function returns a bag, where each element is a line of text. So you have a list of strings. Then, you parse each of those lines of text with json.loads, so each of those lines of text becomes a list again. So you have a list of lists.
In your case you might use map_partitions, and a function that expects a list of a single line of text
b = db.read_text("*.json.gz").map(lambda L: json.loads(L[0]))
Following the comment by #MRocklin, I ended up solving my problem by changing the way I was writing the json.gz files.
Instead of
with gzip.open(filename, 'wt') as write_file:
json.dump(dict_list , write_file)
I used
with gzip.open(filename, 'wt') as write_file:
for dd in dict_list:
json.dump(dd , write_file)
write_file.write("\n")
and kept reading the files as
db.read_text(filename).map(json.loads)

Get JSON's attribute value in Chatterbot and Django integration

statement.text in chatterbot and Django integration returns
{'text': u'How are you doing?', 'created_at': datetime.datetime(2017, 2, 20, 7, 37, 30, 746345, tzinfo=<UTC>), 'extra_data': {}, 'in_response_to': [{'text': u'Hi', 'occurrence': 3}]}
I want a value of text attribute so that it prints How are you doing?
The chatterbot return the json object(dict) so you can use the dictionary operations like following
[1]: data = {'text': u'How are you doing?', 'created_at': datetime.datetime(2017, 2, 20, 7, 37, 30, 746345, tzinfo=<UTC>), 'extra_data': {}, 'in_response_to': [{'text': u'Hi', 'occurrence': 3}]}
[2]: data['text'] or data.get('text')[this approch is good].
What you got is dictionary. Value of dictionary can be obtained by get() function. You can also use dict['text'], but it does not perform error checking. get function returns None if key is not present.

Iterate JSON with Ruby and insert to MySQL db

I'm acceessing an open JSON API like this
require 'net/http'
require 'rubygems'
require 'json'
require 'uri'
require 'pp'
url = "http://api.turfgame.com/v4/users"
uri = URI.parse(url)
data = [{"name" => "tbone"}]
headers = {"Content-Type" => "application/json"}
http = Net::HTTP.new(uri.host,uri.port)
response = http.post(uri.path,data.to_json,headers)
This gives a JSON ouput like this
[{"region"=>{"id"=>141, "name"=>"Stockholm"}, "medals"=>[34, 53, 12, 5, 46], "pointsPerHour"=>95, "blocktime"=>24, "zones"=>[275, 42460, 35956, 31926, 24247, 31722, 1097, 26104, 6072, 24283, 289, 325, 22199, 37740, 22198, 37743, 37074, 22845, 22201, 22846, 7477, 7310], "country"=>"se", "id"=>95195, "rank"=>24, "name"=>"tbone", "uniqueZonesTaken"=>178, "taken"=>1170, "points"=>41693, "place"=>377, "totalPoints"=>176654}]
What I want to do is to grab some of the tags:
name (not in the region block but "tbone")
blocktime
totalPoints
all the IDs from the zone-array
and insert into a mysql table. But I don't get how to iterate the JSON object and get the stuff I want.
doing
puts data["name"]
gives an error like
./headerTest.rb:28:in `[]': can't convert String into Integer (TypeError)
from ./headerTest.rb:28:in `<main>'
And I get that it's because there's two name tags but at different depth but i don't get how to accees either one specifically.
Please?
So you have:
result = [{"region"=>{"id"=>141, "name"=>"Stockholm"}, "medals"=>[34, 53, 12, 5, 46], "pointsPerHour"=>95, "blocktime"=>24, "zones"=>[275, 42460, 35956, 31926, 24247, 31722, 1097, 26104, 6072, 24283, 289, 325, 22199, 37740, 22198, 37743, 37074, 22845, 22201, 22846, 7477, 7310], "country"=>"se", "id"=>95195, "rank"=>24, "name"=>"tbone", "uniqueZonesTaken"=>178, "taken"=>1170, "points"=>41693, "place"=>377, "totalPoints"=>176654}]
This is an array with one value. To obtain those values you desire do:
result[0].slice("name", "blocktime", "totalPoints", "zones")
# this returns => {"name"=>"tbone", "blocktime"=>24, "totalPoints"=>176654, "zones"=>[275, 42460, 35956, 31926, 24247, 31722, 1097, 26104, 6072, 24283, 289, 325, 22199, 37740, 22198, 37743, 37074, 22845, 22201, 22846, 7477, 7310]}