Add a COMMA after a CURLY Brace For JSON Format - json

Hello there i am working on a dataset but its not formatted correctly. It's missing its Square brackets and comma after every object
Example:
{"is_sarcastic": 1, "headline": "thirtysomething scientists unveil doomsday clock of hair loss", "article_link": "something"}
{"is_sarcastic": 0, "headline": "dem rep. totally nails why congress is falling short on gender, racial equality", "article_link": "somethingelse"}
I want to format it such that it turns to this:
[{"is_sarcastic": 1, "headline": "thirtysomething scientists unveil doomsday clock of hair loss", "article_link": "something"},
{"is_sarcastic": 0, "headline": "dem rep. totally nails why congress is falling short on gender, racial equality", "article_link": "somethingelse"}]
I am using Python 3.x to achieve this task.

you can run the following python script, it'll output a file containing the output you desire.
import json
dataJson = []
with open('data.json') as f:
for jsonObj in f:
dataDict = json.loads(jsonObj)
dataJson.append(dataDict)
#print (dataJson)
with open('data2.json', 'w') as jsonfile:
json.dump(dataJson, jsonfile)
where data.json is the name of the file containing the dataset

Related

how to convert a text file of JSON to JSON array in Python?

I'm a bit stumped on how to convert a text file of JSON to JSON array in Python?
So I have a text file that is constantly being written to with JSON entries, however the text file does not have bracket to call it as a JSON Array.
{"apple": "2", "orange": "3"}, {"apple": "5", "orange": "6"}, {"apple": "9", "orange": "10"} ...
How can I put that into a JSON array? My end goal is read the latest entry every time there is a new entry added. The JSON file is created by some other program that I have no control over, and its being written to constantly, so I can't just slap brackets to the start and end of the file.
Thanks
After you read in the file, you can treat it as a string, to which you add brackets to. Then, pass that to the json library to decode it as JSON for you.
import json
with open('data.txt') as f:
raw_data = f.read().splitlines()[-1]
list_data = f'[{raw_data}]'
json_data = json.loads(list_data)
print(json_data)
# [{'apple': '2', 'orange': '3'}, {'apple': '5', 'orange': '6'}, {'apple': '9', 'orange': '10'}]
This assumes that each line of the file should become a new array. The code extracts the last line of the file and converts it.

Python dictionary has json in it. How do I convert that dictionary to json?

I have this dictionary (or so type() tells me):
{'uploadedby': 'fred',
'return_url': '',
'id': '2200',
'question_json': '{"ops":[{"insert":"What metal is responsible for a Vulcan\'s green blood?\\n"}]}'}
When I use json.dumps on it, I get this:
{"uploadedby": "fred",
"return_url": "",
"id": "2200",
"question_json": "{\"ops\":[{\"insert\":\"What metal is responsible for a Vulcan's green blood?\\n\"}]}", "question": "What metal is responsible for a Vulcan's green blood?\r\n"}
I don't want all the escaping that's going on. Is there something I can do to correct this?
You can do something like the following to convert question_json into a python dict, and then dump the entire dict:
test = {'uploadedby': 'fred',
'return_url': '',
'id': '2200',
'question_json': '{"ops":[{"insert":"What metal is responsible for a Vulcan\'s green blood?\\n"}]}'}
json.dumps(
{k: json.loads(v) if k == 'question_json' else v for k,v in test.items()}
)
'{"question_json": {"ops": [{"insert": "What metal is responsible for a Vulcan\'s green blood?\\n"}]}, "uploadedby": "fred", "return_url": "", "id": "2200"}'
You could try the following, which has the added benefit of not needing to specify which key contains the offending value. Here we're checking to see if we can effectively load a JSON string from any of the key-value pairs and leaving them alone if that fails.
import json
mydict = {'uploadedby': 'fred',
'return_url': '',
'id': '2200',
'question_json': '{"ops":[{"insert":"What metal is responsible for a Vulcan\'s green blood?\\n"}]}'}
for key in mydict:
try:
mydict[key] = json.loads(mydict[key])
except:
pass
Now when we do a json.dumps(mydict), the offending key is fixed and others are as they were:
{'uploadedby': 'fred',
'return_url': '',
'id': 2200,
'question_json': {'ops': [{'insert': "What metal is responsible for a Vulcan's green blood?\n"}]}}
Note that the id value has been converted to an int, which may or may not be your intent. It's hard to tell from the original question.

Convert data from CSV to JSON with grouping

Example csv data (top row is column header followed by three data lines);
floor,room,note1,note2,note3
floor1,room1,2people
floor2,room4,6people,projector
floor6,room5,20people,projector,phone
I need the output in json, but grouped by floor, like this;
floor
room
note1
note2
note3
room
note1
note2
note3
floor
room
note1
note2
note3
room
note1
note2
note3
So all floor1 rooms are in their own json grouping, then floor2 rooms etc.
Please could someone point me in the right direction in terms of which tools to look at and any specific functions e.g. jq + categories. I've done some searching already and got muddled up between lots of different posts relating to csvtojson, jq and some python scripts. Ideally I would like to include the solution in a shell script rather than a separate program/language (I have sys admin experience but not a programmer).
Many thanks
Perhaps this can get you started.
Use a programming language like Python to convert the CSV data into a dictionary data structure by splitting on the commas, and use the JSON library to dump your dictionary out as JSON.
I have assumed that actually you expect to have more than one room per floor and thus I took the liberty to adjust your input data a little.
import json
csv = """floor1,room1,note1,note2,note3
floor1,room2,2people
floor1,room3,3people
floor2,room4,6people,projector
floor2,room5,3people,projector
floor3,room6,1person
"""
response = {}
for line in csv.splitlines():
fields = line.split(",")
floor, room, data = fields[0], fields[1], fields[2:]
if floor not in response:
response[floor] = {}
response[floor][room] = data
print json.dumps(response)
If you then run that script and pipe it into jq (where JQ is just used for pretty-printing the output on your screen ; it is not really required) you will see:
$ python test.py | jq .
{
"floor1": {
"room2": [
"2people"
],
"room3": [
"3people"
],
"room1": [
"note1",
"note2",
"note3"
]
},
"floor2": {
"room4": [
"6people",
"projector"
],
"room5": [
"3people",
"projector"
]
},
"floor3": {
"room6": [
"1person"
]
}
}

Pretty Printing Arbitrarily Nested Dictionaries & Lists in Vim

I've run into several scenarios where I lists & dictionaries of data in vim, with arbitrarily nested data structures, i.e.:
a = [ 'somedata', d : { 'one': 'x', 'two': 'y', 'three': 'z' }, 'moredata' ]
b = { 'one': '1', 'two': '2', 'three': [ 'x', 'y', 'z' ] }
I'd really like to have a way to 'pretty print' them in a tabular format. It would be especially helpful to simply treat them as JSON directly in vim. Any suggestions?
You may want to look at Tim Pope's Scriptease.vim which provides many niceties for vim scripting and plugin development.
Although I am not sure how pretty :PP is I have found it pretty enough for my uses.
It should also be noted that vim script dictionaries and arrays are very similar to JSON, so you could in theory use any JSON tools after some clean up.
If your text is valid json, you can turn to the external python -m json.tool
so, you just execute in vim: %!python -m json.tool.
Unfortunately your example won't work, if you take a valid json example with nested dict/lists:
Note
that in the screencast I have ft=json, so some quotes cannot be seen in normal mode, the text I used:
[{"test1": 1, "test2": "win", "t3":{"nest1":"foo","nest2":"bar"}}, {"test1": 1, "test2": "win", "t3":{"nest1":"foo","nest2":"bar"}}, {"test1": 1, "test2": "win", "t3":{"nest1":"foo","nest2":"bar"}}, {"test1": 1, "test2": "win", "t3":{"nest1":"foo","nest2":"bar"}}]

Reading JSON data in a shell script

I have a JSON file containing data about some images:
{
"imageHeight": 1536,
"sessionID": "4340cc80cb532ecf106a7077fc2a166cb84e2c21",
"bottomHeight": 1536,
"imageID": 1,
"crops": 0,
"viewPortHeight": 1296,
"imageWidth": 2048,
"topHeight": 194,
"totalHeight": 4234
}
I wish to process these values in a simple manner in a shell script. I searched online but was not able to find any simple material to understand.
EDIT : What I wish to do with the values ?
I'm using convert (Imagemagick) to process the images. So, the whole workflow is something like. Read the an entry say crop from a line in the json file and then use the value in cropping the image :
convert -crop [image width from json]x[image height from json]+0+[crop value from json] [session_id from json]-[imageID from json].png [sessionID]-[ImageID]-cropped.png
I would recommend using jq. For example, to get the imageHeight, you can use:
jq ".imageHeight" data.json
Output:
1536
If you want to store the value in a shell variable use:
variable_name=$(jq ".imageHeight" data.json)
Python-solution
import json
from pprint import pprint
json_data=open('json_file')
data = json.load(json_data)
pprint(data)
data['bottomHeight']
output:
In [28]: pprint(data)
{u'bottomHeight': 1536,
u'crops': 0,
u'imageHeight': 1536,
u'imageID': 1,
u'imageWidth': 2048,
u'sessionID': u'4340cc80cb532ecf106a7077fc2a166cb84e2c21',
u'topHeight': 194,
u'totalHeight': 4234,
u'viewPortHeight': 1296}
In [29]: data['bottomHeight']
Out[29]: 1536