How to retain double quotes while loading a json in python - json

Dumping JSON using YAML,
c= {"a":1}
d = yaml.dump(c)
Loading JSON using YAML
yaml.load(d)
{'a': 1} # double quotes is lost
How to ensure that the output of the load has double quotes ?
Note: I tried json and simplejson also, all behave the same way.

For Python there is no difference between single and double quotes.
If you need response as JSON string then use standard json module - it will create string with correctly formated JSON - with double quotes.
>>> import json
>>> json.dumps({'a': 1})
'{"a": 1}'
Some frameworks or modules (as requests) have built-in functions to
send correctly-formated JSON (they may use standard json module in background) so don't have to do it on your own.

This
c = {"a":1}
d = yaml.dump(c)
doesn't dump JSON, it dumps a python dict as YAML. Use json.dumps() to make a JSON string from the dict and then optionally load/dump as YAML and preserve the double quotes by specifying preserver_quotes while loading:
import sys
import json
import ruamel.yaml
c= {"a":1}
json_string = json.dumps(c)
print(json_string)
print('---------')
data = ruamel.yaml.round_trip_load(json_string, preserve_quotes=True)
data['a'] = 3
ruamel.yaml.round_trip_dump(data, sys.stdout)
that will print:
{"a": 1}
---------
{"a": 3}

Related

Convert text based key/value pair to JSON format

I have a text file with a lot of key/value pairs in the given format:
secret_key="XXXXX"
database_password="1234"
timout=30
.
.
.
and list continues...
I want these key/value pairs to be stored in a JSON format so that I can make use of this data in the JSON format. Is there any way of doing this. I mean any website or any method to do it automatically?
The Python 3.8 script below would do the job ◡̈
import json
with open('text', 'r') as fp:
dic = {}
while line:=fp.readline().strip():
key, value = line.split('=')
dic[key] = eval(value)
print(json.dumps(dic))
Note: eval is used to prevent double quotes being escaped.
As I guess that is an .env file. So, I would suggest you try to implement something like this in Python:
import json
import sys
try:
dotenv = sys.argv[1]
except IndexError as e:
dotenv = '.env'
with open(dotenv, 'r') as f:
content = f.readlines()
# removes whitespace chars like '\n' at the end of each line
content = [x.strip().split('=') for x in content if '=' in x]
print(json.dumps(dict(content)))
Reference: https://gist.github.com/GabLeRoux/d6b2c2f7a69ebcd8430ea59c9bcc62c0
*Please let me know if you want to implement it in a different language, such as JavaScript.

Writing JSON preserving double backslashes

I want to store python data structures as json on a Postgresql database.
The json.dumps() works well and I get a properly formed JSON, as in:
>>> import json
>>> j = { 'table': '"public"."client"' }
>>> json.dumps(j)
'{"table": "\\"public\\".\\"client\\""}'
If I do print(json.dumps(j)), only one backslash is printed, since it is used by Python as an escape character, as in:
>>> import json
>>> j = { 'table': '"public"."client"' }
>>> json.dumps(j)
'{"table": "\\"public\\".\\"client\\""}'
>>> print(json.dumps(j))
{"table": "\"public\".\"client\""}
The problem
When I try to store this json on Postgresql with psycopg2, the backslashes should not be stripped, I think.
import psycopg2
import json
try:
conn = psycopg2.connect("service=geotuga")
cursor = conn.cursor()
j = { 'table': '"public"."client"' }
cursor.execute("INSERT INTO users.logger(subject,detail) VALUES (%s, %s);", ('json',json.dumps(j) ))
conn.commit()
cursor.close()
except (Exception, psycopg2.Error) as e:
print(e)
finally:
if conn is not None:
conn.close()
On the database, the json string is stored as: {"table": "\"public\".\"client\""}. The double backslashes are gone.
How can I store the JSON properly created by json.dumps with psycopg2 without loosing the double backslashes?
Note: The json stored on the database is no longer valid. If I try to parse it with Javascript, for example, it fails:
> x = '{"table": "\"public\".\"client\""}'
'{"table": ""public"."client""}'
> JSON.parse(x)
SyntaxError: Unexpected token p in JSON at position 12
As luigibertaco pointed out, the problem was how I observed the data in the database. The double backslashes are being properly written to the database, using psycopg2.
If I do:
# select detail from users.logger where subject = 'json' limit 1;
detail
------------------------------------
{"table": "\"public\".\"client\""}
(1 row)
The output shows just one slash.
But if I use the quote_literal Postgresql function, I get the raw data:
# select quote_literal(detail) from users.logger where subject = 'json' limit 1;
quote_literal
-------------------------------------------
E'{"table": "\\"public\\".\\"client\\""}'
(1 row)
Postgresql was able to parse the string
Another check I've made, was testing the json parsing on Postgresql side. It works, so the string is properly encoded.
# select detail::json->'table' from users.logger where subject = 'json' limit 1;
?column?
-------------------------
"\"public\".\"client\""
(1 row)

Serialise and deserialise pandas periodIndex series

The pandas Series.to_json() function is creating unreadable JSON when using a PeriodIndex.
The error that occurs is:
json.decoder.JSONDecodeError: Expecting ':' delimiter: line 1 column 5 (char 4)
I've tried changing the orient, but in all of these combinations of serialising and deserialising the index is lost.
idx = pd.PeriodIndex(['2019', '2020'], freq='A')
series = pd.Series([1, 2], index=idx)
json_series = series.to_json() # This is a demo - in reality I'm storing this in a database, but this code throws the same error
value = json.loads(json_series)
A link to the pandas to_json docs
A link to the python json lib docs
The reason I'm not using json.dumps is that the pandas series object is not serialisable.
Python 3.7.3 Pandas 0.24.2
A workaround is to convert PeriodIndex to regular Index before dump and convert it back to PeriodIndex after load:
regular_idx = period_idx.astype(str)
# then dump
# after load
period_idx = pd.to_datetime(regular_idx).to_period()

Python 3 - Writing data from struct.unpack into json without individual recasting

I have a large object that is read from a binary file using struct.unpack and some of the values are character arrays which are read as bytes.
Since the character arrays in Python 3 are read as bytes instead of string (like in Python 2) they cannot be directly passed to json.dumps since "bytes" are not JSON serializable.
Is there any way to go from unpacked struct to json without searching through each value and converting the bytes to strings?
You can use a custom encoder in this case. See below
import json
x = {}
x['bytes'] = [b"i am bytes", "test"]
x['string'] = "strings"
x['unicode'] = u"unicode string"
class MyEncoder(json.JSONEncoder):
def default(self, o):
if type(o) is bytes:
return o.decode("utf-8")
return super(MyEncoder, self).default(o)
print(json.dumps(x, cls=MyEncoder))
# {"bytes": ["i am bytes", "test"], "string": "strings", "unicode": "unicode string"}

Forcing Python json module to work with ASCII

I'm using json.dump() and json.load() to save/read a dictionary of strings to/from disk. The issue is that I can't have any of the strings in unicode. They seem to be in unicode no matter how I set the parameters to dump/load (including ensure_ascii and encoding).
If you are just dealing with simple JSON objects, you can use the following:
def ascii_encode_dict(data):
ascii_encode = lambda x: x.encode('ascii')
return dict(map(ascii_encode, pair) for pair in data.items())
json.loads(json_data, object_hook=ascii_encode_dict)
Here is an example of how it works:
>>> json_data = '{"foo": "bar", "bar": "baz"}'
>>> json.loads(json_data) # old call gives unicode
{u'foo': u'bar', u'bar': u'baz'}
>>> json.loads(json_data, object_hook=ascii_encode_dict) # new call gives str
{'foo': 'bar', 'bar': 'baz'}
This answer works for a more complex JSON structure, and gives some nice explanation on the object_hook parameter. There is also another answer there that recursively takes the result of a json.loads() call and converts all of the Unicode strings to byte strings.
And if the json object is a mix of datatypes, not only unicode strings, you can use this expression:
def ascii_encode_dict(data):
ascii_encode = lambda x: x.encode('ascii') if isinstance(x, unicode) else x
return dict(map(ascii_encode, pair) for pair in data.items())