python 3 : deserialize nested dictionaries from sqlite - json

I have this sqlite3.register_converter function :
def str_to_dict(s: ByteString) -> Dict:
if s and isinstance(s, ByteString):
s = s.decode('UTF-8').replace("'", '"')
return json.loads(s)
raise TypeError(f'value : "{s}" should be a byte string')
which returns this exception text :
File "/usr/lib64/python3.7/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 30 (char 29)
when encounter with this string :
s = b"{'foo': {'bar': [('for', 'grid')]}}"
It seems that the issue comes from the nested list/tuple/dictionary but what I don't understand is that in the sqlite shell, the value is correctly returned with a select command :
select * from table;
whereas the same command issued from a python script returned the exception above :
class SqliteDb:
def __init__(self, file_path: str = '/tmp/database.db'):
self.file_path = file_path
self._db = sqlite3.connect(self.file_path, detect_types=sqlite3.PARSE_DECLTYPES | sqlite3.PARSE_COLNAMES)
if self._db:
self._cursor = self._db.cursor()
else:
raise ValueError
# register data types converters and adapters
sqlite3.register_adapter(Dict, dict_to_str)
sqlite3.register_converter('Dict', str_to_dict)
sqlite3.register_adapter(List, list_to_str)
sqlite3.register_converter('List', str_to_list)
def __del__(self):
self._cursor.close()
self._db.close()
def select_from(self, table_name: str):
with self._db:
query = f'SELECT * FROM {table_name}'
self._cursor.execute(query)
if __name__ == '__main__':
try:
sq = SqliteDb()
selection_item = sq.select_from("table")[0]
print(f'selection_item : {selection_item}')
except KeyboardInterrupt:
print('\n')
sys.exit(0)
s, the value is already saved in database with no issue. Only the selection causes this issue.
So, anybody has a clue why ?

Your input is really a Python dict literal, and contains structures such as the tuple ('for', 'grid') that cannot be directly parsed as JSON even after you replace single quotes with double quotes.
You can use ast.literal_eval instead to parse the input:
from ast import literal_eval
def str_to_dict(s: ByteString) -> Dict:
return literal_eval(s.decode())

Related

dumping YAML with tags as JSON

I know I can use ruamel.yaml to load a file with tags in it. But when I want to dump without them i get an error. Simplified example :-
from ruamel.yaml import YAML
from json import dumps
import sys
yaml = YAML()
data = yaml.load(
"""
!mytag
a: 1
b: 2
c: 2022-05-01
"""
)
try:
yaml2 = YAML(typ='safe', pure=True)
yaml.default_flow_style = True
yaml2.dump(data, sys.stdout)
except Exception as e:
print('exception dumping using yaml', e)
try:
print(dumps(data))
except Exception as e:
print('exception dumping using json', e)
exception dumping using cannot represent an object: ordereddict([('a', 1), ('b', 2), ('c', datetime.date(2022, 5, 1))])
exception dumping using json Object of type date is not JSON serializable
I cannot change the load() without getting an error on the tag. How to get output with tags stripped (YAML or JSON)?
You get the error because the neither the safe dumper (pure or not), nor JSON, do know about the ruamel.yaml internal
types that preserve comments, tagging, block/flow-style, etc.
Dumping as YAML, you could register these types with alternate dump methods. As JSON this is more complex
as AFAIK you can only convert the leaf-nodes (i.e. the YAML scalars, you would e.g. be
able to use that to dump the datetime.datetime instance that is loaded as the value of key c).
I have used YAML as a readable, editable and programmatically updatable config file with
an much faster loading JSON version of the data used if its file is not older than the corresponding YAML (if
it is older JSON gets created from the YAML). The thing to do in order to dump(s) is
recursively generate Python primitives that JSON understands.
The following does so, but there are other constructs besides datetime
instances that JSON doesn't allow. E.g. when using sequences or dicts
as keys (which is allowed in YAML, but not in JSON). For keys that are
sequences I concatenate the string representation of the elements
:
from ruamel.yaml import YAML
import sys
import datetime
import json
from collections.abc import Mapping
yaml = YAML()
data = yaml.load("""\
!mytag
a: 1
b: 2
c: 2022-05-01
[d, e]: !myseq [42, 196]
{f: g, 18: y}: !myscalar x
""")
def json_dump(data, out, indent=None):
def scalar(obj):
if obj is None:
return None
if isinstance(obj, (datetime.date, datetime.datetime)):
return str(obj)
if isinstance(obj, ruamel.yaml.scalarbool.ScalarBoolean):
return obj == 1
if isinstance(obj, bool):
return bool(obj)
if isinstance(obj, int):
return int(obj)
if isinstance(obj, float):
return float(obj)
if isinstance(obj, tuple):
return '_'.join([str(x) for x in obj])
if isinstance(obj, Mapping):
return '_'.join([f'{k}-{v}' for k, v in obj.items()])
if not isinstance(obj, str): print('type', type(obj))
return obj
def prep(obj):
if isinstance(obj, dict):
return {scalar(k): prep(v) for k, v in obj.items()}
if isinstance(obj, list):
return [prep(elem) for elem in obj]
if isinstance(obj, ruamel.yaml.comments.TaggedScalar):
return prep(obj.value)
return scalar(obj)
res = prep(data)
json.dump(res, out, indent=indent)
json_dump(data, sys.stdout, indent=2)
which gives:
{
"a": 1,
"b": 2,
"c": "2022-05-01",
"d_e": [
42,
196
],
"f-g_18-y": "x"
}

how to raise an excpetion and return none value when the conditions are not satisfied

I am new to python and have just to learn python. I have a written a code to find the common characters in two string and I am getting the desired output. I want to modify the code if the following cases arise and it should return None for the following conditions
1) For two string, if there is no match
2) any of string1 or string2 is nil/empty
3) any of string1 or string2 is hash/array/set/Fixnum [i.e anything other than string]
I am supposed to raise an exception for the above cases. I have gone through the forums and links but could not figure it out correctly. Could anyone please help me on how do raise exception for the above condition
This is the code
class CharactersInString:
def __init__(self, value1, value2):
self.value1 = value1
self.value2 = value2
def find_chars_order_n(self):
new_string = [ ]
new_value1 = list(self.value1)
new_value2 = list(self.value2)
print( "new_value1: ", new_value1)
print( "new_value2: ", new_value2)
for i in new_value1:
if i in new_value2 and i not in new_string:
new_string.append(i)
final_list = list(new_string)
return ''.join(final_list)
if __name__ == "__main__":
obj = CharactersInString("ho", "killmse")
print(obj.find_chars_order_n())

Inserting cipher text into mysql using python

So i have a program which will encrypt a string using AES and generate cipher which in bytes[].
I wish to store this cipher as it is in mysql database.
I found we could use VARBINARY data type in mysql to do so.
In what ways we could achieve so.
Here is my try to do so :
import ast
import mysql.connector
from Crypto.Cipher import AES
from Crypto.Random import get_random_bytes
def encrypt(key, msg):
iv = get_random_bytes(16)
cipher = AES.new(key, AES.MODE_CFB, iv)
ciphertext = cipher.encrypt(msg) # Use the right method here
db = iv + ciphertext
print(db)
cursor.executemany(sql_para_query,db)
print(cursor.fetchone())
connection.commit()
return iv + ciphertext
def decrypt(key, ciphertext):
iv = ciphertext[:16]
ciphertext = ciphertext[16:]
cipher = AES.new(key, AES.MODE_CFB, iv)
msg = cipher.decrypt(ciphertext)
return msg.decode("utf-8")
if __name__ == "__main__":
connection = mysql.connector.connect(host = "localhost", database = "test_db", user = "sann", password = "userpass",use_pure=True)
cursor = connection.cursor(prepared = True)
sql_para_query = """insert into test1 values(UNHEX(%s)) """
ed = input("(e)ncrypt or (d)ecrypt: ")
key = str(1234567899876543)
if ed == "e":
msg = input("message: ")
s= encrypt(key, msg)
print("Encrypted message: ", s)
file = open("e_tmp","wb+")
file.write(s)
print(type(s))
elif ed == "d":
#smsg = input("encrypted message: ")
#file = open("e_tmp","rb")
#smsg = file.read()
#print(type(smsg))
sql_para_query = """select * from test1"""
cursor.execute(sql_para_query)
row = cursor.fetchone()
print(row)
#smsg = str(smsg)
#msg = ast.literal_eval(smsg)
#print(msg)
#print(type(msg))
#s=decrypt(key, msg)
#print("Decrypted message: ", s)
#print(type(s))
Error I'm getting :
Traceback (most recent call last): File
"/home/mr_pool/.local/lib/python3.6/site-packages/mysql/connector/cursor.py",
line 1233, in executemany
self.execute(operation, params) File "/home/mr_pool/.local/lib/python3.6/site-packages/mysql/connector/cursor.py",
line 1207, in execute
elif len(self._prepared['parameters']) != len(params): TypeError: object of type 'int' has no len()
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "tmp1.py", line 36, in
s= encrypt(key, msg) File "tmp1.py", line 14, in encrypt
cursor.executemany(sql_para_query,db) File "/home/mr_pool/.local/lib/python3.6/site-packages/mysql/connector/cursor.py",
line 1239, in executemany
"Failed executing the operation; {error}".format(error=err)) mysql.connector.errors.InterfaceError: Failed executing the operation;
object of type 'int' has no len()
Any other alternatives are also welcome.
My ultimate goal is to store the encrypted text in database.
I reproduced your error, but it seems there are more errors in your code.
The key as well as the message are strings, therefore I got this error:
TypeError: Object type <class 'str'> cannot be passed to C code
Which I fixed by encoding them in utf-8:
# line 38:
key = str(1234567899876543).encode("utf8")
# .... line 41:
s= encrypt(key, msg.encode("utf8"))
The UNHEX function in your SQL Query is not needed because we are entering the data as VARBINARY. You can change your statement to:
"""insert into test1 values(%s) """
The function executemany() can be replaced by execute() because you are only entering one statement. However I will write the solution for using both, execute or executemany.
insert with execute():
From the documentation:
cursor.execute(operation, params=None, multi=False)
iterator = cursor.execute(operation, params=None, multi=True)
This method executes the given database operation (query or command). The parameters found in the tuple or dictionary params are bound to the variables in the operation. Specify variables using %s or %(name)s parameter style (that is, using format or pyformat style). execute() returns an iterator if multi is True.
https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursor-execute.html
So we need just to build a tuple with your parameters by changing the cursor.execute line to:
cursor.execute(sql_para_query, (db, ))
insert with executemany():
From the documentation:
cursor.executemany(operation, seq_of_params)
This method prepares a database operation (query or command) and executes it against all parameter sequences or mappings found in the sequence seq_of_params.
https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursor-executemany.html
Therefore we need to build a sequence with values you'd like to insert. In your case just one value:
cursor.executemany(sql_para_query, [(db, )])
To insert multiple values, you can add as many tuples into your sequence as you want.
full code:
import ast
import mysql.connector
from Crypto.Cipher import AES
from Crypto.Random import get_random_bytes
def encrypt(key, msg):
iv = get_random_bytes(16)
cipher = AES.new(key, AES.MODE_CFB, iv)
ciphertext = cipher.encrypt(msg) # Use the right method here
db = iv + ciphertext
cursor.execute(sql_para_query, (db, ))
connection.commit()
return iv + ciphertext
def decrypt(key, ciphertext):
iv = ciphertext[:16]
ciphertext = ciphertext[16:]
cipher = AES.new(key, AES.MODE_CFB, iv)
msg = cipher.decrypt(ciphertext)
return msg.decode("utf-8")
if __name__ == "__main__":
connection = mysql.connector.connect(host = "localhost", database = "test_db", user = "sann", password = "userpass",use_pure=True)
cursor = connection.cursor(prepared = True)
sql_para_query = """insert into test1 values(%s) """
ed = input("(e)ncrypt or (d)ecrypt: ")
key = str(1234567899876543).encode("utf8")
if ed == "e":
msg = input("message: ")
s= encrypt(key, msg.encode("utf8"))
print("Encrypted message: ", s)
file = open("e_tmp","wb+")
file.write(s)
print(type(s))
elif ed == "d":
sql_para_query = """select * from test1"""
cursor.execute(sql_para_query)
row = cursor.fetchone()
msg = row[0] # row is a tuple, therefore get first element of it
print("Unencrypted message: ", msg)
s=decrypt(key, msg)
print("Decrypted message: ", s)
output:
#encrypt:
(e)ncrypt or (d)ecrypt: e
message: this is my test message !!
Encrypted message: b"\x8f\xdd\xe6f\xb1\x8e\xb51\xc1'\x9d\xbf\xb5\xe1\xc7\x87\x99\x0e\xd4\xb2\x06;g\x85\xc4\xc1\xd2\x07\xb5\xc53x\xb9\xbc\x03+\xa2\x95\r4\xd1*"
<class 'bytes'>
#decrypt:
(e)ncrypt or (d)ecrypt: d
Unencrypted message: bytearray(b"\x8f\xdd\xe6f\xb1\x8e\xb51\xc1\'\x9d\xbf\xb5\xe1\xc7\x87\x99\x0e\xd4\xb2\x06;g\x85\xc4\xc1\xd2\x07\xb5\xc53x\xb9\xbc\x03+\xa2\x95\r4\xd1*")
Decrypted message: this is my test message !!

How to use mock_open with json.load()?

I'm trying to get a unit test working that validates a function that reads credentials from a JSON-encoded file. Since the credentials themselves aren't fixed, the unit test needs to provide some and then test that they are correctly retrieved.
Here is the credentials function:
def read_credentials():
basedir = os.path.dirname(__file__)
with open(os.path.join(basedir, "authentication.json")) as f:
data = json.load(f)
return data["bot_name"], data["bot_password"]
and here is the test:
def test_credentials(self):
with patch("builtins.open", mock_open(
read_data='{"bot_name": "name", "bot_password": "password"}\n'
)):
name, password = shared.read_credentials()
self.assertEqual(name, "name")
self.assertEqual(password, "password")
However, when I run the test, the json code blows up with a decode error. Looking at the json code itself, I'm struggling to see why the mock test is failing because json.load(f) simply calls f.read() then calls json.loads().
Indeed, if I change my authentication function to the following, the unit test works:
def read_credentials():
# Read the authentication file from the current directory and create a
# HTTPBasicAuth object that can then be used for future calls.
basedir = os.path.dirname(__file__)
with open(os.path.join(basedir, "authentication.json")) as f:
content = f.read()
data = json.loads(content)
return data["bot_name"], data["bot_password"]
I don't necessarily mind leaving my code in this form, but I'd like to understand if I've got something wrong in my test that would allow me to keep my function in its original form.
Stack trace:
Traceback (most recent call last):
File "test_shared.py", line 56, in test_credentials
shared.read_credentials()
File "shared.py", line 60, in read_credentials
data = json.loads(content)
File "/home/philip/.local/share/virtualenvs/atlassian-webhook-basic-3gOncDp4/lib/python3.6/site-packages/flask/json/__init__.py", line 205, in loads
return _json.loads(s, **kwargs)
File "/usr/lib/python3.6/json/__init__.py", line 367, in loads
return cls(**kw).decode(s)
File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
I had the same issue and got around it by mocking json.load and builtins.open:
import json
from unittest.mock import patch, MagicMock
# I don't care about the actual open
p1 = patch( "builtins.open", MagicMock() )
m = MagicMock( side_effect = [ { "foo": "bar" } ] )
p2 = patch( "json.load", m )
with p1 as p_open:
with p2 as p_json_load:
f = open( "filename" )
print( json.load( f ) )
Result:
{'foo': 'bar'}
I had the exact same issue and solved it. Full code below, first the function to test, then the test itself.
The original function I want to test loads a json file that is structured like a dictionary, and checks to see if there's a specific key-value pair in it:
def check_if_file_has_real_data(filepath):
with open(filepath, "r") as f:
data = json.load(f)
if "fake" in data["the_data"]:
return False
else:
return True
But I want to test this without loading any actual file, exactly as you describe. Here's how I solved it:
from my_module import check_if_file_has_real_data
import mock
#mock.patch("my_module.json.load")
#mock.patch("my_module.open")
def test_check_if_file_has_real_data(mock_open, mock_json_load):
mock_json_load.return_value = dict({"the_data": "This is fake data"})
assert check_if_file_has_real_data("filepath") == False
mock_json_load.return_value = dict({"the_data": "This is real data"})
assert check_if_file_has_real_data("filepath") == True
The mock_open object isn't called explicitly in the test function, but if you don't include that decorator and argument you get a filepath error when the with open part of the check_if_file_has_real_data function tries to run using the actual open function rather than the MagicMock object that's been passed into it.
Then you overwrite the response provided by the json.load mock with whatever you want to test.

JSON Serialization is empty when serialising from eulxml in python

I am working with eXistDB in python and leveraging the eulxml library to handle mapping from the xml in the database into custom objects. I want to then serialize these objects to json (for another application to consume) but I'm running into issues. jsonpickle doesn't work (it ends up returning all sorts of excess garbage and the value are the fields aren't actually encoded but rather their eulxml type) and the standard json.dumps() is simply giving me empty json (this was after trying to implement the solution detailed here). The problem seems to stem from the fact that the __dict__ values are not initialised __oninit__ (as they are mapped as class properties) so the __dict__ appears empty upon serialization. Here is some sample code:
Serializable Class Object
class Serializable(dict):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
# hack to fix _json.so make_encoder serialize properly
self.__setitem__('dummy', 1)
def _myattrs(self):
return [
(x, self._repr(getattr(self, x)))
for x in self.__dir__()
if x not in Serializable().__dir__()
]
def _repr(self, value):
if isinstance(value, (str, int, float, list, tuple, dict)):
return value
else:
return repr(value)
def __repr__(self):
return '<%s.%s object at %s>' % (
self.__class__.__module__,
self.__class__.__name__,
hex(id(self))
)
def keys(self):
return iter([x[0] for x in self._myattrs()])
def values(self):
return iter([x[1] for x in self._myattrs()])
def items(self):
return iter(self._myattrs())
Base Class
from eulxml import xmlmap
import inspect
import lxml
import json as JSON
from models.serializable import Serializable
class AlcalaBase(xmlmap.XmlObject,Serializable):
def toJSON(self):
return JSON.dumps(self, indent=4)
def to_json(self, skipBegin=False):
json = list()
if not skipBegin:
json.append('{')
json.append(str.format('"{0}": {{', self.ROOT_NAME))
for attr, value in inspect.getmembers(self):
if (attr.find("_") == -1
and attr.find("serialize") == -1
and attr.find("context") == -1
and attr.find("node") == -1
and attr.find("schema") == -1):
if type(value) is xmlmap.fields.NodeList:
if len(value) > 0:
json.append(str.format('"{0}": [', attr))
for v in value:
json.append(v.to_json())
json.append(",")
json = json[:-1]
json.append("]")
else:
json.append(str.format('"{0}": null', attr))
elif (type(value) is xmlmap.fields.StringField
or type(value) is str
or type(value) is lxml.etree._ElementUnicodeResult):
value = JSON.dumps(value)
json.append(str.format('"{0}": {1}', attr, value))
elif (type(value) is xmlmap.fields.IntegerField
or type(value) is int
or type(value) is float):
json.append(str.format('"{0}": {1}', attr, value))
elif value is None:
json.append(str.format('"{0}": null', attr))
elif type(value) is list:
if len(value) > 0:
json.append(str.format('"{0}": [', attr))
for x in value:
json.append(x)
json.append(",")
json = json[:-1]
json.append("]")
else:
json.append(str.format('"{0}": null', attr))
else:
json.append(value.to_json(skipBegin=True))
json.append(",")
json = json[:-1]
if not skipBegin:
json.append('}')
json.append('}')
return ''.join(json)
Sample Class that implements Base
from eulxml import xmlmap
from models.alcalaMonth import AlcalaMonth
from models.alcalaBase import AlcalaBase
class AlcalaPage(AlcalaBase):
ROOT_NAME = "page"
id = xmlmap.StringField('pageID')
year = xmlmap.IntegerField('content/#yearID')
months = xmlmap.NodeListField('content/month', AlcalaMonth)
The toJSON() method on the base is the method that is using the Serializable class and is returning empty json, e.g. "{}". The to_json() is my attempt to for a json-like implementation but that has it's own problems (for some reason it skips certain properties / child objects for no reason I can see but thats a thread for another day).
If I attempt to access myobj.keys or myobj.values (both of which are exposed via Serializable) I can see property names and values as I would expect but I have no idea why json.dumps() produces an empty json string.
Does anyone have any idea why I cannot get these objects to serialize to json?! I've been pulling my hair out for weeks with this. Any help would be greatly appreciated.
So after a lot of playing around, I was finally able to fix this with jsonpickle and it took only 3 lines of code:
def toJson(self):
jsonpickle.set_preferred_backend('simplejson')
return jsonpickle.encode(self, unpicklable=False)
I used simplejson to eliminate some of the additional object notation that was being added and the unpicklable property removed the rest (I'm not sure if this would work with the default json backend as I didn't test it).
Now when I call toJson() on any object that inherits from this base class, I get very nice json and it works brilliantly.