Use json overrides default argparse parameters - json

I have a argparse function containing a mix of internal and user specify settings. I want to use a json as configuration file to store user-specified parameters so that the json will be parsed back to this argparse function.
I also have a mix of data types in the parameters, they are defined in argparse but not in the json.
My argparse function looks like this
def parse_opt():
parser = argparse.ArgumentParser()
parser.add_argument('--name', nargs='+', type=str, default='experiment', help='project name') #specify by users
parser.add_argument('--visualise', action='store_true', help='output contains graphs') #specify by users
parser.add_argument('--imgsize', '--img', '--img-size', nargs='+', type=int, default=[640], help='image size h,w') #let users specify
parser.add_argument('--data', type=str, default=ROOT / 'data/coco128.yaml', help='(optional) dataset.yaml path') #internal default setting
parser.add_argument('--thres', type=float, default=0.3, help='threshold') #internal default setting
opt = parser.parse_args()
return opt
My json configuration config.json looks like this, and it allows users to specify their parameters
d = {"name": "trial_001",
"visualise": true,
"imgsize": 1280}
I tried the following to pass new configurations using the script below, and ran into error TypeError: 'bool' object is not subscriptable . In the main() function, I want all default settings parsed as opt , then the three use user-defined parameters defined in config.json will override opt.name, opt.visualise and opt.imgsize. Then detect(**vars(opt)) reads all users and default parameters and apply detect() function to them (note: my detect() function isn't added in this post as it is quite long). Appreciate any pointers here. thanks.
import argparse
import json
def main(opt):
opt = parse_opt()
with open('config.json') as config_file:
d = json.loads(config_file.read())
for item in d.items():
args.extend(item)
detect(**vars(opt)) #detect() is a function that reads all variables from opt
if __name__ == "__main__":
main(opt)
EDIT: this is the full error message I encountered.
for item in d.items():
args.extend(item)
parser.parse_args(args)
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3326, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-26-53a113868d66>", line 1, in <module>
parser.parse_args(args)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/argparse.py", line 1749, in parse_args
args, argv = self.parse_known_args(args, namespace)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/argparse.py", line 1781, in parse_known_args
namespace, args = self._parse_known_args(args, namespace)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/argparse.py", line 1822, in _parse_known_args
option_tuple = self._parse_optional(arg_string)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/argparse.py", line 2108, in _parse_optional
if not arg_string[0] in self.prefix_chars:
TypeError: 'bool' object is not subscriptable

Related

Pandas Reading json gives me ValueError: Invalid file path or buffer object type in Ploty Dash

This line:
else:
#add to this
nutrients_totals_df = pd.read_json(total_nutrients_json, orient='split')
is throwing the error.
I write my json like:
nutrients_json = nutrients_df.to_json(date_format='iso', orient='split')
Then I stash it in a hidden div or dcc.Storage in one callback and get it in another callback. How do I fix this error?
When I read json files that i've written with Pandas, I use the function below and call inside of json.loads().
def read_json_file_from_local(fullpath):
"""read json file from local"""
with open(fullpath, 'rb') as f:
data = f.read().decode('utf-8')
return data
df = json.loads(read_json_file_from_local(fullpath))

Convert a pipeline_pb2.TrainEvalPipelineConfig to JSON or YAML file for tensorflow object detection API

I want to convert a pipeline_pb2.TrainEvalPipelineConfig to JSON or YAML file format for tensorflow object detection API. I tried converting the protobuf file using :
import tensorflow as tf
from google.protobuf import text_format
import yaml
from object_detection.protos import pipeline_pb2
def get_configs_from_pipeline_file(pipeline_config_path, config_override=None):
'''
read .config and convert it to proto_buffer_object
'''
pipeline_config = pipeline_pb2.TrainEvalPipelineConfig()
with tf.gfile.GFile(pipeline_config_path, "r") as f:
proto_str = f.read()
text_format.Merge(proto_str, pipeline_config)
if config_override:
text_format.Merge(config_override, pipeline_config)
#print(pipeline_config)
return pipeline_config
def create_configs_from_pipeline_proto(pipeline_config):
'''
Returns the configurations as dictionary
'''
configs = {}
configs["model"] = pipeline_config.model
configs["train_config"] = pipeline_config.train_config
configs["train_input_config"] = pipeline_config.train_input_reader
configs["eval_config"] = pipeline_config.eval_config
configs["eval_input_configs"] = pipeline_config.eval_input_reader
# Keeps eval_input_config only for backwards compatibility. All clients should
# read eval_input_configs instead.
if configs["eval_input_configs"]:
configs["eval_input_config"] = configs["eval_input_configs"][0]
if pipeline_config.HasField("graph_rewriter"):
configs["graph_rewriter_config"] = pipeline_config.graph_rewriter
return configs
configs = get_configs_from_pipeline_file('pipeline.config')
config_as_dict = create_configs_from_pipeline_proto(configs)
But when I try converting this returned dictionary to YAML with yaml.dump(config_as_dict) it says
TypeError: can't pickle google.protobuf.pyext._message.RepeatedCompositeContainer objects
For json.dump(config_as_dict) it says :
Traceback (most recent call last):
File "config_file_parsing.py", line 48, in <module>
config_as_json = json.dumps(config_as_dict)
File "/usr/lib/python3.5/json/__init__.py", line 230, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python3.5/json/encoder.py", line 198, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python3.5/json/encoder.py", line 256, in iterencode
return _iterencode(o, 0)
File "/usr/lib/python3.5/json/encoder.py", line 179, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: label_map_path: "label_map.pbtxt"
shuffle: true
tf_record_input_reader {
input_path: "dataset.record"
}
is not JSON serializable
Would appreciate some help here.
JSON can only dump a subset of the python primtivies primitives and dict and list collections (with limitation on self-referencing).
YAML is more powerful, and can be used to dump arbitrary Python objects. But only if those objects can be "investigated" during the representation phase of the dump, which essentially limits that to instances of pure Python classes. For objects created at the C level, one can make explicit dumpers, and if not available Python will try and use the pickle protocol to dump the data to YAML.
Inspecing protobuf on PyPI shows me that there are non-generic wheels available, which is always an indication for some C code optimization. Inspecting one of these files indeed shows a pre-compiled shared object.
Although you make a dict out of the config, this dict can of course only be dumped when all its keys and all its values can be dumped. Since your keys are strings (necessary for JSON), you need to look at each of the values, to find the one that doesn't dump, and convert that to a dumpable object structure (dict/list for JSON, pure Python class for YAML).
You might want to take a look at Module json_format

Python Json reference and validation

I'm starting using python to validate some json information, i'm using a json schema with reference but i'm having trouble to reference those files. This is the code :
from os.path import join, dirname
from jsonschema import validate
import jsonref
def assert_valid_schema(data, schema_file):
""" Checks whether the given data matches the schema """
schema = _load_json_schema(schema_file)
return validate(data, schema)
def _load_json_schema(filename):
""" Loads the given schema file """
relative_path = join('schemas', filename).replace("\\", "/")
absolute_path = join(dirname(__file__), relative_path).replace("\\", "/")
base_path = dirname(absolute_path)
base_uri = 'file://{}/'.format(base_path)
with open(absolute_path) as schema_file:
return jsonref.loads(schema_file.read(), base_uri=base_uri, jsonschema=True, )
assert_valid_schema(data, 'grandpa.json')
The json data is :
data = {"id":1,"work":{"id":10,"name":"Miroirs","composer":{"id":100,"name":"Maurice Ravel","functions":["Composer"]}},"recording_artists":[{"id":101,"name":"Alexandre Tharaud","functions":["Piano"]},{"id":102,"name":"Jean-Martial Golaz","functions":["Engineer","Producer"]}]}
And i'm saving the schema and reference file, into a schemas folder :
recording.json :
{"$schema":"http://json-schema.org/draft-04/schema#","title":"Schema for a recording","type":"object","properties":{"id":{"type":"number"},"work":{"type":"object","properties":{"id":{"type":"number"},"name":{"type":"string"},"composer":{"$ref":"artist.json"}}},"recording_artists":{"type":"array","items":{"$ref":"artist.json"}}},"required":["id","work","recording_artists"]}
artist.json :
{"$schema":"http://json-schema.org/draft-04/schema#","title":"Schema for an artist","type":"object","properties":{"id":{"type":"number"},"name":{"type":"string"},"functions":{"type":"array","items":{"type":"string"}}},"required":["id","name","functions"]}
And this is my error :
Connected to pydev debugger (build 181.5281.24)
Traceback (most recent call last):
File "C:\Python\lib\site-packages\proxytypes.py", line 207, in __subject__
return self.cache
File "C:\Python\lib\site-packages\proxytypes.py", line 131, in __getattribute__
return _oga(self, attr)
AttributeError: 'JsonRef' object has no attribute 'cache'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Python\lib\site-packages\jsonref.py", line 163, in callback
base_doc = self.loader(uri)
<MORE>
python version : 3.6.5
windows 7
Ide : intellijIdea
Can somebody help me?
Thank you
I am not sure why, but on Windows, the file:// needs an extra /. So the following change should do the trick
base_uri = 'file:///{}/'.format(base_path)
Arrived at this answer from a solution posted for a related issue in json schema

How to use mock_open with json.load()?

I'm trying to get a unit test working that validates a function that reads credentials from a JSON-encoded file. Since the credentials themselves aren't fixed, the unit test needs to provide some and then test that they are correctly retrieved.
Here is the credentials function:
def read_credentials():
basedir = os.path.dirname(__file__)
with open(os.path.join(basedir, "authentication.json")) as f:
data = json.load(f)
return data["bot_name"], data["bot_password"]
and here is the test:
def test_credentials(self):
with patch("builtins.open", mock_open(
read_data='{"bot_name": "name", "bot_password": "password"}\n'
)):
name, password = shared.read_credentials()
self.assertEqual(name, "name")
self.assertEqual(password, "password")
However, when I run the test, the json code blows up with a decode error. Looking at the json code itself, I'm struggling to see why the mock test is failing because json.load(f) simply calls f.read() then calls json.loads().
Indeed, if I change my authentication function to the following, the unit test works:
def read_credentials():
# Read the authentication file from the current directory and create a
# HTTPBasicAuth object that can then be used for future calls.
basedir = os.path.dirname(__file__)
with open(os.path.join(basedir, "authentication.json")) as f:
content = f.read()
data = json.loads(content)
return data["bot_name"], data["bot_password"]
I don't necessarily mind leaving my code in this form, but I'd like to understand if I've got something wrong in my test that would allow me to keep my function in its original form.
Stack trace:
Traceback (most recent call last):
File "test_shared.py", line 56, in test_credentials
shared.read_credentials()
File "shared.py", line 60, in read_credentials
data = json.loads(content)
File "/home/philip/.local/share/virtualenvs/atlassian-webhook-basic-3gOncDp4/lib/python3.6/site-packages/flask/json/__init__.py", line 205, in loads
return _json.loads(s, **kwargs)
File "/usr/lib/python3.6/json/__init__.py", line 367, in loads
return cls(**kw).decode(s)
File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
I had the same issue and got around it by mocking json.load and builtins.open:
import json
from unittest.mock import patch, MagicMock
# I don't care about the actual open
p1 = patch( "builtins.open", MagicMock() )
m = MagicMock( side_effect = [ { "foo": "bar" } ] )
p2 = patch( "json.load", m )
with p1 as p_open:
with p2 as p_json_load:
f = open( "filename" )
print( json.load( f ) )
Result:
{'foo': 'bar'}
I had the exact same issue and solved it. Full code below, first the function to test, then the test itself.
The original function I want to test loads a json file that is structured like a dictionary, and checks to see if there's a specific key-value pair in it:
def check_if_file_has_real_data(filepath):
with open(filepath, "r") as f:
data = json.load(f)
if "fake" in data["the_data"]:
return False
else:
return True
But I want to test this without loading any actual file, exactly as you describe. Here's how I solved it:
from my_module import check_if_file_has_real_data
import mock
#mock.patch("my_module.json.load")
#mock.patch("my_module.open")
def test_check_if_file_has_real_data(mock_open, mock_json_load):
mock_json_load.return_value = dict({"the_data": "This is fake data"})
assert check_if_file_has_real_data("filepath") == False
mock_json_load.return_value = dict({"the_data": "This is real data"})
assert check_if_file_has_real_data("filepath") == True
The mock_open object isn't called explicitly in the test function, but if you don't include that decorator and argument you get a filepath error when the with open part of the check_if_file_has_real_data function tries to run using the actual open function rather than the MagicMock object that's been passed into it.
Then you overwrite the response provided by the json.load mock with whatever you want to test.

Defining module variables from functions

I've been finally getting into Python, and have noticed something strange, that works in Java, but not in Python.
When I type the following:
fn = "" # Local filename storage.
def read(filename):
fn = filename
return open(filename, 'r').read()
My flake8 linter for Atom gives me the following error:
F841 - local variable 'fn' is assigned to but never used.
I'm assuming this means that the variable is being defined on the def level, and not the module level, which I intend on doing. Please correct me if I'm wrong.
I've searched Google, with multiple wordings, but can't seem to word it in a way that the correct results display...
Any ideas on how I can be able to achieve module-level variable definitions from the function-level?
If you want to declare fn as a global variable (module-level), use global statement.
def read(filename):
global fn # <-----
fn = filename
return open(filename, 'r').read()
BTW, ; is optional. Don't use it.
You can set a module level variable from the function by doing:
import sys
def read(filename):
module = sys.modules[__name__]
setattr(module, 'fn', filename)
return open(filename, 'r').read()
However, it's a very strange necessity. Consider to change your architecture.
UPD: Let's consider an example:
# module1
# uncomment it to fix NameError and AttributeError
# some_var = ''
def foo(val):
global some_var
some_var = val
# module2
from module1 import *
print(some_var) # raises NameError: name 'some_var' is not defined
foo('bar')
print(some_var) # still raises NameError: name 'some_var' is not defined
# module3
import module1
print(module1.some_var) # raises AttributeError: 'module' object has no attribute 'some_var'
foo('bar')
print(module1.some_var) # prints 'bar' even without some_var = '' definition in the module1
So, it's not so obvious how global behaves during the import process. I think, that manually doing setattr(module, 'attr_name', value) during the read() call is more clear.