Is it possible to access a solver propertiese through Pycaffe? - caffe

Is it possible to read and access the solver properties in Pycaffe?
I need to use some of the information stored in the solver file, but apparently the solver object which is created using
import caffe
solver = caffe.get_solver(solver_path)
is of no use in this case. Is there any other way to get around this problem?

I couldn't find what I was after using solver object and I ended up writing a quick function to get around this issue:
def retrieve_field(solver_path, field=None):
'''
Returns a specific solver parameter value using the specified field
or the whole content of the solver file, when no field is provided.
returns:
a string, as a field value or the whole content as list
'''
lines = []
field_segments = []
with open(solver_path, 'r') as file:
for line in file:
line = line.strip()
lines.append(line)
field_segments = line.split(':')
if (field_segments[0] == field):
#if that line contains # marks (for comments)
if('#' in field_segments[-1]):
idx = field_segments[-1].index('#')
return field_segments[-1][0:idx]
else:
return field_segments[-1]
return lines

Related

Why can't I perform gradients on a variable passed as an argument to a tf.function?

My training loop was giving me the following warning:
WARNING:tensorflow:Gradients do not exist for variables ['noise:0'] when minimizing the loss.
After some tinkering I determined this only happened when the noise variable was being passed as an argument to my loss function which is a tf.function. The code below shows that there is no problem when the loss function is not a tf.function or when the global noise variable is referenced in the function. It also shows that an error results from trying to perform a gradient on the noise variable when it is used as argument in a tf.function:
import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp
from tensorflow_probability import distributions as tfd
from tensorflow_probability import bijectors as tfb
constrain_positive = tfb.Shift(np.finfo(np.float64).tiny)(tfb.Exp())
noise = tfp.util.TransformedVariable(initial_value=.1, bijector=constrain_positive, dtype=np.float64, name="noise")
trainable_variables = [noise.variables[0]]
kernel = tfp.math.psd_kernels.ExponentiatedQuadratic()
optimizer = tf.keras.optimizers.Adam()
index_points = tf.constant([[0]], dtype=np.float64)
observations = tf.constant([0], dtype=np.float64)
# I can train noise when it is passed as an argument to a python function
def loss_function_1(index_points, observations, kernel, observation_noise_variance):
gp = tfd.GaussianProcess(kernel, index_points, observation_noise_variance=observation_noise_variance)
return -gp.log_prob(observations)
with tf.GradientTape() as tape:
nll_1 = loss_function_1(index_points, observations, kernel, noise)
grad_1 = tape.gradient(nll_1, trainable_variables)
print(grad_1)
optimizer.apply_gradients(zip(grad_1, trainable_variables))
# I can train noise if it is used in a tf.function and not passed as an argument
#tf.function(autograph=False, experimental_compile=False)
def loss_function_2(index_points, observations, kernel):
gp = tfd.GaussianProcess(kernel, index_points, observation_noise_variance=noise)
return -gp.log_prob(observations)
with tf.GradientTape() as tape:
nll_2 = loss_function_2(index_points, observations, kernel)
grad_2 = tape.gradient(nll_2, trainable_variables)
print(grad_2)
optimizer.apply_gradients(zip(grad_2, trainable_variables))
# I can train noise if it is passed as an argument to a tf.function if the tf.function
# uses the global variable
#tf.function(autograph=False, experimental_compile=False)
def loss_function_3(index_points, observations, kernel, observation_noise_variance):
gp = tfd.GaussianProcess(kernel, index_points, observation_noise_variance=noise)
return -gp.log_prob(observations)
with tf.GradientTape() as tape:
nll_3 = loss_function_3(index_points, observations, kernel, noise)
grad_3 = tape.gradient(nll_3, trainable_variables)
print(grad_3)
optimizer.apply_gradients(zip(grad_3, trainable_variables))
# I cannot train noise if it is passed as an argument to a tf.function if the tf.function
# the local variable
#tf.function(autograph=False, experimental_compile=False)
def loss_function_4(index_points, observations, kernel, observation_noise_variance):
gp = tfd.GaussianProcess(kernel, index_points, observation_noise_variance=observation_noise_variance)
return -gp.log_prob(observations)
with tf.GradientTape() as tape:
nll_4 = loss_function_4(index_points, observations, kernel, noise)
grad_4 = tape.gradient(nll_4, trainable_variables)
print(grad_4)
optimizer.apply_gradients(zip(grad_4, trainable_variables))
This code prints:
[<tf.Tensor: shape=(), dtype=float64, numpy=0.045454545454545456>]
[<tf.Tensor: shape=(), dtype=float64, numpy=0.045413242911911206>]
[<tf.Tensor: shape=(), dtype=float64, numpy=0.04537197429557289>]
[None]
And then it returns the error message:
ValueError: No gradients provided for any variable: ['noise:0'].
Ideally I would get the performance boost of a tf.function so I don't want to use loss_function_1. Also, I would like to be able to pass different noise variables to my loss function so I do not want to use the global variable like I do in loss_function_2 or loss_function_3.
Why do I get None when I try to perform a gradient on a variable passed as an argument to a tf.function? How can I get around this?
You can't workaround it, it works like that by design.
When you use tf.function you're converting the Python code to a static graph (in particular a DAG). This graph has some input nodes and some output nodes.
The input nodes are the parameter of your function and the output nodes are the return values.
Defining a tf.Variable inside the function body, or equivalently, passing a tf.Variable as a function parameter, means creating a new variable node in the static graph every time you invoke it and creating a new variable every time you call it it's now what you want.
In practice, when you have objects with a state (tf.Variable and similar) you can't define them inside a tf.function-decorated function, but you have to break the function scope and declare the variable outside it.
Your solution of declaring a global variable is the one to use. A better solution is to refactor your code to be more object-oriented, declaring the variable as a private attribute of a class in order to do not expose the variable object globally.
I covered this behavior in this article where you can find several insights on how to refactor your code and how to think when using tf.function

Why must use DataParallel when testing?

Train on the GPU, num_gpus is set to 1:
device_ids = list(range(num_gpus))
model = NestedUNet(opt.num_channel, 2).to(device)
model = nn.DataParallel(model, device_ids=device_ids)
Test on the CPU:
model = NestedUNet_Purn2(opt.num_channel, 2).to(dev)
device_ids = list(range(num_gpus))
model = torch.nn.DataParallel(model, device_ids=device_ids)
model_old = torch.load(path, map_location=dev)
pretrained_dict = model_old.state_dict()
model_dict = model.state_dict()
pretrained_dict = {k: v for k, v in pretrained_dict.items() if k in model_dict}
model_dict.update(pretrained_dict)
model.load_state_dict(model_dict)
This will get the correct result, but when I delete:
device_ids = list(range(num_gpus))
model = torch.nn.DataParallel(model, device_ids=device_ids)
the result is wrong.
nn.DataParallel wraps the model, where the actual model is assigned to the module attribute. That also means that the keys in the state dict have a module. prefix.
Let's look at a very simplified version with just one convolution to see the difference:
class NestedUNet(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 16, kernel_size=3, padding=1)
model = NestedUNet()
model.state_dict().keys() # => odict_keys(['conv1.weight', 'conv1.bias'])
# Wrap the model in DataParallel
model_dp = nn.DataParallel(model, device_ids=range(num_gpus))
model_dp.state_dict().keys() # => odict_keys(['module.conv1.weight', 'module.conv1.bias'])
The state dict you saved with nn.DataParallel does not line up with the regular model's state. You are merging the current state dict with the loaded state dict, that means that the loaded state is ignored, because the model does not have any attributes that belong to the keys and instead you are left with the randomly initialised model.
To avoid making that mistake, you shouldn't merge the state dicts, but rather directly apply it to the model, in which case there will be an error if the keys don't match.
RuntimeError: Error(s) in loading state_dict for NestedUNet:
Missing key(s) in state_dict: "conv1.weight", "conv1.bias".
Unexpected key(s) in state_dict: "module.conv1.weight", "module.conv1.bias".
To make the state dict that you have saved compatible, you can strip off the module. prefix:
pretrained_dict = {key.replace("module.", ""): value for key, value in pretrained_dict.items()}
model.load_state_dict(pretrained_dict)
You can also avoid this issue in the future by unwrapping the model from nn.DataParallel before saving its state, i.e. saving model.module.state_dict(). So you can always load the model first with its state and then later decide to put it into nn.DataParallel if you wanted to use multiple GPUs.
You trained your model using DataParallel and saved it. So, the model weights were stored with a module. prefix. Now, when you load without DataParallel, you basically are not loading any model weights (the model has random weights). As a result, the model predictions are wrong.
I am giving an example.
model = nn.Linear(2, 4)
model = torch.nn.DataParallel(model, device_ids=device_ids)
model.state_dict().keys() # => odict_keys(['module.weight', 'module.bias'])
On the other hand,
another_model = nn.Linear(2, 4)
another_model.state_dict().keys() # => odict_keys(['weight', 'bias'])
See the difference in the OrderedDict keys.
So, in your code, the following three-line works but no model weights are loaded.
pretrained_dict = model_old.state_dict()
model_dict = model.state_dict()
pretrained_dict = {k: v for k, v in pretrained_dict.items() if k in model_dict}
Here, model_dict has keys without the module. prefix but pretrained_dict has when you do not use DataParalle. So, essentially pretrained_dict is empty when DataParallel is not used.
Solution: If you want to avoid using DataParallel, or you can load the weights file, create a new OrderedDict without the module prefix, and load it back.
Something like the following would work for your case without using DataParallel.
# original saved file with DataParallel
model_old = torch.load(path, map_location=dev)
# create new OrderedDict that does not contain `module.`
from collections import OrderedDict
new_state_dict = OrderedDict()
for k, v in model_old.items():
name = k[7:] # remove `module.`
new_state_dict[name] = v
# load params
model.load_state_dict(new_state_dict)

reading json into django error

I'm passing a context variable, x, into a template from a Djano view. It is a list of strings
x = ['Braselton', 'Buford']
Then I am using an ajax function to pass that variable back to a django view. The problem is when I retrieve that variable in a python view with the following code:
new_x = request.GET['x']
print(new_x)
I see the following:
['Braselton', 'Buford']
I've tried json.loads(request.GET['x']) and I keep getting the following error
json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)
Any help is much appreciated
You need to unescape those characters, there are lots of ways to do it..
Python Documentation for more info
import html.parser
import json
variable = "['Braselton', 'Buford']"
parser = html.parser.HTMLParser()
new_variable = parser.unescape(variable)
new_variable = json.loads(parser.unescape(new_variable).replace("'",'"')) # replace single quote
>>> ['Braselton', 'Buford'] # Type List
The problem is that python escaping the HTML elements. Note that it's not JSON.
to unescape you have to use HTML module.
import html
y = html.unenscape(new_x)
print(y) # output is ['Braselton', 'Buford']
Mark the variable as safe.
'{{ x | safe }}'

save list to CSV - python

I try to save my output as csv using the "import csv" and only get errors. Any reason why?
Since I can not make it run will it also notify if the file already exists?
Thanks a lot
from tkinter import *
from tkinter.filedialog import asksaveasfilename
from tkinter import ttk
import csv
def data():
...
output= <class 'list'> #just an example
...
def savefile():
name= asksaveasfilename()
create = csv.writer(open(name, "wb"))
create.writerow(output)
for x in output:
create.writerow(x)
root = Tk()
Mframe = ttk.Frame(root)
Mframe.grid(column=0, row=0, sticky=(N, W, E, S))
bSave=ttk.Button(Mframe, text='Save File', command=savefile)
bSave.grid(column=1, row=0)
root.mainloop()
You are opening the file, but not closing it. A good practise is to use a with statement make sure you close it. By the way, I don't know how is the output list, but if it isn't a list of lists, it makes more sense to me to call writerow once.
Besides, make sure this list is also a global variable, otherwise it won't be available within the scope of savefile. However, global variables are not a very good solution, so consider to pass it as an argument to savefile or use a class to hold all this data:
def savefile():
name = asksaveasfilename()
with open(name, 'w', newline='') as csvfile:
create = csv.writer(csvfile)
create.writerow(output)

how to chunk a csv (dict)reader object in python 3.2?

I try to use Pool from the multiprocessing module to speed up reading in large csv files. For this, I adapted an example (from py2k), but it seems like the csv.dictreader object has no length. Does it mean I can only iterate over it? Is there a way to chunk it still?
These questions seemed relevant, but did not really answer my question:
Number of lines in csv.DictReader,
How to chunk a list in Python 3?
My code tried to do this:
source = open('/scratch/data.txt','r')
def csv2nodes(r):
strptime = time.strptime
mktime = time.mktime
l = []
ppl = set()
for row in r:
cell = int(row['cell'])
id = int(row['seq_ei'])
st = mktime(strptime(row['dat_deb_occupation'],'%d/%m/%Y'))
ed = mktime(strptime(row['dat_fin_occupation'],'%d/%m/%Y'))
# collect list
l.append([(id,cell,{1:st,2: ed})])
# collect separate sets
ppl.add(id)
return (l,ppl)
def csv2graph(source):
r = csv.DictReader(source,delimiter=',')
MG=nx.MultiGraph()
l = []
ppl = set()
# Remember that I use integers for edge attributes, to save space! Dic above.
# start: 1
# end: 2
p = Pool(processes=4)
node_divisor = len(p._pool)*4
node_chunks = list(chunks(r,int(len(r)/int(node_divisor))))
num_chunks = len(node_chunks)
pedgelists = p.map(csv2nodes,
zip(node_chunks))
ll = []
for l in pedgelists:
ll.append(l[0])
ppl.update(l[1])
MG.add_edges_from(ll)
return (MG,ppl)
From the csv.DictReader documentation (and the csv.reader class it subclasses), the class returns an iterator. The code should have thrown a TypeError when you called len().
You can still chunk the data, but you'll have to read it entirely into memory. If you're concerned about memory you can switch from csv.DictReader to csv.reader and skip the overhead of the dictionaries csv.DictReader creates. To improve readability in csv2nodes(), you can assign constants to address each field's index:
CELL = 0
SEQ_EI = 1
DAT_DEB_OCCUPATION = 4
DAT_FIN_OCCUPATION = 5
I also recommend using a different variable than id, since that's a built-in function name.