How to Force Tensorflow to Run under float16? - json

I am building up a sequential model by Keras with a custom activation function by defining a new class written by keras' tf backend and some tf's tensor operators themselves. I put the custom activation function in ../keras/advanced_activation.py.
I intend to run it using float16 precision. Without the custom function, I could use the following to choose between float32 and float16 easily:
if self.precision == 'float16':
K.set_floatx('float16')
K.set_epsilon(1e-4)
else:
K.set_floatx('float32')
K.set_epsilon(1e-7)
Yet when involving the custom function into my model, it seems tf persists in float32 even when I chose float16. I understand that tf is run under flat32 by default, so my questions are:
There are also several built-in activation functions in the same file, how does Keras make them run under float16 so that I might be able to do the same? There is a tf method tf.dtypes.cast(...), can I use it in my custom function to force tf? There is no such a cast in those build-in functions.
Alternatively, how can I force tf to run under float16 directly by using Keras with tf as the backend?
Many thanks.

I got the answer by debugging. The lesson is that
First, tf.dtypes.cast(...) works.
Second, I can specify a second argument into my custom activation function to indicate the data type of the cast(...). The following is the associated code
Third, we do not need tf.constant to indicate the data type of those constants
Fourth, I conclude that adding a custom function in custom_activation.py is the easiest way to define our own layer/activation, as long as it is differentiable everywhere, or at least piece-wisely differentiable and has no discontinuity at junctures.
# Quadruple Piece-Wise Constant Function
class MyFunc(Layer):
def __init__(self, sharp=100, DataType = 'float32', **kwargs):
super(MyFunc, self).__init__(**kwargs)
self.supports_masking = True
self.sharp = K.cast_to_floatx(sharp)
self.DataType = DataType
def call(self, inputs):
inputss = tf.dtypes.cast(inputs, dtype=self.DataType)
orig = inputss
# some calculations
return # my_results
def get_config(self):
config = {'sharp': float(self.sharp),
'DataType': self.DataType}
base_config = super(MyFunc, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
def compute_output_shape(self, input_shape):
return input_shape
Thanks #y.selivonchyk for your worthy discussion with me, and #Yolo Swaggins for your contribution.

Related

DeepSORT with custom detectors

I have created a class in which am loading my custom weights for yolo v5, v7, and v8 to get detection out of this class I am getting xyxy values and class name. now I want to deploy deep SORT on it by using this specific class with no extra files. in my detection class I am simply using torch.hub.load() and access to my basic functions in yolo v5, v7, and v8. now I am looking for specific and straightforward techniques to apply deep sort by using my detection class. is it possible? if it is possible please tell me how can I do it. if it is not possible, what is the simplest method to implement deep sort?
import torch
import cv2
class YoloDetector:
def __init__(self, conf_thold, device, weights_path, expected_objs):
# taking the image file path, confidence level and GPU or CPU selection
self._model = torch.hub.load("WongKinYiu/yolov7", "custom", f"{weights_path}", trust_repo=True)
self._model.conf = conf_thold # NMS confidence threshold
self._model.classes = expected_objs # (optional list) filter by class
self._model.to(device) # specifying device type
def process_image(self, image):
results = self._model(image)
predictions = [] # final list to return all detections
detection = results.pandas().xyxy[0]
for i in range(len(detection)):
# getting bbox and class name one by one
class_name = detection["name"]
xmin = detection["xmin"][i]
ymin = detection["ymin"][i]
xmax = detection["xmax"][i]
ymax = detection["ymax"][i]
# parallely appending the values in list using dictionary
predictions.append({'class': class_name[i], 'bbox': [int(xmin), int(ymin), int(xmax), int(ymax)]})
return predictions
that is the code I am using for getting detections now please tell me how can I implement deep sort by using this.

Why can't I perform gradients on a variable passed as an argument to a tf.function?

My training loop was giving me the following warning:
WARNING:tensorflow:Gradients do not exist for variables ['noise:0'] when minimizing the loss.
After some tinkering I determined this only happened when the noise variable was being passed as an argument to my loss function which is a tf.function. The code below shows that there is no problem when the loss function is not a tf.function or when the global noise variable is referenced in the function. It also shows that an error results from trying to perform a gradient on the noise variable when it is used as argument in a tf.function:
import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp
from tensorflow_probability import distributions as tfd
from tensorflow_probability import bijectors as tfb
constrain_positive = tfb.Shift(np.finfo(np.float64).tiny)(tfb.Exp())
noise = tfp.util.TransformedVariable(initial_value=.1, bijector=constrain_positive, dtype=np.float64, name="noise")
trainable_variables = [noise.variables[0]]
kernel = tfp.math.psd_kernels.ExponentiatedQuadratic()
optimizer = tf.keras.optimizers.Adam()
index_points = tf.constant([[0]], dtype=np.float64)
observations = tf.constant([0], dtype=np.float64)
# I can train noise when it is passed as an argument to a python function
def loss_function_1(index_points, observations, kernel, observation_noise_variance):
gp = tfd.GaussianProcess(kernel, index_points, observation_noise_variance=observation_noise_variance)
return -gp.log_prob(observations)
with tf.GradientTape() as tape:
nll_1 = loss_function_1(index_points, observations, kernel, noise)
grad_1 = tape.gradient(nll_1, trainable_variables)
print(grad_1)
optimizer.apply_gradients(zip(grad_1, trainable_variables))
# I can train noise if it is used in a tf.function and not passed as an argument
#tf.function(autograph=False, experimental_compile=False)
def loss_function_2(index_points, observations, kernel):
gp = tfd.GaussianProcess(kernel, index_points, observation_noise_variance=noise)
return -gp.log_prob(observations)
with tf.GradientTape() as tape:
nll_2 = loss_function_2(index_points, observations, kernel)
grad_2 = tape.gradient(nll_2, trainable_variables)
print(grad_2)
optimizer.apply_gradients(zip(grad_2, trainable_variables))
# I can train noise if it is passed as an argument to a tf.function if the tf.function
# uses the global variable
#tf.function(autograph=False, experimental_compile=False)
def loss_function_3(index_points, observations, kernel, observation_noise_variance):
gp = tfd.GaussianProcess(kernel, index_points, observation_noise_variance=noise)
return -gp.log_prob(observations)
with tf.GradientTape() as tape:
nll_3 = loss_function_3(index_points, observations, kernel, noise)
grad_3 = tape.gradient(nll_3, trainable_variables)
print(grad_3)
optimizer.apply_gradients(zip(grad_3, trainable_variables))
# I cannot train noise if it is passed as an argument to a tf.function if the tf.function
# the local variable
#tf.function(autograph=False, experimental_compile=False)
def loss_function_4(index_points, observations, kernel, observation_noise_variance):
gp = tfd.GaussianProcess(kernel, index_points, observation_noise_variance=observation_noise_variance)
return -gp.log_prob(observations)
with tf.GradientTape() as tape:
nll_4 = loss_function_4(index_points, observations, kernel, noise)
grad_4 = tape.gradient(nll_4, trainable_variables)
print(grad_4)
optimizer.apply_gradients(zip(grad_4, trainable_variables))
This code prints:
[<tf.Tensor: shape=(), dtype=float64, numpy=0.045454545454545456>]
[<tf.Tensor: shape=(), dtype=float64, numpy=0.045413242911911206>]
[<tf.Tensor: shape=(), dtype=float64, numpy=0.04537197429557289>]
[None]
And then it returns the error message:
ValueError: No gradients provided for any variable: ['noise:0'].
Ideally I would get the performance boost of a tf.function so I don't want to use loss_function_1. Also, I would like to be able to pass different noise variables to my loss function so I do not want to use the global variable like I do in loss_function_2 or loss_function_3.
Why do I get None when I try to perform a gradient on a variable passed as an argument to a tf.function? How can I get around this?
You can't workaround it, it works like that by design.
When you use tf.function you're converting the Python code to a static graph (in particular a DAG). This graph has some input nodes and some output nodes.
The input nodes are the parameter of your function and the output nodes are the return values.
Defining a tf.Variable inside the function body, or equivalently, passing a tf.Variable as a function parameter, means creating a new variable node in the static graph every time you invoke it and creating a new variable every time you call it it's now what you want.
In practice, when you have objects with a state (tf.Variable and similar) you can't define them inside a tf.function-decorated function, but you have to break the function scope and declare the variable outside it.
Your solution of declaring a global variable is the one to use. A better solution is to refactor your code to be more object-oriented, declaring the variable as a private attribute of a class in order to do not expose the variable object globally.
I covered this behavior in this article where you can find several insights on how to refactor your code and how to think when using tf.function

Writing a dropout layer using nn.Sequential() method + Pytorch

I am trying to create a Dropout Layer for my neural network using nn.Sequential() like this:
class DropoutLayer(nn.Module):
def __init__(self, p):
super().__init__()
self.p = p
def forward(self, input):
if self.training:
u1 = (np.random.rand(*input.shape)<self.p)
u1 *= input
return u1
else:
input *= self.p
model = nn.Sequential(Flatten(),DropoutLayer(p = 0.7),nn.LogSoftmax(dim = -1))
opt = torch.optim.Adam(modelDp.parameters(), lr=0.005)
train(modelDp, opt, 5)
But I get this error:
ValueError: optimizer got an empty parameter list
First, there is what I assume to be a small typo : you declare model = nn.Sequential(...) but then use modelDp.parameters(). I assume you just made a small copypaste mistake and these are actually the same thing.
This error is yielded because no layer in your model has trainable parameters, i.e parameters that will be affected by the gradient backpropagation step. Actually, this "network" cannot learn anything at all.
To get rid of the error and get an actual working neural network, you need to include the learning layers, which according to the previous error you had reported, are linear layers. That would be something like :
model = nn.Sequential(nn.Linear(784, 10), Flatten(), DropoutLayer(0.7), nn.LogSoftMax(dim=-1))
Now a couple additional remarks :
You may want to use the pytorch random tensors instead of Numpy's. It will be easier to deal with the devicewhen you will eventually want to move your network on GPU.
This code is going to yield another error as soon as you try it in eval mode because your second conditional branch in the forward method does not return anything. You may want to replace the instruction with return input * self.p

FastAI Segmentation Problem: Updating Model Weights with custom Item- and LabelList

This might be a stupid question as nobody on fastai is trying to answer it. If it is one you can go ahead and tell me but also please tell me the answer, because right now I am completely lost.
I am currently working on a U-Net model for the segmentation of cells in microscopy images. Due to class imbalances and to amplify the importance of cell boundaries, I calculated a pixelwise weightmap for each image that I pass into fastai. Therefore I created a new ItemBase class to save labels and weights together:
class WeightedLabels(ItemBase):
"""
Custom ItemBase to store and process labels and pixelwise weights together.
Also handling the target_size of the labels.
"""
def __init__(self, lbl: Image, wgt: Image, target_size: Tuple = None):
self.lbl, self.wgt = lbl, wgt
self.obj, self.data = (lbl, wgt), [lbl.data, wgt.data]
self.target_size = target_size
...
I use extensive augmentation, like elastic deformation, mirroring and rotations on both weights and labels, as well as the original image. I determine the Loss with a custom Cross-entropy loss function that uses the weights to get the weighted loss for each pixel and averages them.
My problem is, that I do not get a very good performace. I have the feeling that might be because of fastai trying to predict the weights as well. My questions are:
Am I right to assume my model tries to predict both?
If so, how do I tell the learner what to use for updating the layers and to only predict part of my labels, while still applying augmentation to both?
Here's the code for how I implemented my custom LabelList and my custom ItemList:
class CustomSegmentationLabelList(ImageList):
"'Item List' suitable for WeightedLabels containing labels and pixelweights"
_processor = vision.data.SegmentationProcessor
def __init__(self,
items: Iterator,
wghts = None,
classes: Collection = None,
target_size: Tuple = None,
loss_func=CrossEntropyFlat(axis=1),
**kwargs):
super().__init__(items, **kwargs)
self.copy_new.append('classes')
self.copy_new.append('wghts')
self.classes, self.loss_func, self.wghts = classes, loss_func, wghts
self.target_size = target_size
def open(self, fn):
res = io.imread(fn)
res = pil2tensor(res, np.float32)
return Image(res)
def get(self, i):
fn = super().get(i)
wt = self.wghts[i]
return WeightedLabels(fn, self.open(wt), self.target_size)
def reconstruct(self, t: Tensor):
return WeightedLabels(Image(t[0]), Image(t[1]), self.target_size)
class CustomSegmentationItemList(ImageList):
"'ItemList' suitable for segmentation with pixelwise weighted loss"
_label_cls, _square_show_res = CustomSegmentationLabelList, False
def label_from_funcs(self, get_labels: Callable, get_weights: Callable,
label_cls: Callable = None, classes=None,
target_size: Tuple = None, **kwargs) -> 'LabelList':
"Get weights and labels from two functions. Saves them in a CustomSegmentationLabelList"
kwargs = {}
wghts = [get_weights(o) for o in self.items]
labels = [get_labels(o) for o in self.items]
if target_size:
print(
f'Masks will be cropped to {target_size}. Choose \'target_size \\= None \' to keep initial size.')
else:
print(f'Masks will not be cropped.')
y = CustomSegmentationLabelList(
labels, wghts, classes, target_size, path=self.path)
res = self._label_list(x=self, y=y)
return res
Also here the part, where I initiate my databunch object:
data = (CustomSegmentationItemList.from_df(img_df,IMG_PATH, convert_mode=IMAGE_TYPE)
.split_by_rand_pct(valid_pct=(1/N_SPLITS),seed=SEED)
.label_from_funcs(get_labels, get_weights, target_size=MASK_SHAPE, classes = array(['background','cell']))
.transform(tfms=tfms, tfm_y=True)
.databunch(bs=BATCH_SIZE))
I use the regular learner, where I pass in my U-Net model, the data, my loss function and some additional arguments that shouldn't really matter here. When trying to apply my model, after training it, it gives me two identical output tensors. I assume that one is probably for the labels and the other for the weights. Both have the following dimensions: (WxHxC). I do not understand why this is happening, because my model is supposed to only have one output in form of (WxHxC). If this happens during prediction, this probably also happens during training. How can I overcome this?

Why must use DataParallel when testing?

Train on the GPU, num_gpus is set to 1:
device_ids = list(range(num_gpus))
model = NestedUNet(opt.num_channel, 2).to(device)
model = nn.DataParallel(model, device_ids=device_ids)
Test on the CPU:
model = NestedUNet_Purn2(opt.num_channel, 2).to(dev)
device_ids = list(range(num_gpus))
model = torch.nn.DataParallel(model, device_ids=device_ids)
model_old = torch.load(path, map_location=dev)
pretrained_dict = model_old.state_dict()
model_dict = model.state_dict()
pretrained_dict = {k: v for k, v in pretrained_dict.items() if k in model_dict}
model_dict.update(pretrained_dict)
model.load_state_dict(model_dict)
This will get the correct result, but when I delete:
device_ids = list(range(num_gpus))
model = torch.nn.DataParallel(model, device_ids=device_ids)
the result is wrong.
nn.DataParallel wraps the model, where the actual model is assigned to the module attribute. That also means that the keys in the state dict have a module. prefix.
Let's look at a very simplified version with just one convolution to see the difference:
class NestedUNet(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 16, kernel_size=3, padding=1)
model = NestedUNet()
model.state_dict().keys() # => odict_keys(['conv1.weight', 'conv1.bias'])
# Wrap the model in DataParallel
model_dp = nn.DataParallel(model, device_ids=range(num_gpus))
model_dp.state_dict().keys() # => odict_keys(['module.conv1.weight', 'module.conv1.bias'])
The state dict you saved with nn.DataParallel does not line up with the regular model's state. You are merging the current state dict with the loaded state dict, that means that the loaded state is ignored, because the model does not have any attributes that belong to the keys and instead you are left with the randomly initialised model.
To avoid making that mistake, you shouldn't merge the state dicts, but rather directly apply it to the model, in which case there will be an error if the keys don't match.
RuntimeError: Error(s) in loading state_dict for NestedUNet:
Missing key(s) in state_dict: "conv1.weight", "conv1.bias".
Unexpected key(s) in state_dict: "module.conv1.weight", "module.conv1.bias".
To make the state dict that you have saved compatible, you can strip off the module. prefix:
pretrained_dict = {key.replace("module.", ""): value for key, value in pretrained_dict.items()}
model.load_state_dict(pretrained_dict)
You can also avoid this issue in the future by unwrapping the model from nn.DataParallel before saving its state, i.e. saving model.module.state_dict(). So you can always load the model first with its state and then later decide to put it into nn.DataParallel if you wanted to use multiple GPUs.
You trained your model using DataParallel and saved it. So, the model weights were stored with a module. prefix. Now, when you load without DataParallel, you basically are not loading any model weights (the model has random weights). As a result, the model predictions are wrong.
I am giving an example.
model = nn.Linear(2, 4)
model = torch.nn.DataParallel(model, device_ids=device_ids)
model.state_dict().keys() # => odict_keys(['module.weight', 'module.bias'])
On the other hand,
another_model = nn.Linear(2, 4)
another_model.state_dict().keys() # => odict_keys(['weight', 'bias'])
See the difference in the OrderedDict keys.
So, in your code, the following three-line works but no model weights are loaded.
pretrained_dict = model_old.state_dict()
model_dict = model.state_dict()
pretrained_dict = {k: v for k, v in pretrained_dict.items() if k in model_dict}
Here, model_dict has keys without the module. prefix but pretrained_dict has when you do not use DataParalle. So, essentially pretrained_dict is empty when DataParallel is not used.
Solution: If you want to avoid using DataParallel, or you can load the weights file, create a new OrderedDict without the module prefix, and load it back.
Something like the following would work for your case without using DataParallel.
# original saved file with DataParallel
model_old = torch.load(path, map_location=dev)
# create new OrderedDict that does not contain `module.`
from collections import OrderedDict
new_state_dict = OrderedDict()
for k, v in model_old.items():
name = k[7:] # remove `module.`
new_state_dict[name] = v
# load params
model.load_state_dict(new_state_dict)