Why must use DataParallel when testing? - deep-learning

Train on the GPU, num_gpus is set to 1:
device_ids = list(range(num_gpus))
model = NestedUNet(opt.num_channel, 2).to(device)
model = nn.DataParallel(model, device_ids=device_ids)
Test on the CPU:
model = NestedUNet_Purn2(opt.num_channel, 2).to(dev)
device_ids = list(range(num_gpus))
model = torch.nn.DataParallel(model, device_ids=device_ids)
model_old = torch.load(path, map_location=dev)
pretrained_dict = model_old.state_dict()
model_dict = model.state_dict()
pretrained_dict = {k: v for k, v in pretrained_dict.items() if k in model_dict}
model_dict.update(pretrained_dict)
model.load_state_dict(model_dict)
This will get the correct result, but when I delete:
device_ids = list(range(num_gpus))
model = torch.nn.DataParallel(model, device_ids=device_ids)
the result is wrong.

nn.DataParallel wraps the model, where the actual model is assigned to the module attribute. That also means that the keys in the state dict have a module. prefix.
Let's look at a very simplified version with just one convolution to see the difference:
class NestedUNet(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 16, kernel_size=3, padding=1)
model = NestedUNet()
model.state_dict().keys() # => odict_keys(['conv1.weight', 'conv1.bias'])
# Wrap the model in DataParallel
model_dp = nn.DataParallel(model, device_ids=range(num_gpus))
model_dp.state_dict().keys() # => odict_keys(['module.conv1.weight', 'module.conv1.bias'])
The state dict you saved with nn.DataParallel does not line up with the regular model's state. You are merging the current state dict with the loaded state dict, that means that the loaded state is ignored, because the model does not have any attributes that belong to the keys and instead you are left with the randomly initialised model.
To avoid making that mistake, you shouldn't merge the state dicts, but rather directly apply it to the model, in which case there will be an error if the keys don't match.
RuntimeError: Error(s) in loading state_dict for NestedUNet:
Missing key(s) in state_dict: "conv1.weight", "conv1.bias".
Unexpected key(s) in state_dict: "module.conv1.weight", "module.conv1.bias".
To make the state dict that you have saved compatible, you can strip off the module. prefix:
pretrained_dict = {key.replace("module.", ""): value for key, value in pretrained_dict.items()}
model.load_state_dict(pretrained_dict)
You can also avoid this issue in the future by unwrapping the model from nn.DataParallel before saving its state, i.e. saving model.module.state_dict(). So you can always load the model first with its state and then later decide to put it into nn.DataParallel if you wanted to use multiple GPUs.

You trained your model using DataParallel and saved it. So, the model weights were stored with a module. prefix. Now, when you load without DataParallel, you basically are not loading any model weights (the model has random weights). As a result, the model predictions are wrong.
I am giving an example.
model = nn.Linear(2, 4)
model = torch.nn.DataParallel(model, device_ids=device_ids)
model.state_dict().keys() # => odict_keys(['module.weight', 'module.bias'])
On the other hand,
another_model = nn.Linear(2, 4)
another_model.state_dict().keys() # => odict_keys(['weight', 'bias'])
See the difference in the OrderedDict keys.
So, in your code, the following three-line works but no model weights are loaded.
pretrained_dict = model_old.state_dict()
model_dict = model.state_dict()
pretrained_dict = {k: v for k, v in pretrained_dict.items() if k in model_dict}
Here, model_dict has keys without the module. prefix but pretrained_dict has when you do not use DataParalle. So, essentially pretrained_dict is empty when DataParallel is not used.
Solution: If you want to avoid using DataParallel, or you can load the weights file, create a new OrderedDict without the module prefix, and load it back.
Something like the following would work for your case without using DataParallel.
# original saved file with DataParallel
model_old = torch.load(path, map_location=dev)
# create new OrderedDict that does not contain `module.`
from collections import OrderedDict
new_state_dict = OrderedDict()
for k, v in model_old.items():
name = k[7:] # remove `module.`
new_state_dict[name] = v
# load params
model.load_state_dict(new_state_dict)

Related

DeepSORT with custom detectors

I have created a class in which am loading my custom weights for yolo v5, v7, and v8 to get detection out of this class I am getting xyxy values and class name. now I want to deploy deep SORT on it by using this specific class with no extra files. in my detection class I am simply using torch.hub.load() and access to my basic functions in yolo v5, v7, and v8. now I am looking for specific and straightforward techniques to apply deep sort by using my detection class. is it possible? if it is possible please tell me how can I do it. if it is not possible, what is the simplest method to implement deep sort?
import torch
import cv2
class YoloDetector:
def __init__(self, conf_thold, device, weights_path, expected_objs):
# taking the image file path, confidence level and GPU or CPU selection
self._model = torch.hub.load("WongKinYiu/yolov7", "custom", f"{weights_path}", trust_repo=True)
self._model.conf = conf_thold # NMS confidence threshold
self._model.classes = expected_objs # (optional list) filter by class
self._model.to(device) # specifying device type
def process_image(self, image):
results = self._model(image)
predictions = [] # final list to return all detections
detection = results.pandas().xyxy[0]
for i in range(len(detection)):
# getting bbox and class name one by one
class_name = detection["name"]
xmin = detection["xmin"][i]
ymin = detection["ymin"][i]
xmax = detection["xmax"][i]
ymax = detection["ymax"][i]
# parallely appending the values in list using dictionary
predictions.append({'class': class_name[i], 'bbox': [int(xmin), int(ymin), int(xmax), int(ymax)]})
return predictions
that is the code I am using for getting detections now please tell me how can I implement deep sort by using this.

Count the number of people having a property bounded by two numbers

The following code goes over the 10 pages of JSON returned by GET request to the URL.
and checks how many records satisfy the condition that bloodPressureDiastole is between the specified limits. It does the job, but I was wondering if there was a better or cleaner way to achieve this in python
import urllib.request
import urllib.parse
import json
baseUrl = 'https://jsonmock.hackerrank.com/api/medical_records?page='
count = 0
for i in range(1, 11):
url = baseUrl+str(i)
f = urllib.request.urlopen(url)
response = f.read().decode('utf-8')
response = json.loads(response)
lowerlimit = 110
upperlimit = 120
for elem in response['data']:
bd = elem['vitals']['bloodPressureDiastole']
if bd >= lowerlimit and bd <= upperlimit:
count = count+1
print(count)
There is no access through fields to json content because you get dict object from json.loads (see translation scheme here). It realises access via __getitem__ method (dict[key]) instead of __getattr__ (object.field) as keys may be any hashible objects not only strings. Moreover, even strings cannot serve as fields if they starts with digits or are the same with built-in dictionary methods.
Despite this, you can define your own custom class realising desired behavior with acceptable key names. json.loads has an argument object_hook wherein you may put any callable object (function or class) that take a dict as the sole argument (not only the resulted one but every one in json recursively) & return something as the result. If your jsons match some template, you can define a class with predefined fields for the json content & even with methods in order to get a robust Python-object, a part of your domain logic.
For instance, let's realise the access through fields. I get json content from response.json method from requests but it has the same arguments as in json package. Comments in code contain remarks about how to make your code more pythonic.
from collections import Counter
from requests import get
class CustomJSON(dict):
def __getattr__(self, key):
return self[key]
def __setattr__(self, key, value):
self[key] = value
LOWER_LIMIT = 110 # Constants should be in uppercase.
UPPER_LIMIT = 120
base_url = 'https://jsonmock.hackerrank.com/api/medical_records'
# It is better use special tools for handling URLs
# in order to evade possible exceptions in the future.
# By the way, your option could look clearer with f-strings
# that can put values from variables (not only) in-place:
# url = f'https://jsonmock.hackerrank.com/api/medical_records?page={i}'
counter = Counter(normal_pressure=0)
# It might be left as it was. This option is useful
# in case of additional counting any other information.
for page_number in range(1, 11):
records = get(
base_url, data={"page": page_number}
).json(object_hook=CustomJSON)
# Python has a pile of libraries for handling url requests & responses.
# urllib is a standard library rewritten from scratch for Python 3.
# However, there is a more featured (connection pooling, redirections, proxi,
# SSL verification &c.) & convenient third-party
# (this is the only disadvantage) library: urllib3.
# Based on it, requests provides an easier, more convenient & friendlier way
# to work with url-requests. So I highly recommend using it
# unless you are aiming for complex connections & url-processing.
for elem in records.data:
if LOWER_LIMIT <= elem.vitals.bloodPressureDiastole <= UPPER_LIMIT:
counter["normal_pressure"] += 1
print(counter)

FastAI Segmentation Problem: Updating Model Weights with custom Item- and LabelList

This might be a stupid question as nobody on fastai is trying to answer it. If it is one you can go ahead and tell me but also please tell me the answer, because right now I am completely lost.
I am currently working on a U-Net model for the segmentation of cells in microscopy images. Due to class imbalances and to amplify the importance of cell boundaries, I calculated a pixelwise weightmap for each image that I pass into fastai. Therefore I created a new ItemBase class to save labels and weights together:
class WeightedLabels(ItemBase):
"""
Custom ItemBase to store and process labels and pixelwise weights together.
Also handling the target_size of the labels.
"""
def __init__(self, lbl: Image, wgt: Image, target_size: Tuple = None):
self.lbl, self.wgt = lbl, wgt
self.obj, self.data = (lbl, wgt), [lbl.data, wgt.data]
self.target_size = target_size
...
I use extensive augmentation, like elastic deformation, mirroring and rotations on both weights and labels, as well as the original image. I determine the Loss with a custom Cross-entropy loss function that uses the weights to get the weighted loss for each pixel and averages them.
My problem is, that I do not get a very good performace. I have the feeling that might be because of fastai trying to predict the weights as well. My questions are:
Am I right to assume my model tries to predict both?
If so, how do I tell the learner what to use for updating the layers and to only predict part of my labels, while still applying augmentation to both?
Here's the code for how I implemented my custom LabelList and my custom ItemList:
class CustomSegmentationLabelList(ImageList):
"'Item List' suitable for WeightedLabels containing labels and pixelweights"
_processor = vision.data.SegmentationProcessor
def __init__(self,
items: Iterator,
wghts = None,
classes: Collection = None,
target_size: Tuple = None,
loss_func=CrossEntropyFlat(axis=1),
**kwargs):
super().__init__(items, **kwargs)
self.copy_new.append('classes')
self.copy_new.append('wghts')
self.classes, self.loss_func, self.wghts = classes, loss_func, wghts
self.target_size = target_size
def open(self, fn):
res = io.imread(fn)
res = pil2tensor(res, np.float32)
return Image(res)
def get(self, i):
fn = super().get(i)
wt = self.wghts[i]
return WeightedLabels(fn, self.open(wt), self.target_size)
def reconstruct(self, t: Tensor):
return WeightedLabels(Image(t[0]), Image(t[1]), self.target_size)
class CustomSegmentationItemList(ImageList):
"'ItemList' suitable for segmentation with pixelwise weighted loss"
_label_cls, _square_show_res = CustomSegmentationLabelList, False
def label_from_funcs(self, get_labels: Callable, get_weights: Callable,
label_cls: Callable = None, classes=None,
target_size: Tuple = None, **kwargs) -> 'LabelList':
"Get weights and labels from two functions. Saves them in a CustomSegmentationLabelList"
kwargs = {}
wghts = [get_weights(o) for o in self.items]
labels = [get_labels(o) for o in self.items]
if target_size:
print(
f'Masks will be cropped to {target_size}. Choose \'target_size \\= None \' to keep initial size.')
else:
print(f'Masks will not be cropped.')
y = CustomSegmentationLabelList(
labels, wghts, classes, target_size, path=self.path)
res = self._label_list(x=self, y=y)
return res
Also here the part, where I initiate my databunch object:
data = (CustomSegmentationItemList.from_df(img_df,IMG_PATH, convert_mode=IMAGE_TYPE)
.split_by_rand_pct(valid_pct=(1/N_SPLITS),seed=SEED)
.label_from_funcs(get_labels, get_weights, target_size=MASK_SHAPE, classes = array(['background','cell']))
.transform(tfms=tfms, tfm_y=True)
.databunch(bs=BATCH_SIZE))
I use the regular learner, where I pass in my U-Net model, the data, my loss function and some additional arguments that shouldn't really matter here. When trying to apply my model, after training it, it gives me two identical output tensors. I assume that one is probably for the labels and the other for the weights. Both have the following dimensions: (WxHxC). I do not understand why this is happening, because my model is supposed to only have one output in form of (WxHxC). If this happens during prediction, this probably also happens during training. How can I overcome this?

Variable length output in keras

I'm trying to create an autoencoder in keras with bucketing where the input and the output have different time steps.
model = Sequential()
#encoder
model.add(Embedding(vocab_size, embedding_size, mask_zero=True))
model.add(LSTM(units=hidden_size, return_sequences=False))
#decoder
model.add(RepeatVector(max_out_length))
model.add(LSTM(units=hidden_size, return_sequences=True))
model.add(TimeDistributed(Dense(num_class, activation='softmax')))
For the input there is no problem as the network can accept different length inputs as long as the whole batch has the same length. However the problem is with the output size as its determined by the RepeatVector length and there is not easy way to change it.
Is there a solution for such a problem?
If you mean "inputs with variable lengths" and "outputs with the same lengths as the inputs", you can do this:
Warning: this solution must work with batch size = 1
You will need to create an external loop and pass each sample as a numpy array with the exact length
You cannot use masking in this solution, and the right output depends on the correct length of the input
This is a working code using Keras + Tensorflow:
Imports:
from keras.layers import *
from keras.models import Model
import numpy as np
import keras.backend as K
from keras.utils.np_utils import to_categorical
Custom functions to use in Lambda layers:
#this function gets the length from the original input
#and stores it in the final output of the encoder
def storeLength(x):
inputTensor = x[0]
storeInto = x[1] #the final output
length = K.shape(inputTensor)[1]
length = K.cast(length,K.floatx())
length = K.reshape(length,(1,1))
#will put length as the first element in the final output
return K.concatenate([length,storeInto])
#this function expands the length of the input in the decoder
def expandLength(x):
#lenght is the first element in the encoded input
length = K.cast(x[0,0],'int32') #or int64 if necessary
#the remaining elements are the actual data to be decoded
data = x[:,1:]
#a tensor with shape (length,)
length = K.ones_like(K.arange(0,length))
#make both length tensor and data tensor 3D and with paired dimensions
length = K.cast(K.reshape(length,(1,-1,1)),K.floatx())
data = K.reshape(data,(1,1,-1))
#this automatically repeats the elements based on the paired shapes
return data*length
Creating the models:
I assumed the output is equal to the input, but since you're using an Embedding, I made "num_classes" equal to the number of words.
For this solution, we use a branching, thus I had to use the functional API Model. Which will be way better later, because you will want to train with autoencoder.train_on_batch and then just encode with encoder.predict() or just decode with decoder.predict().
vocab_size = 100
embedding_size = 7
num_class=vocab_size
hidden_size = 3
#encoder
inputs = Input(batch_shape = (1,None))
outputs = Embedding(vocab_size, embedding_size)(inputs)
outputs = LSTM(units=hidden_size, return_sequences=False)(outputs)
outputs = Lambda(storeLength)([inputs,outputs])
encoder = Model(inputs,outputs)
#decoder
inputs = Input(batch_shape=(1,hidden_size+1))
outputs = Lambda(expandLength)(inputs)
outputs = LSTM(units=hidden_size, return_sequences=True)(outputs)
outputs = TimeDistributed(Dense(num_class, activation='softmax'))(outputs)
decoder = Model(inputs,outputs)
#autoencoder
inputs = Input(batch_shape=(1,None))
outputs = encoder(inputs)
outputs = decoder(outputs)
autoencoder = Model(inputs,outputs)
#see each model's shapes
encoder.summary()
decoder.summary()
autoencoder.summary()
Just an example with fake data and the method that should be used for training:
inputData = []
outputData = []
for i in range(7,10):
inp = np.arange(i).reshape((1,i))
inputData.append(inp)
outputData.append(to_categorical(inp,num_class))
autoencoder.compile(loss='mse',optimizer='adam')
for epoch in range(1):
for inputSample,outputSample in zip(inputData,outputData):
print(inputSample.shape,outputSample.shape)
autoencoder.train_on_batch(inputSample,outputSample)
for inputSample in inputData:
print(autoencoder.predict(inputSample).shape)

how to chunk a csv (dict)reader object in python 3.2?

I try to use Pool from the multiprocessing module to speed up reading in large csv files. For this, I adapted an example (from py2k), but it seems like the csv.dictreader object has no length. Does it mean I can only iterate over it? Is there a way to chunk it still?
These questions seemed relevant, but did not really answer my question:
Number of lines in csv.DictReader,
How to chunk a list in Python 3?
My code tried to do this:
source = open('/scratch/data.txt','r')
def csv2nodes(r):
strptime = time.strptime
mktime = time.mktime
l = []
ppl = set()
for row in r:
cell = int(row['cell'])
id = int(row['seq_ei'])
st = mktime(strptime(row['dat_deb_occupation'],'%d/%m/%Y'))
ed = mktime(strptime(row['dat_fin_occupation'],'%d/%m/%Y'))
# collect list
l.append([(id,cell,{1:st,2: ed})])
# collect separate sets
ppl.add(id)
return (l,ppl)
def csv2graph(source):
r = csv.DictReader(source,delimiter=',')
MG=nx.MultiGraph()
l = []
ppl = set()
# Remember that I use integers for edge attributes, to save space! Dic above.
# start: 1
# end: 2
p = Pool(processes=4)
node_divisor = len(p._pool)*4
node_chunks = list(chunks(r,int(len(r)/int(node_divisor))))
num_chunks = len(node_chunks)
pedgelists = p.map(csv2nodes,
zip(node_chunks))
ll = []
for l in pedgelists:
ll.append(l[0])
ppl.update(l[1])
MG.add_edges_from(ll)
return (MG,ppl)
From the csv.DictReader documentation (and the csv.reader class it subclasses), the class returns an iterator. The code should have thrown a TypeError when you called len().
You can still chunk the data, but you'll have to read it entirely into memory. If you're concerned about memory you can switch from csv.DictReader to csv.reader and skip the overhead of the dictionaries csv.DictReader creates. To improve readability in csv2nodes(), you can assign constants to address each field's index:
CELL = 0
SEQ_EI = 1
DAT_DEB_OCCUPATION = 4
DAT_FIN_OCCUPATION = 5
I also recommend using a different variable than id, since that's a built-in function name.