Why does the permute layer fail with in-place operation? - caffe

I tried to use Permute layer with intel caffe, the codes with in-place operation failed with wrong top blob shape:
layer {
name: "conv4_3_norm_mbox_conf_perm"
type: "Permute"
bottom: "per_blob"
top: "per_blob"
permute_param {
order: 0
order: 2
order: 3
order: 1
}
}
which succeeded with different bottom, top name.
Why does the in-place operation fail?

Obviously, values are messed up during permutation. To swap two variables, you need a temporary buffer (unless you use XOR tricks or something).

Related

Image augmentation options in config for Object detection and high loss value using SSD MobileNetv2

I noticed that when I include the image augmentation options like below to train my object detection model the loss value is incredibly high like 30K and 65K unlike when I dont use these options
Why is that so? note that I have only observed this for first few hundred steps and haven't baby sit my model for too long
65K loss value with these
data_augmentation_options {
random_image_scale {
min_scale_ratio:0.5
min_scale_ratio:2
}
}
data_augmentation_options {
scale_boxes_to_pixel_coordinates {
}
}
~30K plus loss value with these
data_augmentation_options {
random_image_scale {
min_scale_ratio:0.5
min_scale_ratio:2
}
}
data_augmentation_options {
random_pixel_value_scale {
}
}
data_augmentation_options {
random_crop_image {
}
}
data_augmentation_options {
scale_boxes_to_pixel_coordinates {
}
}
I observed the same just yesterday and came up with the following observation for me. If you just observe a training just for a few steps, it is not able to generalize the whole test set as fast as with a non-augmented data set. Here you have to see an augmented data set like an bigger one, with much more variance in it. But here the information is not available from the first epoch on, it needs some time till this variance is exposed during the training.
So just try to train it longer and observe the outcome, the results should improve.

Dynamic infoGainMatrix (H) in InfoGainLossLayer, Caffe

I am trying to scale the contribution of the positive samples and negative samples in the classification loss (l_cls) differentially in the multitask loss function(L) of RPN in Faster-RCNN.
As far as I know, the straight forward way to do this in Caffe is to use ‘InfoGainLossLayer’ and pass an infoGainMatrix(H) which contains the different scales. Unfortunately, to my knowledge, the infoGainMatrix(H) cannot be computed on the fly and passed to InfoGainLossLayer. (ref). I would like to have my H computed dynamically.
It would be great if anyone could explain how this can be circumvented.
You can write a "Python" layer that computes H "on the fly" and feed it as a third "bottom" of "InfogainLoss" layer:
layers {
name: "computeH"
bottom: "label"
top: "H"
type: "Python"
python_param { ... }
}
layer {
name: "loss"
type: "InfogainLoss"
bottom: "pred"
bottom: "label"
bottom: "H" # new H for each batch
top: "loss"
}

Explain silence layer in caffe

void SilenceLayer<Dtype>::Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom) {
for (int i = 0; i < bottom.size(); ++i) {
if (propagate_down[i]) {
caffe_set(bottom[i]->count(), Dtype(0),
bottom[i]->mutable_cpu_diff());
}
}
}
It just sets the diff to zero.
What is the use of this layer?
The use of this layer is simply to avoid that the output of unused blobs is reported in the log. Being an output manager layer, it is obviously zero its gradient.
For instance, let us assume we are using AlexNet and we change the bottom of the 'fc7' layer to 'pool5' instead of 'fc6'. If we do not delete the 'fc6' blob declaration, this layer is not used anymore but its ouput will be printed in stderr: it is considered as an output of the whole architecture. If we want to keep 'fc6' for some reasons, but without showing its values, we can use the 'SilenceLayer'.
http://caffe.berkeleyvision.org/tutorial/layers/silence.html
See also caffe.help.

How to split a Blob along channels in Caffe

I would like to split the Blob channels in Caffe, so that I can split one Blob of (N, c, w, h) into two output Blobs of size (N, c/2, w, h).
What I have described above is very general, what I want to do actually is to separate a two-channel input image into two different images. One goes to a convolutional layer and the other goes to a pooling layer. Finally, I concatenate the outputs.
So I am wondering if a Caffe layer that allows the user to do such thing exists, and how to define it in the prototxt file.
Yes, the Slice layer is for that purpose. From the Layer Catalogue:
The Slice layer is a utility layer that slices an input layer to multiple output layers along a given dimension (currently num or channel only) with given slice indices.
To slice a Blob of size N x 2 x H x W into two Blobs of size N x 1 x H x W, you have to slice axis: 1 (along channels) at slice_point: 1 (after the first channel):
layer {
name: "slice-conv-pool"
type: "Slice"
bottom: "data"
top: "conv1"
top: "pool1"
slice_param {
axis: 1
slice_point: 1
}
}

Time complexity of this function?

algo(n)
for i in 0 to n {
for 0 to 8^i {
}
}
for i to 8^d {
}
Any kind of analysis or information about the time complexity of this algorithm will be usefull. Worst case, best case, lower/upper bounds, theta/omega/big-o, recurrence relation....etc.
Your algorithm runs in exponential time (T ∈ Θ(c^n), c>1). You can analyse the number of iterations of the inner for loop (... for 0 to 8^i) using Sigma notation:
Since your algorithm is in Θ(8^n), it is also in O(8^n) (upper asymptotic bound) and Ω(8^n) (lower asymptotic bound).
The above analysis is performed under the assumption that the d in the final for loop analysis is less or equal to n, in which case the nested two for loop prior to it will dominate (and hence we needn't analyze the last non-dominant for loop explicitly).
algo(n) is basically made of two parts:
for i in 0 to n
for 0 to 8^i
and
for i to 8^d
Let's start with the first. Assuming each iteration of the inner loop takes constant time, it's complexity is C*8^i.
Now, if we sum it across possible values of i we get:
8^0 + 8^1 + 8^2 + .... + 8^n-1
This is sum of geometric series with a=1, r=8, and its sum is:
1 * (1-8 ^(n-1)) / (1-8) = 1 * (-1/7 + 8^(n-1)/7)
For n->infinity, this can be approximated as 8^(n-1)/7, and we can conclude the complexity is Θ(8^(n-1)/7) = Θ(8^n)
As for the 2nd part, it is pretty straight forward and is 8^d.
This gives total complexity of T(n) is in Θ(8^d + 8^n)