I am trying to scale the contribution of the positive samples and negative samples in the classification loss (l_cls) differentially in the multitask loss function(L) of RPN in Faster-RCNN.
As far as I know, the straight forward way to do this in Caffe is to use ‘InfoGainLossLayer’ and pass an infoGainMatrix(H) which contains the different scales. Unfortunately, to my knowledge, the infoGainMatrix(H) cannot be computed on the fly and passed to InfoGainLossLayer. (ref). I would like to have my H computed dynamically.
It would be great if anyone could explain how this can be circumvented.
You can write a "Python" layer that computes H "on the fly" and feed it as a third "bottom" of "InfogainLoss" layer:
layers {
name: "computeH"
bottom: "label"
top: "H"
type: "Python"
python_param { ... }
}
layer {
name: "loss"
type: "InfogainLoss"
bottom: "pred"
bottom: "label"
bottom: "H" # new H for each batch
top: "loss"
}
Related
I tried to use Permute layer with intel caffe, the codes with in-place operation failed with wrong top blob shape:
layer {
name: "conv4_3_norm_mbox_conf_perm"
type: "Permute"
bottom: "per_blob"
top: "per_blob"
permute_param {
order: 0
order: 2
order: 3
order: 1
}
}
which succeeded with different bottom, top name.
Why does the in-place operation fail?
Obviously, values are messed up during permutation. To swap two variables, you need a temporary buffer (unless you use XOR tricks or something).
Sorry for this primitive question; but I've just started using caffe.
I want to multiple output of a layer by a constant:
top = bottom * k
any ideas This is what I have tried so far: (for example costant = 0.5)
expParam = {power: 1
scale: 0.5
shift: 0}
L.Exp(bottom, exp_param=expParam, in_place=False)
Using "Scale" layer:
L.Scale(bottom, scale_param={'filler': {'type': 'constant', 'value': 0.5}},
param={'lr_mult': 0, 'decay_mult': 0})
Note that by default "Scale" layer learns the scale factor. If you want it to remain fixed you need to set lr_mult to zero.
void SilenceLayer<Dtype>::Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom) {
for (int i = 0; i < bottom.size(); ++i) {
if (propagate_down[i]) {
caffe_set(bottom[i]->count(), Dtype(0),
bottom[i]->mutable_cpu_diff());
}
}
}
It just sets the diff to zero.
What is the use of this layer?
The use of this layer is simply to avoid that the output of unused blobs is reported in the log. Being an output manager layer, it is obviously zero its gradient.
For instance, let us assume we are using AlexNet and we change the bottom of the 'fc7' layer to 'pool5' instead of 'fc6'. If we do not delete the 'fc6' blob declaration, this layer is not used anymore but its ouput will be printed in stderr: it is considered as an output of the whole architecture. If we want to keep 'fc6' for some reasons, but without showing its values, we can use the 'SilenceLayer'.
http://caffe.berkeleyvision.org/tutorial/layers/silence.html
See also caffe.help.
I would like to split the Blob channels in Caffe, so that I can split one Blob of (N, c, w, h) into two output Blobs of size (N, c/2, w, h).
What I have described above is very general, what I want to do actually is to separate a two-channel input image into two different images. One goes to a convolutional layer and the other goes to a pooling layer. Finally, I concatenate the outputs.
So I am wondering if a Caffe layer that allows the user to do such thing exists, and how to define it in the prototxt file.
Yes, the Slice layer is for that purpose. From the Layer Catalogue:
The Slice layer is a utility layer that slices an input layer to multiple output layers along a given dimension (currently num or channel only) with given slice indices.
To slice a Blob of size N x 2 x H x W into two Blobs of size N x 1 x H x W, you have to slice axis: 1 (along channels) at slice_point: 1 (after the first channel):
layer {
name: "slice-conv-pool"
type: "Slice"
bottom: "data"
top: "conv1"
top: "pool1"
slice_param {
axis: 1
slice_point: 1
}
}
This is my data layer of net.prototxt:
layer {
name: "csv"
type: "MemoryData"
top: "data"
top: "label"
include {
phase: TRAIN
}
memory_data_param {
batch_size: 10
channels: 1
width: 14
height: 1
}
}
I find the function
MemoryDataLayer<Dtype>::Reset(Dtype* data, Dtype* labels, int n)
but I don't know where should I add this function to?
Now I want to know where
is the label data from? Because I only see label key word in Datum struct.
I always use MemoryData layer when I train a network through pycaffe module.Like this
solver = caffe.SGDSolver(solver_file)
X = np.zeros((batch_size, 3, im_height, im_width), dtype = np.float32)
Y = np.zeros((batch_size, ), dtype = np.float32)
# put processed images into X, put labels into Y
solver.net.set_input_arrays(X,Y)
you can refer caffe_root/python/caffe/pycaffe.py and _caffe.cpp for detail