I wrote a siamese-like network using caffe with two inputs. The output of the convolutional layer with the first input is always the same, while the second output changes every time. The input layer and the convolutional layers are as follows:
layer {
name: "input"
type: "Input"
top: "data1"
top: "data2"
input_param {
shape {dim: 1
dim: 1
dim: 28
dim: 28
}
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data1"
top: "conv1_1"
convolution_param {
num_output: 20
kernel_size: 5
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data2"
top: "conv1_2"
convolution_param {
num_output: 20
kernel_size: 5
bias_term: false
weight_filler {
type: "xavier"
}
}
}
May I build the convolutional layer with the python layer? If so, how
Related
It must be a very stupid question, but since I have not such sufficient know ledge storage and having no more time to search the answer of it, I have to put it here to ask for help. I generated a training dataset of images of simple geometric shapes as triangles, squares, diamonds etc. by programs and constructed a CNN with two convolutional layers and one pooling layer also a final fully connected layer to learn the classifications of these shapes. But the network just does not to learn it. I mean the loss just does not decrease. What is the cause?
In Caffe, the neural network configuration file "very_simple_one.prototxt" looks like:
name: "very_simple_one"
layer {
##name: "input"
name: "data"
##type: "Input"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mean_file: "images/train_valid_lmdb_mean.binaryproto"
}
data_param {
source: "images/train_valid_lmdb"
batch_size: 1000
backend: LMDB
}
input_param {
shape {
dim: 1
dim: 3
dim: 200
dim: 200
}
}
}
layer {
##name: "input"
name: "data"
##type: "Input"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mean_file: "images/train_valid_lmdb_mean.binaryproto"
}
data_param {
source: "images/test_lmdb"
batch_size: 100
backend: LMDB
}
input_param {
shape {
dim: 1
dim: 3
dim: 200
dim: 200
}
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 50
kernel_size: 5
stride: 5
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 5
stride: 5
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
convolution_param {
num_output: 3
kernel_size: 8
stride: 8
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "fc3"
type: "InnerProduct"
bottom: "conv2"
top: "fc3"
inner_product_param {
num_output: 3
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc3"
bottom: "label"
}
The "solver.prototxt" looks like:
net: "very_simple_one.prototxt"
type: "SGD"
test_iter: 15
test_interval: 100
base_lr: 0.05
lr_policy: "step"
gamma: 0.9999
stepsize: 100
display: 20
max_iter: 50000
snapshot: 2000
momentum: 0.9
weight_decay: 0.00000000000
solver_mode: GPU
Also tried AdaGrad by commenting the "momentum" and modify the "type" to AdaGrad.
Train this net by the command:
....../caffe/build/tools/caffe train -solver solver.prototxt
All failed to train. I mean the loss just does not decrease. The loss is hovering within a very very small interval but never really to decrease.
Just wonder if the dataset is definitely not able to be trained or there is something wrong with my configuration files, the above ones?
I also have modified the network according to what Ibrahim Yousuf said by replacing the pooling layer as convolutional layer as:
name: "very_simple_one"
layer {
##name: "input"
name: "data"
##type: "Input"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mean_file: "images/train_valid_lmdb_mean.binaryproto"
}
data_param {
source: "images/train_valid_lmdb"
batch_size: 1000
backend: LMDB
}
input_param {
shape {
dim: 1
dim: 3
dim: 200
dim: 200
}
}
}
layer {
##name: "input"
name: "data"
##type: "Input"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mean_file: "images/train_valid_lmdb_mean.binaryproto"
}
data_param {
source: "images/test_lmdb"
batch_size: 100
backend: LMDB
}
input_param {
shape {
dim: 1
dim: 3
dim: 200
dim: 200
}
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 50
kernel_size: 5
##stride: 5
stride: 2
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "conv1.5"
type: "Convolution"
bottom: "conv1"
top: "conv1.5"
convolution_param {
num_output: 10
kernel_size: 5
stride: 2
}
}
layer {
name: "relu1.5"
type: "ReLU"
bottom: "conv1.5"
top: "conv1.5"
}
layer {
name: "conv2"
type: "Convolution"
bottom: "conv1.5"
top: "conv2"
convolution_param {
num_output: 3
kernel_size: 8
stride: 4
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "fc3"
type: "InnerProduct"
bottom: "conv2"
top: "fc3"
inner_product_param {
num_output: 3
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc3"
bottom: "label"
}
But the loss still does not decrease. Should I be confirmed that the cause is my dataset? And my dataset is really very small and if anyone could give me a hand, I can upload it onto somewhere a net disk to be downloaded for test.
Solved. The labels for classification should start from zero not one e.g. 0, 1 ,2 for three classification problem not 1, 2, 3.
I'm using the DeepLab_v2 version of Caffe in order to do semantic segmentation. I can finetune the ResNet101 using imagenet model, but I cannot train the model from scratch using custom data. Did anyone have similar experience and managed to solve this issue?
This is how a functional block of the ResNet looks like, that I'm currently using for training:
layer {
bottom: "data"
top: "conv1"
name: "conv1"
type: "Convolution"
param {
name: "conv1_0"
lr_mult: 1
decay_mult: 1
}
convolution_param {
num_output: 64
kernel_size: 3
pad: 1
stride: 2
bias_term: false
weight_filler {
type: "msra"
}
}
}
layer {
bottom: "conv1"
top: "conv1"
name: "bn_conv1"
type: "BatchNorm"
batch_norm_param {
use_global_stats: true
}
param {
name: "bn_conv1_0"
lr_mult: 0
}
param {
name: "bn_conv1_1"
lr_mult: 0
}
param {
name: "bn_conv1_2"
lr_mult: 0
}
}
layer {
bottom: "conv1"
top: "conv1"
name: "scale_conv1"
type: "Scale"
scale_param {
bias_term: true
filler {
value: 0.5
}
bias_filler {
value: -2
}
}
param {
name: "scale_conv1_0"
lr_mult: 0
}
param {
name: "scale_conv1_1"
lr_mult: 0
}
}
layer {
top: "conv1"
bottom: "conv1"
name: "conv1_relu"
type: "ReLU"
}
I tried all kinds of variations including use_global_stats: false. I am able to train one single block of the type above, but when I try to use all 101 layers, the model does not converge anymore.
Any ideas?
hello I am quite new to deep learning and caffe so please do not mind if my question is a little stupid.
I have been looking into pixel-wise classification / segmentation / regression. Therefore I have seen there is a gitlhub repo for image segmentation fcn berkeley and some other posts like question 1, question 2.
What I wanted to do is something similar buf slightly different. I have a dataset of images and their corresponding ground_truth as images. I am not sure if it is better to use pixel-wise classification via SoftmaxLoss or regression via EuclideanLoss. My ground_truth images contain values from 0-255 and only have one channel.
I have been trying to do a regression task and have a fully convolutional network with a few convolutional layers which remain the output size and the last layer looks like this: In the end I want to do a depth prediction task. Therefore I am not sure if it is better to use SoftmaxWithLoss or EuclideanLoss. However this question might be a bit stupid. But is this approach correct? First I have tried to learn the shape of my images, i.e. I have set the values in the ground_truth to 0.5 when my input image has a value greater than 0 at the corresponding location. Could anyone help me please?
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 256
kernel_size: 53
stride: 1
pad: 26
weight_filler {
type: "gaussian"
std: 0.011
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "conv2"
type: "Convolution"
bottom: "conv1"
top: "conv2"
convolution_param {
num_output: 128
kernel_size: 15
stride: 1
pad: 7
weight_filler {
type: "gaussian"
std: 0.011
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "conv3"
type: "Convolution"
bottom: "conv2"
top: "conv3"
convolution_param {
num_output: 1
kernel_size: 11
stride: 1
pad: 5
weight_filler {
type: "gaussian"
std: 0.011
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
#layer {
# name: "loss"
# type: "SoftmaxWithLoss"
# bottom: "score"
# bottom: "label"
# top: "loss"
# loss_param {
# ignore_label: 255
# normalize: true
# }
#}
layer {
name: "loss"
type: "EuclideanLoss"
bottom: "conv3"
bottom: "label"
top: "loss"
}
In a recent discussion, I found out that some parts of the deploy.prototxt exist only because they have been directly copied from the train_test.prototxt and are ignored during testing. For example:
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param { #Starting here
lr_mult: 1
}
param {
lr_mult: 2
} #To here
convolution_param { #is this section useful?
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
I was told that the section containing LR for weights as biases was useless in deploy files and could be deleted. This got me thinking, is the convolution_param portion absolutely required? If yes, do we still have to define the weight and bias fillers as we will only do testing using this file and fillers are initialized only when we need to train a network. Is there any other detail that is unnecessary?
The convolution_param portion is required but you can remove weight_filler and bias_filler if you want.
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
}
}
The above layer will run well during Test.
I am running the MNIST example with some manual changes to the layers. While training everything works great and I reach a final test accuracy of ~99%. I am now trying to work with the generated model in python using pycaffe and am following the steps as given here. I want to compute the confusion matrix so I'm looping through the test images one by one from LMDB and then running the network. Here is the code :
net = caffe.Net(args.proto, args.model, caffe.TEST)
...
datum = caffe.proto.caffe_pb2.Datum()
datum.ParseFromString(value)
label = int(datum.label)
image = caffe.io.datum_to_array(datum).astype(np.uint8)
...
net.blobs['data'].reshape(1, 1, 28, 28) # Greyscale 28x28 images
net.blobs['data'].data[...] = image
net.forward()
# Get predicted label
print net.blobs['label'].data[0] # use this later for confusion matrix
Here is my network definition prototxt
name: "MNISTNet"
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625
}
data_param {
source: "train_lmdb"
batch_size: 64
backend: LMDB
}
}
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
scale: 0.00390625
}
data_param {
source: "test_lmdb"
batch_size: 100
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "fc1"
type: "InnerProduct"
bottom: "pool2"
top: "fc1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "fc1"
top: "fc1"
}
layer {
name: "fc2"
type: "InnerProduct"
bottom: "fc1"
top: "fc2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 10
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc2"
bottom: "label"
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "fc2"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
Note that test batch size is 100 which is why I need that reshape in python code. Now, suppose I change the test batch size to 1, the exact same python code prints different (and mostly correct) predicted class labels. Thus, the code being run with batch size 1 produces expected result with ~99% accuracy while batch size 100 is horrible.
However, based on the Imagenet pycaffe tutorial, I don't see what I'm doing wrong. As a last resort I can create a copy of my prototxt with batch size 1 for test and use that in my python code and use the original one while training, but that is not ideal.
Also, I don't think it should be an issue with preprocessing since it doesn't explain why it works well with batch size 1.
Any pointers appreciated!