caffe fcn pixel-wise segmentation to regression - deep-learning

hello I am quite new to deep learning and caffe so please do not mind if my question is a little stupid.
I have been looking into pixel-wise classification / segmentation / regression. Therefore I have seen there is a gitlhub repo for image segmentation fcn berkeley and some other posts like question 1, question 2.
What I wanted to do is something similar buf slightly different. I have a dataset of images and their corresponding ground_truth as images. I am not sure if it is better to use pixel-wise classification via SoftmaxLoss or regression via EuclideanLoss. My ground_truth images contain values from 0-255 and only have one channel.
I have been trying to do a regression task and have a fully convolutional network with a few convolutional layers which remain the output size and the last layer looks like this: In the end I want to do a depth prediction task. Therefore I am not sure if it is better to use SoftmaxWithLoss or EuclideanLoss. However this question might be a bit stupid. But is this approach correct? First I have tried to learn the shape of my images, i.e. I have set the values in the ground_truth to 0.5 when my input image has a value greater than 0 at the corresponding location. Could anyone help me please?
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 256
kernel_size: 53
stride: 1
pad: 26
weight_filler {
type: "gaussian"
std: 0.011
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "conv2"
type: "Convolution"
bottom: "conv1"
top: "conv2"
convolution_param {
num_output: 128
kernel_size: 15
stride: 1
pad: 7
weight_filler {
type: "gaussian"
std: 0.011
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "conv3"
type: "Convolution"
bottom: "conv2"
top: "conv3"
convolution_param {
num_output: 1
kernel_size: 11
stride: 1
pad: 5
weight_filler {
type: "gaussian"
std: 0.011
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
#layer {
# name: "loss"
# type: "SoftmaxWithLoss"
# bottom: "score"
# bottom: "label"
# top: "loss"
# loss_param {
# ignore_label: 255
# normalize: true
# }
#}
layer {
name: "loss"
type: "EuclideanLoss"
bottom: "conv3"
bottom: "label"
top: "loss"
}

Related

CNN doesn't learn simple geometric patterns

It must be a very stupid question, but since I have not such sufficient know ledge storage and having no more time to search the answer of it, I have to put it here to ask for help. I generated a training dataset of images of simple geometric shapes as triangles, squares, diamonds etc. by programs and constructed a CNN with two convolutional layers and one pooling layer also a final fully connected layer to learn the classifications of these shapes. But the network just does not to learn it. I mean the loss just does not decrease. What is the cause?
In Caffe, the neural network configuration file "very_simple_one.prototxt" looks like:
name: "very_simple_one"
layer {
##name: "input"
name: "data"
##type: "Input"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mean_file: "images/train_valid_lmdb_mean.binaryproto"
}
data_param {
source: "images/train_valid_lmdb"
batch_size: 1000
backend: LMDB
}
input_param {
shape {
dim: 1
dim: 3
dim: 200
dim: 200
}
}
}
layer {
##name: "input"
name: "data"
##type: "Input"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mean_file: "images/train_valid_lmdb_mean.binaryproto"
}
data_param {
source: "images/test_lmdb"
batch_size: 100
backend: LMDB
}
input_param {
shape {
dim: 1
dim: 3
dim: 200
dim: 200
}
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 50
kernel_size: 5
stride: 5
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 5
stride: 5
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
convolution_param {
num_output: 3
kernel_size: 8
stride: 8
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "fc3"
type: "InnerProduct"
bottom: "conv2"
top: "fc3"
inner_product_param {
num_output: 3
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc3"
bottom: "label"
}
The "solver.prototxt" looks like:
net: "very_simple_one.prototxt"
type: "SGD"
test_iter: 15
test_interval: 100
base_lr: 0.05
lr_policy: "step"
gamma: 0.9999
stepsize: 100
display: 20
max_iter: 50000
snapshot: 2000
momentum: 0.9
weight_decay: 0.00000000000
solver_mode: GPU
Also tried AdaGrad by commenting the "momentum" and modify the "type" to AdaGrad.
Train this net by the command:
....../caffe/build/tools/caffe train -solver solver.prototxt
All failed to train. I mean the loss just does not decrease. The loss is hovering within a very very small interval but never really to decrease.
Just wonder if the dataset is definitely not able to be trained or there is something wrong with my configuration files, the above ones?
I also have modified the network according to what Ibrahim Yousuf said by replacing the pooling layer as convolutional layer as:
name: "very_simple_one"
layer {
##name: "input"
name: "data"
##type: "Input"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mean_file: "images/train_valid_lmdb_mean.binaryproto"
}
data_param {
source: "images/train_valid_lmdb"
batch_size: 1000
backend: LMDB
}
input_param {
shape {
dim: 1
dim: 3
dim: 200
dim: 200
}
}
}
layer {
##name: "input"
name: "data"
##type: "Input"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mean_file: "images/train_valid_lmdb_mean.binaryproto"
}
data_param {
source: "images/test_lmdb"
batch_size: 100
backend: LMDB
}
input_param {
shape {
dim: 1
dim: 3
dim: 200
dim: 200
}
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 50
kernel_size: 5
##stride: 5
stride: 2
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "conv1.5"
type: "Convolution"
bottom: "conv1"
top: "conv1.5"
convolution_param {
num_output: 10
kernel_size: 5
stride: 2
}
}
layer {
name: "relu1.5"
type: "ReLU"
bottom: "conv1.5"
top: "conv1.5"
}
layer {
name: "conv2"
type: "Convolution"
bottom: "conv1.5"
top: "conv2"
convolution_param {
num_output: 3
kernel_size: 8
stride: 4
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "fc3"
type: "InnerProduct"
bottom: "conv2"
top: "fc3"
inner_product_param {
num_output: 3
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc3"
bottom: "label"
}
But the loss still does not decrease. Should I be confirmed that the cause is my dataset? And my dataset is really very small and if anyone could give me a hand, I can upload it onto somewhere a net disk to be downloaded for test.
Solved. The labels for classification should start from zero not one e.g. 0, 1 ,2 for three classification problem not 1, 2, 3.

Caffe fails on fine-tune training mean.binaryproto: Movidius

After creating correctly train.txt and val.txt from own dataset with two different classes, I created both train_leveldb with data.mdb(67.7Mb), lock.mdb(8.2kB) and mean.binaryproto(786.4kB) inside, and val_leveldb with data.mdb(16.9Mb),lock.mdb (8.2kB) and mean.binaryproto(786.4kB).
After that I proceed to train the net as follows:
>/opt/movidius/caffe/build/tools/caffe train --solver=/opt/movidius/caffe/models/bvlc_reference_caffenet/solver_isia.prototxt --weights /opt/movidius/caffe/models/bvlc_reference_caffenet/bvlc.caffemodel 2>&1 | tee /opt/movidius/caffe/models/blvc_reference_caffenet/train.log
MDB databases files (train and val) exist and are accessible, both mean.binaryproto too. Any idea to fix it? Any comment will be welcome.
Thanks.
LOGFILE:
I0906 16:56:47.615576 10762 caffe.cpp:210] Use CPU.
I0906 16:56:47.615811 10762 solver.cpp:63] Initializing solver from parameters:
test_iter: 1000
test_interval: 1000
base_lr: 0.01
display: 20
max_iter: 40000
lr_policy: "step"
gamma: 0.1
momentum: 0.9
weight_decay: 0.0005
stepsize: 2500
snapshot: 5000
snapshot_prefix: "/opt/movidius/caffe/models/bvlc_reference_caffenet/caffenet_isia"
solver_mode: CPU
net: "/opt/movidius/caffe/models/bvlc_reference_caffenet/train_isia.prototxt"
train_state {
level: 0
stage: ""
}
I0906 16:56:47.615988 10762 solver.cpp:106] Creating training net from net file: /opt/movidius/caffe/models/bvlc_reference_caffenet/train_isia.prototxt
I0906 16:56:47.616300 10762 net.cpp:322] The NetState phase (0) differed from the phase (1) specified by a rule in layer data
I0906 16:56:47.616331 10762 net.cpp:322] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
I0906 16:56:47.616339 10762 net.cpp:58] Initializing net from parameters:
name: "CaffeNet"
state {
phase: TRAIN
level: 0
stage: ""
}
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 227
mean_file: "/home/spalomar/workspace/ISIA/lmdb/Imagenet/train_leveldb/mean.binaryproto"
}
data_param {
source: "/home/spalomar/workspace/ISIA/lmdb/Imagenet/train_leveldb"
batch_size: 256
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm1"
type: "LRN"
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "norm1"
top: "conv2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm2"
type: "LRN"
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc8-isia"
type: "InnerProduct"
bottom: "fc7"
top: "fc8-isia"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc8-isia"
bottom: "label"
top: "loss"
}
I0906 16:56:47.616586 10762 layer_factory.hpp:77] Creating layer data
I0906 16:56:47.616915 10762 net.cpp:100] Creating Layer data
I0906 16:56:47.616928 10762 net.cpp:408] data -> data
I0906 16:56:47.616962 10762 net.cpp:408] data -> label
I0906 16:56:47.616978 10762 data_transformer.cpp:27] Loading mean file from: /home/spalomar/workspace/ISIA/lmdb/Imagenet/train_leveldb/mean.binaryproto
F0906 16:56:47.616992 10765 db_lmdb.hpp:15] Check failed: mdb_status == 0 (2 vs. 0) No such file or directory
*** Check failure stack trace: ***
F0906 16:56:47.616993 10762 io.cpp:63] Check failed: fd != -1 (-1 vs. -1) File not found: /home/spalomar/workspace/ISIA/lmdb/Imagenet/train_leveldb/mean.binaryproto
*** Check failure stack trace: ***
# 0x7f6b3b1dc0cd google::LogMessage::Fail()
# 0x7f6b3b1dc0cd google::LogMessage::Fail()
# 0x7f6b3b1ddf33 google::LogMessage::SendToLog()
# 0x7f6b3b1ddf33 google::LogMessage::SendToLog()
# 0x7f6b3b1dbc28 google::LogMessage::Flush()
# 0x7f6b3b1dbc28 google::LogMessage::Flush()
# 0x7f6b3b1de999 google::LogMessageFatal::~LogMessageFatal()
# 0x7f6b3b1de999 google::LogMessageFatal::~LogMessageFatal()
# 0x7f6b3b9aad4a caffe::ReadProtoFromBinaryFile()
# 0x7f6b3b993c4a caffe::db::LMDB::Open()
# 0x7f6b3b7a8250 caffe::DataTransformer<>::DataTransformer()
# 0x7f6b3b797ab7 caffe::DataReader<>::Body::InternalThreadEntry()
# 0x7f6b3b7d6775 caffe::BaseDataLayer<>::LayerSetUp()
# 0x7f6b396a4bcd (unknown)
# 0x7f6b3b7d689a caffe::BasePrefetchingDataLayer<>::LayerSetUp()
# 0x7f6b38f596db start_thread
# 0x7f6b3b93925b caffe::Net<>::Init()
# 0x7f6b399d988f clone
You can solve the problem, without mean file.
transform_param {
mirror: true
crop_size: 227
mean_value: 104 # Blue
mean_value: 116 # Green
mean_value: 122 # Red
}
To get actually values from mean.binaryproto use this code: https://gist.github.com/Coderx7/26eebeefaa3fb28f654d2951980b80ba
or compute it by yourself.

The outputs of the convolutional layer in Caffe are different

I wrote a siamese-like network using caffe with two inputs. The output of the convolutional layer with the first input is always the same, while the second output changes every time. The input layer and the convolutional layers are as follows:
layer {
name: "input"
type: "Input"
top: "data1"
top: "data2"
input_param {
shape {dim: 1
dim: 1
dim: 28
dim: 28
}
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data1"
top: "conv1_1"
convolution_param {
num_output: 20
kernel_size: 5
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data2"
top: "conv1_2"
convolution_param {
num_output: 20
kernel_size: 5
bias_term: false
weight_filler {
type: "xavier"
}
}
}
May I build the convolutional layer with the python layer? If so, how

Which part of the deploy.prototxt file in caffe is absolutely necessary for testing?

In a recent discussion, I found out that some parts of the deploy.prototxt exist only because they have been directly copied from the train_test.prototxt and are ignored during testing. For example:
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param { #Starting here
lr_mult: 1
}
param {
lr_mult: 2
} #To here
convolution_param { #is this section useful?
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
I was told that the section containing LR for weights as biases was useless in deploy files and could be deleted. This got me thinking, is the convolution_param portion absolutely required? If yes, do we still have to define the weight and bias fillers as we will only do testing using this file and fillers are initialized only when we need to train a network. Is there any other detail that is unnecessary?
The convolution_param portion is required but you can remove weight_filler and bias_filler if you want.
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
}
}
The above layer will run well during Test.

Discrepancy in results when using batch size 1 in prototxt versus coercing batch size to 1 in pycaffe

I am running the MNIST example with some manual changes to the layers. While training everything works great and I reach a final test accuracy of ~99%. I am now trying to work with the generated model in python using pycaffe and am following the steps as given here. I want to compute the confusion matrix so I'm looping through the test images one by one from LMDB and then running the network. Here is the code :
net = caffe.Net(args.proto, args.model, caffe.TEST)
...
datum = caffe.proto.caffe_pb2.Datum()
datum.ParseFromString(value)
label = int(datum.label)
image = caffe.io.datum_to_array(datum).astype(np.uint8)
...
net.blobs['data'].reshape(1, 1, 28, 28) # Greyscale 28x28 images
net.blobs['data'].data[...] = image
net.forward()
# Get predicted label
print net.blobs['label'].data[0] # use this later for confusion matrix
Here is my network definition prototxt
name: "MNISTNet"
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625
}
data_param {
source: "train_lmdb"
batch_size: 64
backend: LMDB
}
}
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
scale: 0.00390625
}
data_param {
source: "test_lmdb"
batch_size: 100
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "fc1"
type: "InnerProduct"
bottom: "pool2"
top: "fc1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "fc1"
top: "fc1"
}
layer {
name: "fc2"
type: "InnerProduct"
bottom: "fc1"
top: "fc2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 10
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc2"
bottom: "label"
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "fc2"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
Note that test batch size is 100 which is why I need that reshape in python code. Now, suppose I change the test batch size to 1, the exact same python code prints different (and mostly correct) predicted class labels. Thus, the code being run with batch size 1 produces expected result with ~99% accuracy while batch size 100 is horrible.
However, based on the Imagenet pycaffe tutorial, I don't see what I'm doing wrong. As a last resort I can create a copy of my prototxt with batch size 1 for test and use that in my python code and use the original one while training, but that is not ideal.
Also, I don't think it should be an issue with preprocessing since it doesn't explain why it works well with batch size 1.
Any pointers appreciated!