When looking throught the prototxt of the googlenet one finds that the inception layers have a concat layer at the end which takes several bottom inputs.
e.g:
layer {
name: "inception_3a/output"
type: "Concat"
bottom: "inception_3a/1x1"
bottom: "inception_3a/3x3"
bottom: "inception_3a/5x5"
bottom: "inception_3a/pool_proj"
top: "inception_3a/output"
}
As it can be seen, there is one 1x1 conv-layer, one 3x3 conv-layer , one 5x5 conv-layer and finally a pooling layer. These layers are described as following:
layer {
name: "inception_3a/1x1"
type: "Convolution"
bottom: "pool2/3x3_s2"
top: "inception_3a/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_1x1"
type: "ReLU"
bottom: "inception_3a/1x1"
top: "inception_3a/1x1"
}
layer {
name: "inception_3a/3x3_reduce"
type: "Convolution"
bottom: "pool2/3x3_s2"
top: "inception_3a/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.09
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_3a/3x3_reduce"
top: "inception_3a/3x3_reduce"
}
layer {
name: "inception_3a/3x3"
type: "Convolution"
bottom: "inception_3a/3x3_reduce"
top: "inception_3a/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_3x3"
type: "ReLU"
bottom: "inception_3a/3x3"
top: "inception_3a/3x3"
}
layer {
name: "inception_3a/5x5_reduce"
type: "Convolution"
bottom: "pool2/3x3_s2"
top: "inception_3a/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 16
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.2
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_3a/5x5_reduce"
top: "inception_3a/5x5_reduce"
}
layer {
name: "inception_3a/5x5"
type: "Convolution"
bottom: "inception_3a/5x5_reduce"
top: "inception_3a/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_5x5"
type: "ReLU"
bottom: "inception_3a/5x5"
top: "inception_3a/5x5"
}
layer {
name: "inception_3a/pool"
type: "Pooling"
bottom: "pool2/3x3_s2"
top: "inception_3a/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_3a/pool_proj"
type: "Convolution"
bottom: "inception_3a/pool"
top: "inception_3a/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
It can be seen that these have different numbers of output and also different filter size, anyhow the documentation on the concat layer is the following:
input:
n_i * c_i * h * w for each input blob i from 1 to K.
Output:
if axis = 0: (n_1 + n_2 + ... + n_K) * c_1 * h * w, and all input c_i
should be the same.
if axis = 1: n_1 * (c_1 + c_2 + ... + c_K) * h * w, and all input n_i should be the same.
Firstly, I am not sure what the default is and secondly I am not sure which Dimensions will have the output Volume, since width and height should stay the same but all thre conv layer produce different outputs. Any pointers would be really appreciated
The default value for 'Concat' axis is 1, thus concatenating through channel dimension. In order to do this, all the layers that are concatenated, should have the same height and width. Looking to the log, the dimensions are (assuming batch size 32):
inception_3a/1x1 -> [32, 64, 28, 28]
inception_3a/3x3 -> [32, 128, 28, 28]
inception_3a/5x5 -> [32, 32, 28, 28]
inception_3a/pool_proj -> [32, 32, 28, 28]
Thus the final output will have dimension:
inception_3a/output -> [32 (64+128+32+32) 28, 28] -> [32, 256, 28, 28]
As expected from the Caffe log:
Creating Layer inception_3a/output
inception_3a/output <- inception_3a/1x1
inception_3a/output <- inception_3a/3x3
inception_3a/output <- inception_3a/5x5
inception_3a/output <- inception_3a/pool_proj
inception_3a/output -> inception_3a/output
Setting up inception_3a/output
Top shape: 32 256 28 28 (6422528)
Related
After creating correctly train.txt and val.txt from own dataset with two different classes, I created both train_leveldb with data.mdb(67.7Mb), lock.mdb(8.2kB) and mean.binaryproto(786.4kB) inside, and val_leveldb with data.mdb(16.9Mb),lock.mdb (8.2kB) and mean.binaryproto(786.4kB).
After that I proceed to train the net as follows:
>/opt/movidius/caffe/build/tools/caffe train --solver=/opt/movidius/caffe/models/bvlc_reference_caffenet/solver_isia.prototxt --weights /opt/movidius/caffe/models/bvlc_reference_caffenet/bvlc.caffemodel 2>&1 | tee /opt/movidius/caffe/models/blvc_reference_caffenet/train.log
MDB databases files (train and val) exist and are accessible, both mean.binaryproto too. Any idea to fix it? Any comment will be welcome.
Thanks.
LOGFILE:
I0906 16:56:47.615576 10762 caffe.cpp:210] Use CPU.
I0906 16:56:47.615811 10762 solver.cpp:63] Initializing solver from parameters:
test_iter: 1000
test_interval: 1000
base_lr: 0.01
display: 20
max_iter: 40000
lr_policy: "step"
gamma: 0.1
momentum: 0.9
weight_decay: 0.0005
stepsize: 2500
snapshot: 5000
snapshot_prefix: "/opt/movidius/caffe/models/bvlc_reference_caffenet/caffenet_isia"
solver_mode: CPU
net: "/opt/movidius/caffe/models/bvlc_reference_caffenet/train_isia.prototxt"
train_state {
level: 0
stage: ""
}
I0906 16:56:47.615988 10762 solver.cpp:106] Creating training net from net file: /opt/movidius/caffe/models/bvlc_reference_caffenet/train_isia.prototxt
I0906 16:56:47.616300 10762 net.cpp:322] The NetState phase (0) differed from the phase (1) specified by a rule in layer data
I0906 16:56:47.616331 10762 net.cpp:322] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
I0906 16:56:47.616339 10762 net.cpp:58] Initializing net from parameters:
name: "CaffeNet"
state {
phase: TRAIN
level: 0
stage: ""
}
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 227
mean_file: "/home/spalomar/workspace/ISIA/lmdb/Imagenet/train_leveldb/mean.binaryproto"
}
data_param {
source: "/home/spalomar/workspace/ISIA/lmdb/Imagenet/train_leveldb"
batch_size: 256
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm1"
type: "LRN"
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "norm1"
top: "conv2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm2"
type: "LRN"
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc8-isia"
type: "InnerProduct"
bottom: "fc7"
top: "fc8-isia"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc8-isia"
bottom: "label"
top: "loss"
}
I0906 16:56:47.616586 10762 layer_factory.hpp:77] Creating layer data
I0906 16:56:47.616915 10762 net.cpp:100] Creating Layer data
I0906 16:56:47.616928 10762 net.cpp:408] data -> data
I0906 16:56:47.616962 10762 net.cpp:408] data -> label
I0906 16:56:47.616978 10762 data_transformer.cpp:27] Loading mean file from: /home/spalomar/workspace/ISIA/lmdb/Imagenet/train_leveldb/mean.binaryproto
F0906 16:56:47.616992 10765 db_lmdb.hpp:15] Check failed: mdb_status == 0 (2 vs. 0) No such file or directory
*** Check failure stack trace: ***
F0906 16:56:47.616993 10762 io.cpp:63] Check failed: fd != -1 (-1 vs. -1) File not found: /home/spalomar/workspace/ISIA/lmdb/Imagenet/train_leveldb/mean.binaryproto
*** Check failure stack trace: ***
# 0x7f6b3b1dc0cd google::LogMessage::Fail()
# 0x7f6b3b1dc0cd google::LogMessage::Fail()
# 0x7f6b3b1ddf33 google::LogMessage::SendToLog()
# 0x7f6b3b1ddf33 google::LogMessage::SendToLog()
# 0x7f6b3b1dbc28 google::LogMessage::Flush()
# 0x7f6b3b1dbc28 google::LogMessage::Flush()
# 0x7f6b3b1de999 google::LogMessageFatal::~LogMessageFatal()
# 0x7f6b3b1de999 google::LogMessageFatal::~LogMessageFatal()
# 0x7f6b3b9aad4a caffe::ReadProtoFromBinaryFile()
# 0x7f6b3b993c4a caffe::db::LMDB::Open()
# 0x7f6b3b7a8250 caffe::DataTransformer<>::DataTransformer()
# 0x7f6b3b797ab7 caffe::DataReader<>::Body::InternalThreadEntry()
# 0x7f6b3b7d6775 caffe::BaseDataLayer<>::LayerSetUp()
# 0x7f6b396a4bcd (unknown)
# 0x7f6b3b7d689a caffe::BasePrefetchingDataLayer<>::LayerSetUp()
# 0x7f6b38f596db start_thread
# 0x7f6b3b93925b caffe::Net<>::Init()
# 0x7f6b399d988f clone
You can solve the problem, without mean file.
transform_param {
mirror: true
crop_size: 227
mean_value: 104 # Blue
mean_value: 116 # Green
mean_value: 122 # Red
}
To get actually values from mean.binaryproto use this code: https://gist.github.com/Coderx7/26eebeefaa3fb28f654d2951980b80ba
or compute it by yourself.
I've played with Caffe for a long time but never done multilabel classification and it seems I'm getting stuck:
What I'm using
First of all, I'm creating the lmdb (train_lmdb, val_lmdb), labels (labels_train_lmdb, labels_val_lmdb) and mean (mean_lmdb.binaryproto) with Caffe-LMDBCreation-MultiLabel.
The model has around 13000 images for 7 classes.
2000 of those images have two classes (for example, the vector is [1, 0, 0, 1, 0, 0, 0]
The rest of the images have only one class (for example, the vector would be [0, 0, 1, 0, 0, 0, 0]
What I'm expecting
I'm expecting, at least, to grab an image from the train dataset, for example:
img1.jpg 0 0 0 1 0 0 0
classify it, and have a value similar to [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]
What I'm having instead
for the image above (img1.jpg), I'm having these type of results:
[0.48112139105796814, 0.5486980676651001, 0.5396456122398376, 0.44233766198158264, 0.5605107545852661, 0.3539462387561798, 0.5215630531311035]
which doesn't make sense. I've tried with several snapshots (one each 10000 iterations) and results are similar, all of them really close to 0.50
My prototxt
train_val.prototxt
name: "multi-class-alexnet"
# --------------------------------- TRAIN -------------------------------
# -----------------------------------------------------------------------
layer {
name: "data"
type: "Data"
top: "data"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 180
mean_file: "./mean_lmdb.binaryproto"
}
data_param {
source: "./train_lmdb"
batch_size: 64
backend: LMDB
}
}
# ---------------------------- TRAIN LABELS -----------------------------
# -----------------------------------------------------------------------
layer {
name: "data"
type: "Data"
top: "label"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625
mean_value: 0
}
data_param {
source: "./labels_train_lmdb"
batch_size: 64
backend: LMDB
}
}
# ---------------------------------- VAL --------------------------------
# -----------------------------------------------------------------------
layer {
name: "data"
type: "Data"
top: "data"
include {
phase: TEST
}
transform_param {
mirror: false
crop_size: 180
mean_file: "./mean_lmdb.binaryproto"
}
data_param {
source: "./val_lmdb"
batch_size: 32
backend: LMDB
}
}
# ----------------------------- VAL LABELS ------------------------------
# -----------------------------------------------------------------------
layer {
name: "data"
type: "Data"
top: "label"
include {
phase: TEST
}
transform_param {
scale: 0.00390625
mean_value: 0
}
data_param {
source: "./labels_val_lmdb"
batch_size: 32
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm1"
type: "LRN"
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "norm1"
top: "conv2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm2"
type: "LRN"
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6_"
type: "InnerProduct"
bottom: "pool5"
top: "fc6_"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6_"
top: "fc6_"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6_"
top: "fc6_"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6_"
top: "fc7"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "latent"
type: "InnerProduct"
bottom: "fc7"
top: "latent"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 48
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
bottom: "latent"
top: "latent_sigmoid"
name: "latent_sigmoid"
type: "Sigmoid"
}
layer {
name: "fc9"
type: "InnerProduct"
bottom: "latent_sigmoid"
top: "fc9"
param {
lr_mult: 10
decay_mult: 1
}
param {
lr_mult: 20
decay_mult: 0
}
inner_product_param {
num_output: 7
weight_filler {
type: "gaussian"
std: 0.2
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "accuracy"
type: "MultiLabelAccuracy"
bottom: "fc9"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
# ----------------------------------------------------------------
# ----------------- Multi-label Loss Function -------------------
# ----------------------------------------------------------------
layer {
name: "loss"
type: "SigmoidCrossEntropyLoss"
bottom: "fc9"
bottom: "label"
top: "loss"
}
deploy.prototxt:
name: "multi-class-alexnet"
input: "data"
input_shape {
dim: 10
dim: 3
dim: 180
dim: 180
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm1"
type: "LRN"
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "norm1"
top: "conv2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm2"
type: "LRN"
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6_"
type: "InnerProduct"
bottom: "pool5"
top: "fc6_"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6_"
top: "fc6_"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6_"
top: "fc6_"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6_"
top: "fc7"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "latent_"
type: "InnerProduct"
bottom: "fc7"
top: "latent_"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 7
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
bottom: "latent_"
top: "latent_sigmoid"
name: "latent_sigmoid"
type: "Sigmoid"
}
Loss function
Somehow my model is showing two losses, which I don't understand, losses on output #0 and #5. These are the grep of the last lines (over 110000 iterations):
output #0:
I0427 10:20:04.475754 1817 solver.cpp:238] Train net output #0: loss = 0.0867133 (* 1 = 0.0867133 loss)
I0427 10:20:38.257825 1817 solver.cpp:238] Train net output #0: loss = 0.0477974 (* 1 = 0.0477974 loss)
I0427 10:21:11.794013 1817 solver.cpp:238] Train net output #0: loss = 0.0390092 (* 1 = 0.0390092 loss)
I0427 10:21:45.620671 1817 solver.cpp:238] Train net output #0: loss = 0.039954 (* 1 = 0.039954 loss)
I0427 10:22:19.271747 1817 solver.cpp:238] Train net output #0: loss = 0.0477802 (* 1 = 0.0477802 loss)
I0427 10:22:53.160802 1817 solver.cpp:238] Train net output #0: loss = 0.0406158 (* 1 = 0.0406158 loss)
I0427 10:23:26.843694 1817 solver.cpp:238] Train net output #0: loss = 0.0355715 (* 1 = 0.0355715 loss)
I0427 10:24:31.727321 1817 solver.cpp:238] Train net output #0: loss = 0.0396538 (* 1 = 0.0396538 loss)
I0427 10:25:05.019598 1817 solver.cpp:238] Train net output #0: loss = 0.037121 (* 1 = 0.037121 loss)
I0427 10:25:38.730303 1817 solver.cpp:238] Train net output #0: loss = 0.0362058 (* 1 = 0.0362058 loss)
output #5:
I0427 09:26:52.251719 1817 solver.cpp:398] Test net output #5: loss = 6.98116 (* 1 = 6.98116 loss)
I0427 09:33:01.639736 1817 solver.cpp:398] Test net output #5: loss = 6.99285 (* 1 = 6.99285 loss)
I0427 09:39:09.991879 1817 solver.cpp:398] Test net output #5: loss = 7.02165 (* 1 = 7.02165 loss)
I0427 09:45:18.013739 1817 solver.cpp:398] Test net output #5: loss = 7.01533 (* 1 = 7.01533 loss)
I0427 09:51:27.065721 1817 solver.cpp:398] Test net output #5: loss = 7.02347 (* 1 = 7.02347 loss)
I0427 09:58:13.271441 1817 solver.cpp:398] Test net output #5: loss = 6.98176 (* 1 = 6.98176 loss)
I0427 10:05:31.896226 1817 solver.cpp:398] Test net output #5: loss = 6.99103 (* 1 = 6.99103 loss)
I0427 10:12:12.693677 1817 solver.cpp:398] Test net output #5: loss = 7.02868 (* 1 = 7.02868 loss)
I0427 10:18:23.250385 1817 solver.cpp:398] Test net output #5: loss = 7.03427 (* 1 = 7.03427 loss)
I0427 10:24:31.239820 1817 solver.cpp:398] Test net output #5: loss = 6.97721 (* 1 = 6.97721 loss)
While training the dataset. I am getting the following error:
I0614 19:07:11.271327 30865 layer_factory.hpp:77] Creating layer data
I0614 19:07:11.271596 30865 net.cpp:84] Creating Layer data
I0614 19:07:11.271848 30865 net.cpp:380] data -> data
I0614 19:07:11.271896 30865 net.cpp:380] data -> label
I0614 19:07:11.271941 30865 data_transformer.cpp:25] Loading mean file from: train_mean
I0614 19:07:11.275465 30865 image_data_layer.cpp:38] Opening file
F0614 19:07:11.275923 30865 image_data_layer.cpp:49] Check failed: !lines_.empty() File is empty
*** Check failure stack trace: ***
# 0x7fba518d25cd google::LogMessage::Fail()
# 0x7fba518d4433 google::LogMessage::SendToLog()
# 0x7fba518d215b google::LogMessage::Flush()
# 0x7fba518d4e1e google::LogMessageFatal::~LogMessageFatal()
# 0x7fba51ce9509 caffe::ImageDataLayer<>::DataLayerSetUp()
# 0x7fba51d1f62e caffe::BasePrefetchingDataLayer<>::LayerSetUp()
# 0x7fba51de7897 caffe::Net<>::Init()
# 0x7fba51de9fde caffe::Net<>::Net()
# 0x7fba51df24e5 caffe::Solver<>::InitTrainNet()
# 0x7fba51df3925 caffe::Solver<>::Init()
# 0x7fba51df3c4f caffe::Solver<>::Solver()
# 0x7fba51dc8bb1 caffe::Creator_SGDSolver<>()
# 0x40a4b8 train()
# 0x406fa0 main
# 0x7fba50843830 __libc_start_main
# 0x4077c9 _start
# (nil) (unknown)
Aborted (core dumped)
I used the template from the github repo of Caffe after installing it.
I have created a subdirectory under the Caffe Root directory named playground.
I am attaching the complete folder for reproducibility.
GitHub Link
The commands that I successfully executed:
../build/tools/convert_imageset -resize_height 256 -resize_width 256 train_raw_img/ train_files.txt train_lmdb
../build/tools/convert_imageset -resize_height 256 -resize_width 256 test_raw_img/ test_files.txt test_lmdb
../build/tools/compute_image_mean train_lmdb train_mean
../build/tools/compute_image_mean train_lmdb test_mean
However, when I proceed to train the network I receive the above error:
../build/tools/caffe train --solver=my_solver_val.prototxt
Complete log of error:
I0614 19:32:54.634418 31048 caffe.cpp:211] Use CPU.
I0614 19:32:54.635144 31048 solver.cpp:44] Initializing solver from parameters:
test_iter: 1000
test_interval: 1000
base_lr: 0.01
display: 20
max_iter: 50000
lr_policy: "step"
gamma: 0.1
momentum: 0.9
weight_decay: 0.0005
stepsize: 10000
snapshot: 10000
snapshot_prefix: "models/mymodel/caffenet_train"
solver_mode: CPU
net: "my_train_val.prototxt"
train_state {
level: 0
stage: ""
}
I0614 19:32:54.639066 31048 solver.cpp:87] Creating training net from net file: my_train_val.prototxt
I0614 19:32:54.640214 31048 net.cpp:294] The NetState phase (0) differed from the phase (1) specified by a rule in layer data
I0614 19:32:54.640645 31048 net.cpp:294] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
I0614 19:32:54.641345 31048 net.cpp:51] Initializing net from parameters:
name: "CaffeNet"
state {
phase: TRAIN
level: 0
stage: ""
}
layer {
name: "data"
type: "ImageData"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 256
mean_file: "train_mean"
}
data_param {
source: "train_files.txt"
batch_size: 2
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm1"
type: "LRN"
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "norm1"
top: "conv2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm2"
type: "LRN"
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc8"
type: "InnerProduct"
bottom: "fc7"
top: "fc8"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc8"
bottom: "label"
top: "loss"
}
I0614 19:32:54.644022 31048 layer_factory.hpp:77] Creating layer data
I0614 19:32:54.644239 31048 net.cpp:84] Creating Layer data
I0614 19:32:54.644256 31048 net.cpp:380] data -> data
I0614 19:32:54.644280 31048 net.cpp:380] data -> label
I0614 19:32:54.644448 31048 data_transformer.cpp:25] Loading mean file from: train_mean
I0614 19:32:54.646653 31048 image_data_layer.cpp:38] Opening file
F0614 19:32:54.646975 31048 image_data_layer.cpp:49] Check failed: !lines_.empty() File is empty
*** Check failure stack trace: ***
# 0x7f83c21c95cd google::LogMessage::Fail()
# 0x7f83c21cb433 google::LogMessage::SendToLog()
# 0x7f83c21c915b google::LogMessage::Flush()
# 0x7f83c21cbe1e google::LogMessageFatal::~LogMessageFatal()
# 0x7f83c25e0509 caffe::ImageDataLayer<>::DataLayerSetUp()
# 0x7f83c261662e caffe::BasePrefetchingDataLayer<>::LayerSetUp()
# 0x7f83c26de897 caffe::Net<>::Init()
# 0x7f83c26e0fde caffe::Net<>::Net()
# 0x7f83c26e94e5 caffe::Solver<>::InitTrainNet()
# 0x7f83c26ea925 caffe::Solver<>::Init()
# 0x7f83c26eac4f caffe::Solver<>::Solver()
# 0x7f83c26bfbb1 caffe::Creator_SGDSolver<>()
# 0x40a4b8 train()
# 0x406fa0 main
# 0x7f83c113a830 __libc_start_main
# 0x4077c9 _start
# (nil) (unknown)
Aborted (core dumped)
You are using "ImageData" input layer. The layer takes a text file (in your case source: "train_files.txt") and expects each line of the file to contain a path to an image file and the classification label for that image.
It seems like this file ('train_files.txt') is empty in your case.
1. Verify that 'train_files.txt' lists image file names.
2. Verify that the listed image files do exist on your machine and you have reading permission for these files.
BTW,
If you already went through all the trouble of creating train_lmdb why not use an input "Data" layer that directly reads the lmdb?
I am using the Alexnet and try to deploy my network. But when I do so I get the following error:
I0109 15:16:56.645679 4240 net.cpp:100] Creating Layer fc6
I0109 15:16:56.645681 4240 net.cpp:434] fc6 <- pool5
I0109 15:16:56.645684 4240 net.cpp:408] fc6 -> fc6
I0109 15:16:56.712829 4240 net.cpp:150] Setting up fc6
I0109 15:16:56.712869 4240 net.cpp:157] Top shape: 1 4096 (4096)
I0109 15:16:56.712873 4240 net.cpp:165] Memory required for data: 6778220
I0109 15:16:56.712882 4240 layer_factory.hpp:77] Creating layer relu6
I0109 15:16:56.712890 4240 net.cpp:100] Creating Layer relu6
I0109 15:16:56.712893 4240 net.cpp:434] relu6 <- fc6
I0109 15:16:56.712915 4240 net.cpp:395] relu6 -> fc6 (in-place)
F0109 15:16:56.713158 4240 blob.hpp:122] Check failed: axis_index < num_axes() (2 vs. 2) axis 2 out of range for 2-D Blob with shape 1 4096 (4096)
*** Check failure stack trace: ***
I have no clue why. It has always been working for me and now this error occurs.
EDIT
layer {
name: "data"
type: "Input"
top: "data"
input_param { shape: { dim: 1 dim: 3 dim: 227 dim: 227 } }
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0.0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0.0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "norm2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "pool2"
top: "conv3"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0.0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0.0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0.0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0.0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6l"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6l"
top: "fc6d"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6d"
top: "fc7"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0.0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc8"
type: "InnerProduct"
bottom: "fc7"
top: "fc8"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0.0
}
inner_product_param {
num_output: 612
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "softmax"
type: "Softmax"
bottom: "fc8"
top: "softmax"
}
Above is my .prototxt (Should be the same as Alexnet)
Check failed: axis_index < num_axes() (2 vs. 2) axis 2 out of range for 2-D Blob with shape 1 4096 (4096)
*** Check failure stack trace: ***
# 0x7f8b8e3125cd google::LogMessage::Fail()
# 0x7f8b8e314433 google::LogMessage::SendToLog()
# 0x7f8b8e31215b google::LogMessage::Flush()
# 0x7f8b8e314e1e google::LogMessageFatal::~LogMessageFatal()
# 0x7f8b8e92c86a caffe::Blob<>::CanonicalAxisIndex()
# 0x7f8b8eaa09c2 caffe::CuDNNReLULayer<>::Reshape()
# 0x7f8b8e97f481 caffe::Net<>::Init()
# 0x7f8b8e980d01 caffe::Net<>::Net()
# 0x7f8b8eac7c5a caffe::Solver<>::InitTrainNet()
# 0x7f8b8eac8fc7 caffe::Solver<>::Init()
# 0x7f8b8eac936a caffe::Solver<>::Solver()
# 0x7f8b8e960c53 caffe::Creator_SGDSolver<>()
# 0x40ac89 train()
# 0x407590 main
# 0x7f8b8d283830 __libc_start_main
# 0x407db9 _start
# (nil) (unknown)
Aborted (core dumped)
Example 2:
I0228 13:03:22.875816 4395 layer_factory.hpp:77] Creating layer relu6
I0228 13:03:22.875828 4395 net.cpp:100] Creating Layer relu6
I0228 13:03:22.875831 4395 net.cpp:434] relu6 <- fc-main
I0228 13:03:22.875855 4395 net.cpp:395] relu6 -> fc-main (in-place)
F0228 13:03:22.876565 4395 blob.hpp:122] Check failed: axis_index < num_axes() (2 vs. 2) axis 2 out of range for 2-D Blob with shape 4 4096 (16384)
*** Check failure stack trace: ***
# 0x7fe1271d85cd google::LogMessage::Fail()
# 0x7fe1271da433 google::LogMessage::SendToLog()
# 0x7fe1271d815b google::LogMessage::Flush()
# 0x7fe1271dae1e google::LogMessageFatal::~LogMessageFatal()
# 0x7fe1277f286a caffe::Blob<>::CanonicalAxisIndex()
# 0x7fe1279669c2 caffe::CuDNNReLULayer<>::Reshape()
# 0x7fe127845481 caffe::Net<>::Init()
# 0x7fe127846d01 caffe::Net<>::Net()
# 0x7fe12798dc5a caffe::Solver<>::InitTrainNet()
# 0x7fe12798efc7 caffe::Solver<>::Init()
# 0x7fe12798f36a caffe::Solver<>::Solver()
# 0x7fe127826c53 caffe::Creator_SGDSolver<>()
# 0x40ac89 train()
# 0x407590 main
# 0x7fe126149830 __libc_start_main
# 0x407db9 _start
# (nil) (unknown)
Aborted (core dumped)
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 1
}
param {
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc-main"
top: "fc-main"
dropout_param {
dropout_ratio: 0.5
}
}
Assuming we have a layer like this:
layer {
name: "fully-connected"
type: "InnerProduct"
bottom: "bottom"
top: "top"
inner_product_param {
num_output: 1
}
}
The output is batch_size x 1. In several papers (for exmaple link1 page 3 picture on the top, or link2 page 4 on top)I have seen that they used such a layer in the end to come up with a 2D image for pixel-wise prediction. How is it possible to transform this into a 2D image? I was thinking of reshape or deconvolution, but I cannot figure out how that would work. A simple example would be helpful
UPDATE: My input images are 304x228 and my ground_truth (depth images) are 75x55.
################# Main net ##################
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "norm2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "pool2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relufc6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4070
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
type: "Reshape"
name: "reshape"
bottom: "fc7"
top: "fc7_reshaped"
reshape_param {
shape { dim: 1 dim: 1 dim: 55 dim: 74 }
}
}
layer {
name: "deconv1"
type: "Deconvolution"
bottom: "fc7_reshaped"
top: "deconv1"
convolution_param {
num_output: 64
kernel_size: 5
pad: 2
stride: 1
#group: 256
weight_filler {
type: "bilinear"
}
bias_term: false
}
}
#########################
layer {
name: "conv6"
type: "Convolution"
bottom: "data"
top: "conv6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 63
kernel_size: 9
stride: 2
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "conv6"
top: "conv6"
}
layer {
name: "pool6"
type: "Pooling"
bottom: "conv6"
top: "pool6"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
########################
layer {
name: "concat"
type: "Concat"
bottom: "deconv1"
bottom: "pool6"
top: "concat"
concat_param {
concat_dim: 1
}
}
layer {
name: "conv7"
type: "Convolution"
bottom: "concat"
top: "conv7"
convolution_param {
num_output: 64
kernel_size: 5
pad: 2
stride: 1
weight_filler {
type: "gaussian"
std: 0.011
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "conv7"
top: "conv7"
relu_param{
negative_slope: 0.01
engine: CUDNN
}
}
layer {
name: "conv8"
type: "Convolution"
bottom: "conv7"
top: "conv8"
convolution_param {
num_output: 64
kernel_size: 5
pad: 2
stride: 1
weight_filler {
type: "gaussian"
std: 0.011
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu8"
type: "ReLU"
bottom: "conv8"
top: "conv8"
relu_param{
negative_slope: 0.01
engine: CUDNN
}
}
layer {
name: "conv9"
type: "Convolution"
bottom: "conv8"
top: "conv9"
convolution_param {
num_output: 1
kernel_size: 5
pad: 2
stride: 1
weight_filler {
type: "gaussian"
std: 0.011
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu9"
type: "ReLU"
bottom: "conv9"
top: "result"
relu_param{
negative_slope: 0.01
engine: CUDNN
}
}
log:
I1108 19:34:57.239722 4277 data_layer.cpp:41] output data size: 1,1,228,304
I1108 19:34:57.243340 4277 data_layer.cpp:41] output data size: 1,1,55,74
I1108 19:34:57.247392 4277 net.cpp:150] Setting up conv1
I1108 19:34:57.247407 4277 net.cpp:157] Top shape: 1 96 55 74 (390720)
I1108 19:34:57.248191 4277 net.cpp:150] Setting up pool1
I1108 19:34:57.248196 4277 net.cpp:157] Top shape: 1 96 27 37 (95904)
I1108 19:34:57.253263 4277 net.cpp:150] Setting up conv2
I1108 19:34:57.253276 4277 net.cpp:157] Top shape: 1 256 27 37 (255744)
I1108 19:34:57.254202 4277 net.cpp:150] Setting up pool2
I1108 19:34:57.254220 4277 net.cpp:157] Top shape: 1 256 13 18 (59904)
I1108 19:34:57.269943 4277 net.cpp:150] Setting up conv3
I1108 19:34:57.269961 4277 net.cpp:157] Top shape: 1 384 13 18 (89856)
I1108 19:34:57.285303 4277 net.cpp:150] Setting up conv4
I1108 19:34:57.285338 4277 net.cpp:157] Top shape: 1 384 13 18 (89856)
I1108 19:34:57.294801 4277 net.cpp:150] Setting up conv5
I1108 19:34:57.294841 4277 net.cpp:157] Top shape: 1 256 13 18 (59904)
I1108 19:34:57.295207 4277 net.cpp:150] Setting up pool5
I1108 19:34:57.295210 4277 net.cpp:157] Top shape: 1 256 6 9 (13824)
I1108 19:34:57.743222 4277 net.cpp:150] Setting up fc6
I1108 19:34:57.743259 4277 net.cpp:157] Top shape: 1 4096 (4096)
I1108 19:34:57.881680 4277 net.cpp:150] Setting up fc7
I1108 19:34:57.881718 4277 net.cpp:157] Top shape: 1 4070 (4070)
I1108 19:34:57.881826 4277 net.cpp:150] Setting up reshape
I1108 19:34:57.881846 4277 net.cpp:157] Top shape: 1 1 55 74 (4070)
I1108 19:34:57.884768 4277 net.cpp:150] Setting up conv6
I1108 19:34:57.885309 4277 net.cpp:150] Setting up pool6
I1108 19:34:57.885327 4277 net.cpp:157] Top shape: 1 63 55 74 (256410)
I1108 19:34:57.885395 4277 net.cpp:150] Setting up concat
I1108 19:34:57.885412 4277 net.cpp:157] Top shape: 1 64 55 74 (260480)
I1108 19:34:57.886759 4277 net.cpp:150] Setting up conv7
I1108 19:34:57.886786 4277 net.cpp:157] Top shape: 1 64 55 74 (260480)
I1108 19:34:57.897269 4277 net.cpp:150] Setting up conv8
I1108 19:34:57.897303 4277 net.cpp:157] Top shape: 1 64 55 74 (260480)
I1108 19:34:57.899129 4277 net.cpp:150] Setting up conv9
I1108 19:34:57.899138 4277 net.cpp:157] Top shape: 1 1 55 74 (4070)
The value of num_output of the last fully connected layer will not be 1 for pixel wise prediction. It will be equal to w*h of the input image.
What made you feel that the value will be 1?
Edit 1:
Below are the dimensions of each layer mentioned in link1 page 3 figure:
LAYER OUTPUT DIM [c*h*w]
course1 96*h1*w1 conv layer
course2 256*h2*w2 conv layer
course3 384*h3*w3 conv layer
course4 384*h4*w4 conv layer
course5 256*h5*w5 conv layer
course6 4096*1*1 fc layer
course7 X*1*1 fc layer where 'X' could be interpreted as w*h
To understand this further, lets assume we have a network to predict the pixels of the image. The images are of size 10*10. Thus, the final output of the fc layer too will be having the dimension 100*1*1(like in course7). This could be interpreted as 10*10.
Now the question will be, how can the 1d array predict a 2d image correctly. For this, you have to note that the loss is calculated for this output, using the labels which could be corresponding to the pixel data. Thus during training, the weights will learn to predict the pixel data.
EDIT 2:
Trying to draw the net using draw_net.py in caffe, gives you this:
The relu layer connected with conv6 and fc6 has the same name, leading to a complicated connectivity in the drawn image. I am not sure on whether this will cause some issues during training, but I would suggest you to rename one of the relu layers to a unique name to avoid some unforseen issues.
Coming back to your question, there doesn't seem to be an upsampling happening after fully connected layers. As seen in the log:
I1108 19:34:57.881680 4277 net.cpp:150] Setting up fc7
I1108 19:34:57.881718 4277 net.cpp:157] Top shape: 1 4070 (4070)
I1108 19:34:57.881826 4277 net.cpp:150] Setting up reshape
I1108 19:34:57.881846 4277 net.cpp:157] Top shape: 1 1 55 74 (4070)
I1108 19:34:57.884768 4277 net.cpp:150] Setting up conv6
I1108 19:34:57.885309 4277 net.cpp:150] Setting up pool6
I1108 19:34:57.885327 4277 net.cpp:157] Top shape: 1 63 55 74 (256410)
fc7 has output dimension of 4070*1*1. This is being reshaped to 1*55*74 to be passed as an input to conv6 layer.
The output of the whole network is produced in conv9, which has an output dimension of 1*55*74, which is exactly similar to the dimension of the labels (depth data).
Please do pinpoint on where you feel the upsample is happening, if my answer is still not clear.
if you simply need fully-connected networks like the conventional multi-layer perceptron, use 2D blobs (shape (N, D)) and call the InnerProductLayer.