Unknown blob input data to layer 0 in caffe - caffe

I am getting following error while using my caffe prototxt:
F0329 17:37:40.771555 24587 insert_splits.cpp:35] Unknown blob input data to layer 0
*** Check failure stack trace: ***
The first 2 layers in my caffe prototxt is given below:
layers {
name: "data"
type: IMAGE_DATA
top: "data"
top: "label"
include {
phase: TRAIN
}
image_data_param {
source: "train2.txt"
batch_size: 100
new_height: 28
new_width: 28
is_color: false
}
}
layers {
name: "conv1"
type: CONVOLUTION
bottom: "data"
top: "conv1"
blobs_lr: 1
blobs_lr: 3
convolution_param {
num_output: 8
kernel_size: 9
stride: 1
weight_filler { type: "xavier" }
bias_filler { type: "constant" }
}
}
What could be the possible reason for the same ?

t seems like your IMAGE_DATA layer is only defined for TRAIN phase. Thus blobs data and label are not defined for TEST phase. I suspect you see no error when the solver builds the train phase net, and only when test phase net is built then the error appears.

Related

problems of training RCF model using caffe

I'm using caffe to train a model. I am sure I have connected the data layer with the train.txt in the source of image_data_param.But when I try ./train.sh. It always prompt: can not find image.
Ubuntu 18.04 openCV3 python2
layer {
name: "data"
type: "ImageLabelmapData"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: false
mean_value: 104.00699
mean_value: 116.66877
mean_value: 122.67892
}
image_data_param {
root_folder: "/home/yogang/Desktop/rawdata/train/"
source: "/home/yogang/Desktop/rawdata/train.txt"
batch_size: 1
shuffle: true
new_height:0
new_width: 0
}
}
I0828 19:29:13.834946 14079 layer_factory.hpp:77] Creating layer data
I0828 19:29:13.835011 14079 net.cpp:101] Creating Layer data
I0828 19:29:13.835031 14079 net.cpp:409] data -> data
I0828 19:29:13.835059 14079 net.cpp:409] data -> label
I0828 19:29:13.835124 14079 image_labelmap_data_layer.cpp:42] Opening file /home/yogang/Desktop/rawdata/train.txt
I0828 19:29:13.835505 14079 image_labelmap_data_layer.cpp:52] Shuffling data
I0828 19:29:13.835677 14079 image_labelmap_data_layer.cpp:57] A total of 242 images.
E0828 19:29:13.836748 14079 io.cpp:80] Could not open or find file /home/yogang/Desktop/rawdata/train//home/yogang/Desktop/rawdata/train/satellite144.jpg
E0828 19:29:13.836797 14079 io.cpp:80] Could not open or find file /home/yogang/Desktop/rawdata/train//home/yogang/Desktop/rawdata/train/400.jpg
F0828 19:29:13.836818 14079 image_labelmap_data_layer.cpp:86] Check failed: cv_img.data Could not load /home/yogang/Desktop/rawdata/train/satellite144.jpg
*** Check failure stack trace: ***
./train.sh: line 8: 14079 Aborted (core dumped) ./solve.py
i think image path is wrong
check your image path

Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM for FASTER RCNN Library

I am testing Faster Rcnn. Installation is fine.
During the installation, I had one issue with cudnn5.1 and I followed the suggestion here and now installation is fine.
Now I test the demo code as
./tools/demo.py
Then I have error as
I1117 09:48:41.011925 12503 net.cpp:51] Initializing net from parameters:
name: "VGG_ILSVRC_16_layers"
state {
phase: TEST
level: 0
}
.
.
.
layer {
name: "cls_prob"
type: "Softmax"
bottom: "cls_score"
top: "cls_prob"
}
I1117 09:48:41.012234 12503 layer_factory.hpp:77] Creating layer input
I1117 09:48:41.012251 12503 net.cpp:84] Creating Layer input
I1117 09:48:41.012259 12503 net.cpp:380] input -> data
I1117 09:48:41.012271 12503 net.cpp:380] input -> im_info
I1117 09:48:41.328574 12503 net.cpp:122] Setting up input
I1117 09:48:41.328608 12503 net.cpp:129] Top shape: 1 3 224 224 (150528)
I1117 09:48:41.328614 12503 net.cpp:129] Top shape: 1 3 (3)
I1117 09:48:41.328618 12503 net.cpp:137] Memory required for data: 602124
I1117 09:48:41.328624 12503 layer_factory.hpp:77] Creating layer conv1_1
I1117 09:48:41.328655 12503 net.cpp:84] Creating Layer conv1_1
I1117 09:48:41.328660 12503 net.cpp:406] conv1_1 <- data
I1117 09:48:41.328670 12503 net.cpp:380] conv1_1 -> conv1_1
F1117 09:48:41.676553 12503 cudnn.hpp:128] Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM
*** Check failure stack trace: ***
Aborted (core dumped)
What is wrong with my installation for this faster rcnn?
I have cuda8.0 and libcudnn5_5.1.10-1+cuda8.0 is installed on Ubuntu16.04.
I have Qurdo K4200 graphic card.
Now it works for me. Since libcudnn5_5.1 is for CUDA7.5. Can check in cudnn's user guide at GPU and driver requirements. So I changed to cudnnv6.0 for CUDA8.0.
Then you may face the issue of
Check failed: error == cudaSuccess (8 vs. 0) invalid device function
For that you will need to py-faster-rcnn/lib/fast_rcnn/config.py and change
__C.USE_GPU_NMS = True
to
__C.USE_GPU_NMS = False
Then now it works.

Cannot find mean file while training on Imagenet

I am trying to train and validate a network on Imagenet. The validation process works without any problems (with the pretrained weights). However, when I try to perform the training, there appears an error that the imagenet_mean.binaryproto file is not found; the very same file that has worked for the valiudation process. What is wrong?
...
I0222 15:29:15.108032 15823 net.cpp:399] data -> label
I0222 15:29:15.108057 15823 data_transformer.cpp:25] Loading mean file from: /home/myuser/learning/caffe/data/ilsvrc12/imagenet_mean.binaryproto
F0222 15:29:15.108577 15830 db_lmdb.hpp:14] Check failed: mdb_status == 0 (2 vs. 0) No such file or directory
*** Check failure stack trace: ***
# 0x7fc82857edaa (unknown)
# 0x7fc82857ece4 (unknown)
# 0x7fc82857e6e6 (unknown)
# 0x7fc828581687 (unknown)
# 0x7fc828ba115e caffe::db::LMDB::Open()
# 0x7fc828b75644 caffe::DataReader::Body::InternalThreadEntry()
# 0x7fc828cc1470 caffe::InternalThread::entry()
# 0x7fc81f4a8a4a (unknown)
# 0x7fc826a98184 start_thread
# 0x7fc8271b437d (unknown)
# (nil) (unknown)
Aborted (core dumped)
Here is the prototxt I am using:
name: "CaffeNet"
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 227
mean_file: "/home/myuser/learning/caffe/data/ilsvrc12/imagenet_mean.binaryproto"
#mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
}
# mean pixel / channel-wise mean instead of mean image
# transform_param {
# crop_size: 227
# mean_value: 104
# mean_value: 117
# mean_value: 123
# mirror: true
# }
data_param {
source: "examples/imagenet/ilsvrc12_train_lmdb"
batch_size: 256
backend: LMDB
}
}
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mirror: false
crop_size: 227
mean_file: "/home/myuser/learning/caffe/data/ilsvrc12/imagenet_mean.binaryproto"
#mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
}
# mean pixel / channel-wise mean instead of mean image
# transform_param {
# crop_size: 227
# mean_value: 104
# mean_value: 117
# mean_value: 123
# mirror: false
# }
data_param {
source: "/sdc/repository/myuser/Imagenet2012/Imagenet2012trainLMDB"
#source: "examples/imagenet/ilsvrc12_val_lmdb"
batch_size: 50
backend: LMDB
}
}
layer {
name: "conv1"
…

Caffe: change data layer after surgery

I trained an FC network with HDF5 data layer, then used surgery for transplantation to a convolutional network, then changed the data layer to a probe-suitable data layer, i.e.:
from:
layer {
name: "layer_data_left"
type: "HDF5Data"
top: "data_left"
top: "labels_left"
include {
phase: TRAIN
}
hdf5_data_param {
source: "/home/me/Desktop/trainLeftPatches.txt"
batch_size: 128
}
}
to
layer {
name: "data_left"
type: "Input"
top: "data_right"
input_param { shape: { dim: 1 dim: 1 dim: 1241 dim: 367 } }
}
is there any reason this would go out of memory?:
>>> fc_net.forward()
F0729 20:02:02.205382 6821 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
Aborted (core dumped)
Or, is it more likely that I made a mistake somewhere in surgery & exchanging data layers?
Thank you.

How do I specify num_test_nets in restart?

I trained a GoogleNet model for a while, and now I'd like to restart from a checkpoint, adding a test phase. I have the test already in my train_val.prototxt file, and I added the proper parameters to my solver.prototxt ... but I get an error on the restart:
I0712 15:53:02.615947 47646 net.cpp:278] This network produces output loss2/loss1
I0712 15:53:02.615964 47646 net.cpp:278] This network produces output loss3/loss3
I0712 15:53:02.616109 47646 net.cpp:292] Network initialization done.
F0712 15:53:02.616665 47646 solver.cpp:128] Check failed: param_.test_iter_size() == num_test_nets (1 vs. 0) test_iter must be specified for each test network.
*** Check failure stack trace: ***
# 0x7f550cf70e6d (unknown)
# 0x7f550cf72ced (unknown)
# 0x7f550cf70a5c (unknown)
# 0x7f550cf7363e (unknown)
# 0x7f550d3b605b caffe::Solver<>::InitTestNets()
# 0x7f550d3b63ed caffe::Solver<>::Init()
# 0x7f550d3b6738 caffe::Solver<>::Solver()
# 0x7f550d4fa633 caffe::Creator_SGDSolver<>()
# 0x7f550da5bb76 caffe::SolverRegistry<>::CreateSolver()
# 0x7f550da548f4 train()
# 0x7f550da52316 main
# 0x7f5508f43b15 __libc_start_main
# 0x7f550da52d3d (unknown)
solver.prototxt
train_net: "<my_path>/train_val.prototxt"
test_iter: 1000
test_interval: 4000
test_initialization: false
display: 40
average_loss: 40
base_lr: 0.01
lr_policy: "step"
stepsize: 320000
gamma: 0.96
max_iter: 10000000
momentum: 0.9
weight_decay: 0.0002
snapshot: 40000
snapshot_prefix: "models/<my_path>"
solver_mode: CPU
train_val.prototxt train and test layers:
name: "GoogleNet"
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 224
mean_value: 104
mean_value: 117
mean_value: 123
}
data_param {
source: "/<blah>/ilsvrc12_train_lmdb"
batch_size: 32
backend: LMDB
}
}
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mirror: true
crop_size: 224
mean_value: 104
mean_value: 117
mean_value: 123
}
data_param {
source: "/<blah>/ilsvrc12_val_lmdb"
batch_size: 32
backend: LMDB
}
}
You should modify one place in your solver.prototxt from
train_net: "/train_val.prototxt"
to
net: "/train_val.prototxt"
Because the Solver does not use value of "train_net" to initialize a test net, so the test phase you added was not founded by the solver.
In fact, the parameters "train_net" and "test_net" are separately used to initialize a train net and a test net only, while "net" is used for both.