I trained a GoogleNet model for a while, and now I'd like to restart from a checkpoint, adding a test phase. I have the test already in my train_val.prototxt file, and I added the proper parameters to my solver.prototxt ... but I get an error on the restart:
I0712 15:53:02.615947 47646 net.cpp:278] This network produces output loss2/loss1
I0712 15:53:02.615964 47646 net.cpp:278] This network produces output loss3/loss3
I0712 15:53:02.616109 47646 net.cpp:292] Network initialization done.
F0712 15:53:02.616665 47646 solver.cpp:128] Check failed: param_.test_iter_size() == num_test_nets (1 vs. 0) test_iter must be specified for each test network.
*** Check failure stack trace: ***
# 0x7f550cf70e6d (unknown)
# 0x7f550cf72ced (unknown)
# 0x7f550cf70a5c (unknown)
# 0x7f550cf7363e (unknown)
# 0x7f550d3b605b caffe::Solver<>::InitTestNets()
# 0x7f550d3b63ed caffe::Solver<>::Init()
# 0x7f550d3b6738 caffe::Solver<>::Solver()
# 0x7f550d4fa633 caffe::Creator_SGDSolver<>()
# 0x7f550da5bb76 caffe::SolverRegistry<>::CreateSolver()
# 0x7f550da548f4 train()
# 0x7f550da52316 main
# 0x7f5508f43b15 __libc_start_main
# 0x7f550da52d3d (unknown)
solver.prototxt
train_net: "<my_path>/train_val.prototxt"
test_iter: 1000
test_interval: 4000
test_initialization: false
display: 40
average_loss: 40
base_lr: 0.01
lr_policy: "step"
stepsize: 320000
gamma: 0.96
max_iter: 10000000
momentum: 0.9
weight_decay: 0.0002
snapshot: 40000
snapshot_prefix: "models/<my_path>"
solver_mode: CPU
train_val.prototxt train and test layers:
name: "GoogleNet"
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 224
mean_value: 104
mean_value: 117
mean_value: 123
}
data_param {
source: "/<blah>/ilsvrc12_train_lmdb"
batch_size: 32
backend: LMDB
}
}
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mirror: true
crop_size: 224
mean_value: 104
mean_value: 117
mean_value: 123
}
data_param {
source: "/<blah>/ilsvrc12_val_lmdb"
batch_size: 32
backend: LMDB
}
}
You should modify one place in your solver.prototxt from
train_net: "/train_val.prototxt"
to
net: "/train_val.prototxt"
Because the Solver does not use value of "train_net" to initialize a test net, so the test phase you added was not founded by the solver.
In fact, the parameters "train_net" and "test_net" are separately used to initialize a train net and a test net only, while "net" is used for both.
Related
I'm using liipimagineBundle, it wroks just fine on localhost, but when I use the bundle with the same version on server (shared hosting with public_html as a public folder)
images are no longer loaded or resolved
I really appreciate the help
liip_imagine.yaml
# See dos how to configure the bundle: https://symfony.com/doc/current/bundles/LiipImagineBundle/basic-usage.html
liip_imagine:
# valid drivers options include "gd" or "gmagick" or "imagick"
driver: "gd"
resolvers:
default:
web_path:
web_root: "%kernel.project_dir%/public"
product:
web_path:
web_root: "%kernel.project_dir%/public"
product_detail:
web_path:
web_root: "%kernel.project_dir%/public"
product_mobile:
web_path:
web_root: "%kernel.project_dir%/public"
product_list:
web_path:
web_root: "%kernel.project_dir%/public"
product_category:
web_path:
web_root: "%kernel.project_dir%/public"
product_home:
web_path:
web_root: "%kernel.project_dir%/public"
product_home_sales:
web_path:
web_root: "%kernel.project_dir%/public"
category_home:
web_path:
web_root: "%kernel.project_dir%/public"
bg_home:
web_path:
web_root: "%kernel.project_dir%/public"
loaders:
product:
filesystem:
data_root: "%kernel.project_dir%/public/uploads/products"
product_list:
filesystem:
data_root: "%kernel.project_dir%/public/uploads/products"
product_mobile:
filesystem:
data_root: "%kernel.project_dir%/public/uploads/products"
product_detail:
filesystem:
data_root: "%kernel.project_dir%/public/uploads/products"
product_category:
filesystem:
data_root: "%kernel.project_dir%/public/uploads/products"
product_home:
filesystem:
data_root: "%kernel.project_dir%/public/uploads/products"
product_home_sales:
filesystem:
data_root: "%kernel.project_dir%/public/uploads/products"
category_home:
filesystem:
data_root: "%kernel.project_dir%/public/uploads/categories"
bg_home:
filesystem:
data_root: "%kernel.project_dir%/public/front/img"
webp:
generate: false
filter_sets:
cache: ~
# the name of the "filter set"
products_list:
cache: product
data_loader: product
# adjust the image quality to 75%
quality: 100
# list of transformations to apply (the "filters")
filters:
fixed:
width: 368
height: 368
#thumbnail: { size: [314, 211], mode: outbound }
bg_home:
cache: bg_home
data_loader: bg_home
# adjust the image quality to 75%
quality: 100
# list of transformations to apply (the "filters")
filters:
fixed:
width: 414
height: 221
#thumbnail: { size: [314, 211], mode: outbound }
products_display:
cache: product
data_loader: product_list
# adjust the image quality to 75%
quality: 100
# list of transformations to apply (the "filters")
filters:
fixed:
width: 350
height: 324
#thumbnail: { size: [314, 211], mode: outbound }
product_category:
cache: product
data_loader: product_category
# adjust the image quality to 75%
quality: 100
# list of transformations to apply (the "filters")
filters:
fixed:
width: 255
height: 324
#thumbnail: { size: [314, 211], mode: outbound }
product_mobile:
cache: product
data_loader: product_mobile
# adjust the image quality to 75%
quality: 100
# list of transformations to apply (the "filters")
filters:
fixed:
width: 338
height: 400
#thumbnail: { size: [314, 211], mode: outbound }
product_thumb:
cache: product_detail
data_loader: product_detail
# adjust the image quality to 75%
quality: 75
# list of transformations to apply (the "filters")
filters:
fixed:
width: 60
height: 50
#thumbnail: { size: [850, 450], mode: outbound }
product_home:
cache: product_home
data_loader: product_home
# adjust the image quality to 75%
quality: 75
# list of transformations to apply (the "filters")
filters:
fixed:
width: 270
height: 344
#thumbnail: { size: [850, 450], mode: outbound }
product_home_sales:
cache: product_home_sales
data_loader: product_home_sales
# adjust the image quality to 75%
quality: 75
# list of transformations to apply (the "filters")
filters:
fixed:
width: 210
height: 260
#thumbnail: { size: [850, 450], mode: outbound }
category_home:
cache: category_home
data_loader: category_home
# adjust the image quality to 75%
quality: 75
# list of transformations to apply (the "filters")
filters:
fixed:
width: 376
height: 231
#thumbnail: { size: [850, 450], mode: outbound }
I really appreciate the help I've been stuck in this error forever now
I'm using caffe to train a model. I am sure I have connected the data layer with the train.txt in the source of image_data_param.But when I try ./train.sh. It always prompt: can not find image.
Ubuntu 18.04 openCV3 python2
layer {
name: "data"
type: "ImageLabelmapData"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: false
mean_value: 104.00699
mean_value: 116.66877
mean_value: 122.67892
}
image_data_param {
root_folder: "/home/yogang/Desktop/rawdata/train/"
source: "/home/yogang/Desktop/rawdata/train.txt"
batch_size: 1
shuffle: true
new_height:0
new_width: 0
}
}
I0828 19:29:13.834946 14079 layer_factory.hpp:77] Creating layer data
I0828 19:29:13.835011 14079 net.cpp:101] Creating Layer data
I0828 19:29:13.835031 14079 net.cpp:409] data -> data
I0828 19:29:13.835059 14079 net.cpp:409] data -> label
I0828 19:29:13.835124 14079 image_labelmap_data_layer.cpp:42] Opening file /home/yogang/Desktop/rawdata/train.txt
I0828 19:29:13.835505 14079 image_labelmap_data_layer.cpp:52] Shuffling data
I0828 19:29:13.835677 14079 image_labelmap_data_layer.cpp:57] A total of 242 images.
E0828 19:29:13.836748 14079 io.cpp:80] Could not open or find file /home/yogang/Desktop/rawdata/train//home/yogang/Desktop/rawdata/train/satellite144.jpg
E0828 19:29:13.836797 14079 io.cpp:80] Could not open or find file /home/yogang/Desktop/rawdata/train//home/yogang/Desktop/rawdata/train/400.jpg
F0828 19:29:13.836818 14079 image_labelmap_data_layer.cpp:86] Check failed: cv_img.data Could not load /home/yogang/Desktop/rawdata/train/satellite144.jpg
*** Check failure stack trace: ***
./train.sh: line 8: 14079 Aborted (core dumped) ./solve.py
i think image path is wrong
check your image path
I am getting following error while using my caffe prototxt:
F0329 17:37:40.771555 24587 insert_splits.cpp:35] Unknown blob input data to layer 0
*** Check failure stack trace: ***
The first 2 layers in my caffe prototxt is given below:
layers {
name: "data"
type: IMAGE_DATA
top: "data"
top: "label"
include {
phase: TRAIN
}
image_data_param {
source: "train2.txt"
batch_size: 100
new_height: 28
new_width: 28
is_color: false
}
}
layers {
name: "conv1"
type: CONVOLUTION
bottom: "data"
top: "conv1"
blobs_lr: 1
blobs_lr: 3
convolution_param {
num_output: 8
kernel_size: 9
stride: 1
weight_filler { type: "xavier" }
bias_filler { type: "constant" }
}
}
What could be the possible reason for the same ?
t seems like your IMAGE_DATA layer is only defined for TRAIN phase. Thus blobs data and label are not defined for TEST phase. I suspect you see no error when the solver builds the train phase net, and only when test phase net is built then the error appears.
I am trying to train and validate a network on Imagenet. The validation process works without any problems (with the pretrained weights). However, when I try to perform the training, there appears an error that the imagenet_mean.binaryproto file is not found; the very same file that has worked for the valiudation process. What is wrong?
...
I0222 15:29:15.108032 15823 net.cpp:399] data -> label
I0222 15:29:15.108057 15823 data_transformer.cpp:25] Loading mean file from: /home/myuser/learning/caffe/data/ilsvrc12/imagenet_mean.binaryproto
F0222 15:29:15.108577 15830 db_lmdb.hpp:14] Check failed: mdb_status == 0 (2 vs. 0) No such file or directory
*** Check failure stack trace: ***
# 0x7fc82857edaa (unknown)
# 0x7fc82857ece4 (unknown)
# 0x7fc82857e6e6 (unknown)
# 0x7fc828581687 (unknown)
# 0x7fc828ba115e caffe::db::LMDB::Open()
# 0x7fc828b75644 caffe::DataReader::Body::InternalThreadEntry()
# 0x7fc828cc1470 caffe::InternalThread::entry()
# 0x7fc81f4a8a4a (unknown)
# 0x7fc826a98184 start_thread
# 0x7fc8271b437d (unknown)
# (nil) (unknown)
Aborted (core dumped)
Here is the prototxt I am using:
name: "CaffeNet"
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 227
mean_file: "/home/myuser/learning/caffe/data/ilsvrc12/imagenet_mean.binaryproto"
#mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
}
# mean pixel / channel-wise mean instead of mean image
# transform_param {
# crop_size: 227
# mean_value: 104
# mean_value: 117
# mean_value: 123
# mirror: true
# }
data_param {
source: "examples/imagenet/ilsvrc12_train_lmdb"
batch_size: 256
backend: LMDB
}
}
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mirror: false
crop_size: 227
mean_file: "/home/myuser/learning/caffe/data/ilsvrc12/imagenet_mean.binaryproto"
#mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
}
# mean pixel / channel-wise mean instead of mean image
# transform_param {
# crop_size: 227
# mean_value: 104
# mean_value: 117
# mean_value: 123
# mirror: false
# }
data_param {
source: "/sdc/repository/myuser/Imagenet2012/Imagenet2012trainLMDB"
#source: "examples/imagenet/ilsvrc12_val_lmdb"
batch_size: 50
backend: LMDB
}
}
layer {
name: "conv1"
…
I trained an FC network with HDF5 data layer, then used surgery for transplantation to a convolutional network, then changed the data layer to a probe-suitable data layer, i.e.:
from:
layer {
name: "layer_data_left"
type: "HDF5Data"
top: "data_left"
top: "labels_left"
include {
phase: TRAIN
}
hdf5_data_param {
source: "/home/me/Desktop/trainLeftPatches.txt"
batch_size: 128
}
}
to
layer {
name: "data_left"
type: "Input"
top: "data_right"
input_param { shape: { dim: 1 dim: 1 dim: 1241 dim: 367 } }
}
is there any reason this would go out of memory?:
>>> fc_net.forward()
F0729 20:02:02.205382 6821 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
Aborted (core dumped)
Or, is it more likely that I made a mistake somewhere in surgery & exchanging data layers?
Thank you.