Fine tuning a pre-trained network with new data - caffe

I have a pre-trained network (the prototxt definition and binary caffemodel with the weights) designed for image recognition. I got it on-line, without knowing how it was trained, on which data, and i haven't seen the solver file.
The network has 3 layers (as far as i can tell - i have 3 prototxt files).
I'm trying to add another "feature" to the network - make it recognize some pose as well.
The steps I've taken so far:
- Add another output to the last layer, similar to the outputs that were already there
- Process the image database through the first two layers, and save the output to lmdb
- create a new solver for fine-tuning
- create a train_test for fine tuning the last layer
Running "caffe train" with the solver simply crashes.
I tried figuring out more by going into python and:
caffe.Net(train_test_file_path)
I got:
I0703 11:10:54.095563 21756 net.cpp:294] The NetState phase (1) differed from the phase (0) specified by a rule in layer data
I0703 11:10:54.095655 21756 net.cpp:51] Initializing net from parameters:
<train_test_file_content>
I0703 11:10:54.096817 21756 layer_factory.hpp:77] Creating layer data
I0703 11:10:54.097033 21756 db_lmdb.cpp:35] Opened lmdb /home/user/yaw_db/test/lmdb/
I0703 11:10:54.097090 21756 net.cpp:84] Creating Layer data
I0703 11:10:54.097111 21756 net.cpp:380] data -> data
I0703 11:10:54.097158 21756 net.cpp:380] data -> label
I0703 11:10:54.097657 21756 data_layer.cpp:45] output data size: 50,1,1,193536
I0703 11:10:54.097937 21756 net.cpp:122] Setting up data
I0703 11:10:54.097960 21756 net.cpp:129] Top shape: 50 1 1 193536 (9676800)
I0703 11:10:54.097983 21756 net.cpp:129] Top shape: 50 (50)
I0703 11:10:54.097999 21756 net.cpp:137] Memory required for data: 38707400
I0703 11:10:54.098014 21756 layer_factory.hpp:77] Creating layer label_data_1_split
I0703 11:10:54.098047 21756 net.cpp:84] Creating Layer label_data_1_split
I0703 11:10:54.098063 21756 net.cpp:406] label_data_1_split <- label
I0703 11:10:54.098084 21756 net.cpp:380] label_data_1_split -> label_data_1_split_0
I0703 11:10:54.098106 21756 net.cpp:380] label_data_1_split -> label_data_1_split_1
I0703 11:10:54.098131 21756 net.cpp:122] Setting up label_data_1_split
I0703 11:10:54.098145 21756 net.cpp:129] Top shape: 50 (50)
I0703 11:10:54.098163 21756 net.cpp:129] Top shape: 50 (50)
I0703 11:10:54.098176 21756 net.cpp:137] Memory required for data: 38707800
I0703 11:10:54.098188 21756 layer_factory.hpp:77] Creating layer conv1_3
I0703 11:10:54.098212 21756 net.cpp:84] Creating Layer conv1_3
I0703 11:10:54.098227 21756 net.cpp:406] conv1_3 <- data
I0703 11:10:54.098245 21756 net.cpp:380] conv1_3 -> conv1_3
F0703 11:10:54.098325 21756 blob.cpp:32] Check failed: shape[i] >= 0 (-1 vs. 0)
*** Check failure stack trace: ***
Aborted (core dumped)
Opening the lmdb I've created and using stat() on it produced:
{'branch_pages': 1,
'depth': 2,
'entries': 12651,
'leaf_pages': 75,
'overflow_pages': 561233,
'psize': 4096}
Searching the internet gave me a slight idea that perhaps i saved the processed images wrong.
Any further ideas?
PS. I am very new to caffe, neuron networks etc. so i might even be missing the simplest of things.

You saved your intermediate features into lmdb file ('/home/user/yaw_db/test/lmdb').
The data there is stored as a collection of 1x1x193,536 dimensional features. You are reading a batch of 50 each time. You can see this in your log file:
I0703 11:10:54.097657 21756 data_layer.cpp:45] output data size: 50,1,1,193536
Now it seems like you are trying to apply a 3x3 convolution (at layer 'conv1_3'). However, the spatial dimensions of your input blob are 1x193,536. There's not enough "height" to the input blob to allow for 3x3 convolution, this is why you get an error
F0703 11:10:54.098325 21756 blob.cpp:32] Check failed: shape[i] >= 0 (-1 vs. 0)

Related

Pytorch Convolutional Autoencoder output is blurry. How to improve it?

I created Convolutional Autoencoder using Pytorch and I'm trying to improve it.
For the encoding layer I use first 4 layers of pre-trained ResNet 18 model from torchvision.models.resnet.
I have mid-layer with just one Convolutional layer with input and output channel sizes of 512. For the decoding layer I use Convolutional layers following with BatchNorm and ReLU activation function.
The decoding layer reduces the channel each layer: 512 -> 256 -> 128 -> 64 -> 32 -> 16 -> 3 and increases the resolution of the image with interpolation to match the dimension of the corresponding layer in the encoding part. For the last layer I use sigmoid instead of ReLu.
All Convolutional layers are:
self.up = nn.Sequential(
nn.Conv2d(input_channels, output_channels,
kernel_size=5, stride=1,
padding=2, bias=False),
nn.BatchNorm2d(output_channels),
nn.ReLU()
)
The input images are scaled to [0, 1] range and have shapes 224x224x3. Sample outputs are (First is from training set, the second from the test set):
First image
First image output
Second image
Second image output
Any ideas why output is blurry? The provided model has been trained around 160 epochs with ~16000 images using Adam optimizer with lr=0.00005. I'm thinking about adding one more Convolutional layer in self.up given above. This will increase complexity of the model, but I'm not sure if it is the right way to improve the model.

Check failed: error == cudaSuccess (3 vs. 0) initialization error *** Check failure stack trace: **

I am very new to cuda and caffe. In my code I am using one caffe model for one textline detection and another caffe model for chracAter recognition. After detecting the textlines, I am processing all the textlines in parallel for segmentation and then recognition. However, while doing recognition I get the error as follows:
I0503 14:40:41.661458 3996 net.cpp:436] Input 0 -> data
I0503 14:40:41.661509 3996 layer_factory.hpp:76] Creating layer scale
I0503 14:40:41.661527 3996 net.cpp:111] Creating Layer scale
I0503 14:40:41.661536 3996 net.cpp:478] scale <- data
I0503 14:40:41.661545 3996 net.cpp:434] scale -> scaled
I0503 14:40:41.661563 3996 net.cpp:156] Setting up scale
I0503 14:40:41.661576 3996 net.cpp:164] Top shape: 1 1 20 20 (400)
I0503 14:40:41.661583 3996 layer_factory.hpp:76] Creating layer conv1
I0503 14:40:41.661597 3996 net.cpp:111] Creating Layer conv1
I0503 14:40:41.661605 3996 net.cpp:478] conv1 <- scaled
I0503 14:40:41.661615 3996 net.cpp:434] conv1 -> conv1
F0503 14:40:41.661710 3996 syncedmem.hpp:19] Check failed: error == cudaSuccess (3 vs. 0) initialization error
*** Check failure stack trace: ***
How can I fix this?
I also met this problem. and now there are some suggestions maybe help you.
init the whole caffe net for each thread.
the Caffe::mode_ variable that controls this is thread-local, so ensure you're calling caffe.set_mode_gpu() in each thread before running any Caffe functions.
using the threading module instead of the multiprocessing module.

How a subset of pretrained caffe model can be saved?

I am working on a pretrained caffe model (in python) which has 3 layers. I want to decompose this caffe model and create a new model the same as first layer of this model. For example:
Original Caffe model
data -> conv1_1 -> conv1_2 -> conv2_1 -> conv2_2 -> conv3_1 -> conv3_2
New Caffe model
data -> conv1_1 -> conv1_2
Can anybody help me?
Python exposes the data inside the .caffemodel file. It can be accessed as an array. For example,
net = caffe.Net('path/to/conv.prototxt', 'path/to/conv.caffemodel', caffe.TEST)
W = net.params['con_1'][0].data[...]
b = net.params['con_1'][1].data[...]
You can copy this data into a new file and save it as a .caffemodel file. Have a look at this and this.

Caffe: Why in Imagenet model activation dimension of fc6 is 4096?

I was looking at pool5 layer in the Imagenet trained model (ILSVRC challenge) whose output size is 256*6*6 (~9000), after which there is a fc6 layer whose num_outputs is 4096. Can anyone please explain how is 4096 chosen?

caffe - Input data, train and test set [duplicate]

I started with Caffe and the mnist example ran well.
I have the train and label data as data.mat. (I have 300 training data with 30 features and labels are (-1, +1) that have saved in data.mat).
However, I don't quite understand how I can use caffe to implement my own dataset?
Is there a step by step tutorial can teach me?
Many thanks!!!! Any advice would be appreciated!
I think the most straight forward way to transfer data from Matlab to caffe is via HDF5 file.
First, save your data in Matlab in an HDF5 file using hdf5write. I assume your training data is stored in a variable name X of size 300-by-30 and the labels are stored in y a 300-by-1 vector:
hdf5write('my_data.h5', '/X',
single( permute(reshape(X,[300, 30, 1, 1]),[4:-1:1]) ) );
hdf5write('my_data.h5', '/label',
single( permute(reshape(y,[300, 1, 1, 1]),[4:-1:1]) ),
'WriteMode', 'append' );
Note that the data is saved as a 4D array: the first dimension is the number of features, second one is the feature's dimension and the last two are 1 (representing no spatial dimensions). Also note that the names given to the data in the HDF5 are "X" and "label" - these names should be used as the "top" blobs of the input data layer.
Why permute? please see this answer for an explanation.
You also need to prepare a text file listing the names of all hdf5 files you are using (in your case, only my_data.h5). File /path/to/list/file.txt should have a single line
/path/to/my_data.h5
Now you can add an input data layer to your train_val.prototxt
layer {
type: "HDF5Data"
name: "data"
top: "X" # note: same name as in HDF5
top: "label" #
hdf5_data_param {
source: "/path/to/list/file.txt"
batch_size: 20
}
include { phase: TRAIN }
}
For more information regarding hdf5 input layer, you can see in this answer.