I started with Caffe and the mnist example ran well.
I have the train and label data as data.mat. (I have 300 training data with 30 features and labels are (-1, +1) that have saved in data.mat).
However, I don't quite understand how I can use caffe to implement my own dataset?
Is there a step by step tutorial can teach me?
Many thanks!!!! Any advice would be appreciated!
I think the most straight forward way to transfer data from Matlab to caffe is via HDF5 file.
First, save your data in Matlab in an HDF5 file using hdf5write. I assume your training data is stored in a variable name X of size 300-by-30 and the labels are stored in y a 300-by-1 vector:
hdf5write('my_data.h5', '/X',
single( permute(reshape(X,[300, 30, 1, 1]),[4:-1:1]) ) );
hdf5write('my_data.h5', '/label',
single( permute(reshape(y,[300, 1, 1, 1]),[4:-1:1]) ),
'WriteMode', 'append' );
Note that the data is saved as a 4D array: the first dimension is the number of features, second one is the feature's dimension and the last two are 1 (representing no spatial dimensions). Also note that the names given to the data in the HDF5 are "X" and "label" - these names should be used as the "top" blobs of the input data layer.
Why permute? please see this answer for an explanation.
You also need to prepare a text file listing the names of all hdf5 files you are using (in your case, only my_data.h5). File /path/to/list/file.txt should have a single line
/path/to/my_data.h5
Now you can add an input data layer to your train_val.prototxt
layer {
type: "HDF5Data"
name: "data"
top: "X" # note: same name as in HDF5
top: "label" #
hdf5_data_param {
source: "/path/to/list/file.txt"
batch_size: 20
}
include { phase: TRAIN }
}
For more information regarding hdf5 input layer, you can see in this answer.
Related
My question may sound easy, however, I have difficulty in creating an Hdf5 dataset from my 3d medical images that have been saved in nii format for both images and manual segmentation (label files). My questions are:
The blob shape in pycaffe is N*C*W*H, is it different order in matcaffe? for example in pycaffe the data blob shape will be 1*1*60*320*320 for a grayscale volume with width and height 320-by-320 and 60 slices. I tried to use a Matlab code to create HDF5 dataset for the 3D data, and the order of blob in hdf5info file is 320*320*60*1*1 for both data and label. How should I change the orders in the Matlab code to be readable in Pycaffe?
Is there any python code for creating the hdf5 database for 3D data?
if I create the hdf5 data in Matlab and use the list the pycaffe, will it raise issue?
Thanks
Matlab arranges elements in ND arrays from the first dimension to the last (like fortran), while caffe and hdf5 stores the arrays from last dimension to first ("col-major").
This answer explains how to simply modify Matlab's code to write "col-major" arrays to HDF5 files.
You might also find this answer useful for your problem.
I'm totally new in caffe and I'm try to convert a tensorflow model to caffe.
I have a tuple which's shape is a little complex for it's stored some word vector.
This is the shape of the tuple data——
data[0]: a list, [684, 84], stores the sentence vector;
data[1]: a list, [684, 84], stores the position vector;
data[2]: a matrix, [684, 10], stores the aspects of the sentence;
data[3]: a matrix, [1, 684], stores the label of each sentence;
data[4]: a number, stores the max length of sentences;
Each row represents a sentences, which is also a sample of the dataset.
In tf, I return the whole tuple from a function which is wrote by myself.
train_data = read_data(FLAGS.train_data, source_count, source_word2idx)
I noticed that caffe always requires a data layer before training the data, but I don't have ideas how to convert my data to lmdb type or just sent them as a tuple or matrix into the model.
By the way, I'm using pycaffe.
Counld anyone help?
Thanks a lot!
There's no particular magic; all you need to do is to write an input routine that reads the file and returns the data in the format expected for train_data. You do not need to pre-convert your data to LMDB or any other format; just write read data to accept your current input format, and give the model the format it requires.
We can't help you from there: you haven't specified the model's format at all, and you've given us only the shape for the input data (no internal structure or semantics). Simply treat the data as if you were figuring out how to organize the input data for a given output format.
I wanted to run the [FCN code][1] for semantic segmentation. However, I am beginner in Caffe and I did not know from which point should I start running the code.
Is there any step by step guidance for running?
Since I could not get that much help here, I am posting the steps here. It might be helpful for those who are inexperienced (like me). It took long time for me to figure out how to run it and get the results. you may be able to run it successfully, but similar to my case, the results was blank image for long time and finally found out that how setting should be.
I could perform successfully FCN8s on my data and I did the following steps:
Divide the data into two sets (train, validation) and the labels as well for the corresponding images in both train and validation (4 folder altogether: train_img_lmdb, train_label_lmdb, val_img_lmdb and val_label_lmdb)
Convert your data (each of them separately) into LMDB format (if it is not RGB, convert it using cv2 function), you will have 4 lmdb folders including data.mdb and lock.mdb. the sample code is available here.
Download the .caffemodel from the url that authors have provided,
Change the path to the path of your lmdb files in the train_val.ptototxt file, you should have 4 data layer that source is the path to the train_img_lmdb, train_label_lmdb, val_img_lmdb and val_label_lmdb, similar to this link
Add a convolution layer after this line (here, I have five classes, then change the num_output based on the number of classes in ground truth images):
layer {
name: "score_5classes"
type: "Convolution"
bottom: "score"
top: "score_5classes"
convolution_param {
num_output: 5
pad: 0
kernel_size: 1
}
}
change loss layer as follows(just according what name you have in bottom layer):
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "score_5classes"
bottom: "label"
top: "loss"
loss_param {
normalize: true
}
}
Run the model to start training in you have pycaffe and installed caffe environment.
caffe train -solver=/path/to/solver.prototxt -weights /path/to/pre-trained/model/fcn8s-heavy-pascal.caffemodel 2>&1 | tee /path/to/save/training/log/file/fcn8_exp1.log
I hope it is helpful. Thanks for #Shai's helps
I would like to modify the ImageNet caffe model as described bellow:
As the input channel number for temporal nets is different from that
of spatial nets (20 vs. 3), we average the ImageNet model filters of
first layer across the channel, and then copy the average results 20
times as the initialization of temporal nets.
My question is how can I achive the above results? How can I open the caffe model to be able to do those changes to it?
I read the net surgery tutorial but it doesn't cover the procedure needed.
Thank you for your assistance!
AMayer
The Net Surgery tutorial should give you the basics you need to cover this. But let me explain the steps you need to do in more detail:
Prepare the .prototxt network architectures: You need two files: the existing ImageNet .prototxt file, and your new temporal network architecture. You should make all layers except the first convolutional layers identical in both networks, including the names of the layers. That way, you can use the ImageNet .caffemodel file to initialize the weights automatically.
As the first conv layer has a different size, you have to give it a different name in your .prototxt file than it has in the ImageNet file. Otherwise, Caffe will try to initialize this layer with the existing weights too, which will fail as they have different shapes. (This is what happens in the edit to your question.) Just name it e.g. conv1b and change all references to that layer accordingly.
Load the ImageNet network for testing, so you can extract the parameters from the model file:
net = caffe.Net('imagenet.prototxt', 'imagenet.caffemodel', caffe.TEST)
Extract the weights from this loaded model.
conv_1_weights = old_net.params['conv1'][0].data
conv_1_biases = old_net.params['conv1'][1].data
Average the weights across the channels:
conv_av_weights = np.mean(conv_1_weights, axis=1, keepdims=True)
Load your new network together with the old .caffemodel file, as all layers except for the first layer directly use the weights from ImageNet:
new_net = caffe.Net('new_network.prototxt', 'imagenet.caffemodel', caffe.TEST)
Assign your calculated average weights to the new network
new_net.params['conv1b'][0].data[...] = conv_av_weights
new_net.params['conv1b'][1].data[...] = conv_1_biases
Save your weights to a new .caffemodel file:
new_net.save('new_weights.caffemodel')
I'm trying to understand how data is interpreted in Caffe.
For that I've taken a look at the Minst Tutorial
Looking at the input data definition:
layers {
name: "mnist"
type: DATA
data_param {
source: "mnist_train_lmdb"
backend: LMDB
batch_size: 64
scale: 0.00390625
}
top: "data"
top: "label"
}
I've now looked at the mnist_train_lmdb and taken one of the entries (shown in hex):
0801101C181C229006
00000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000
00000000000054B99F973C2400000000000000000000000000000000
000000000000DEFEFEFEFEF1C6C6C6C6C6C6C6C6AA34000000000000
00000000000043724872A3E3FEE1FEFEFEFAE5FEFE8C000000000000
000000000000000000000011420E4343433B15ECFE6A000000000000
00000000000000000000000000000000000053FDD112000000000000
000000000000000000000000000000000016E9FF5300000000000000
000000000000000000000000000000000081FEEE2C00000000000000
000000000000000000000000000000003BF9FE3E0000000000000000
0000000000000000000000000000000085FEBB050000000000000000
00000000000000000000000000000009CDF83A000000000000000000
0000000000000000000000000000007EFEB600000000000000000000
00000000000000000000000000004BFBF03900000000000000000000
0000000000000000000000000013DDFEA60000000000000000000000
00000000000000000000000003CBFEDB230000000000000000000000
00000000000000000000000026FEFE4D000000000000000000000000
00000000000000000000001FE0FE7301000000000000000000000000
000000000000000000000085FEFE3400000000000000000000000000
000000000000000000003DF2FEFE3400000000000000000000000000
0000000000000000000079FEFEDB2800000000000000000000000000
0000000000000000000079FECF120000000000000000000000000000
00000000000000000000000000000000000000000000000000000000
2807
(I've added the line breaks here to be able to see the '7' digit.)
Now my question is, where this format is described? Or put differently where is defined that the first 36 bytes are some sort of header and the last 8 bytes have some label correspondence?
How would I go about constructing my own data?
Neither Blob Tutorial nor Layers Definition give much away about required formats. My intention is not to use image data, but time series
Thanks!
I realized that protocol buffers must come into play here. So I tried to deserialize it against some of the types defined in caffe.proto.
Datum seems to be the perfect fit:
{Caffe.Datum}
Channels: 1
Data: {byte[784]}
Encoded: false
FloatData: Count = 0
Height: 28
Label: 7
Width: 28
So the answer is simply: It's a serialized representation of a 'Datum' typed instance as defined per caffe.proto
Btw. since english is not my native language I had to first realize that "Datum" is a singular form of "data"
When it comes to using your own data, it's structured as follows:
The conventional blob dimensions for data are number N x channel
K x height H x width W. Blob memory is row-major in layout so the last
/ rightmost dimension changes fastest. For example, the value at index
(n, k, h, w) is physically located at index ((n * K + k) * H + h) * W
+ w.
See Blobs, Layers, and Nets: anatomy of a Caffe model for reference
I can try to answer your second question. Since Caffe only takes data in a bunch of selected formats like lmdb, hdf5 etc., it is best to convert (or generate - in case of synthetic data) your data to these formats. Following links can help you in this. If you have trouble with import hdf5in Python, then you may refer to this page.
Creating an LMDB file in Python
Writing an HDF5 file in Python
HDF5 more examples