Is it possible for me to directly give an image and its segmentation as the input for training a caffe?
Does the original implementation support this?
If yes where can I find an appropreate prototxt file?
Yes. It is possible.
Have a look at Fully Convolutional Networks for Semantic Segmentation and SegNet. They are both fully convolutional networks and are trained for semantic segmentation. The prototxt and caffemodel files are available on GitHub.
You can run FCN with the original implementation, but SegNet uses some layers which are not part of the original implementation. They have an edited version of caffe on Github, so you can use that.
Related
Newbie to Caffe.
I am trying to use the trained Convolutional neural network on MNIST dataset using Caffe deep learning framework.
Following the official tutorial.
Steps taken successfully:
./data/mnist/get_mnist.sh
./examples/mnist/create_mnist.sh
./examples/mnist/train_lenet.sh
Model was trained and stopped with the following message:
I1203 solver.cpp:133] Snapshotting solver state to lenet_iter_10000.solverstate
I1203 solver.cpp:78] Optimization Done.
Now, I am not sure as how to get a testing image and use the existing trained model which I believe has been snapshot by the name lenet_iter_10000.solverstate to see the predicted scores for each class.
Use the test function of caffe:
<path to caffe root>/caffe test -model <val filename>.prototxt -weights lenet_iter_10000.caffemodel
As you want to test only one image, give that image as input to your test data layer. Use the mean_image as input as well in your <val filename>.protoxt. Test batch size is 1 in this case.
Also note that lenet_iter_10000.solverstate is not your trained model. Your trained model is actually lenet_iter_10000.caffemodel. To know about the diffrence between solverstate and caffemodel files see here.
MatConvNet support convolution transpose layer ('convt'), but I can not find an example in their source code nor in their documents. Are there any guidence?
I noticed there is an example given by https://github.com/vlfeat/matconvnet-fcn,
but it involved many irrelated things. I hope the example to be as simple as possible.
I am interested in convolutional neural networks (CNNs) as a example of computationally extensive application that is suitable for acceleration using reconfigurable hardware (i.e. lets say FPGA)
In order to do that I need to examine a simple CNN code that I can use to understand how they are implemented, how are the computations in each layer taking place, how the output of each layer is being fed to the input of the next one. I am familiar with the theoretical part (http://cs231n.github.io/convolutional-networks/)
But, I am not interested in training the CNN, I want a complete, self contained CNN code that is pre-trained and all the weights and biases values are known.
I know that there are plenty of CNN libraries, i.e. Caffe, but the problem is that there is no trivial example code that is self contained. even for the simplest Caffe example "cpp_classification" many libraries are invoked, the architecture of the CNN is expressed as .prototxt file, other types of inputs such as .caffemodel and .binaryproto are involved. openCV2 libraries is invoked too. there are layers and layers of abstraction and different libraries working together to produce the classification outcome.
I know that those abstractions are needed to generate a "useable" CNN implementation, but for a hardware person who needs a bare-bone code to study, this is too much of "un-related work".
My question is: Can anyone guide me into a simple and self-contained CNN implementation that I can start with?
I can recommend tiny-cnn. It is simple, lightweight (e.g. header-only) and CPU only, while providing several layers frequently used within the literature (as for example pooling layers, dropout layers or local response normalization layer). This means, that you can easily explore an efficient implementation of these layers in C++ without requiring knowledge of CUDA and digging through the I/O and framework code as required by framework such as Caffe. The implementation lacks some comments, but the code is still easy to read and understand.
The provided MNIST example is quite easy to use (tried it myself some time ago) and trains efficiently. After training and testing, the weights are written to file. Then you have a simple pre-trained model from which you can start, see the provided examples/mnist/test.cpp and examples/mnist/train.cpp. It can easily be loaded for testing (or recognizing digits) such that you can debug the code while executing a learned model.
If you want to inspect a more complicated network, have a look at the Cifar-10 Example.
This is the simplest implementation I have seen: DNN McCaffrey
Also, the source code for this by Karpathy looks pretty straightforward.
I wanna compare the performance of CNN and autoencoder in caffe. I'm completely familiar with cnn in caffe but I wanna is the autoencoder also has deploy.prototxt file ? is there any differences in using this two models rather than the architecture?
Yes it also has a deploy.prototxt.
both train_val.prototxt and 'deploy.prototxt' are cnn architecture description files. The sole difference between them is, train_val.prototxt takes training data and loss as input/output, but 'deploy.prototxt' takes testing image as input, and predicted value as out put.
Here is an example of a cnn and autoencoder for MINST: Caffe Examples. (I have not tried the examples.) Using the models is generally the same. Learning rates etc. depend on the model.
You need to implement an auto-encoder example using python or matlab. The example in Caffe is not true auto-encoder because it doesn't set layer-wise training stage and during training stage, it doesn't fix W{L->L+1} = W{L+1->L+2}^T. It is easily to find a 1D auto-encoder in github, but 2D auto-encoder may be hard to find.
The main difference between the Auto encoders and conventional network is
In Auto encoder your input is your label image for training.
Auto encoder tries to approximate the output similar as input.
Auto encoders does not have softmax layer while training.
It can be used as a pre-trained model for your network which converge faster comparing to other pre-trained models. It is because your network has already extracted the features for your data.
The Conventional training and testing you can perform on pre trained auto encoder network for faster convergence and accuracy.
I played with the scikit-neuralnetwork, which is backed by the pylearn2 library. The pylearn2 has functions to visualize learned weights of the convolutional kernels. Can I somehow access the learned model inside the scikit wrapper and visualize the weights aswell?
I am new to python so going trough the source of scikit-nn did not really help me.
Thanks