What's the most elegant method to transform models between pytorch and caffe? - caffe

Just like title says, models pytorch to caffe, caffe to pytorch.
Not only inference, but also trainable models after transformation.
Any good points?

Related

How to change inter-layer hidden size in pretrained BERT and Pytorch?

For a research project, I need to be able to give input to say the third encoder layer of a pre-trained BERT model.
I know that things like these are done using hooks in PyTorch, but that works when I'm defining the model myself.
I was hoping for some frameworks that implement that sort of thing for popular pre-trained models. Or if there was a way to do this in huggingface itself.

How to use pytorch 0.4.1 model to init pytorch 0.2.0?

I have met a problem that I train my model in pytorch 0.4.1, but I can't find a tool to convert it into caffe model.
How to use a pytorch 0.4.1 model to init pytorch 0.2.0?
Or how to convert a pytorch 0.4.1 model to a caffe model?
Assuming you are using Caffe2, you can use ONNX to convert models trained in one AI framework to another (with some limitations). In your case you need to export your PyTorch trained model to ONNX model and then import the ONNX model to Caffe2 model.
Follow these tutorials:
PyTorch to ONNX export: link
ONNX to Caffe2 import: link
The tutorials are pretty straightforward, you don't need to write much code. But there are some limitations while exporting your PyTorch model to ONNX.
Check this link for in-depth tutorial on Pytorch to ONNX.
Comment below if you have any doubts/ questions.
EDIT: Try this repo, I have tested it on a toy model it is working.

Is the xception model in keras the best model was describe in the paper?

I read the Xception paper and in this paper it was mentioned in part 4.7 that best results are achivable without any activation. Now I want to use this network on videos using keras toolbox but the model in keras uses 'ReLU' activation function. Does the model in keras returns best model or it is better to omit the relu layers?
You are confusing normal activations used for convolutional and dense layers, with the ones mentioned in the paper. Section 4.7 only deals with varying the activation between depth-wise and point-wise convolutions, the rest of the activations in the architecture are kept unchanged.

Caffe Autoencoder

I wanna compare the performance of CNN and autoencoder in caffe. I'm completely familiar with cnn in caffe but I wanna is the autoencoder also has deploy.prototxt file ? is there any differences in using this two models rather than the architecture?
Yes it also has a deploy.prototxt.
both train_val.prototxt and 'deploy.prototxt' are cnn architecture description files. The sole difference between them is, train_val.prototxt takes training data and loss as input/output, but 'deploy.prototxt' takes testing image as input, and predicted value as out put.
Here is an example of a cnn and autoencoder for MINST: Caffe Examples. (I have not tried the examples.) Using the models is generally the same. Learning rates etc. depend on the model.
You need to implement an auto-encoder example using python or matlab. The example in Caffe is not true auto-encoder because it doesn't set layer-wise training stage and during training stage, it doesn't fix W{L->L+1} = W{L+1->L+2}^T. It is easily to find a 1D auto-encoder in github, but 2D auto-encoder may be hard to find.
The main difference between the Auto encoders and conventional network is
In Auto encoder your input is your label image for training.
Auto encoder tries to approximate the output similar as input.
Auto encoders does not have softmax layer while training.
It can be used as a pre-trained model for your network which converge faster comparing to other pre-trained models. It is because your network has already extracted the features for your data.
The Conventional training and testing you can perform on pre trained auto encoder network for faster convergence and accuracy.

how to write caffe python layer with trainable parameters?

I want to learn how to write caffe python layers.
But I only find examples about very simple layers like pyloss.
How to write python caffe with trainable parameters?
For example, how to write a fully connected python layer?
Caffe stores the layer's trainable parameters as a vector of blobs. By default this vector is empty and it is up to you to add parameters blobs to it in the setup of the layer. There is a simple example for a layer with parameters in test_python_layer.py.
See this post for more information about "Python" layers in caffe.