weights used by caffe layers with same name in different phases - caffe

I have a prototxt like:
layer{
name:"l1"
bottom: "b1"
top: "t1"
include{
phase: TRAIN
}
}
layer{
name:"l1"
bottom: "b1"
top: "t2"
include{
phase: TEST
}
}
There are two layers with
same name
different blobs
different phase
What will be the weights used in test phase?
1.) weights learned in train phase(because the layers have same name)
2.) Random initial weights

Weights learned in train phase will be attempted to be used in test phase.
But errors will occur to stop the testing if any of the 2 conditions below isn't satisfied:
the numbers of the two layers's blobs are equal
the shapes(size at every dimension) of the two layers's blobs are consistent
In fact, the layer in testing net will always try to copy weights from the layer with the same name in trained net and check on the number and shape of the blob containing weights to make sure that it will use the proper weights.
Details can be found in "template
void Net::ShareTrainedLayersWith(const Net* other)" function that will be called by test net object to copy weights from the trained net at the begining of test.

Related

How can you increase the accuracy of ResNet50?

I'm using Resnet50 model to classify images into two classes: normal cells and cancer cells.
so I want to to increase the accuracy but i don't know what to modify.
# we are using resnet50 for transfer learnin here. So we have imported it
from tensorflow.keras.applications import resnet50
# initializing model with weights='imagenet'i.e. we are carring its original weights
model_name='resnet50'
base_model=resnet50.ResNet50(include_top=False, weights="imagenet",input_shape=img_shape, pooling='max')
last_layer=base_model.output # we are taking last layer of the model
# Add flatten layer: we are extending Neural Network by adding flattn layer
flatten=layers.Flatten()(last_layer)
# Add dense layer
dense1=layers.Dense(100,activation='relu')(flatten)
# Add dense layer to the final output layer
output_layer=layers.Dense(class_count,activation='softmax')(flatten)
# Creating modle with input and output layer
model=Model(inputs=base_model.inputs,outputs=output_layer)
model.compile(Adamax(learning_rate=.001), loss='categorical_crossentropy', metrics=['accuracy'])
There were 48 errors in 534 test cases Model accuracy= 91.01 %
Also what do you think about the results of the graph?
this is the classification report
i got good results but is there a possibility to increase accuracy more than that?
This is a broad question as there are many ways one can attempt to generally improve the network's accuracy. some of which may be
Increase the dimension of the layers that are learned in transfer learning (make sure not to overfit)
Use transfer learning with Convolution layers and not MLP
let the optimization algorithm choose the learning rate on its own
Play with additional augmentations to the dataset
and the list goes on.
Also, if possible, I would suggest comparing your results to other publicly available benchmarks - by doing so you might understand the upper bounds of the accuracies better

Randomness in VGG16 prediction

I am using the VGG-16 network available in pytorch out of the box to predict some image index. I found out that for same input file, if i predict multiple time, I get different outcome. This seems counter-intuitive to me. Once the weights are predicted ( since I am using the pretrained model) there should not be any randomness at any step, and hence multiple run with same input file shall return same prediction.
Here is my code:
import torch
import torchvision.models as models
VGG16 = models.vgg16(pretrained=True)
def VGG16_predict(img_path):
transformer = transforms.Compose([transforms.CenterCrop(224),transforms.ToTensor()])
data = transformer(Image.open(img_path))
output = softmax(VGG16(data.unsqueeze(0)), dim=1).argmax().item()
return output # predicted class index
VGG16_predict(image)
Here is the image
Recall that many modules have two states for training vs evaluation: "Some models use modules which have different training and evaluation behavior, such as batch normalization. To switch between these modes, use model.train() or model.eval() as appropriate. See train() or eval() for details." (https://pytorch.org/docs/stable/torchvision/models.html)
In this case, the classifier layers include dropout, which is stochastic during training. Run VGG16.eval() if you want the evaluations to be non-random.

How did you run FCN code semantic segmentation?

I wanted to run the [FCN code][1] for semantic segmentation. However, I am beginner in Caffe and I did not know from which point should I start running the code.
Is there any step by step guidance for running?
Since I could not get that much help here, I am posting the steps here. It might be helpful for those who are inexperienced (like me). It took long time for me to figure out how to run it and get the results. you may be able to run it successfully, but similar to my case, the results was blank image for long time and finally found out that how setting should be.
I could perform successfully FCN8s on my data and I did the following steps:
Divide the data into two sets (train, validation) and the labels as well for the corresponding images in both train and validation (4 folder altogether: train_img_lmdb, train_label_lmdb, val_img_lmdb and val_label_lmdb)
Convert your data (each of them separately) into LMDB format (if it is not RGB, convert it using cv2 function), you will have 4 lmdb folders including data.mdb and lock.mdb. the sample code is available here.
Download the .caffemodel from the url that authors have provided,
Change the path to the path of your lmdb files in the train_val.ptototxt file, you should have 4 data layer that source is the path to the train_img_lmdb, train_label_lmdb, val_img_lmdb and val_label_lmdb, similar to this link
Add a convolution layer after this line (here, I have five classes, then change the num_output based on the number of classes in ground truth images):
layer {
name: "score_5classes"
type: "Convolution"
bottom: "score"
top: "score_5classes"
convolution_param {
num_output: 5
pad: 0
kernel_size: 1
}
}
change loss layer as follows(just according what name you have in bottom layer):
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "score_5classes"
bottom: "label"
top: "loss"
loss_param {
normalize: true
}
}
Run the model to start training in you have pycaffe and installed caffe environment.
caffe train -solver=/path/to/solver.prototxt -weights /path/to/pre-trained/model/fcn8s-heavy-pascal.caffemodel 2>&1 | tee /path/to/save/training/log/file/fcn8_exp1.log
I hope it is helpful. Thanks for #Shai's helps

How to modify the Imagenet Caffe Model?

I would like to modify the ImageNet caffe model as described bellow:
As the input channel number for temporal nets is different from that
of spatial nets (20 vs. 3), we average the ImageNet model filters of
first layer across the channel, and then copy the average results 20
times as the initialization of temporal nets.
My question is how can I achive the above results? How can I open the caffe model to be able to do those changes to it?
I read the net surgery tutorial but it doesn't cover the procedure needed.
Thank you for your assistance!
AMayer
The Net Surgery tutorial should give you the basics you need to cover this. But let me explain the steps you need to do in more detail:
Prepare the .prototxt network architectures: You need two files: the existing ImageNet .prototxt file, and your new temporal network architecture. You should make all layers except the first convolutional layers identical in both networks, including the names of the layers. That way, you can use the ImageNet .caffemodel file to initialize the weights automatically.
As the first conv layer has a different size, you have to give it a different name in your .prototxt file than it has in the ImageNet file. Otherwise, Caffe will try to initialize this layer with the existing weights too, which will fail as they have different shapes. (This is what happens in the edit to your question.) Just name it e.g. conv1b and change all references to that layer accordingly.
Load the ImageNet network for testing, so you can extract the parameters from the model file:
net = caffe.Net('imagenet.prototxt', 'imagenet.caffemodel', caffe.TEST)
Extract the weights from this loaded model.
conv_1_weights = old_net.params['conv1'][0].data
conv_1_biases = old_net.params['conv1'][1].data
Average the weights across the channels:
conv_av_weights = np.mean(conv_1_weights, axis=1, keepdims=True)
Load your new network together with the old .caffemodel file, as all layers except for the first layer directly use the weights from ImageNet:
new_net = caffe.Net('new_network.prototxt', 'imagenet.caffemodel', caffe.TEST)
Assign your calculated average weights to the new network
new_net.params['conv1b'][0].data[...] = conv_av_weights
new_net.params['conv1b'][1].data[...] = conv_1_biases
Save your weights to a new .caffemodel file:
new_net.save('new_weights.caffemodel')

caffe - Input data, train and test set [duplicate]

I started with Caffe and the mnist example ran well.
I have the train and label data as data.mat. (I have 300 training data with 30 features and labels are (-1, +1) that have saved in data.mat).
However, I don't quite understand how I can use caffe to implement my own dataset?
Is there a step by step tutorial can teach me?
Many thanks!!!! Any advice would be appreciated!
I think the most straight forward way to transfer data from Matlab to caffe is via HDF5 file.
First, save your data in Matlab in an HDF5 file using hdf5write. I assume your training data is stored in a variable name X of size 300-by-30 and the labels are stored in y a 300-by-1 vector:
hdf5write('my_data.h5', '/X',
single( permute(reshape(X,[300, 30, 1, 1]),[4:-1:1]) ) );
hdf5write('my_data.h5', '/label',
single( permute(reshape(y,[300, 1, 1, 1]),[4:-1:1]) ),
'WriteMode', 'append' );
Note that the data is saved as a 4D array: the first dimension is the number of features, second one is the feature's dimension and the last two are 1 (representing no spatial dimensions). Also note that the names given to the data in the HDF5 are "X" and "label" - these names should be used as the "top" blobs of the input data layer.
Why permute? please see this answer for an explanation.
You also need to prepare a text file listing the names of all hdf5 files you are using (in your case, only my_data.h5). File /path/to/list/file.txt should have a single line
/path/to/my_data.h5
Now you can add an input data layer to your train_val.prototxt
layer {
type: "HDF5Data"
name: "data"
top: "X" # note: same name as in HDF5
top: "label" #
hdf5_data_param {
source: "/path/to/list/file.txt"
batch_size: 20
}
include { phase: TRAIN }
}
For more information regarding hdf5 input layer, you can see in this answer.