I am trying to use resnet18 and densenet121 as the pretrained models and have added 1 FC layer at the end of each network two change dimensions to 512 of both network and then have concatenated the outputs and passed to two final FC layers.
this can be seen here:
class classifier(nn.Module):
def __init__(self,num_classes):
super(classifier,self).__init__()
self.resnet=models.resnet18(pretrained=True)
self.rfc1=nn.Linear(512,512)
self.densenet=models.densenet121(pretrained=True)
self.dfc1=nn.Linear(1024,512)
self.final_fc1=nn.Linear(1024,512)
self.final_fc2=nn.Linear(512,num_classes)
self.dropout=nn.Dropout(0.2)
def forward(self,x):
y=x.detach().clone()
x=self.resnet.conv1(x)
x=self.resnet.bn1(x)
x=self.resnet.relu(x)
x=self.resnet.maxpool(x)
x=self.resnet.layer1(x)
x=self.resnet.layer2(x)
x=self.resnet.layer3(x)
x=self.resnet.layer4(x)
x=self.resnet.avgpool(x)
x=x.view(x.size(0),-1)
x=nn.functional.relu(self.rfc1(x))
y=self.densenet.features(y)
y=y.view(y.size(0),-1)
y=nn.functional.relu(self.dfc1(y))
x=torch.cat((x,y),0)
x=nn.functional.relu(self.final_fc1(x))
x=self.dropout(x)
x=self.final_fc2(x)
return x
Error:
size mismatch, m1: [1048576 x 1], m2: [1024 x 512] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:283
but as i could see the densenet outputs 1024 feature to the last layer
Questions:
Is the implementation correct ?
Can this implementation work as a Ensemble
Related
I have a Convolutional Neural Network, and it's trying to resolve a classification problem using images (2 classes, so binary classification), using sigmoid.
To evaluate the model I use:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
path_dir = '../../dataset/train'
parth_dir_test = '../../dataset/test'
datagen = ImageDataGenerator(
rescale=1./255,
validation_split = 0.2)
test_set = datagen.flow_from_directory(parth_dir_test,
target_size= (150,150),
batch_size = 64,
class_mode = 'binary')
score = classifier.evaluate(test_set, verbose=0)
print('Test Loss', score[0])
print('Test accuracy', score[1])
And it outputs:
When I try to print the classification report I use:
yhat_classes = classifier.predict_classes(test_set, verbose=0)
yhat_classes = yhat_classes[:, 0]
print(classification_report(test_set.classes,yhat_classes))
But now I get this accuracy:
If I print the test_set.classes, it shows the first 344 numbers of the array as 0, and the next 344 as 1. Is this test_set shuffled before feeding into the network?
I think your model is doing just fine both in "training" and "evaluating".Evaluation accuracy comes on the basis of prediction so maybe you are making some logical mistake while using model.predict_classes().Please check if you are using the trained model weights and not any randomly initialized model while evaluating it.
what "evaluate" does: The model sets apart this fraction of data while training, and will not train on it, and will evaluate loss and any other model's metrics on this data after each "epoch".so, model.evaluate() is for evaluating your trained model. Its output is accuracy or loss, not prediction to your input data!
predict: Generates output predictions for the input samples. model.predict() actually predicts, and its output is target value, predicted from your input data.
FYI: if your accurscy in Binary Classification problem is less than 50%, it's worse than the case that you randomly predict one of those classes (acc = 50%)!
I needed to add a shuffle=False. The code that work is:
test_set = datagen.flow_from_directory(parth_dir_test,
target_size=(150,150),
batch_size=64,
class_mode='binary',
shuffle=False)
What should be the format/data types of y-labels for training if the actual y-labels cab be any decimal number between 0-9 (4.1,8.5 etc) and the last output layer is defined as:
out = Dense(10,activation='softmax')(previous_layer)
and the loss fuction to use is:
def earth_mover_loss(y_true, y_pred):
cdf_ytrue = K.cumsum(y_true, axis=-1)
cdf_ypred = K.cumsum(y_pred, axis=-1)
samplewise_emd = K.sqrt(K.mean(K.square(K.abs(cdf_ytrue - cdf_ypred)), axis=-1))
return K.mean(samplewise_emd)
How should I feed the labels in my network. I want to mimic this model
I wish to train a RNN model such that I can predict for T steps ahead in a time series model. Most of the examples that I have seen so far are centred around text.
The toy example that I have is to predict 3 sine waves as shown below:
x = torch.arange(0,30,0.05)
y = [torch.sin(x), torch.sin(x-np.pi), torch.sin(x-np.pi/2)]
y = torch.stack(y)
y = y.t()
y is of shape 600,3. However in order for the LSTM to accept it the input needs to be of shape (seq_len, batch, input_size). I was wondering if there is a function in pytorch that converts them to required format. Suppose that in my case I want seq_len=50 and batch_size=32.
This snippet of code from machinelearningmastery was the only snippet of code I found.
# convert an array of values into a dataset matrix
def create_dataset(dataset, look_back=1):
dataX, dataY = [], []
for i in range(len(dataset)-look_back-1):
a = dataset[i:(i+look_back), 0]
dataX.append(a)
dataY.append(dataset[i + look_back, 0])
return numpy.array(dataX), numpy.array(dataY)
Does pad_packed_sequence or anything similar in pytorch natively do this.
If anyone is interested, this is my LSTM model:
class LSTM(nn.Module):
def __init__(self, n_features, h, num_layers=2):
super().__init__()
self.lstm = nn.LSTM(n_features, h, num_layers)
self.linear = nn.Linear(h, n_features)
def forward(self, input, h=None):
lstm_out, self.hidden = self.lstm(input, h)
return self.linear(lstm_out)
[optional Q] For whatever solution that I end up with, is there a way to ensure that I can do stateful training?
I am learning about designing Convolutional Neural Networks using Keras. I have developed a simple model using VGG16 as the base. I have about 6 classes of images in the dataset. Here are the code and description of my model.
model = models.Sequential()
conv_base = VGG16(weights='imagenet' ,include_top=False, input_shape=(IMAGE_SIZE, IMAGE_SIZE, 3))
conv_base.trainable = False
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(6, activation='sigmoid'))
Here is the code for compiling and fitting the model:
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.RMSprop(lr=1e-4),
metrics=['acc'])
model.summary()
callbacks = [
EarlyStopping(monitor='acc', patience=1, mode='auto'),
ModelCheckpoint(monitor='val_loss', save_best_only=True, filepath=model_file_path)
]
history = model.fit_generator(
train_generator,
steps_per_epoch=10,
epochs=EPOCHS,
validation_data=validation_generator,
callbacks = callbacks,
validation_steps=10)
Here is the code for prediction of a new image
img = image.load_img(img_path, target_size=(IMAGE_SIZE, IMAGE_SIZE))
plt.figure(index)
imgplot = plt.imshow(img)
x = image.img_to_array(img)
x = x.reshape((1,) + x.shape)
prediction = model.predict(x)[0]
# print(prediction)
Often model.predict() method predicts more than one class.
[0 1 1 0 0 0]
I have a couple of questions
Is it normal for a multiclass classification model to predict more than one output?
How is accuracy measured during training time if more than one class was predicted?
How can I modify the neural network so that only one class is predicted?
Any help is appreciated. Thank you so much!
You are not doing multi-class classification, but multi-label. This is caused by the use of a sigmoid activation at the output layer. To do multi-class classification properly, use a softmax activation at the output, which will produce a probability distribution over classes.
Taking the class with the biggest probability (argmax) will produce a single class prediction, as expected.
In PyTorch, we can define architectures in multiple ways. Here, I'd like to create a simple LSTM network using the Sequential module.
In Lua's torch I would usually go with:
model = nn.Sequential()
model:add(nn.SplitTable(1,2))
model:add(nn.Sequencer(nn.LSTM(inputSize, hiddenSize)))
model:add(nn.SelectTable(-1)) -- last step of output sequence
model:add(nn.Linear(hiddenSize, classes_n))
However, in PyTorch, I don't find the equivalent of SelectTable to get the last output.
nn.Sequential(
nn.LSTM(inputSize, hiddenSize, 1, batch_first=True),
# what to put here to retrieve last output of LSTM ?,
nn.Linear(hiddenSize, classe_n))
Define a class to extract the last cell output:
# LSTM() returns tuple of (tensor, (recurrent state))
class extract_tensor(nn.Module):
def forward(self,x):
# Output shape (batch, features, hidden)
tensor, _ = x
# Reshape shape (batch, hidden)
return tensor[:, -1, :]
nn.Sequential(
nn.LSTM(inputSize, hiddenSize, 1, batch_first=True),
extract_tensor(),
nn.Linear(hiddenSize, classe_n)
)
According to the LSTM cell documentation the outputs parameter has a shape of (seq_len, batch, hidden_size * num_directions) so you can easily take the last element of the sequence in this way:
rnn = nn.LSTM(10, 20, 2)
input = Variable(torch.randn(5, 3, 10))
h0 = Variable(torch.randn(2, 3, 20))
c0 = Variable(torch.randn(2, 3, 20))
output, hn = rnn(input, (h0, c0))
print(output[-1]) # last element
Tensor manipulation and Neural networks design in PyTorch is incredibly easier than in Torch so you rarely have to use containers. In fact, as stated in the tutorial PyTorch for former Torch users PyTorch is built around Autograd so you don't need anymore to worry about containers. However, if you want to use your old Lua Torch code you can have a look to the Legacy package.
As far as I'm concerned there's nothing like a SplitTable or a SelectTable in PyTorch. That said, you are allowed to concatenate an arbitrary number of modules or blocks within a single architecture, and you can use this property to retrieve the output of a certain layer. Let's make it more clear with a simple example.
Suppose I want to build a simple two-layer MLP and retrieve the output of each layer. I can build a custom class inheriting from nn.Module:
class MyMLP(nn.Module):
def __init__(self, in_channels, out_channels_1, out_channels_2):
# first of all, calling base class constructor
super().__init__()
# now I can build my modular network
self.block1 = nn.Linear(in_channels, out_channels_1)
self.block2 = nn.Linear(out_channels_1, out_channels_2)
# you MUST implement a forward(input) method whenever inheriting from nn.Module
def forward(x):
# first_out will now be your output of the first block
first_out = self.block1(x)
x = self.block2(first_out)
# by returning both x and first_out, you can now access the first layer's output
return x, first_out
In your main file you can now declare the custom architecture and use it:
from myFile import MyMLP
import numpy as np
in_ch = out_ch_1 = out_ch_2 = 64
# some fake input instance
x = np.random.rand(in_ch)
my_mlp = MyMLP(in_ch, out_ch_1, out_ch_2)
# get your outputs
final_out, first_layer_out = my_mlp(x)
Moreover, you could concatenate two MyMLP in a more complex model definition and retrieve the output of each one in a similar way.
I hope this is enough to clarify, but in case you have more questions, please feel free to ask, since I may have omitted something.