This is my code. I tried to build a VGG 11 layers network, with a mix of ReLu and ELu activation and many regularizations on kernels and activities. The result is really confusing: The code is at 10th epoch. My loss on both train and val have decreased from 2000 to 1.5, but my acc on both train and val remained the same at 50%. Can somebody explain to me?
# VGG 11
from keras.regularizers import l2
from keras.layers.advanced_activations import ELU
from keras.optimizers import Adam
model = Sequential()
model.add(Conv2D(64, (3, 3), kernel_initializer='he_normal',
kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.0001),
input_shape=(1, 96, 96), activation='relu'))
model.add(Conv2D(64, (3, 3), kernel_initializer='he_normal',
kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.0001),
activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, (3, 3), kernel_initializer='he_normal',
kernel_regularizer=l2(0.0001),activity_regularizer=l2(0.0001),
activation='relu'))
model.add(Conv2D(128, (3, 3), kernel_initializer='he_normal',
kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.0001),
activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(256, (3, 3), kernel_initializer='he_normal',
kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.0001),
activation='relu'))
model.add(Conv2D(256, (3, 3), kernel_initializer='he_normal',
kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.0001),
activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(512, (3, 3), kernel_initializer='he_normal',
kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.0001),
activation='relu'))
model.add(Conv2D(512, (3, 3), kernel_initializer='he_normal',
kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.0001),
activation='relu'))
model.add(Conv2D(512, (3, 3), kernel_initializer='he_normal',
kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.0001),
activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
# convert convolutional filters to flat so they can be feed to fully connected layers
model.add(Flatten())
model.add(Dense(2048, kernel_initializer='he_normal',
kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.01)))
model.add(ELU(alpha=1.0))
model.add(Dropout(0.5))
model.add(Dense(1024, kernel_initializer='he_normal',
kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.01)))
model.add(ELU(alpha=1.0))
model.add(Dropout(0.5))
model.add(Dense(2))
model.add(Activation('softmax'))
adammo = Adam(lr=0.0008, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(loss='categorical_crossentropy', optimizer=adammo, metrics=['accuracy'])
hist = model.fit(X_train, y_train, batch_size=48, epochs=20, verbose=1, validation_data=(X_val, y_val))
This is not a defect, in fact, it is entirely possible!
Categorical cross entropy loss does not require that accuracy go up with the loss decreasing. This is not a bug in keras or theano, but rather a network or data problem.
This network structure is probably over-complicated for what you might be trying to do. You should remove some of your regularization, use only ReLu, use less layers, use the standard adam optimizer, a larger batch, etc. Try first using one of keras' default models like VGG16,
If you want to see their implementation to edit it for a different VGG11 like structure. It is here:
def VGG_16(weights_path=None):
model = Sequential()
model.add(ZeroPadding2D((1,1),input_shape=(3,224,224)))
model.add(Convolution2D(64, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(64, 3, 3, activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(128, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(128, 3, 3, activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(256, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(256, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(256, 3, 3, activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(Flatten())
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1000, activation='softmax'))
if weights_path:
model.load_weights(weights_path)
return model
You can see it is much more simple. It only uses rely (which has gotten popular these days) has no regularization, different convolution structure, etc. Modify that to your needs!
Related
I'm building a CNN for dectection of chest x-ray images with pneumonia, the problem I got is that when training the usage of ram just blows up, I'm using a Kaggle Notebook and the ram usage slowly grows till the 30gb limit and then shuts down, even before printing a single epoch.
I tried passing the data from float64 to float32 but no mayor results, just the rate in which reaches the 30gb limits slows down.
I don't really know where's the problem because I've seen similar codes also trained for the same dataset in Kaggle Notebooks, so I know that the data itself isn't the problem. My image dataset are 5216 images for train and just 16 for validation. Here's the code
image_size = (128,128)
data_train = tf.keras.utils.image_dataset_from_directory(train_path, color_mode ='grayscale', image_size = image_size)
data_train = data_train.map(lambda x, y: (tf.cast(x, tf.float32), y))
data_train_iterator = data_train.as_numpy_iterator()
batch_data_train = data_train_iterator.next()
data_val = tf.keras.utils.image_dataset_from_directory(val_path,color_mode ='grayscale', image_size = image_size)
data_val = data_val.map(lambda x, y: (tf.cast(x, tf.float32), y))
data_val_iterator = data_val.as_numpy_iterator()
batch_data_val = data_val_iterator.next()
data_augmentation = ImageDataGenerator(
rescale = 1./255,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
vertical_flip=False,
zoom_range=0.2,
fill_mode='nearest'
)
data_aug_train = data_augmentation.flow_from_directory(train_path, target_size=image_size, batch_size=32, class_mode='categorical')
data_aug_val = data_augmentation.flow_from_directory(val_path, target_size=image_size, batch_size=32, class_mode='categorical')
def pneumonia_model():
model = Sequential()
model.add(Conv2D(16, (3, 3), activation='relu', padding="same", input_shape=(128,128,1)))
model.add(BatchNormalization())
model.add(Conv2D(16, (3, 3), padding="same", activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3), activation='relu', padding="same", input_shape=(128,128,1)))
model.add(BatchNormalization())
model.add(Conv2D(32, (3, 3), padding="same", activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu', padding="same"))
model.add(BatchNormalization())
model.add(Conv2D(64, (3, 3), padding="same", activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(96, (3, 3), dilation_rate=(2, 2), activation='relu', padding="same"))
model.add(BatchNormalization())
model.add(Conv2D(96, (3, 3), padding="valid", activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, (3, 3), dilation_rate=(2, 2), activation='relu', padding="same"))
model.add(BatchNormalization())
model.add(Conv2D(128, (3, 3), padding="valid", activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(2 , activation='softmax'))
print(model.summary())
return model
model = pneumonia_model()
model.compile('adam', loss=tf.losses.BinaryCrossentropy(), metrics=['accuracy'])
hist = model.fit(data_aug_train,epochs=20,validation_data = tuple(data_aug_val))`
`
I used a 3DUnet with resblock to segment a CT image with input torch size of [1, 1, 96, 176, 176], but it throws the following error:
RuntimeError: Sizes of tensors must match except in dimension 2. Got 55 and 54 (The offending index is 0)
Hence I traced back, I found the error comes from
outputs = self.decoder_stage2(torch.cat([short_range6, long_range3], dim=1)) + short_range6
The short_range6 has torch.Size([1, 64, 24, 55, 40]) while the long_range3 has torch.Size([1, 128, 24, 54, 40]). I think this is because something not being a power of 2, but cannot find where to modify.
Below is the complete structure of the network, really thanks for any help!
class ResUNet(nn.Module):
def __init__(self, in_channel=1, out_channel=2 ,training=True):
super().__init__()
self.training = training
self.dorp_rate = 0.2
self.encoder_stage1 = nn.Sequential(
nn.Conv3d(in_channel, 16, 3, 1, padding=1),
nn.PReLU(16),
nn.Conv3d(16, 16, 3, 1, padding=1),
nn.PReLU(16),
)
self.encoder_stage2 = nn.Sequential(
nn.Conv3d(32, 32, 3, 1, padding=1),
nn.PReLU(32),
nn.Conv3d(32, 32, 3, 1, padding=1),
nn.PReLU(32),
nn.Conv3d(32, 32, 3, 1, padding=1),
nn.PReLU(32),
)
self.encoder_stage3 = nn.Sequential(
nn.Conv3d(64, 64, 3, 1, padding=1),
nn.PReLU(64),
nn.Conv3d(64, 64, 3, 1, padding=2, dilation=2),
nn.PReLU(64),
nn.Conv3d(64, 64, 3, 1, padding=4, dilation=4),
nn.PReLU(64),
)
self.encoder_stage4 = nn.Sequential(
nn.Conv3d(128, 128, 3, 1, padding=3, dilation=3),
nn.PReLU(128),
nn.Conv3d(128, 128, 3, 1, padding=4, dilation=4),
nn.PReLU(128),
nn.Conv3d(128, 128, 3, 1, padding=5, dilation=5),
nn.PReLU(128),
)
self.decoder_stage1 = nn.Sequential(
nn.Conv3d(128, 256, 3, 1, padding=1),
nn.PReLU(256),
nn.Conv3d(256, 256, 3, 1, padding=1),
nn.PReLU(256),
nn.Conv3d(256, 256, 3, 1, padding=1),
nn.PReLU(256),
)
self.decoder_stage2 = nn.Sequential(
nn.Conv3d(128 + 64, 128, 3, 1, padding=1),
nn.PReLU(128),
nn.Conv3d(128, 128, 3, 1, padding=1),
nn.PReLU(128),
nn.Conv3d(128, 128, 3, 1, padding=1),
nn.PReLU(128),
)
self.decoder_stage3 = nn.Sequential(
nn.Conv3d(64 + 32, 64, 3, 1, padding=1),
nn.PReLU(64),
nn.Conv3d(64, 64, 3, 1, padding=1),
nn.PReLU(64),
nn.Conv3d(64, 64, 3, 1, padding=1),
nn.PReLU(64),
)
self.decoder_stage4 = nn.Sequential(
nn.Conv3d(32 + 16, 32, 3, 1, padding=1),
nn.PReLU(32),
nn.Conv3d(32, 32, 3, 1, padding=1),
nn.PReLU(32),
)
self.down_conv1 = nn.Sequential(
nn.Conv3d(16, 32, 2, 2),
nn.PReLU(32)
)
self.down_conv2 = nn.Sequential(
nn.Conv3d(32, 64, 2, 2),
nn.PReLU(64)
)
self.down_conv3 = nn.Sequential(
nn.Conv3d(64, 128, 2, 2),
nn.PReLU(128)
)
self.down_conv4 = nn.Sequential(
nn.Conv3d(128, 256, 3, 1, padding=1),
nn.PReLU(256)
)
self.up_conv2 = nn.Sequential(
nn.ConvTranspose3d(256, 128, 2, 2),
nn.PReLU(128)
)
self.up_conv3 = nn.Sequential(
nn.ConvTranspose3d(128, 64, 2, 2),
nn.PReLU(64)
)
self.up_conv4 = nn.Sequential(
nn.ConvTranspose3d(64, 32, 2, 2),
nn.PReLU(32)
)
# 256*256
self.map4 = nn.Sequential(
nn.Conv3d(32, out_channel, 1, 1),
nn.Upsample(scale_factor=(1, 1, 1), mode='trilinear', align_corners=False),
nn.Softmax(dim=1)
)
# 128*128
self.map3 = nn.Sequential(
nn.Conv3d(64, out_channel, 1, 1),
nn.Upsample(scale_factor=(2, 2, 2), mode='trilinear', align_corners=False),
nn.Softmax(dim=1)
)
# 64*64
self.map2 = nn.Sequential(
nn.Conv3d(128, out_channel, 1, 1),
nn.Upsample(scale_factor=(4, 4, 4), mode='trilinear', align_corners=False),
nn.Softmax(dim=1)
)
# 32*32
self.map1 = nn.Sequential(
nn.Conv3d(256, out_channel, 1, 1),
nn.Upsample(scale_factor=(8, 8, 8), mode='trilinear', align_corners=False),
nn.Softmax(dim=1)
)
def forward(self, inputs):
long_range1 = self.encoder_stage1(inputs) + inputs
short_range1 = self.down_conv1(long_range1)
long_range2 = self.encoder_stage2(short_range1) + short_range1
long_range2 = F.dropout(long_range2, self.dorp_rate, self.training)
short_range2 = self.down_conv2(long_range2)
long_range3 = self.encoder_stage3(short_range2) + short_range2
long_range3 = F.dropout(long_range3, self.dorp_rate, self.training)
short_range3 = self.down_conv3(long_range3)
long_range4 = self.encoder_stage4(short_range3) + short_range3
long_range4 = F.dropout(long_range4, self.dorp_rate, self.training)
short_range4 = self.down_conv4(long_range4)
outputs = self.decoder_stage1(long_range4) + short_range4
outputs = F.dropout(outputs, self.dorp_rate, self.training)
output1 = self.map1(outputs)
short_range6 = self.up_conv2(outputs)
outputs = self.decoder_stage2(torch.cat([short_range6, long_range3], dim=1)) + short_range6
outputs = F.dropout(outputs, self.dorp_rate, self.training)
output2 = self.map2(outputs)
short_range7 = self.up_conv3(outputs)
outputs = self.decoder_stage3(torch.cat([short_range7, long_range2], dim=1)) + short_range7
outputs = F.dropout(outputs, self.dorp_rate, self.training)
output3 = self.map3(outputs)
short_range8 = self.up_conv4(outputs)
outputs = self.decoder_stage4(torch.cat([short_range8, long_range1], dim=1)) + short_range8
output4 = self.map4(outputs)
if self.training is True:
return output1, output2, output3, output4
else:
return output4```
You can pad your image's dimensions to be multiple of 32's. By doing this, you won't have to change the 3DUnet's parameters.
I will provide you a simple code to show you the way.
# I assume that you named your input image as img
padding1_mult = math.floor(img.shape[3] / 32) + 1
padding2_mult = math.floor(img.shape[4] / 32) + 1
pad1 = (32 * padding1_mult) - img.shape[3]
pad2 = (32 * padding2_mult) - img.shape[4]
padding = nn.ReplicationPad2d((0, pad2, pad1, 0, 0 ,0))
img = padding(img)
After this operation, your image shape must be torch.Size([1, 1, 96, 192, 192])
I have been trying to train a DCGAN implementation in PyTorch. I am getting the following error:
RuntimeError: Calculated padded input size per channel: (3 x 3). Kernel size: (4 x 4). Kernel size can't be greater than actual input size
I have modified the discriminator network defined in this implementation.
The modified discriminator network is as follows:
self.main = nn.Sequential(
nn.Conv2d(nc, ndf, 4, 2, 1, bias=False),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(ndf, ndf * 2, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 2),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(ndf * 2, ndf * 4, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 4),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(ndf * 4, ndf * 8, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 8),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(ndf * 8, ndf * 8, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 8),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(ndf * 8, ndf * 16, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 16),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(ndf * 16, ndf * 16, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 16),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(ndf * 16, 1, 4, 1, 0, bias=False),
nn.Sigmoid()
)
Thanks in advance!
Here is my blob shape:
data 4096 4.10e+03 (1, 2, 1, 2048)
Convolution1 130944 1.31e+05 (1, 64, 1, 2046)
ReLU1 130944 1.31e+05 (1, 64, 1, 2046)
Convolution2 130816 1.31e+05 (1, 64, 1, 2044)
ReLU2 130816 1.31e+05 (1, 64, 1, 2044)
ReLU2_ReLU2_0_split_0 130816 1.31e+05 (1, 64, 1, 2044)
ReLU2_ReLU2_0_split_1 130816 1.31e+05 (1, 64, 1, 2044)
Pooling1 65408 6.54e+04 (1, 64, 1, 1022)
Convolution3 130560 1.31e+05 (1, 128, 1, 1020)
ReLU3 130560 1.31e+05 (1, 128, 1, 1020)
Convolution4 130304 1.30e+05 (1, 128, 1, 1018)
ReLU4 130304 1.30e+05 (1, 128, 1, 1018)
ReLU4_ReLU4_0_split_0 130304 1.30e+05 (1, 128, 1, 1018)
ReLU4_ReLU4_0_split_1 130304 1.30e+05 (1, 128, 1, 1018)
Pooling2 65152 6.52e+04 (1, 128, 1, 509)
What is 2 lines of "ReLU2_0_split_0" and "ReLU2_ReLU2_0_split_1"? where they come from?
Your ReLU layer's output is used as a "bottom" for two layers. Therefore, Caffe automatically adds a "Split" layer that creates two copies of the ReLU output and feed each copy to one of the top layers. These two copies are named ReLU_split0 and ReLU_split1.
there is List like this
List(List(9, 1, 9),
List(9, 1, 8),
List(8, 2, 7),
List(8, 2, 6),
List(7, 3, 5),
List(7, 3, 4))
i want to slice list by second element, like this
List(List(List(9, 1, 9), List(9, 1, 8)), //second elem is "1"
List(List(8, 2, 7), List(8,2,6)), //second elem is "2"
List(List(7, 3, 5), List(7,3,4)), //second elem is "3"
I can do it by using verbose for loop, but it's too verbose.
I also try to use span, but I fail to get second element.
what is the best way?
val l = List(List(9, 1, 9),
List(9, 1, 8),
List(8, 2, 7),
List(8, 2, 6),
List(7, 3, 5),
List(7, 3, 4))
l.groupBy( {case first::second::tail => second} ).values.toList