I am trying to implement the architecture I have attached.
The output of my DataLoader has size: torch.Size([128, 192, 9, 1])
I am using a batch size of 128
View just reshapes the output of the dense layer
model = nn.Sequential(nn.Conv2d(192, 32, 3),
nn.ReLU(),
nn.Conv2d(32, 64, 3),
nn.ReLU(),
nn.Conv2d(64, 64, 3),
nn.MaxPool2d(2),
nn.Dropout(0.25),
nn.Flatten(),
nn.Linear(5952, 128),
nn.ReLU(),
nn.Dropout(0.5),
nn.Linear(128, 126),
View((6, 21)),
nn.Softmax(dim=1))
this is the architecture I currently have and I don't know if my input to the Conv2d are correct
I keep getting errors with my dimensions and kernel sizes. I am unsure how to proceed.
CNN Architecture
Related
Loss Output
Epoch 996 Output
I have been working on a deep convolutional generative adversarial network (DCGAN) that generates pictures of cats (RGB, 64x64 pixels). It seems to learn rather quickly, as it is clear the images are cats by around the 300th epoch. For some reason, even after 1000 epochs, they have a good amount of blur on them, which is preventing them from being their full resolution. I am almost certain the issue is in my generator network structure, so I have attached it below.
model = tf.keras.Sequential()
model.add(layers.Dense(8*8*256, use_bias=False, input_shape=(100,)))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Reshape((8, 8, 256)))
assert model.output_shape == (None, 8, 8, 256)
model.add(layers.Conv2DTranspose(256, (5, 5), strides=(1, 1), padding='same', use_bias=False))
assert model.output_shape == (None, 8, 8, 256)
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Conv2DTranspose(128, (5, 5), strides=(2, 2), padding='same', use_bias=False))
assert model.output_shape == (None, 16, 16, 128)
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False))
assert model.output_shape == (None, 32, 32, 64)
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
# activation is tanh to not squash out the negatives thats we've been keeping through leakyReLU
model.add(layers.Conv2DTranspose(3, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
print("model.output_shape=", model.output_shape)
assert model.output_shape == (None, 64, 64, 3)[[enter image description here](https://i.stack.imgur.com/eICCQ.png)](https://i.stack.imgur.com/pfMY5.png)
I suspect the problems results from the artifacts generated due to my use of Conv2DTranspose layers, but is it worth it to switch to Upscaling followed by a Convolutional layer? I feel like it would do less learning this way.
I am trying to train the VGG16 model code, but the loss is not optimized and seems that model's parameters are not updated.
here is the model :
import torch
import torch.nn as nn
import math
import torch.nn.functional as F
from utils import AvgPoolConv
cfg = {
'VGG11': [16, 'M', 32, 'M', 64, 64, 'M', 128, 128, 'M', 128, 128, 'M'],
'VGG13': [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
'VGG16': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'],
'VGG19': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'],}
class VGG(nn.Module):
def __init__(self, vgg_name, use_bn, num_class=100):
super(VGG, self).__init__()
self.features = self._make_layers(cfg[vgg_name], use_bn)
self.classifier = nn.Sequential(
nn.Flatten(),
nn.Linear(512,4096),
nn.ReLU(inplace=True),
nn.Dropout(p=0.5),
nn.Linear(4096,4096),
nn.ReLU(inplace=True),
nn.Dropout(p=0.5),
nn.Linear(4096, num_class)
)
#self.classifier = nn.Linear(512, num_class)
for m in self.modules():
if isinstance(m, nn.Conv2d):
n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
m.weight.data.normal_(0, math.sqrt(2. / n))
elif isinstance(m, nn.BatchNorm2d):
m.weight.data.fill_(1)
m.bias.data.zero_()
elif isinstance(m, nn.Linear):
n = m.weight.size(1)
m.weight.data.normal_(0, 1.0/float(n))
m.bias.data.zero_()
def forward(self, x):
out = self.features(x)
out = self.classifier(out)
return out
def _make_layers(self, cfg, use_bn=True):
layers = []
in_channels = 3
for x in cfg:
if x == 'M':
layers += [nn.AvgPool2d(2)]
#layers += [AvgPoolConv(kernel_size=2, stride=2, input_channel=in_channels)]
else:
layers += [nn.Conv2d(in_channels, x, kernel_size=3, padding=1),
nn.BatchNorm2d(x) if use_bn else nn.Dropout(0.25),
nn.ReLU(inplace=True)]
in_channels = x
#layers += [nn.AvgPool2d(kernel_size=1, stride=1)]
return nn.Sequential(*layers)
but if I delete the first 2 FC layers from the classifier as shown below, the model is trained and loss can be optimized ??
self.features = self._make_layers(cfg[vgg_name], use_bn)
self.classifier = nn.Linear(512, num_class)
Why this happens?
First, it would be good to verify if the parameters are really not updated or just that the change is small.
Different architectures might require different tuning (learning rate, weight decay if you use it etc.). A good thing to try when debugging is a test "can I overfit it"; use a single batch (or a single sample even) and check if you can get it to 0; you might need to tweak optimization parameters mentioned before.
Assuming everything is correct and the gradient flows, I'd say - tune the learning rate and try adding batch normalization between your linear and relu layers (should make the training much faster).
input_shape=(100,100,6)
input_tensor=keras.Input(input_shape)
model.add(Conv2D(32, 3, padding='same', activation='relu', input_shape=input_shape))
model.add((Conv1D(filters=32, kernel_size=2, activation='relu', padding='same')))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(Conv2D(64, 3, padding='same', activation='relu', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(Conv2D(128, 3, padding='same', activation='relu', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.2))
model.compile(loss='categorical_crossentropy', optimizer="adam", metrics=['accuracy'])
training_set = train_datagen.flow_from_directory('/content/gdrive/My Drive/Data/training_set',
target_size = (128, 128),
batch_size = 32,
class_mode = 'categorical')
history=model.fit(training_set,
steps_per_epoch=nb_train_images//batch_size,
epochs=100,
validation_data=test_set,
validation_steps=nb_test_images//batch_size,
callbacks=callbacks)
history=model.fit(training_set,
steps_per_epoch=nb_train_images//batch_size,
epochs=40,
validation_data=test_set,
validation_steps=nb_test_images//batch_size,
callbacks=callbacks)
I have 6 different types of set to classify. where am i going wrong? i have add the input shape in above where i mentioned 1001006 ,can someone help to understand this issue.
This was happening to me too. The following is my code. The way that I fixed it was that instead of having some different input shape, I just made the input shape the training data's image shape.
train = image_gen.flow_from_directory(
train_path,
target_size=(500, 500
),
color_mode='grayscale',
class_mode='binary',
batch_size=16
)
#and then later, when I build the model
model.add(Conv2D(filters[0], (5, 5), padding='same',kernel_regularizer=12(0.001), activation='relu', input_shape=train.image_shape))
#the important part is the input_shape=train.image_shape
I've created a CNN designed to recognize objects.
from keras.preprocessing.image import img_to_array, load_img
img = load_img('newimage.jpg')
x = img_to_array(img)
x = x.reshape( (1,) + x.shape )
scores = model.predict(x, verbose=1)
print(scores)
However I'm getting:
expected convolution2d_input_1 to have shape (None, 3, 108, 192) but got array with shape (1, 3, 192, 108)
My model:
def create_model():
model = Sequential()
model.add(Convolution2D(32, 3, 3, input_shape=(3, img_width, img_height)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
return model
I've looked at related answers and the documentation, but at a loss as to how to reshape the array to match what's expected?
I guess the problem is with setting up the image width and height. As the error says:
expected convolution2d_input_1 to have shape (None, 3, 108, 192) # expected width = 108 and height = 192
but got array with shape (1, 3, 192, 108) # width = 192, height = 108
Update: I tested your code with a small change and it worked!
I am giving just changed lines:
img_width, img_height = 960, 717
model.add(Convolution2D(32, 3, 3, input_shape=(img_height, img_width, 3)))
This is the main change - input_shape=(img_height, img_width, 3)
The image i used to run this code was of width = 960 and height = 717. I have updated my previous answer as some part of the answer was wrong! Sorry for that.
I'm trying to build an autoencoder using Keras, based on [this example][1] from the docs. Because my data is large, I'd like to use a generator to avoid loading it into memory.
My model looks like:
model = Sequential()
model.add(Convolution2D(16, 3, 3, activation='relu', border_mode='same', input_shape=(3, 256, 256)))
model.add(MaxPooling2D((2, 2), border_mode='same'))
model.add(Convolution2D(8, 3, 3, activation='relu', border_mode='same'))
model.add(MaxPooling2D((2, 2), border_mode='same'))
model.add(Convolution2D(8, 3, 3, activation='relu', border_mode='same'))
model.add(MaxPooling2D((2, 2), border_mode='same'))
model.add(Convolution2D(8, 3, 3, activation='relu', border_mode='same'))
model.add(UpSampling2D((2, 2)))
model.add(Convolution2D(8, 3, 3, activation='relu', border_mode='same'))
model.add(UpSampling2D((2, 2)))
model.add(Convolution2D(16, 3, 3, activation='relu'))
model.add(UpSampling2D((2, 2)))
model.add(Convolution2D(1, 3, 3, activation='sigmoid', border_mode='same'))
model.compile(optimizer='adadelta', loss='binary_crossentropy')
My generator:
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory('IMAGE DIRECTORY', color_mode='rgb', class_mode='binary', batch_size=32, target_size=(256, 256))
And then fitting the model:
model.fit_generator(
train_generator,
samples_per_epoch=1,
nb_epoch=1,
verbose=1,
)
I'm getting this error:
Exception: Error when checking model target: expected convolution2d_76 to have 4 dimensions, but got array with shape (32, 1)
That looks like the size of my batch rather than a sample. What am I doing wrong?
The error is most likely due to the class_mode='binary'. It makes the generator produce binary classes, so the output has shape (batch_size, 1), while your model produces a four dimensional output (since the last layer is a convolution).
I guess that you want your label to be the image itself. Based on the source of the flow_from_directory and the DirectoryIterator it uses, it is impossible to do by just changing the class_mode. A possible solution would be along the lines of:
train_generator_ = train_datagen.flow_from_directory('IMAGE DIRECTORY', color_mode='rgb', class_mode=None, batch_size=32, target_size=(256, 256))
def train_generator():
for x in train_iterator_:
yield x, x
Note that I set class_mode to None. It makes the generator to return just the image instead of tuple(image, label). I then define a new generator, that returns the image as both the input and the label.