How can I get my TensorFlow GAN output to not be fuzzy? - deep-learning

Loss Output
Epoch 996 Output
I have been working on a deep convolutional generative adversarial network (DCGAN) that generates pictures of cats (RGB, 64x64 pixels). It seems to learn rather quickly, as it is clear the images are cats by around the 300th epoch. For some reason, even after 1000 epochs, they have a good amount of blur on them, which is preventing them from being their full resolution. I am almost certain the issue is in my generator network structure, so I have attached it below.
model = tf.keras.Sequential()
model.add(layers.Dense(8*8*256, use_bias=False, input_shape=(100,)))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Reshape((8, 8, 256)))
assert model.output_shape == (None, 8, 8, 256)
model.add(layers.Conv2DTranspose(256, (5, 5), strides=(1, 1), padding='same', use_bias=False))
assert model.output_shape == (None, 8, 8, 256)
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Conv2DTranspose(128, (5, 5), strides=(2, 2), padding='same', use_bias=False))
assert model.output_shape == (None, 16, 16, 128)
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False))
assert model.output_shape == (None, 32, 32, 64)
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
# activation is tanh to not squash out the negatives thats we've been keeping through leakyReLU
model.add(layers.Conv2DTranspose(3, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
print("model.output_shape=", model.output_shape)
assert model.output_shape == (None, 64, 64, 3)[[enter image description here](https://i.stack.imgur.com/eICCQ.png)](https://i.stack.imgur.com/pfMY5.png)
I suspect the problems results from the artifacts generated due to my use of Conv2DTranspose layers, but is it worth it to switch to Upscaling followed by a Convolutional layer? I feel like it would do less learning this way.

Related

Pytorch CNN Input for Guitar Tab CNN

I am trying to implement the architecture I have attached.
The output of my DataLoader has size: torch.Size([128, 192, 9, 1])
I am using a batch size of 128
View just reshapes the output of the dense layer
model = nn.Sequential(nn.Conv2d(192, 32, 3),
nn.ReLU(),
nn.Conv2d(32, 64, 3),
nn.ReLU(),
nn.Conv2d(64, 64, 3),
nn.MaxPool2d(2),
nn.Dropout(0.25),
nn.Flatten(),
nn.Linear(5952, 128),
nn.ReLU(),
nn.Dropout(0.5),
nn.Linear(128, 126),
View((6, 21)),
nn.Softmax(dim=1))
this is the architecture I currently have and I don't know if my input to the Conv2d are correct
I keep getting errors with my dimensions and kernel sizes. I am unsure how to proceed.
CNN Architecture

InvalidArgumentError: input depth must be evenly divisible by filter depth: 3 vs 6

input_shape=(100,100,6)
input_tensor=keras.Input(input_shape)
model.add(Conv2D(32, 3, padding='same', activation='relu', input_shape=input_shape))
model.add((Conv1D(filters=32, kernel_size=2, activation='relu', padding='same')))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(Conv2D(64, 3, padding='same', activation='relu', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(Conv2D(128, 3, padding='same', activation='relu', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.2))
model.compile(loss='categorical_crossentropy', optimizer="adam", metrics=['accuracy'])
training_set = train_datagen.flow_from_directory('/content/gdrive/My Drive/Data/training_set',
target_size = (128, 128),
batch_size = 32,
class_mode = 'categorical')
history=model.fit(training_set,
steps_per_epoch=nb_train_images//batch_size,
epochs=100,
validation_data=test_set,
validation_steps=nb_test_images//batch_size,
callbacks=callbacks)
history=model.fit(training_set,
steps_per_epoch=nb_train_images//batch_size,
epochs=40,
validation_data=test_set,
validation_steps=nb_test_images//batch_size,
callbacks=callbacks)
I have 6 different types of set to classify. where am i going wrong? i have add the input shape in above where i mentioned 1001006 ,can someone help to understand this issue.
This was happening to me too. The following is my code. The way that I fixed it was that instead of having some different input shape, I just made the input shape the training data's image shape.
train = image_gen.flow_from_directory(
train_path,
target_size=(500, 500
),
color_mode='grayscale',
class_mode='binary',
batch_size=16
)
#and then later, when I build the model
model.add(Conv2D(filters[0], (5, 5), padding='same',kernel_regularizer=12(0.001), activation='relu', input_shape=train.image_shape))
#the important part is the input_shape=train.image_shape

Low accuracy of a Deep CNN

I have generated a dataset using EMNIST and mathematical symbols that has one character per image or two characters per image. There are 72 possible characters in the dataset. The image is sized at 28x56(hxw).
Ex:- single character double character
There are 5256 (72*73) possible classes considering all the combinations of the characters. This is from 72 possible characters in the first part and 73 possible characters(including a blank) in the second part of the label. I have made sure that each class has around 540-600 images. The total dataset has around 3 million images.
The CNN models I have tried:
input_shape = (28, 56, 1)
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=(3, 3), padding='same', use_bias=False,
input_shape=input_shape))
model.add(Activation('relu'))
model.add(Conv2D(filters=32, kernel_size=(3, 3), padding='same', use_bias=False))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(.2))
model.add(Conv2D(filters=64, kernel_size=(3, 3), padding='same', use_bias=False))
model.add(Activation('relu'))
model.add(Conv2D(filters=64, kernel_size=(3, 3), padding='same', use_bias=False))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(.2))
model.add(Conv2D(filters=128, kernel_size=(3, 3), padding='same', use_bias=False))
model.add(Activation('relu'))
model.add(Conv2D(filters=128, kernel_size=(3, 3), padding='same', use_bias=False))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(.3))
model.add(Flatten())
model.add(Dense(4096, activation='relu'))
model.add(BatchNormalization())
model.add(Dense(4096, activation='relu'))
model.add(Dense(units=5256, activation='softmax'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='sparse_categorical_crossentropy', optimizer=sgd, metrics=['sparse_categorical_accuracy'])
I even tried a model with two Dense layers of 10512 units aswell. I was only able to acheive an accuracy of around 66%. I have tried various batch sizes from 32,64,256 and ADAM optimizer with various learning rates aswell. It would be great if someone can point out what I am doing wrong here.or give some tips on increasing the accuracy.
On following Ruslan S.'s recommendation, I trained the model on Resnet50(Retrained the whole network, not just last layers). I was able to achieve a significant improvement of Accuracy. I was able to reach around 96% accuracy.

Reshaping image data in Keras to match CNN requirements

I've created a CNN designed to recognize objects.
from keras.preprocessing.image import img_to_array, load_img
img = load_img('newimage.jpg')
x = img_to_array(img)
x = x.reshape( (1,) + x.shape )
scores = model.predict(x, verbose=1)
print(scores)
However I'm getting:
expected convolution2d_input_1 to have shape (None, 3, 108, 192) but got array with shape (1, 3, 192, 108)
My model:
def create_model():
model = Sequential()
model.add(Convolution2D(32, 3, 3, input_shape=(3, img_width, img_height)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
return model
I've looked at related answers and the documentation, but at a loss as to how to reshape the array to match what's expected?
I guess the problem is with setting up the image width and height. As the error says:
expected convolution2d_input_1 to have shape (None, 3, 108, 192) # expected width = 108 and height = 192
but got array with shape (1, 3, 192, 108) # width = 192, height = 108
Update: I tested your code with a small change and it worked!
I am giving just changed lines:
img_width, img_height = 960, 717
model.add(Convolution2D(32, 3, 3, input_shape=(img_height, img_width, 3)))
This is the main change - input_shape=(img_height, img_width, 3)
The image i used to run this code was of width = 960 and height = 717. I have updated my previous answer as some part of the answer was wrong! Sorry for that.

Keras ImageDataGenerator not working as expected

I'm trying to build an autoencoder using Keras, based on [this example][1] from the docs. Because my data is large, I'd like to use a generator to avoid loading it into memory.
My model looks like:
model = Sequential()
model.add(Convolution2D(16, 3, 3, activation='relu', border_mode='same', input_shape=(3, 256, 256)))
model.add(MaxPooling2D((2, 2), border_mode='same'))
model.add(Convolution2D(8, 3, 3, activation='relu', border_mode='same'))
model.add(MaxPooling2D((2, 2), border_mode='same'))
model.add(Convolution2D(8, 3, 3, activation='relu', border_mode='same'))
model.add(MaxPooling2D((2, 2), border_mode='same'))
model.add(Convolution2D(8, 3, 3, activation='relu', border_mode='same'))
model.add(UpSampling2D((2, 2)))
model.add(Convolution2D(8, 3, 3, activation='relu', border_mode='same'))
model.add(UpSampling2D((2, 2)))
model.add(Convolution2D(16, 3, 3, activation='relu'))
model.add(UpSampling2D((2, 2)))
model.add(Convolution2D(1, 3, 3, activation='sigmoid', border_mode='same'))
model.compile(optimizer='adadelta', loss='binary_crossentropy')
My generator:
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory('IMAGE DIRECTORY', color_mode='rgb', class_mode='binary', batch_size=32, target_size=(256, 256))
And then fitting the model:
model.fit_generator(
train_generator,
samples_per_epoch=1,
nb_epoch=1,
verbose=1,
)
I'm getting this error:
Exception: Error when checking model target: expected convolution2d_76 to have 4 dimensions, but got array with shape (32, 1)
That looks like the size of my batch rather than a sample. What am I doing wrong?
The error is most likely due to the class_mode='binary'. It makes the generator produce binary classes, so the output has shape (batch_size, 1), while your model produces a four dimensional output (since the last layer is a convolution).
I guess that you want your label to be the image itself. Based on the source of the flow_from_directory and the DirectoryIterator it uses, it is impossible to do by just changing the class_mode. A possible solution would be along the lines of:
train_generator_ = train_datagen.flow_from_directory('IMAGE DIRECTORY', color_mode='rgb', class_mode=None, batch_size=32, target_size=(256, 256))
def train_generator():
for x in train_iterator_:
yield x, x
Note that I set class_mode to None. It makes the generator to return just the image instead of tuple(image, label). I then define a new generator, that returns the image as both the input and the label.