Why is my accuracy high from the beginning of training? - deep-learning

I am training a neural network to recognize some of attributes on .png pictures, and what I get when I start training is something like this, and it is increasing till the end of the epoch:
32/4817 [..............................] - ETA: 167s - loss: 0.6756 - acc: 0.5
64/4817 [..............................] - ETA: 152s - loss: 0.6214 - acc: 0.7
96/4817 [..............................] - ETA: 145s - loss: 0.6169 - acc: 0.7
128/4817 [.............................] - ETA: 142s - loss: 0.5972 - acc: 0.7
160/4817 [.............................] - ETA: 140s - loss: 0.5734 - acc: 0.7
192/4817 [>............................] - ETA: 138s - loss: 0.5604 - acc: 0.7
224/4817 [>............................] - ETA: 137s - loss: 0.5427 - acc: 0.7
256/4817 [>............................] - ETA: 135s - loss: 0.5160 - acc: 0.7
288/4817 [>............................] - ETA: 134s - loss: 0.5492 - acc: 0.7
320/4817 [>............................] - ETA: 133s - loss: 0.5574 - acc: 0.7
352/4817 [=>...........................] - ETA: 131s - loss: 0.5559 - acc: 0.7
384/4817 [=>...........................] - ETA: 129s - loss: 0.5550 - acc: 0.7
416/4817 [=>...........................] - ETA: 128s - loss: 0.5504 - acc: 0.7
448/4817 [=>...........................] - ETA: 127s - loss: 0.5417 - acc: 0.7
480/4817 [=>...........................] - ETA: 126s - loss: 0.5425 - acc: 0.7
My question is why is the starting accuracy so high? I suppose it should be something around 0.1 and then increasing while learning.
Also, at the end I get:
('Test loss:', 0.42451223436727564)
('Test accuracy:', 0.82572614112830256)
Is that too big test loss?
This is my network:
input_shape = x_train[0].shape
print(input_shape)
model = Sequential()
stoplearn = keras.callbacks.EarlyStopping(monitor='val_loss', min_delta=0,
patience=0, verbose=0, mode='auto')
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=20,
verbose=1,
validation_data=(x_test, y_test),
callbacks=[stoplearn])
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
It is written in Python using Keras.

You classify your data into two classes (since your output layer is of size 2), so accuracy of 0.5 is not high. In fact, it means that your network behaves randomly, which is what you expect at the beginning. Regarding the loss, there is no absolute answer for that. Your test accuracy seems not bad, and you could try to play with some of the parameters (for example, taking smaller size for the fully connected layer) to see whether you can improve it.

You have two classes. Random choice will lead to 50% accuracy. This is what you get in the beginning. Hence your result is expected.
The reason why it jumps directly to 70% accuracy could be that your problem is simple.
If you want to double-check it, you could
use other classifiers,
check how many examples are used to calculate accuracy,
serialize the trained classifier and manually feed it with new examples and check their results

Related

Why my training loss decreases first then increases reguarly in each epoch?

In each epoch, my training loss first decreases (compared with the loss of its last epoch) but then increases till the end of this epoch. Below is an example:
Epoch [29][250/940] lr: 3.298e-04, top1_acc: 0.4903, top5_acc: 0.7323, loss: 2.2106, grad_norm: 1.8732
Epoch [29][500/940] lr: 3.298e-04, top1_acc: 0.4844, top5_acc: 0.7267, loss: 2.2330, grad_norm: 1.8848
Epoch [29][750/940] lr: 3.298e-04, top1_acc: 0.4850, top5_acc: 0.7247, loss: 2.2491, grad_norm: 1.8910
Epoch [30][250/940] lr: 3.055e-04, top1_acc: 0.4985, top5_acc: 0.7384, loss: 2.1676, grad_norm: 1.9173
Epoch [30][500/940] lr: 3.055e-04, top1_acc: 0.4971, top5_acc: 0.7356, loss: 2.1739, grad_norm: 1.9099
Epoch [30][750/940] lr: 3.055e-04, top1_acc: 0.4962, top5_acc: 0.7341, loss: 2.1861, grad_norm: 1.9269
Epoch [31][250/940] lr: 2.816e-04, top1_acc: 0.5119, top5_acc: 0.7454, loss: 2.1087, grad_norm: 1.9586
Epoch [31][500/940] lr: 2.816e-04, top1_acc: 0.5071, top5_acc: 0.7435, loss: 2.1329, grad_norm: 1.9674
Epoch [31][750/940] lr: 2.816e-04, top1_acc: 0.5076, top5_acc: 0.7436, loss: 2.1296, grad_norm: 1.9738
The training loss curve is:
The pattern of decrease-then-increase in each epoch of training loss is very stable but it only happens at the the later part of the training. So it should not be the problem of codes.
Is there a genius can explain this?

why is the accuracy of my pretrained resnet-152 model so low?

I am fairly new to deep learning and neural networks. I recently built a facial emotions recognition classifier using the FER-2013 dataset. I am using the pretrained resnet-152 model for classification, but the accuracy of my model is very low, both training and validation accuracies. I am getting an accuracy of around 36%, which is not good. I suppose that using transfer learning, the accuracies should be high, why is it that im getting such a low accuracy. should I change the hyperparameters? here is my code.
model= models.resnet152(pretrained=True)
for param in model.parameters():
param.requires_grad= False
print(model)
from collections import OrderedDict
classifier= nn.Sequential(OrderedDict([
('fc1',nn.Linear(2048, 512)),
('relu', nn.ReLU()),
('dropout1', nn. Dropout(p=0.5)),
('fc2', nn.Linear(512, 7)),
('output', nn.LogSoftmax(dim=1))
]))
model.fc= classifier
print(classifier)
def train_model(model, criterion, optimizer, scheduler, num_epochs=10):
since= time.time()
best_model_wts= copy.deepcopy(model.state_dict())
best_acc= 0.0
for epoch in range(1, num_epochs + 1):
print('Epoch {}/{}'.format(epoch, num_epochs))
print('-' * 10)
for phase in ['train', 'validation']:
if phase == 'train':
scheduler.step()
model.train()
else:
model.eval()
running_loss= 0.0
running_corrects=0
for inputs, labels in dataloaders[phase]:
inputs, labels= inputs.to(device), labels.to(device)
optimizer.zero_grad()
with torch.set_grad_enabled(phase== 'train'):
outputs= model(inputs)
loss= criterion(outputs, labels)
_, preds= torch.max(outputs, 1)
if phase == 'train':
loss.backward()
optimizer.step()
running_loss += loss.item() * inputs.size(0)
running_corrects += torch.sum(preds== labels.data)
epoch_loss= running_loss / dataset_sizes[phase]
epoch_acc= running_corrects.double() / dataset_sizes[phase]
print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc))
if phase == 'validation' and epoch_acc > best_acc:
best_acc= epoch_acc
best_model_wts= copy.deepcopy(model.state_dict())
time_elapsed= time.time() - since
print('Training complete in {:.0f}m {:.0f}s'.format(
time_elapsed // 60, time_elapsed % 60))
print('Best valid accuracy: {:4f}'.format(best_acc))
model.load_state_dict(best_model_wts)
return model
use_gpu= torch.cuda.is_available()
num_epochs= 10
if use_gpu:
print('Using GPU: '+ str(use_gpu))
model= model.cuda()
criterion= nn.NLLLoss()
optimizer= optim.SGD(model.fc.parameters(), lr = .0006, momentum=0.9)
exp_lr_scheduler = lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.1)
model_ft = train_model(model, criterion, optimizer, exp_lr_scheduler, num_epochs=10)
Can someone please guide me. I am a beginner at it, and I could really make use of some help in it.
preprocess the dataset.
Get more dataset as low accuracy could be a result of smaller dataset.
Try data-augmentation if you have less data.

Train Pnet of mtcnn, the bounding box regression acc is very low! How to increase it? or is my lebal wrong?

I use the FDDB face pic to train the mtcnn for face detection. In the pnet, for bounding box regression, the acc stay around the 60%. Is there something wrong?
Epoch 397/400
1200/1200 [==============================] - 0s 131us/sample - loss: 0.5068 - conv4_1_loss: 5.4316e-06 - conv4_2_loss: 0.5052 - conv4_1_accuracy: 1.0000 - conv4_2_accuracy: 0.5850
Epoch 398/400
1200/1200 [==============================] - 0s 131us/sample - loss: 0.4350 - conv4_1_loss: 3.8598e-06 - conv4_2_loss: 0.4358 - conv4_1_accuracy: 1.0000 - conv4_2_accuracy: 0.5950
Epoch 399/400
1200/1200 [==============================] - 0s 131us/sample - loss: 0.8905 - conv4_1_loss: 5.0222e-06 - conv4_2_loss: 0.8863 - conv4_1_accuracy: 1.0000 - conv4_2_accuracy: 0.5817
Epoch 400/400
1200/1200 [==============================] - 0s 124us/sample - loss: 1.8505 - conv4_1_loss: 3.0373e-04 - conv4_2_loss: 1.8358 - conv4_1_accuracy: 1.0000 - conv4_2_accuracy: 0.5817
class P_Net(keras.Model):
def __init__(self):
super(P_Net, self).__init__(name="P_Net")
# Define layers here.
self.conv1 = keras.layers.Conv2D(10, (3, 3), name="conv1")
self.prelu1 = keras.layers.PReLU(tf.constant_initializer(0.25), shared_axes=[1, 2], name="prelu1")
self.pool1 = keras.layers.MaxPooling2D((2, 2), name="pool1")
self.conv2 = keras.layers.Conv2D(16, (3, 3), name="conv2")
self.prelu2 = keras.layers.PReLU(tf.constant_initializer(0.25), shared_axes=[1, 2], name="prelu2")
self.conv3 = keras.layers.Conv2D(32, (3, 3), name="conv3")
self.prelu3 = keras.layers.PReLU(tf.constant_initializer(0.25), shared_axes=[1, 2], name="prelu3")
self.cls_output = keras.layers.Conv2D(2, (1, 1), activation="softmax", name="conv4_1")
self.bbox_pred = keras.layers.Conv2D(4, (1, 1), name="conv4_2")
#self.landmark_pred = keras.layers.Conv2D(10, (1, 1), name="conv4_3")
def call(self, inputs):
# Define your forward pass here,
# using layers you previously defined (in `__init__`).
x = self.conv1(inputs)
x = self.prelu1(x)
x = self.pool1(x)
x = self.conv2(x)
x = self.prelu2(x)
x = self.conv3(x)
x = self.prelu3(x)
return [self.cls_output(x), self.bbox_pred(x)]#, self.landmark_pred(x)]
def get_summary(self, input_shape):
inputs = keras.Input(input_shape)
model = keras.Model(inputs, self.call(inputs))
return model
For the bounding box lebal, in the Positive train_set ,i use the markable x1,y1 x2,y2 which the FDDB data supplied, just resize them depends scale of pic. is it wrong?
the Negative train_set, i set box lebal [0,0,0,0]
like this:
['./pos/20020816bigimg_932.jpg',
1,
['0.2857142857142857',
'2.2857142857142856',
'12.285714285714286',
'14.285714285714286']]

Low validation accuracy in parallel DenseNet

I've taken the code from https://github.com/flyyufelix/cnn_finetune and remodeled it so that there is now two DenseNet-121 in parallel, with the layers after each model's last Global Average Pooled removed.
Both models were joined together like this:
print("Begin model 1")
model = densenet121_model(img_rows=img_rows, img_cols=img_cols, color_type=channel, num_classes=num_classes)
print("Begin model 2")
model2 = densenet121_nw_model(img_rows=img_rows, img_cols=img_cols, color_type=channel, num_classes=num_classes)
mergedOut = Add()([model.output,model2.output])
#mergedOut = Flatten()(mergedOut)
mergedOut = Dense(num_classes, name='cmb_fc6')(mergedOut)
mergedOut = Activation('softmax', name='cmb_prob')(mergedOut)
newModel = Model([model.input,model2.input], mergedOut)
adam = Adam(lr=1e-3, decay=1e-6, amsgrad=True)
newModel.compile(optimizer=adam, loss='categorical_crossentropy', metrics=['accuracy'])
# Start Fine-tuning
newModel.fit([X_train,X_train], Y_train,
batch_size=batch_size,
nb_epoch=nb_epoch,
shuffle=True,
verbose=1,
validation_data=([X_valid,X_valid],Y_valid)
)
The first model has its layers frozen, and the one in parallel is suppose to learn additional features on top of the first model to supposedly improve accuracy.
However, even at 100 epochs,
the training accuracy is almost 100% but validation floats around 9%.
I'm not quite sure what could be the reason and how to fix it, considering I've already changed the optimizer from SGD (same concept, 2 densenets with the first trained on ImageNet, the second has no weights to begin with same results) to Adam (2 densenets, both pre-trained on imagenet).
Epoch 101/1000
1000/1000 [==============================] - 1678s 2s/step - loss: 0.0550 - acc: 0.9820 - val_loss: 12.9906 - val_acc: 0.0900
Epoch 102/1000
1000/1000 [==============================] - 1703s 2s/step - loss: 0.0567 - acc: 0.9880 - val_loss: 12.9804 - val_acc: 0.1100

Keras: batch training for multiple large datasets

this question regards the common problem of training on multiple large files in Keras which are jointly too large to fit on GPU memory.
I am using Keras 1.0.5 and I would like a solution that does not require 1.0.6.
One way to do this was described by fchollet
here and
here:
# Create generator that yields (current features X, current labels y)
def BatchGenerator(files):
for file in files:
current_data = pickle.load(open("file", "rb"))
X_train = current_data[:,:-1]
y_train = current_data[:,-1]
yield (X_train, y_train)
# train model on each dataset
for epoch in range(n_epochs):
for (X_train, y_train) in BatchGenerator(files):
model.fit(X_train, y_train, batch_size = 32, nb_epoch = 1)
However I fear that the state of the model is not saved, rather that the model is reinitialized not only between epochs but also between datasets. Each "Epoch 1/1" represents training on a different dataset below:
~~~~~ Epoch 0 ~~~~~~
Epoch 1/1
295806/295806 [==============================] - 13s - loss: 15.7517
Epoch 1/1
407890/407890 [==============================] - 19s - loss: 15.8036
Epoch 1/1
383188/383188 [==============================] - 19s - loss: 15.8130
~~~~~ Epoch 1 ~~~~~~
Epoch 1/1
295806/295806 [==============================] - 14s - loss: 15.7517
Epoch 1/1
407890/407890 [==============================] - 20s - loss: 15.8036
Epoch 1/1
383188/383188 [==============================] - 15s - loss: 15.8130
I am aware that one can use model.fit_generator but as the method above was repeatedly suggested as a way of batch training I would like to know what I am doing wrong.
Thanks for your help,
Max
It has been a while since I faced that problem but I remember that I used
Kera's functionality to provide data through Python generators, i.e. model = Sequential(); model.fit_generator(...).
An exemplary code snippet (should be self-explanatory)
def generate_batches(files, batch_size):
counter = 0
while True:
fname = files[counter]
print(fname)
counter = (counter + 1) % len(files)
data_bundle = pickle.load(open(fname, "rb"))
X_train = data_bundle[0].astype(np.float32)
y_train = data_bundle[1].astype(np.float32)
y_train = y_train.flatten()
for cbatch in range(0, X_train.shape[0], batch_size):
yield (X_train[cbatch:(cbatch + batch_size),:,:], y_train[cbatch:(cbatch + batch_size)])
model = Sequential()
model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
train_files = [train_bundle_loc + "bundle_" + cb.__str__() for cb in range(nb_train_bundles)]
gen = generate_batches(files=train_files, batch_size=batch_size)
history = model.fit_generator(gen, samples_per_epoch=samples_per_epoch, nb_epoch=num_epoch,verbose=1, class_weight=class_weights)