I am working on a CNN model in Google Colab for 3 class classification problem. I have a balanced class dataset with 1680 training sample (560 each) and 420 testing sample (140 each).I built a shallow network to train my data.But my model is overfitting after 7/8 epoch. I have applied all the possible techniques like Dropout, l2 regularization, data augmentation, Batch Normalization, reduce learning rate but there is no any changes in the model performance.
I achieved loss: 0.7746 – accuracy: 0.6673 – val_loss: 0.8310 – val_accuracy: 0.6071 on 30 epochs.
#model architecture
model = models.Sequential()
model.add(layers.Conv2D(8, (3, 3), activation='relu', input_shape=(28, 28, 3)))
model.add(layers.BatchNormalization())
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(16, (3, 3), activation='relu'))
model.add(layers.BatchNormalization())
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Dropout(0.5))
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(3, activation='softmax'))
model.summary()
Model: "sequential_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_8 (Conv2D) (None, 26, 26, 8) 224
batch_normalization_8 (Bat (None, 26, 26, 8) 32
chNormalization)
max_pooling2d_4 (MaxPoolin (None, 13, 13, 8) 0
g2D)
conv2d_9 (Conv2D) (None, 11, 11, 16) 1168
batch_normalization_9 (Bat (None, 11, 11, 16) 64
chNormalization)
max_pooling2d_5 (MaxPoolin (None, 5, 5, 16) 0
g2D)
dropout_7 (Dropout) (None, 5, 5, 16) 0
flatten_4 (Flatten) (None, 400) 0
dropout_8 (Dropout) (None, 400) 0
dense_5 (Dense) (None, 3) 1203
=================================================================
Total params: 2691 (10.51 KB)
Trainable params: 2643 (10.32 KB)
Non-trainable params: 48 (192.00 Byte)
learning_rate=0.001, batch_size=8, optimizer=Adam, loss=’categorical_crossentropy’ and
target_size=(28,28)
Classification Report
precision recall f1-score support
Moderately Tolerant 0.33 0.66 0.43 140
Susceptible 0.33 0.13 0.18 140
Tolerant 0.39 0.23 0.29 140
accuracy 0.34 420
macro avg 0.35 0.34 0.30 420
weighted avg 0.35 0.34 0.30 420
Though I have high f1-score in Moderately Tolerant class still my model predict unseen image as Susceptible class.Here are the graph of training and validation accuracy and loss.
enter image description here enter image description here
Can anyone help me out to understand the reason behind this and help me to improve my model so that my model can generalize well. Thank you.
Kaushik Bordoloi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.