I’ve been trying to do transfer learning in EfficientNet in a binary classification task.
My directory structure looks like this:
training
├── label0
└── label1
validation
├── label0
└── label1
and I use this to create the image datagen:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=30,
width_shift_range=0.0,
height_shift_range=0.0,
shear_range=0.0,
zoom_range=0.0,
horizontal_flip=True,
fill_mode='nearest'
)
no augmentations are applied
validation_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
'training',
target_size=(240, 240),
batch_size=128,
class_mode='binary', # Binary classification
shuffle=True
)
validation_generator = validation_datagen.flow_from_directory(
'validation',
target_size=(240, 240),
batch_size=128,
class_mode='binary',
shuffle=True
)
With an output of :
Found 19747 images belonging to 2 classes.
Found 4938 images belonging to 2 classes.
This is the code for building the model:
from tensorflow.keras.layers import GlobalAveragePooling2D, Dense
from tensorflow.keras.applications import EfficientNetB1
from tensorflow.keras.models import Model
from tensorflow.keras import layers
NUM_CLASSES = 2
IMG_SIZE = 240
size = (IMG_SIZE, IMG_SIZE)
def build_model():
inputs = layers.Input(shape=(IMG_SIZE, IMG_SIZE, 3))
model = EfficientNetB1(include_top=False, input_tensor=inputs, weights="imagenet")
model.trainable = False
x = layers.GlobalAveragePooling2D(name="avg_pool")(model.output)
x = layers.BatchNormalization()(x)
# x = layers.Dense(128, activation="relu")(x)
# top_dropout_rate = 0.2
# x = layers.Dropout(top_dropout_rate, name="dropout")(x)
# outputs = layers.Dense(1, activation="sigmoid", name="pred")(x)
outputs = layers.Dense(1, activation="sigmoid", name="pred")(x)
model = tf.keras.Model(inputs, outputs, name="EfficientNet")
optimizer = tf.keras.optimizers.Adam(learning_rate=1e-4)
model.compile(
optimizer=optimizer,
loss="binary_crossentropy",
metrics=["accuracy"]
)
return model
you can see that I’m trying lots of things just to make something work but to no avail. I set model.trainable
to False since I am following this implementation on transfer learning. I still haven’t finished the first step. I was planning to move on to the unfreezing once I reached 60-70% but I can’t even make it to 52%.
Here is how I start the training:
model = build_model()
epochs = 50
hist = model.fit(train_generator,
epochs=epochs,
steps_per_epoch=len(train_generator),
validation_data=validation_generator,
validation_steps=len(validation_generator))
and the accuracy and val_accuracy are always almost 50% at the first ten epochs, which is no more than a random guessing.
I tried lowering or increasing the learning rate (1e-1 to 1e-6), the batch size (32 – 256), adding dropouts (0.1 – 0.5), adding relu layers (32 – 128), and making sure the images are in the right class. What can I do to make the model really learn?
2
In Your code You use line:
model.trainable = False
This line means that your model can’t be trained. If you want do transfer learning You have to unfreeze some layers.
Some example how to unfreeze last 2 layers, try to experiment with different values:
# Unfreeze the top layers of the base model
for layer in base_model.layers[-2:]:
if not isinstance(layer, layers.BatchNormalization):
layer.trainable = True
else:
layer.trainable = False
BatchNormalization layers are frozen because for small dataset it may cause problems with convergence (values in this layer will be unstable).
2