Hi I have trained a CNN model with keras similar to the example model they use on their website but with slightly smaller layers and an extra dropout layer at the end. The model build functions looks a little something like this:
" --------- Model Params ---------"
epochs = 800
image_size = (384, 256)
batch_size = 128
number_of_layers = 4
drop_out = 0.25
num_dropouts = 2
learn_rate = 0.00001
layer_sizes = [64, 128, 256, 512, 728]
class_weight = {0:1, 1:3}
image_rotation = 0.1
def make_model(input_shape, num_classes, layer_num=3, drop_out=0.25, dropouts=1):
inputs = keras.Input(shape=input_shape)
# Entry block
x = layers.Rescaling(1.0 / 255)(inputs)
x = layers.Conv2D(128, 3, strides=2, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
previous_block_activation = x # Set aside residual
layer_use = []
for i in range(layer_num): layer_use.append(layer_sizes[i])
for size in layer_use:
x = layers.Activation("relu")(x)
x = layers.SeparableConv2D(size, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
x = layers.SeparableConv2D(size, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.MaxPooling2D(3, strides=2, padding="same")(x)
# Project residual
residual = layers.Conv2D(size, 1, strides=2, padding="same")(
previous_block_activation
)
x = layers.add([x, residual]) # Add back residual
previous_block_activation = x # Set aside next residual
x = layers.SeparableConv2D(1024, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
x = layers.GlobalAveragePooling2D()(x)
if num_classes == 2:
units = 1
else:
units = num_classes
if dropouts == 1:
x = layers.Dropout(drop_out)(x)
# We specify activation=None so as to return logits
outputs = layers.Dense(units, activation=None)(x)
elif dropouts == 2:
x = layers.Dropout(0.2)(x)
x = layers.Dense(units, activation=None)(x)
outputs = layers.Dropout(drop_out)(x)
return keras.Model(inputs, outputs)
model = make_model(input_shape=image_size + (3,), num_classes=2, layer_num=number_of_layers,
drop_out=drop_out, dropouts=num_dropouts)
model.compile(
optimizer=keras.optimizers.Adam(learning_rate=learn_rate),
loss=keras.losses.BinaryCrossentropy(from_logits=True),
metrics=[keras.metrics.BinaryAccuracy(name="acc")],
)
model.fit(
train_ds,
epochs=epochs,
callbacks=callbacks,
validation_data=val_ds,
class_weight=class_weight
)
The model trains fine – reaching around 97% validation accuracy. And when using the predict function the model generally gives sensible outputs from the input data I give it. The issue I am having is that the output prediction when using the predict_on_batch
function are not the same and will often vary by +-0.15 for the same input data. What is the cause of this, shouldn’t the predictions be identical for the same input data once the model has been trained and is being used for predictions?
Thanks