I’ve been working with TensorFlow’s pretrained models extensively and have noticed a recurring issue in many tutorials. Often, these tutorials fine-tune the model or freeze it and use it as a feature extractor, appending different output layers based on the task at hand. However, I’ve noticed that many of these tutorials seem to skip the step for preprocessing the data, even though each model requires different input or the data are only scaled between 0-1.
For each model, a preprocess_input
function exists, but it seems to be applied inconsistently across tutorials. Sometimes it’s used only during prediction and not during training, or sometimes not at all. This leads me to my questions:
- Should the data be preprocessed in the same way for training as it is for prediction?
- Could this inconsistency be resolved by appending a lambda function with the preprocess_input to the input layer, assuming the input data of all my models would always be RGB values between 0-255? For example:
x = tf.keras.layers.Lambda(preprocess_input)(input_layer)
I would appreciate any insights or recommendations on this matter. Thank you in advance!