Goal:
Transfer learning Mobilenetv2 (input size 224×224 and it’s own preprocessing (resize + central_crop + normalization)) as encoder for Unet with input size 512×512 using pytorch.
What I’ve done:
- Created architecture of such an Unet
- Loaded MobileNet_V2_Weights.IMAGENET1K_V2
- Started learning and saw that loss is high and stays the same
- Loaded MobileNet_V2_Weights.IMAGENET1K_V1
- Started learning and saw that everything is fine and model converges
All trainings were done without preprocessing because it operates with 224×224 images. The only thing I did is normalizing to [0;1].
Questions:
- Why V2 weights haven’t managed to start converging?
- Do I loose accuracy throwing away imagenet normalization from preprocessing step?
- Should I add one extra conv+pooling layer to increase receptive field twice because input size was doubled?