Problem Description:
I am training a convolutional autoencoder for image reconstruction, but the model outputs are blurry and low-contrast compared to the expected results. Here’s the setup:
Python Version: 3.7
TensorFlow Version: 1.15.8 (DirectML)
GPU: AMD Radeon RX 6700XT
Model Type: Convolutional Autoencoder
Despite data normalization and augmentation (rotation, brightness adjustment, horizontal flipping), the model struggles to generate high-quality reconstructions. The issue seems to be linked to the convolutional layers or the loss function.
What I’ve Tried:
- Reducing the learning rate.
- Normalizing the dataset to the [0,1][0,1] range.
- Adjusting the number of filters in the encoder and decoder.
- Using MSE as the loss function.
Questions:
1-) Could doubling the filters in the encoder/decoder layers help address the blurriness, as it does for GAN critics?
2-) Is it possible to combine MAE loss with MSE during training to mitigate the desaturation issue?
3-) Are there specific architectural or learning adjustments to improve the output quality and avoid blurry/faded results?
Images:
Input, target (expected output), and predicted outputs are attached for comparison.
Example 1:
Example 2:
I would greatly appreciate any advice or insights to tackle this problem effectively. Thank you in advance!