Why does my contrastive learning model’s loss and gradients explode during training?
I am fine-tuning an embedding model using contrastive learning. For the loss function, I’m using torch.nn.CrossEntropyLoss
.
I am fine-tuning an embedding model using contrastive learning. For the loss function, I’m using torch.nn.CrossEntropyLoss
.