ORPOTrainer Error: Calculated loss must be on the original device: cuda:0 but device in use is cuda:3
I am trying to train Phi3 with an ORPO dataset using the ORPOTrainer from the HuggingFace Transformers library. My machine has 4 GPUs, so I would like to start multi-GPU training.
This is my ORPOCONFIG: