Relative Content

Tag Archive for large-datapytorch-dataloadermulti-gpuhuggingface-trainer

Evaluation Speed is too low, and takes alot of time using HF trainer

I’m training a huge self-supervised model, when I tried to train the complete dataset, it threw cuda oom errors, to fix that I decreased batch size and added gradiant accumulation along with eval accumulation steps. Its not throwing the cuda oom errors but the evaluation speed decreased by a lot.