I am using accelerate launch with deepspeed zero stage 2 for multi gpu training and inference and am struggling to free up GPU memory.
Basically, my programme has three parts
- Load first model…
-Remove all memory occupied by 1. - Load second model…
-Remove all memory occupied by 2. - Load third model…
Now between each 1 and 2 and 2 and 3 I would like to free all GPU memory occupied by the previous part (otherwise I get an out of memory error). I have tried del, gc, release_memory(), but nothing works. For an example see below:
...
model = ...
accelerator = Accelerator()
model, dataloader, optimizer = accelerator.prepare(model, dataloader, optimizer)
...
del model
accelerator.free_memory()
del accelerator
gc.collect()
torch.cuda.empty_cache()
Any help is appreciated! Thanks in advance!
I tried the following:
del model
accelerator.free_memory()
del accelerator
gc.collect()
torch.cuda.empty_cache()
But am still running out of memory and torch.cuda.memory_allocated()
is unaffected.
I found this question (How to Properly Manage GPU Memory Between Successive PyTorch Training Phases using accelerate?) which describes my problem without Deepspeed, but it also doesn’t have an answer.
Llmw123 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.