I’ve discovered a memory leak in torch.load() when loading a saved tensor. Despite explicitly deleting the loaded object and invoking garbage collection, the memory usage does not decrease as expected.
Below is a minimal reproducible example to demonstrate the issue. Code to Reproduce:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
import torch
import gc
def create_and_save_image():
image = torch.randn(30, 384, 384)
ckpt = {"image": image}
torch.save(ckpt, "test_image.pt")
@profile
def load_and_del():
ckpt = torch.load("test_image.pt")
del ckpt
gc.collect()
if __name__ == '__main__':
create_and_save_image()
load_and_del()
Steps to Reproduce:
python3 -m memory_profiler mem_leak.py
Observe the memory usage before and after loading and deleting the tensor.
Expected Behavior:
The memory usage should increase when torch.load() is called and decrease after the object is deleted and garbage collection is invoked.
Observed Behavior:
The memory usage increases when torch.load() is called, but does not decrease after deleting the object and calling gc.collect(), indicating a memory leak.
System Information:
PyTorch version: 2.3.0
Python version: 3.8.5
OS: Linux
CUDA version: 12.0
GPU: NVIDIA GeForce RTX A6000 (48 Gb)
Additional Information:
I have used memory_profiler to profile the memory usage, and it confirms that memory is not being freed as expected. Here’s the output of the memory profiler:
Line # Mem usage Increment Occurrences Line Contents
11 353.602 MiB 353.602 MiB 1 @profile
12 def load_and_del():
13 370.574 MiB 16.973 MiB 1 ckpt = torch.load("test_image.pt")
14 370.574 MiB 0.000 MiB 1 del ckpt
15 370.574 MiB 0.000 MiB 1 gc.collect()
Mireia Planas is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.