When I run the PyTorch script for Llama 3 8B-Instruct (first window), I get the error that PyTorch couldn’t allocate 112MB GPU RAM (see red line). In the window below however, you can see that Python manages to allocate all available 2,4 GB RAM. So this error is confusing me and it prevents me from using Llama 3 locally.
Any idea on how to fix that? Are 3GB of VRAM not enough for Llama 3? Why is the error talking about 112MB then?