Relative Content

Tag Archive for pythonhuggingface-transformerstorchllama

CUDA out of memory while using Llama3.1-8B for inference

I have written a simple Python script that uses the HuggingFace transformers library along with torch to run Llama3.1-8B-instruct purely for inference, after feeding in some long-ish bits of text (about 10k-20k tokens). It runs fine on my laptop, which has a GPU with 12GB RAM, but can also access up to 28GB total (I guess from the main system RAM?)

CUDA out of memory while using Llama3.1-8B for inference

I have written a simple Python script that uses the HuggingFace transformers library along with torch to run Llama3.1-8B-instruct purely for inference, after feeding in some long-ish bits of text (about 10k-20k tokens). It runs fine on my laptop, which has a GPU with 12GB RAM, but can also access up to 28GB total (I guess from the main system RAM?)

CUDA out of memory while using Llama3.1-8B for inference

I have written a simple Python script that uses the HuggingFace transformers library along with torch to run Llama3.1-8B-instruct purely for inference, after feeding in some long-ish bits of text (about 10k-20k tokens). It runs fine on my laptop, which has a GPU with 12GB RAM, but can also access up to 28GB total (I guess from the main system RAM?)

Thiết kế website giá rẻ

Danh mục

Relative Content

Tag Archive for pythonhuggingface-transformerstorchllama

CUDA out of memory while using Llama3.1-8B for inference

CUDA out of memory while using Llama3.1-8B for inference

CUDA out of memory while using Llama3.1-8B for inference