I am facing an issue with my Django project which runs inside a Docker container along with Redis and Celery. I am using Redis and Celery to manage queues and other processes.
The problem is that my Django application initializes the model only once when it starts, but the Celery workers attempt to build the model every time they start. Since my GPU memory is limited, this causes the application to crash due to insufficient memory.
Here is the code from my model.py file:
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "test-model",
max_seq_length = max_seq_length,
dtype = dtype,
load_in_4bit = load_in_4bit,
token = TOKEN
)
FastLanguageModel.for_inference(model)
I want to ensure that the model is initialized only once and that all workers use this pre-initialized model. How can I achieve this?
I attempted to initialize the model in the Django application using the following code:
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "test-model",
max_seq_length = max_seq_length,
dtype = dtype,
load_in_4bit = load_in_4bit,
token = TOKEN
)
FastLanguageModel.for_inference(model)
I expected that the model would be initialized once when the Django application starts, and that this pre-initialized model would be used by all Celery workers without reinitializing.
However, the actual result is that each Celery worker tries to reinitialize the model upon starting, leading to GPU memory exhaustion and application crashes.
Mehmet Yıldırım is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.