I’m using Django with celery on Heroku with Redis.
I have two queues, called ‘celery‘ and ‘letters‘
I have two worker types on Heroku
- worker (Dyno with lots of RAM). It can run tasks from both queues
- letter-worker is a smaller instance: It should only process tasks from the ‘letters’ queue.
My heroku procfile looks as follows:
worker: REMAP_SIGTERM=SIGQUIT celery -A shareforce.taskapp worker --loglevel=info --concurrency=3 --prefetch-multiplier=2 -Q letters,celery
letter-worker: REMAP_SIGTERM=SIGQUIT celery -A shareforce.taskapp worker --loglevel=info --concurrency=2 --prefetch-multiplier=2 -Q letters -X celery
In my config I have this setting
CELERY_TASK_DEFAULT_QUEUE = "celery"
I have the following celery task:
@shared_task(queue="celery")
def generate_export(**kwargs):
pass
I call it like this:
generate_export.delay()
Occasionally, about 1 in 50, it runs on the letter-worker, causing it to run out of memory which restarts the application killing any running task on either worker with it.
I don’t know how it is possible or how to fix it.
Any help of suggestions on how to fix or debug it would be appreciated.
3
The problem you’re having is probably caused by either a task routing setup error or by the -X celery flag failing to properly remove the celery queue from the letter-worker.
Here is a more thorough explanation of potential causes and solutions for this issue.
- Configuring Task Routing
The celery queue will be the destination for tasks without a designated queue if the CELERY_TASK_DEFAULT_QUEUE = “celery” setting is applied. It doesn’t hurt to double check, but since you’re assigning the queue=”celery” directly when defining the task, it could not have been required.
Verify the assignment in the task queue: Verify that generate_export has the queue=”celery” given to it correctly. Given that you said that it occasionally runs on the letter-worker, there may be a race condition or an error in Celery’s routing.
Celery’s Exclusion (-X) Flag
Your letter-worker configuration’s -X celery setting is meant to keep the celery queue out. But occasionally, owing to misconfigurations, the letter worker could unintentionally select jobs from the celery queue.
Solution: Make sure that the right routing rules are configured in your configuration and manually route the job to the appropriate worker instead of using -X celery.
- Put Explicit Task Routing into Practice
In order to guarantee that jobs are appropriately routed to the appropriate queues, you can establish a more detailed task routing strategy.
Add what follows to your settings.py:
In this manner, jobs will always be sent to the correct queue in accordance with the routing rules you have specified.
- Distinct Task Categories in Procfile
Which workers are in charge of which queues is specified in the Heroku Procfile. You can clearly distinguish between the different task kinds rather than depending on -X:
Make the following changes to your Heroku Procfile:
In this manner, only jobs from the celery queue are processed by the worker, and only tasks from the letters queue are processed by the letter-worker. Since queue assignment takes care of the separation, the -X flag is no longer required.
-
Manage Concurrency in Worker Queues
Limiting concurrency can help you prevent the letter worker from experiencing memory overload by lowering the amount of concurrent tasks that are executed on that worker. Although you have concurrency set to 2, you may want to reduce it even further if memory problems persist in order to prevent smaller jobs from overloading the smaller instance. -
Extra Troubleshooting
Examine the job logging: Turn on more thorough logging in Celery to monitor which employees are taking up which jobs. This will assist you in determining why specific tasks are assigned to the incorrect employee.
By using the worker command to set the log level to debug, you can boost the verbosity of the logs:
Pay attention Redis: Redis may occasionally still have certain jobs in its queue that, because of old setups or conflicting task assignments, are picked up by the incorrect worker. Redis queues can be watched to make sure they’re operating as they should.
Complete Setup: Modify the Procfile to divide the labourers between the celery and letter queues.
Make sure your settings have task routing stated clearly.py.
Modify memory-intensive or concurrency-intensive processes to avoid overloading smaller instances.
To keep an eye on task execution and routing, use debugging and logging.
You should be able to guarantee that your letter worker only completes jobs from the letters queue and stays away from tasks from the celery by following these instructions.