I’m migrating my Flask application from Heroku to AWS using EC2 instances. The setup includes Flask, Celery, Celery Beat, and Redis, all of which worked perfectly on Heroku.
However, after deploying on AWS EC2, I’m encountering an issue where Celery stops executing new tasks from the Flask app after a few hours. It continues to execute tasks from Celery Beat, but no new tasks are picked up or executed from the Flask application. These tasks usually run indefinitely until the user stops them.
Here’s an overview of my setup:
- Flask is used for the web application.
- Celery is used for handling asynchronous tasks.
- Celery Beat is for scheduling periodic tasks.
- Redis is used as the message broker.
from celery import Celery, Task, signals
from dotenv import load_dotenv
from flask import Flask
from config.common import REDIS_BROKER_URL, REDIS_RESULT_URL
def celery_init_app(flask_app) -> Celery:
class FlaskTask(Task):
def __call__(self, *args: object, **kwargs: object) -> object:
with flask_app.app_context():
return self.run(*args, **kwargs)
celery_app = Celery("app_tasks", task_cls=FlaskTask)
celery_app.config_from_object(flask_app.config["CELERY"])
celery_app.set_default()
flask_app.extensions["celery"] = celery_app
celery_app.conf.update(flask_app.config)
return celery_app
load_dotenv()
flask_app = Flask(__name__)
flask_app.config.from_mapping(
CELERY=dict(
broker_url=REDIS_BROKER_URL,
result_backend=REDIS_RESULT_URL,
task_ignore_result=True,
task_track_started=True,
task_acks_late=True,
worker_prefetch_multiplier=1,
),
)
celery_app = celery_init_app(flask_app)
if celery_app is not None:
print("Connected to Celery Worker", flush=True)
else:
print("Failed to connect to Celery Worker", flush=True)
# Regis service tasks
import tasks.tasks
FROM python:3.9
ENV PYTHONUNBUFFERED 1
RUN apt update && apt install -y libmagic1
COPY ./requirements.txt /requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
# Set environment variables
ENV FLASK_APP=app.py
ENV FLASK_ENV=production
ENV FLASK_RUN_HOST=0.0.0.0
ENV FLASK_RUN_PORT=7000
ADD . /app
WORKDIR /app
EXPOSE 7000
CMD ["gunicorn", "wsgi:app", "-w", "4", "-b", "0.0.0.0:7000", "--timeout", "120", "--max-requests", "5000"]
FROM python:3.9
ENV PYTHONUNBUFFERED 1
WORKDIR /app
# Copy the current directory contents into the container at /app
COPY . /app
RUN pip install --no-cache-dir -r requirements.txt
CMD ["celery", "-A", "tasks.celery", "worker", "-c", "20", "--loglevel=info"]
Things I’ve Tried:
- Restarting the Celery worker and Redis server – This temporarily resolves the issue, but it happens again after a few hours.
- Checking for network connectivity issues – Everything seems fine, but the problem persists.
- Monitoring logs – No obvious errors are appearing in the logs before the worker stops receiving tasks.
- Checking system resources (CPU, Memory, Disk) – Everything looks good, no signs of resource exhaustion.