I have an application that is essentially a web server (with a separate Dockerfile and dependencies) that includes a custom workflow engine. Additionally, I have around 20 Docker images with AI models, each with a FastAPI wrapper exposing the models’ APIs. When a user makes a request, the web server builds a DAG (not an Airflow DAG, but a DAG in this custom workflow engine), where each ‘component’ of the DAG calls the web API of a specific container with an AI model.
What I want to do is replace my custom workflow engine with Airflow, using Python operators inside the main web server, and other operators inside these AI model containers. However, I have encountered some problems:
How to create Airflow workers of different types? Each AI model has a different Docker image, so as I understand, I need to install an Airflow worker in each of these containers. Certain PythonOperators must be executed on a worker inside a container of a specific type.
How to make these workers visible to Airflow?
How to define PythonOperators inside containers so these operators can be visible in the main web app DAG? Can I register them somehow in Airflow and reference them in the main web server DAG by ID or something? I have read about K8S and DockerContainer operators, but as I understand, they only start and stop Docker containers. I want to keep the container running with a Python operator and worker inside it. The reason why I can’t keep all the code of all operators in one project is because dependencies (even python) sometimes are different drastically