I use airflow to schedule my ETL jobs, and the project have version control.
Environment is built with airflow docker image, and the codes use bind mount.
So any changes (including git pull command) are synchronized with the container.
The file structure is like the follows:
airflow_home/
|– dags/
|– plugins/
|– custom_module/
.
The current situation is that if the code is changed in dags/plugins, airflow scheduler will read the latest code.
However, if there is a code change in the custom module, airflow scheduler will not detect it.
It will still use the old module instead of reading the latest code.
I think it’s because the custom module is already loaded into memory after airflow scheduler starts.
Is there any way to dynamically load a custom module into the airflow scheduler?
I’ve tried:
- using PYTHONPATH to specify the module path
- using the setup tool to create an editable package
Both methods failed, still can’t get the latest code.
Is there any way to dynamically load the module when a DAG task is running?
(( Using importlib.reload is work, but it adds a lot of code to each .py file. So I don’t plan to use this yet.