A
.airflowignore
file specifies the directories or files inDAG_FOLDER
that Airflow should intentionally ignore. Each line in.airflowignore
specifies a regular expression pattern, and directories or files whose names (not DAG id) match any of the patterns would be ignored (under the hood,re.findall()
is used to match the pattern). Overall it works like a.gitignore
file.
According to the docs.
However, a .gitignore
file does not use only regular expressions. It uses a specific combination of special gitignore
grammar, supports glob patterns, and supports an everything-except syntax:
*.swp # ------------------- glob pattern to ignore vi swap files
data/item_[0-9]*[.]tar # -- regex pattern to ignore matching items
build/* # ----------------- glob pattern to ignore everything in the build folder...
!build/my-app # ----------- except the one binary that is tracked by git-lfs
What part of the .gitignore
syntax does Airflow support? Because the docs claim to support all of it, and only the regex part of it in the same paragraph.
If it is truly “all of it”, then the complete .airflowignore
file is:
*/
!dags/*
But I am thinking that will not work given the docs’ description.
3
Airflow does support “everything EXCEPT” syntax with globs.
It also supports regexes and normal expressions, as far as I can tell from the unit tests.
So a valid default .airflowignore
file should be:
*/
!dags/
(assuming dags/
at the top of the repo is the convention)
from referencing their unit tests.
The environment variable must be set:
DAG_IGNORE_FILE_SYNTAX=glob
Worth noting that it may be difficult to hit the docs at the latest stable version from Google, and there may be no indication other than the url that you are on an older version.
Instead, consider searching directly from the main site. It’s either that, or you run the risk of…
— hitting a stale version with the current doc header, which (in this link) points to version 1.10.1
.
I was only able to find the main docs through happenstance, as discovered with the help another SO contributor and an erroneous relative link from that page.
2