I have a DAG with multiple SQL tasks (and .sql files) referencing the same template variable called refdate
(each file may reference this variable multiple times within the query code). Let’s assume the sql file looks like this:
select '{{params.refdate}}';
Now I want to use the SQLExecuteQueryOperator
to set the parameter refdate
and execute the query:
sql_task = SQLExecuteQueryOperator(
task_id="sql-task",
conn_id="connection",
sql="query.sql",
params={
"refdate": "{{ macros.ds_add(ds, -2) }}"
}
)
The problem is that this operator will not render the parameter properly. It will just replace the the refdate
parameter with the plain text “{{ macros.ds_add(ds, -2) }}”.
From this question, I noticed that we can simply replace the parameter with the jinja template in the query file. Thus I can directly write my sql query file like this:
select '{{ macros.ds_add(ds, -2) }}';
Now, there is no need to provide the refdate
parameter anymore. We can simply:
sql_task = SQLExecuteQueryOperator(
task_id="sql-task",
conn_id="connection",
sql="query.sql"
)
This change works fine!
My concern, however, is that in my real business case I have multiple sql files and multiple references to refdate
. I’m concerned because the definition of refdate
may change in the future, for example to something like refdate="{{ macros.ds_add(ds, -1) }}"
. In such a case I would need to change multiple SQL files at the same time. Of course this is not a good practice in terms of writing clean and easily mantainable DAG codes.
Any suggestions?