Airflow GCSToBigQueryOperator has a param max_id_key: If set, the name of a column in the BigQuery table that's to be loaded. This will be used to select the MAX value from BigQuery after the load occurs. The results will be returned by the execute() command, which in turn gets stored in XCom for future operators to use. This can be helpful with incremental loads--during future executions, you can pick up from the max ID.
I’ve added this param in my task and I can see the return value within the XCom list in the admin, but I’m wondering how to call and use it for subsequent queries so that instead of SELECT * FROM tblFooBar
I can do SELECT * FROM tblFooBar WHERE ID > {max_id_xcom}
Any idea how to go about this?
1