I am working with an Airflow environment managed through AWS Managed Workflows for Apache Airflow (MWAA). In my log outputs, I have been noticing non-critical errors that are cluttering the logs, which I believe is a code quality issue. These errors are related to the SecretsManagerBackend, specifically when trying to retrieve variables from AWS Secrets Manager.
Here is a snippet of the logs where these errors occur:
*** Reading remote log from Cloudwatch log_group: airflow-SnowflakeAirflow-DEV-MwaaEnvironment-Task log_stream: dag_id=CloudRDBMSOrchestration/run_id=ci_acct__XXXX/task_id=stg-load.execute-orchestration-step.execute-stg-load/attempt=1.log.
...
[2024-08-06T15:37:33.332+0000] {{variable.py:283}} ERROR - Unable to retrieve variable from secrets backend (SecretsManagerBackend). Checking subsequent secrets backend.
Traceback (most recent call last):
File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/models/variable.py", line 279, in get_variable_from_secrets
var_val = secrets_backend.get_variable(key=key)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/providers/amazon/aws/secrets/secrets_manager.py", line 278, in get_variable
return self._get_secret(self.variables_prefix, key, self.variables_lookup_pattern)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/providers/amazon/aws/secrets/secrets_manager.py", line 311, in _get_secret
response = self.client.get_secret_value(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/airflow/.local/lib/python3.11/site-packages/botocore/client.py", line 553, in _api_call
return self._make_api_call(operation_name, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/airflow/.local/lib/python3.11/site-packages/botocore/client.py", line 1009, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (AccessDeniedException) when calling the GetSecretValue operation: User: arn:aws:sts::XXXX:assumed-role/XXXX/AmazonMWAA-airflow is not authorized to perform: secretsmanager:GetSecretValue on resource: cloudrdbms_variables because no identity-based policy allows the secretsmanager:GetSecretValue action
...
These errors are not causing the tasks to fail. The task execution proceeds successfully after the error, and the final task status is SUCCESS. However, the errors are still logged, which causes unnecessary clutter in the logs. The project team has labeled this issue as non-fixable and non-resolvable since it does not impact task success or the overall workflow.
From a code quality perspective, I believe these logs should not contain errors unless they indicate a genuine problem that needs attention. I would like to know if others in the community have encountered similar issues and how they have dealt with them. Should I push for a resolution, or is it indeed acceptable to leave this as is?
Is there a potential impact I might be overlooking if these errors are left unaddressed, even though the task completes successfully?
user26727680 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.