We have changed the authentication type from SASL/SCRAM to IAM role based and in order to do that we have adding trust relationship and also gave all access to applications in policy attached to the ECS service, but still I see these errors.
Here is the overview of the application, it consumes from upstream kafka cluster and processes the kafak message and produces it to downstream kafka cluster using confluent-kafka python library(2.3.0).
Weird behaving we see is application doesn’t throw errors while processing messages but it sits idle for around 5 hours its starts throwing these errors, if any messages comes in then it will not throw errors for next 5 hours. This behavior is odd.
Error: %3|1714955097.400|FAIL|8b68559c-dbf7-401b-ac6c-807523ee37ee#producer-1| [thrd:sasl_ssl://b-3.clusteranme.stinjb.c7.kafka.region.]: sasl_ssl://b-3.clusteranme.stinjb.c7.region.amazonaws.com:9098/3: SASL authentication error: [6ad8e7d6-f5f0-41c9-930f-26cc577779ed]: Access denied (after 346ms in state AUTH_REQ)
We have changed the policy by enabling all access for Apache Kafka APIs for MSK and MSK.
FYI: In order to generate auth token we are using aws_msk_iam_sasl_signer library from AWS to generate token based on region and passing it to oauth_cb config parameter of Producer.
Balaji B is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.