We have a Spring Boot application deployed on DC1 that connects to a Kafka cluster with its leader in DC2. There was a 9-second connection outage for Kafka on DC1, during which the application was unable to produce messages using the Spring Kafka transactional producer. After 9 seconds, the Kafka connection was restored, and a new leader election took place. However, the Spring Kafka transactional producer still couldn’t publish messages and encountered a ProducerFencedException. It took around 60 minutes for application to recover.
My concern is that since Spring Kafka uses a cached producer, it may not have refreshed the metadata to reconnect properly, resulting in the exception due to an expired epoch on the server side.
How can I configure my Spring Kafka producer to handle this situation effectively
we cannot make producer non transactional and idempotency is required in our usecases.