We are running a Kafka cluster with 6 nodes active. Kafka version is 2.13-3.2.0. There are only around 10+ topics in the system and just around 50 consumers (spread across different consumer groups). Recently we found that the server log file size is growing rapidly and reaching more than 500 GB size under 7 days period. The below log is printed repeatedly and frequently in the server log.
**WARN: [LeaderEpochCache topic-my-topic-12] New epoch entry EpochEntry(epoch=53, startOffset=83104) caused truncation of conflicting entries ListBuffer(EpochEntry(epoch=54, startOffset=83103)). Cache now contains 16 entries. (kafka.server.epoch.LeaderEpochFileCache)**
The above log is printed only for one of topic and it is getting printed for many of the partitions of the topic.
topic-my-topic-12,
topic-my-topic-18,
topic-my-topic-14,
The mentioned topic is having partition count 30 and replication count 4.
Seven days back there was a data center outage and we had to restart all the Kafka nodes back. Currently all the nodes are active and running fine. And from last seven days, there are no consumers consuming message from the topic. After the outage we couldn’t start the consumer for the above topic. But there was one producer sending message to the topic but not very frequently. We had now stopped the producer however the WARN logs continue to appear. Kafka message retention is configured for 7 days.
Any support to troubleshoot the same is appreciated.