Relative Content

Tag Archive for apache-kafka

message delivery guarantee in Kafka

The question is the following, or to be more precise, in my misunderstanding regarding Kafka delivery guarantees, I looked everywhere for information and somewhere the data differs. As I understand it, the message delivery guarantee in Kafka is how they will be delivered to the target unit (Kafka, consumer). As it was said in one of the articles, before the Kafka update in 2017, Kafka supported the delivery guarantee al least once and at most once, and here is the question, is this delivery from the producer side or from the consumer side. And how is this achieved, by what parameter. From the assumption, this is the asc parameter, which indicates whether it is necessary to wait for confirmation or not. The article said that since 2017 transactions have been introduced in Kafka and this helps to achieve exactly once, in another article it was said that transactions work only at the topic level, but in no way relate to exactly once, that is, they work at the level of messages being entered into 2 topics within one transaction, and consumers in turn could read these messages after they are committed. I also heard about idempotency from the producer side, that Kafka can do deduplication by saving a certain message ID in its topic and as this happens prevents duplicates from being found in case of re-sending a message due to a producer or Kafka failure. So how can we achieve full exactly once, so that there is a guarantee that the full process, including sending from the producer to Kafka, reading and processing on the consumer, was complete?

Duplicated Kafka messages in output of Embulk collection

We are using: Embulk version v0.10.12.
We are collecting files using sftp input and pushing them to Kafka using embulk-output-kafka.
From time to time, we face duplicated kafka messages within our output Kafka topic although the Embulk logs shows that each file is processed ony once and Embulk Kafka producer pushes the message only once.
What could be the reason of such duplication ?