I am writting a Flink application where order of the output records is paramount. Inside a certain batch of records it’s OK if they arrive out of order but batch 2 should never arrive before batch 1. I need to output them to an AWS Kinesis Stream and, for each key (which should always be written to the same shard) I cannot afford to lose the order of the records. I am using, for this, the KinesisStreamsSink and the way I am creating one looks like this :
KinesisStreamsSink.<Rec>builder()
.setKinesisClientProperties(kinesisProducerConfig)
.setSerializationSchema((SerializationSchema<Rec>) rec ->
rec.Json.getBytes(StandardCharsets.UTF_8))
.setStreamName(streamName)
.setPartitionKeyGenerator(element -> element.key )
.setMaxInFlightRequests(1)
.build();
Rec contains two fields, key and Json. The former is used to decide the shard the element should go to while the later is the payload I want to send through Kinesis. The order the plots should show on the other side should be precisely the same that I’m injecting them on the Kinesis. The problem seems to happen when there are errors, specifically, throughput exceeded errors. In those occasions, the order seems to be lost. When there aren’t throughput errors, order seems to be preserved. I set the MaxInFlightRequests to 1 in an attempt to have the sink not start handling another request (batch) without fully sending the current one, basically after having successfully retried all the records that failed. But that doesn’t seem to be what’s happening. If a record fails it seems to be added to the next batch and thus mixed with the next batch of records.
Is there any way to configure KinesisStreamsSink so that the order is preserved in face of errors? I don’t care the order is lost inside one batch but batches should go in the right order. Basically, never start sending records belonging to batch #3 before all the records from batch #2 have been sent which, in turn, should not have a single record from it sent without batch #1 being fully sent to the Kinesis Stream, and so on