Flink streaming pipeline with one kafka partition with 5 parallelism with keyed event windows with 1 minute tumbling window. We are using ascending timestamp watermarks with default periodic generator.
We are having lot of late records from different keyed windows. I’m not entirely sure if this is related to flink not having keyed watermarks. The events incoming are in timestamp order so I’m thinking it is something to do with advancing watermark.
Is it something do with global watermarks which is marking records as late even though they are consumed in kafka source before the watermark is generated and before they are assigned windows? In other words can watermark advancement mark the records as late by closing the window prematurely while we have events pending window assignment ?
Please ask me if you need more details.