While investigating high system latency in a Python Apache Beam pipeline running on Dataflow, I noticed that for some steps the reported StartBundle and/or FinishBundle wall times can totally dominate the ProcessElement time.
At first glance that doesn’t seem very healthy to me, and I’m wondering if it indicates some deeper issue.
Examples of processing in these steps:
- Reading bytes from Kafka and parsing them with FastAvro
- Stateful DoFn with cache and timer