I’m trying to understand an unexpected :timeout during :ssh_sftp.start_channel/3. This is the flow of events:
- A Broadway pipeline is running, using a BroadwayKafka producer
- The application receives SIGINT and starts to shutdown
- Broadway batchers execute Broadway.handle_batch/4, kicking off a series of synchronous events
- The last of which is writing a file to a remote SFTP server
The precise timing:
- SIGINT received: 15:39:26.382
- First SFTP connection attempted: 15:39:27.618
- First SFTP connection timeout: 15:39:32.620
- The last Kafka consumer leaves the group and the app exits with 0: 15:39:52.711
So the connection “times out” before the app fully shuts down.
I have two threads I’m not sure how to pull on any further:
The fact that the timeouts occur about exactly 5s after the attempt makes me think there’s some other underlying timeout. But why would we only hit this timeout on SIGINT?
Maybe :timeout is a misleading error here and there’s something else going on entirely
Anybody have any ideas? I’m hoping there’s a magic “Ah you forgot about !” solution here.
I would expect the SFTP connections and finish writing before the application exited successfully.