This is an opinion based question, my use case:
My use case is real-time and needs to be able to process everything in a sub-second speed.
I have an external mongo DB which holds information about all of the users for persistency and visibility, and I have an state for one of the operators in Flink which holds the same information and keyed by <accountId, userId>.
On demand, I’m sending a message to a kafka topic with an accountId, which is consumed by my Flink app.
My application takes that message, does a look up join with Mongo on all of the users that is in that account, and fans out a message per userId to send it to the operator I talked before.
The issue with that solution is that for an account with a lot of users, this solution is not really scalable.
Flink’s aligned checkpoint mechanism waits for the fanout to be done completely, while unaligned saves all of the undone messages on the state, which is not scalable as well.
I thought about doing the join outside of Flink, and then send all those messages after the fan out to Flink, but it’s a problem for me to produce that much messages.
Also thought about using the broadcast mechanism for that, but it’s impossible to send a message to all the relevant keys from the broadcast stream.
What is the best practice here? Is there something I’m missing?
Any help will be greatly appriciated!