We have a system that was built using GCP pubsub where there is a main “events” topic that receives different messages from many sources and publishes all of these to many different subscribers. The subscribers only action certain messages.
When a messages on the subscription fails it goes to a DLQ pubsub – we have one of these for every subscriber.
The problem is that in GCP it seems that you cannot replay a message to an individual pubsub subscriber – you must send it back to the topic. This causes problems where the same message was sent to multiple subscribers and only 1 failed. So if you replay you get a double send to one of them.
When this system was designed the original developer was assuming that it was more like SNS to multiple SQS in AWS, where you could just move message from the SQS DLQ back to the queue that it failed on. Unfortunately this doesn’t seem possible.
I have drawn a diagram below of the problem. When a message fails on subscriber A it eventually goes to DLQ A subscriber. I want this to then follow the green line – but it seems only the orange line is possible:
How have you handled this scenario in GCP and any suggestions for fixing system architecture would be welcome.