I use masstransit 8 + amazon sqs/sns, the app is running in Windows Docker container.
I have a single consumer, where following exceptions can occur: EndpointNotFoundException, TimeoutException, SocketException.
To handle these exceptions I use 3 separate re-try policies that have different time interval.
builder.Services
.Configure<AmazonSqsTransportOptions>(configuration.GetSection("MassTransit:Transports:AmazonSqs"))
.AddMassTransit(bus =>
{
bus.AddDelayedMessageScheduler();
bus.AddConsumer<MySingleConsumer>();
bus.UsingAmazonSqs((context, sqs) =>
{
var entityNameFormatter = new MessageNameFormatterEntityNameFormatter(new AmazonSqsMessageNameFormatter());
var options = context.GetRequiredService<IOptions<AmazonSqsTransportOptions>>().Value;
sqs.MessageTopology.SetEntityNameFormatter(new PrefixEntityNameFormatter(entityNameFormatter, options.Scope));
sqs.ConfigureEndpoints(context, new KebabCaseEndpointNameFormatter(options.Scope));
sqs.UseDelayedMessageScheduler();
sqs.UseMessageRetry(r =>
{
r.Handle(typeof(EndpointNotFoundException));
r.Interval(3, TimeSpan.FromSeconds(10));
});
sqs.UseMessageRetry(r =>
{
r.Handle(typeof(TimeoutException));
r.Interval(3, TimeSpan.FromSeconds(20));
});
sqs.UseMessageRetry(r =>
{
r.Handle(typeof(SocketException));
r.Interval(3, TimeSpan.FromSeconds(30));
});
});
});
Now, if exception of one type will occur sequentially 3 times then re-try policy will handle it correctly – i.e. the exception will finally be thrown after 3 re-try attempts in row.
The problem is if exceptions occur in random order – for instance – first EndpointNotFoundException, then TimeoutException, then SocketException, and again EndpointNotFoundException.
In this case the counter of exception attempt is reset to 0 and to re-throw the EndpointNotFoundException you need to wait again for it to occur 3 in row.
Here are the examples:
-
EndpointNotFoundException occurs 3 times in row => the next EndpointNotFoundException will be re-thrown – CORRECT.
-
EndpointNotFoundException occurs 3 times in row, then TimeoutException occurs 1 time – it reset the counter of EndpointNotFoundException exception to 0 and you have to wait again EndpointNotFoundException to occur 3 times in row for it be re-thrown – INCORRECT.
-
SocketException occurs 3 time, TimeoutException occurs 3 times, again SocketException occurs 3 time – only after this SocketException is re-thrown – INCORRECT.
-
SocketException occurs 1 time, TimeoutException occurs 2 times, EndpointNotFoundException occurs 2 times, then if SocketException occurs 3 times only then it will be finally re-thrown – INCORRECT.
In other words in ‘remembers’ only the attempts of last exception in row. If occurs other exception – it reset the attempt counter. It could potentially lead to endless loop of exceptions re-tries, say EndpointNotFoundException 1 time, TimeoutException 1 time, SocketException 1 time, then again EndpointNotFoundException 1 time, again TimeoutException 1 time – and so one as every next exception which is different from previous one – always reset the attempts counter to 0.
I noted the counter works incorrect when multiple sqs.UseMessageRetry() methods are using for every type of exception.
If all exceptions are declared within only one in sqs.UseMessageRetry() policy – then the counter works correctly for any random order of exceptions.
Probably, I’ve misunderstand something in how re-try policy works, could you please advice ? It’s easy to re-produce, should I put it as bug in GitHub ?
Thanks,
Evgeny.