Thiết kế website giá rẻ

Question

While testing whether quorum queues work well with our existing infrastructure, I noticed that our current setup is not as resilient against broker failures / restarts as I had previously assumed (NB: The problem also happens with mirrored classic queues, I verified that after noticing the problem).

For context: We are using the bitnami chart for our RabbitMQ deployment (see: https://github.com/bitnami/charts/tree/main/bitnami/rabbitmq), with three replicas. We opted for a Service of type LoadBalancer, so that external connections to the broker are possible. This means my application uses spring.rabbitmq.addresses=<LOADBALANCER IP>.

Now, when the broker is restarted (e.g. by issuing kubectl rollout restart statefulset/rabbitmq, or by draining nodes for maintenance) and my application is connected to one of the restarting pods while publishing messages, I observed some rather unexpected behavior: Depending on different factors I am losing up to 20,000 messages out of a million. I then, of course, tried to understand the problem and find solutions to it, but I feel quite stuck at the moment.

What I really want:

On temporary broker connection errors, like they are happening when restarting, I want to make sure all my messages are delivered to the broker after the connection is restored.

If that is not possible I would want to get a log message for each message that could not be delivered.

For reference I built a small project to reproduce my “current” problem, you can find it at https://github.com/Linus9000/spring-amqp-demo/tree/master

Now, what did I actually find and learn?

First: My code publishing a message looked something like this:

<code>@RestController

@RequestMapping("/")

@CommonsLog

public class RabbitController {

private final RabbitTemplate rabbitTemplate;

public RabbitController(RabbitTemplate rabbitTemplate) {

this.rabbitTemplate = rabbitTemplate;

}

@GetMapping

public ResponseEntity<String> index() {

for (int i = 0; i < 1_000_000; i++) {

try {

this.sendMessage("myexchange", "myrouting", String.valueOf(i));

} catch (Exception e) {

log.error("Could not send message", e);

}

return ResponseEntity.noContent().build();

}

private void sendMessage(String exchange, String routingKey, Object content) {

MessageProperties properties = new MessageProperties();

properties.setDeliveryMode(MessageDeliveryMode.PERSISTENT);

Message message = new Message(content.toString().getBytes(), properties);

this.rabbitTemplate.convertAndSend(exchange, routingKey, message);

}

</code>

<code>@RestController @RequestMapping("/") @CommonsLog public class RabbitController { private final RabbitTemplate rabbitTemplate; public RabbitController(RabbitTemplate rabbitTemplate) { this.rabbitTemplate = rabbitTemplate; } @GetMapping public ResponseEntity<String> index() { for (int i = 0; i < 1_000_000; i++) { try { this.sendMessage("myexchange", "myrouting", String.valueOf(i)); } catch (Exception e) { log.error("Could not send message", e); } } return ResponseEntity.noContent().build(); } private void sendMessage(String exchange, String routingKey, Object content) { MessageProperties properties = new MessageProperties(); properties.setDeliveryMode(MessageDeliveryMode.PERSISTENT); Message message = new Message(content.toString().getBytes(), properties); this.rabbitTemplate.convertAndSend(exchange, routingKey, message); } } </code>

@RestController
@RequestMapping("/")
@CommonsLog
public class RabbitController {

    private final RabbitTemplate rabbitTemplate;


    public RabbitController(RabbitTemplate rabbitTemplate) {

        this.rabbitTemplate = rabbitTemplate;
    }


    @GetMapping
    public ResponseEntity<String> index() {

        for (int i = 0; i < 1_000_000; i++) {
            try {
                this.sendMessage("myexchange", "myrouting", String.valueOf(i));
            } catch (Exception e) {
                log.error("Could not send message", e);
            }
        }

        return ResponseEntity.noContent().build();
    }


    private void sendMessage(String exchange, String routingKey, Object content) {

        MessageProperties properties = new MessageProperties();
        properties.setDeliveryMode(MessageDeliveryMode.PERSISTENT);

        Message message = new Message(content.toString().getBytes(), properties);

        this.rabbitTemplate.convertAndSend(exchange, routingKey, message);
    }

}

Now, when I restart the broker while I am in the for loop (i.e. waiting for my controller call to finish) I get one(!) error log:

<code>2024-06-18T09:15:30.370+02:00 ERROR 25988 --- [rabbitmq-demo] [nio-8080-exec-1] c.example.rabbitmqdemo.RabbitController : Could not send message

org.springframework.amqp.AmqpIOException: java.net.SocketException: Connection reset by peer

Caused by: java.net.SocketException: Connection reset by peer

</code>

<code>2024-06-18T09:15:30.370+02:00 ERROR 25988 --- [rabbitmq-demo] [nio-8080-exec-1] c.example.rabbitmqdemo.RabbitController : Could not send message org.springframework.amqp.AmqpIOException: java.net.SocketException: Connection reset by peer Caused by: java.net.SocketException: Connection reset by peer </code>

2024-06-18T09:15:30.370+02:00 ERROR 25988 --- [rabbitmq-demo] [nio-8080-exec-1] c.example.rabbitmqdemo.RabbitController  : Could not send message

org.springframework.amqp.AmqpIOException: java.net.SocketException: Connection reset by peer
Caused by: java.net.SocketException: Connection reset by peer

But when I check the broker (via management UI) I see only 959,372 messages in my queue. WTF happened here?

Naturally, this concerns me. After some digging I find the spring.rabbitmq.template.retry configs, leading to:

<code>spring:

rabbitmq:

template:

retry:

enabled: true

max-attempts: 100

multiplier: 2

max-interval: 120000

</code>

<code>spring: rabbitmq: template: retry: enabled: true max-attempts: 100 multiplier: 2 max-interval: 120000 </code>

spring:
  rabbitmq:
    template:
      retry:
        enabled: true
        max-attempts: 100
        multiplier: 2
        max-interval: 120000

Do note the quite absurd values here.

Unfortunately, this only leads to missing more messages, I’m down to 909,813 messages which actually make it into the queue.

So, next experiment. How about spring.rabbitmq.template.mandatory?

<code>spring:

rabbitmq:

template:

mandatory: true

retry:

enabled: true

max-attempts: 100

multiplier: 2

max-interval: 120000

</code>

<code>spring: rabbitmq: template: mandatory: true retry: enabled: true max-attempts: 100 multiplier: 2 max-interval: 120000 </code>

spring:
  rabbitmq:
    template:
      mandatory: true
      retry:
        enabled: true
        max-attempts: 100
        multiplier: 2
        max-interval: 120000

Still, only 947,848 messages.

Okay then, time for the next experiment: spring.rabbitmq.publisher-confirm-type=correlated. Due to performance penalties I now only try to put 150,000 messages into the queue:

<code>spring:

rabbitmq:

publisher-confirm-type: correlated

template:

mandatory: true

retry:

enabled: true

max-attempts: 100

multiplier: 2

max-interval: 120000

</code>

<code>spring: rabbitmq: publisher-confirm-type: correlated template: mandatory: true retry: enabled: true max-attempts: 100 multiplier: 2 max-interval: 120000 </code>

spring:
  rabbitmq:
    publisher-confirm-type: correlated
    template:
      mandatory: true
      retry:
        enabled: true
        max-attempts: 100
        multiplier: 2
        max-interval: 120000

Now we’re getting somewhere: 149,970 messages in the queue. Only 30 messages missing! Time for the last puzzle piece, spring.rabbitmq.publisher-returns:

<code>spring:

rabbitmq:

publisher-confirm-type: correlated

publisher-returns: true

template:

mandatory: true

retry:

enabled: true

max-attempts: 100

multiplier: 2

max-interval: 120000

</code>

<code>spring: rabbitmq: publisher-confirm-type: correlated publisher-returns: true template: mandatory: true retry: enabled: true max-attempts: 100 multiplier: 2 max-interval: 120000 </code>

spring:
  rabbitmq:
    publisher-confirm-type: correlated
    publisher-returns: true
    template:
      mandatory: true
      retry:
        enabled: true
        max-attempts: 100
        multiplier: 2
        max-interval: 120000

Now, we get additional error logs:

<code>2024-06-18T09:39:53.980+02:00 ERROR 22904 --- [rabbitmq-demo] [nio-8080-exec-1] o.s.a.r.c.CachingConnectionFactory : Could not configure the channel to receive publisher confirms

java.io.IOException: null

Caused by: com.rabbitmq.client.ShutdownSignalException: connection error

Caused by: java.net.SocketException: Connection reset

</code>

<code>2024-06-18T09:39:53.980+02:00 ERROR 22904 --- [rabbitmq-demo] [nio-8080-exec-1] o.s.a.r.c.CachingConnectionFactory : Could not configure the channel to receive publisher confirms java.io.IOException: null Caused by: com.rabbitmq.client.ShutdownSignalException: connection error Caused by: java.net.SocketException: Connection reset </code>

2024-06-18T09:39:53.980+02:00 ERROR 22904 --- [rabbitmq-demo] [nio-8080-exec-1] o.s.a.r.c.CachingConnectionFactory       : Could not configure the channel to receive publisher confirms
java.io.IOException: null
Caused by: com.rabbitmq.client.ShutdownSignalException: connection error
Caused by: java.net.SocketException: Connection reset

And also:

<code>2024-06-18T09:39:54.198+02:00 ERROR 22904 --- [rabbitmq-demo] [nio-8080-exec-1] c.example.rabbitmqdemo.RabbitController : Could not send message

org.springframework.amqp.AmqpException: PublisherCallbackChannel is closed

</code>

<code>2024-06-18T09:39:54.198+02:00 ERROR 22904 --- [rabbitmq-demo] [nio-8080-exec-1] c.example.rabbitmqdemo.RabbitController : Could not send message org.springframework.amqp.AmqpException: PublisherCallbackChannel is closed </code>

2024-06-18T09:39:54.198+02:00 ERROR 22904 --- [rabbitmq-demo] [nio-8080-exec-1] c.example.rabbitmqdemo.RabbitController  : Could not send message

org.springframework.amqp.AmqpException: PublisherCallbackChannel is closed

And still, only 149,972 messages.

Maybe me catching the exception when sending is wrong? Maybe the retry config already handles that?

Let’s try!

<code>@GetMapping

public ResponseEntity<String> index() {

for (int i = 0; i < 150_000; i++) {

this.sendMessage("myexchange", "myrouting", String.valueOf(i));

}

return ResponseEntity.noContent().build();

}

</code>

<code>@GetMapping public ResponseEntity<String> index() { for (int i = 0; i < 150_000; i++) { this.sendMessage("myexchange", "myrouting", String.valueOf(i)); } return ResponseEntity.noContent().build(); } </code>

@GetMapping
    public ResponseEntity<String> index() {

        for (int i = 0; i < 150_000; i++) {
            this.sendMessage("myexchange", "myrouting", String.valueOf(i));
        }

        return ResponseEntity.noContent().build();
    }

Now it just stops after getting disconnected once, which in this case means only 30,524 messages made it into the queue.

Thiết kế website giá rẻ

Danh mục

Handling broker restarts as publisher/producer without losing messages