I have a springboot service at the place i work in. In a nutshell, all this service does is fire rabbitMQ Consumers to start consuming messages from different queues it is subscribed to. And we can toggle these consumers to start or stop using a controller. Assuming that we have 5 services being listened by the support service (let me call it: SS), we have 5 controllers configured to toggle these consumer starts/stops.
Note: Our publishers are upstream systems which we have no control over.
An abstraction of the queues bound to each service exchange is as follows:
- <SERVICE_NAME>_QUEUE
- <SERVICE_NAME>_DEADLETTER
- <SERVICE_NAME>_PARKING_LOT
The way these are configured is that when a message comes to the exchange, it maps it to the default queue. If there is any processing issue, the message is routed to a deadletter queue and it stays there for 3 retries. After the 4th retry (1 normal flow + 3 retry flows), the message is routed to the parkingLot queue for storage. These messages will then be consumed by the SS and will also store these records in a database for persistence and later analysis.
The issue here is that we are getting an x-death error which is in turn discarding the message and not allowing the SS to store the payload into the DB. Here are the error logs:
These logs come in groups of 4 as shown in a comment i added.
As the logs suggest, there seems to be an issue while trying to consume messages with an x-death header & the cause for failure being a Fatal Message Conversion Error. But this does not seem to be the case. I tried to recreate this in my dev environment but this error is so inconsistent that it rarely gets reproduced. Some notable points are mentioned below:
- For a correct payload, nothing seems to happen.
- Even if there is a wrong payload, the message goes to deadletter (which appends an x-death header). Sometimes the SS consumes it & stores the payload in the DB with no hiccups. But on rare occasions, it throws this x-death header error.
- Whenever this x-death error gets thrown, neither is the record consumed, nor it gets stored in the DB. Additionally, the payload is not printed in the logs (Important for RCA in Prod as this issue was caught in prod).
I have tried sending multiple permutations and combinations of wrong payloads directly onto the producer exchanges in hopes of recreating these issues. 404, 500, incorrect api urls, wrong payload format, etc but it was all consumed and data was stored in DB with relevant error logs.
One notable finding is that i was able to recreate this error only once when i did these specific set of steps.
- Started the Service and let it connect to all the relevant queues.
- Waited for sometime – giving the service some idle time.
- Published the messages directly into the exchange (with an incorrect API URL which throws a 500 Internal Server error upon hitting it). Sent 31 of the same messages.
- Waited for some more time for the messages to go into the DeadLetter and then to the parkingLot (Around 3 minutes – TTL in DL is 1 minute).
- After the messages reached the parkingLot queue, i waited for another 5 minutes.
- Hit the SS API with start parameter to toggle the Consumer Start.
This is when for 27 of the 31 messages, the x-death error was thrown and only 4 of these persisted in the database with relevant errors. But when i tried these same steps later on in the day (around 2 hours after i recreated it), i was able to consume all the messages; even though i followed the same steps mentioned above.
Could there be any other reason as to why this error is occurring? Any help would be greatly appreciated.
Note: The dev application (SS) is deployed on AKS (Azure) and the API’s are also AKS URL’s.
PS: IF you need any other information, feel free to request.
StackOverflow wouldn’t allow me to paste relevant logs flagging them as spam. Here are the logs as mentioned in the original question:
{"@timestamp":"2024-08-13T04:25:55.666340167Z","@version": "1","message": "Execution of Rabbit message listener failed.","logger_name": "org.springframework.amqp.rabbit.listener.ConditionalRejectingErrorHandler","thread_name": "order-2","level": "WARN","level_value": 30000,"stack_trace": "org.springframework.amqp.rabbit.support.ListenerExecutionFailedException: Failed to convert messagentat org.springframework.amqp.rabbit.listener.adapter.MessagingMessageListenerAdapter.onMessage(MessagingMessageListenerAdapter.java:158)ntat org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:1663)ntat org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.actualInvokeListener(AbstractMessageListenerContainer.java:1582)ntat org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:1570)ntat org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.doExecuteListener(AbstractMessageListenerContainer.java:1561)ntat org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.executeListenerAndHandleException(AbstractMessageListenerContainer.java:1506)ntat org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.lambda$executeListener$8(AbstractMessageListenerContainer.java:1484)ntat io.micrometer.observation.Observation.lambda$observe$0(Observation.java:493)ntat io.micrometer.observation.Observation.observeWithContext(Observation.java:603)ntat io.micrometer.observation.Observation.observe(Observation.java:492)ntat org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.executeListener(AbstractMessageListenerContainer.java:1484)ntat org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.doReceiveAndExecute(SimpleMessageListenerContainer.java:994)ntat org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.receiveAndExecute(SimpleMessageListenerContainer.java:941)ntat org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.mainLoop(SimpleMessageListenerContainer.java:1323)ntat org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:1225)ntat java.base/java.lang.Thread.run(Thread.java:833)nCaused by: org.springframework.amqp.support.converter.MessageConversionException: failed to resolve class name. Class not found [com.xyz.sourcing.purchaseorder.domain.Event]ntat org.springframework.amqp.support.converter.DefaultJackson2JavaTypeMapper.getClassIdType(DefaultJackson2JavaTypeMapper.java:189)ntat org.springframework.amqp.support.converter.DefaultJackson2JavaTypeMapper.fromTypeHeader(DefaultJackson2JavaTypeMapper.java:146)ntat org.springframework.amqp.support.converter.DefaultJackson2JavaTypeMapper.toJavaType(DefaultJackson2JavaTypeMapper.java:122)ntat org.springframework.amqp.support.converter.AbstractJackson2MessageConverter.convertContent(AbstractJackson2MessageConverter.java:394)ntat org.springframework.amqp.support.converter.AbstractJackson2MessageConverter.doFromMessage(AbstractJackson2MessageConverter.java:364)ntat org.springframework.amqp.support.converter.AbstractJackson2MessageConverter.fromMessage(AbstractJackson2MessageConverter.java:321)ntat org.springframework.amqp.support.converter.AbstractJackson2MessageConverter.fromMessage(AbstractJackson2MessageConverter.java:304)ntat org.springframework.amqp.rabbit.listener.adapter.AbstractAdaptableMessageListener.extractMessage(AbstractAdaptableMessageListener.java:342)ntat org.springframework.amqp.rabbit.listener.adapter.MessagingMessageListenerAdapter$MessagingMessageConverterAdapter.extractPayload(MessagingMessageListenerAdapter.java:385)ntat org.springframework.amqp.support.converter.MessagingMessageConverter.fromMessage(MessagingMessageConverter.java:132)ntat org.springframework.amqp.rabbit.listener.adapter.MessagingMessageListenerAdapter.toMessagingMessage(MessagingMessageListenerAdapter.java:258)ntat org.springframework.amqp.rabbit.listener.adapter.MessagingMessageListenerAdapter.onMessage(MessagingMessageListenerAdapter.java:148)nt... 15 common frames omittednCaused by: java.lang.ClassNotFoundException: com.xyz.sourcing.purchaseorder.domain.Eventntat java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:445)ntat java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:592)ntat org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:149)ntat java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525)ntat java.base/java.lang.Class.forName0(Native Method)ntat java.base/java.lang.Class.forName(Class.java:467)ntat org.springframework.util.ClassUtils.forName(ClassUtils.java:291)ntat org.springframework.amqp.support.converter.DefaultJackson2JavaTypeMapper.getClassIdType(DefaultJackson2JavaTypeMapper.java:185)nt... 26 common frames omittedn"}
{"@timestamp":"2024-08-13T04:25:55.666573996Z","@version":"1","message":"Fatal message conversion error; message rejected; it will be dropped or routed to a dead letter exchange, if so configured: (Body:'[B@4bec94c9(byte[430])' MessageProperties [headers={tracestate=1798415@nr=0-0-1542405-710958860-4d829bd199bb0fa9-fa07f6961c58cb8f-1-1.004289-1723491004180, x-first-death-exchange=gis-order-dl-exchange, traceparent=00-0a0f1f42ed84ec0490c4f08bae1962f8-4d829bd199bb0fa9-01, x-death=[{reason=expired, count=3, exchange=gis-order-dl-exchange, routing-keys=[GIS.ORDER.DEADLETTER], time=Mon Aug 12 19:28:02 GMT 2024, queue=gis_booking_order_deadletter_queue}], x-first-death-reason=expired, x-first-death-queue=gis_booking_order_deadletter_queue, custom_header=[CustomHeader(count=1, errorMessage=404 Not Found from GET https://dev.api.azeus.xyz.com/sourcing/order-repository-service/order-receipt-linkage?orderNumber=XYZ123, eventId=81833449-0b32-4d24-92e1-b7c5f510f3a3, timestamp=Mon Aug 12 19:27:02 GMT 2024), CustomHeader(count=2, errorMessage=404 Not Found from GET https://dev.api.azeus.xyz.com/sourcing/order-repository-service/order-receipt-linkage?orderNumber=XYZ123, eventId=81833449-0b32-4d24-92e1-b7c5f510f3a3, timestamp=Mon Aug 12 19:28:03 GMT 2024), CustomHeader(count=3, errorMessage=404 Not Found from GET https://dev.api.azeus.xyz.com/sourcing/order-repository-service/order-receipt-linkage?orderNumber=XYZ123, eventId=81833449-0b32-4d24-92e1-b7c5f510f3a3, timestamp=Mon Aug 12 19:29:03 GMT 2024)], __TypeId__=com.xyz.sourcing.purchaseorder.domain.Event, newrelic=eyJkIjp7ImFjIjoiMTU0MjQwNSIsInByIjoxLjAwNDI4OSwidHgiOiJmYTA3ZjY5NjFjNThjYjhmIiwidGkiOjE3MjM0OTEwMDQxODAsInR5IjoiQXBwIiwidGsiOiIxNzk4NDE1IiwiaWQiOiI0ZDgyOWJkMTk5YmIwZmE5IiwidHIiOiIwYTBmMWY0MmVkODRlYzA0OTBjNGYwOGJhZTE5NjJmOCIsInNhIjp0cnVlLCJhcCI6IjcxMDk1ODg2MCJ9LCJ2IjpbMCwxXX0=}, timestamp=Mon Aug 12 19:27:00 GMT 2024, contentType=application/json, contentEncoding=UTF-8, contentLength=0, priority=0, redelivered=false, receivedExchange=gis-order-dl-exchange, receivedRoutingKey=GIS.ORDER.PARK, deliveryTag=3, consumerTag=amq.ctag-snqN38JmZjVIkqyeAQ7TkA, consumerQueue=gis_booking_order_parkinglot_queue])","logger_name":"org.springframework.amqp.rabbit.listener.ConditionalRejectingErrorHandler$DefaultExceptionStrategy","thread_name":"order-2","level":"WARN","level_value":30000}
{"@timestamp":"2024-08-13T04:25:55.66666626Z","@version":"1","message":"x-death header detected on a message with a fatal exception; perhaps requeued from a DLQ? - discarding: (Body:'[B@4bec94c9(byte[430])' MessageProperties [headers={tracestate=1798415@nr=0-0-1542405-710958860-4d829bd199bb0fa9-fa07f6961c58cb8f-1-1.004289-1723491004180, x-first-death-exchange=gis-order-dl-exchange, traceparent=00-0a0f1f42ed84ec0490c4f08bae1962f8-4d829bd199bb0fa9-01, x-death=[{reason=expired, count=3, exchange=gis-order-dl-exchange, routing-keys=[GIS.ORDER.DEADLETTER], time=Mon Aug 12 19:28:02 GMT 2024, queue=gis_booking_order_deadletter_queue}], x-first-death-reason=expired, x-first-death-queue=gis_booking_order_deadletter_queue, custom_header=[CustomHeader(count=1, errorMessage=404 Not Found from GET https://dev.api.azeus.xyz.com/sourcing/order-repository-service/order-receipt-linkage?orderNumber=XYZ123, eventId=81833449-0b32-4d24-92e1-b7c5f510f3a3, timestamp=Mon Aug 12 19:27:02 GMT 2024), CustomHeader(count=2, errorMessage=404 Not Found from GET https://dev.api.azeus.xyz.com/sourcing/order-repository-service/order-receipt-linkage?orderNumber=XYZ123, eventId=81833449-0b32-4d24-92e1-b7c5f510f3a3, timestamp=Mon Aug 12 19:28:03 GMT 2024), CustomHeader(count=3, errorMessage=404 Not Found from GET https://dev.api.azeus.xyz.com/sourcing/order-repository-service/order-receipt-linkage?orderNumber=XYZ123, eventId=81833449-0b32-4d24-92e1-b7c5f510f3a3, timestamp=Mon Aug 12 19:29:03 GMT 2024)], __TypeId__=com.xyz.sourcing.purchaseorder.domain.Event, newrelic=eyJkIjp7ImFjIjoiMTU0MjQwNSIsInByIjoxLjAwNDI4OSwidHgiOiJmYTA3ZjY5NjFjNThjYjhmIiwidGkiOjE3MjM0OTEwMDQxODAsInR5IjoiQXBwIiwidGsiOiIxNzk4NDE1IiwiaWQiOiI0ZDgyOWJkMTk5YmIwZmE5IiwidHIiOiIwYTBmMWY0MmVkODRlYzA0OTBjNGYwOGJhZTE5NjJmOCIsInNhIjp0cnVlLCJhcCI6IjcxMDk1ODg2MCJ9LCJ2IjpbMCwxXX0=}, timestamp=Mon Aug 12 19:27:00 GMT 2024, contentType=application/json, contentEncoding=UTF-8, contentLength=0, priority=0, redelivered=false, receivedExchange=gis-order-dl-exchange, receivedRoutingKey=GIS.ORDER.PARK, deliveryTag=3, consumerTag=amq.ctag-snqN38JmZjVIkqyeAQ7TkA, consumerQueue=gis_booking_order_parkinglot_queue])","logger_name":"org.springframework.amqp.rabbit.listener.ConditionalRejectingErrorHandler","thread_name":"order-2","level":"ERROR","level_value":40000}
{"@timestamp":"2024-08-13T04:25:55.666727385Z","@version":"1","message":"Execution of Rabbit message listener failed, and the error handler threw an exception","logger_name":"org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer","thread_name":"order-2","level":"ERROR","level_value":40000,"stack_trace":"org.springframework.amqp.ImmediateAcknowledgeAmqpException: Fatal and x-death presentntat org.springframework.amqp.rabbit.listener.ConditionalRejectingErrorHandler.handleError(ConditionalRejectingErrorHandler.java:143)ntat org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.invokeErrorHandler(AbstractMessageListenerContainer.java:1453)ntat org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.handleListenerException(AbstractMessageListenerContainer.java:1751)ntat org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.executeListenerAndHandleException(AbstractMessageListenerContainer.java:1527)ntat org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.lambda$executeListener$8(AbstractMessageListenerContainer.java:1484)ntat io.micrometer.observation.Observation.lambda$observe$0(Observation.java:493)ntat io.micrometer.observation.Observation.observeWithContext(Observation.java:603)ntat io.micrometer.observation.Observation.observe(Observation.java:492)ntat org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.executeListener(AbstractMessageListenerContainer.java:1484)ntat org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.doReceiveAndExecute(SimpleMessageListenerContainer.java:994)ntat org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.receiveAndExecute(SimpleMessageListenerContainer.java:941)ntat org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.mainLoop(SimpleMessageListenerContainer.java:1323)ntat org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:1225)ntat java.base/java.lang.Thread.run(Thread.java:833)n"}