I have an existing Scala application that I am trying to refactor in order to use Akka. One of the problems I have is how to manage error-checking in actor-based applications.
Usually error-checking is done through one of two mechanisms:
- either one returns a value indicating an error condition, such as
Option[A]
orFailure[A]
, or - by the use of exceptions.
Neither of these style seems particulary useful here. On the one hand, actor messages are usually “fire and forget”, hence there are no return values. [One can have a return value using Futures, but it certainly not customary to ask for futures on every message.] On the other hand, processing of the message usually happens on another thread, so that one cannot catch an exception arising from the processing of a message.
One could simulate the first mechanism by sending back error or confirmation messages, such as
class FooActor extends Actor {
def receive = {
case Foo => ...
if (errorCondition) sender ! ErrorMessage
}
}
But if one has to do this for every actor, it becomes a lot of boilerplate and it seems a poor man’s simulation of stack unwinding.
What is a good strategy to recover from errors in actor-based applications?
3
Based on your example with the database connection failing, I think you need to introduce another actor. This actor holds a queue of the messages that contain the data that needs to be written to the database. You then have the actor that holds the database connection request the next message to be saved to the database by sending a message to the queue actor. Then, a message containing the data to be saved is sent to the database actor and upon successful save, the database actor sends a message to the queue actor signifying a successful save. At this point, the message that was just saved can be removed from the queue and the next message sent to the database actor, if there is one. However, if the database actor crashes, the message signifying the successful save is never sent, and thus the message is still in the queue actor. When the database actor starts back up, it will simply tell the queue actor it is ready to process again, and the previous message that was not saved to the database will be sent again.
2
Fault tolerance is one of the key concepts of Akka and “let it crash” is one of their mottos. Akka’s mechanism for handling failures is called supervision, and it’s right there on akka.io/#supervision:
Actors form a tree with actors being parents to the actors they’ve created. As a parent, the actor is responsible for handling its children’s failures (so-called supervision), forming a chain of responsibility, all the way to the top. When an actor crashes, its parent can either restart or stop it, or escalate the failure up the hierarchy of actors. This enables a clean set of semantics for managing failures in a concurrent, distributed system and allows for writing highly fault-tolerant systems that self-heal. Learn more.
Hence, you don’t need to explicitly fire failure messages in case of an exception. The supervisor strategy of an actor’s parent (or the parent’s parent and so on) is invoked automatically. Although you can of course do it if you want to.
Also take a look at Supervision, Lifecycle Monitoring and Restart Hooks.
1