This is a classic problem which I’m sure has been solved many times by many different people. I don’t have any formal training (I’ve not studied computer science or any other such academic subject) and so I’m not sure of the best way to solve the problem I’m about to describe.
If we imagine the below diagram is an example of a bankers dilemma (two users Foo and Bar have access to a single bank account: Baz). What is the expected behaviour when following one of the paths shown?
Note: I’m assuming we’re using a mutex (or some other form of synchronisation) on the Baz variable.
Example 1: Baz initially holds the value 10. If Foo writes a new value (which is the result of removing 5 from the current value) before Bar; then Bar will end up taking 10 from the new value 5, leaving a minus balance (i.e. the final value will be -5). Meaning more money has been taken than available.
Example 2: Baz initially holds the value 10. If Bar writes a new value (which is the result of removing 10 from the current value) before Foo; then Foo will end up taking 5 from the new value 0, leaving a minus balance (i.e. the final value will be -5). Meaning more money has been taken than available.
Both actions (Foo (-5)
and Bar (-10)
) are triggered at the same time. So how do we ensure that either Foo or Bar is alerted to the fact that their transaction cannot be completed (as there are not enough funds for it to succeed)?
It seems a potential solution is to ensure the caller executes a method that uses a mutex internally to lock the value first; then once the value is locked we can read the value; and then check if the action is valid. If the condition passes then we update the value and release the lock on the value. Meaning the next caller will be able to lock the value down and run through the same steps.
But how would this approach work with a distributed system? You could suggest using a global data store, but it would have to be one that guarantees consistency (e.g. a service such as AWS’ Dynamo DB offers “eventual consistency” and so wouldn’t work for a banking institution); but guaranteed consistency is generally considered to be very slow (depending on the number of distributed nodes I assume).
So how do we attempt to solve this design problem?
5
I understand finance industry uses a system of “after-the-fact” checking and fixups to resolve errors.
ie, you make each transaction on the individual systems independently (such that you know that each system is correct) and you write to a log the details of each transaction. These logs are then compared later, and if an error occurred in one, the other is instructed to rollback its transaction.
So, Bank A successfully withdraws money, but bank B fails to credit it. Later on, the transaction lists from both are compared and Bank A gets a credit to make things right.
You cannot implement a distributed “lock” in such systems as they make not respond in the way you expect – and you do not want to lock someone’s account while you make a withdrawal if you cannot tell how long it will be before the other system involved in the transaction will take to complete, you might end up with a lock that remains open blocking other transactions on that account.
3
For a distributed system, you would either:
a) Use “subtract amount or return error if you can’t”, where the code responsible for baz
returns an error if the result would’ve been negative (or returns “success” if there wasn’t an error)
b) Use the equivalent of locking; where the code responsible for baz
has an “acquire baz” and “release baz” that need to be used before and after.
Note that this is typically just the tip of the iceberg. More likely is that you’ve got 2 or more bank accounts, and want to transfer funds from one to the others such that either all accounts are updated or none are updated. In this case you might (e.g.) end up with a combination.
For example, if there are two accounts “Fred” and “Jane” and you want to transfer $5 from Fred to Jane; then you might end up with a sequence like:
-
From you to Fred’s account: “If Fred’s account is 5 or greater lock Fred’s account and tell me I can proceed, else tell me I can’t proceed”
-
From Fred’s account to you: “You may proceed”
-
From you to Jane’s account: “If Jane’s account can be increased by 5 lock Jane’s account and tell me I can proceed, else tell me I can’t proceed”
-
From Jane’s account to you: “You may proceed”
-
From you to Fred’s account: “Subtract 5 from Fred’s account and release the lock you gave me previously”
-
From you to Jane’s account: “Add 5 to Jane’s account and release the lock you gave me previously”
Note that for this example; you, Fred’s account and Jane’s account may all be running on completely different computers communicating with messages/packets (with no shared memory at all).
5