I’m currently reading C++ Concurrency in Action by Anthony Williams and I’m facing an obstacle in thought.
First he describes deadlocks as when two threads lock simultaneously (at least, that’s how I understood it), which makes sense.
However, he goes on to explain how you can lock two mutexes at the same time. Obviously this has a function, but with the above (assumingly wrong) understanding that would instantly deadlock both threads.
From that obstacle arises a new one; obviously it won’t deadlock, which means the mutexes must have a more advanced use than just allowing a single thread to do work in a process. In the book, mutexes are given to particular objects.
This leads me to my overall question(s): Are mutexes assigned to specific memory regions, such as objects? If so, how?
The book does a great job at explaining how mutexes can be used but never really describes what mutexes are at a low level. I realize they’re implementation specific but I never really grasped what exactly they do or how the locking functions use them.
2
A mutex is not a specific memory region; it’s a signal, basically a special type of object, and it works purely “on the honor system,” so to speak. A mutex protects a region of code because calls to it surround the code in question, not because it has any inherent relationship to the code. (Even when it’s a lock associated with an object, like with C#’s lock
keyword, locks only work when they’re invoked by some piece of code.)
There are two things that can happen when a piece of code attempts to acquire a lock: either it can find it available and acquire the lock, or it can find it already in use and fail. The most common response to failing is to wait until the lock becomes available. And a deadlock doesn’t happen because two threads are both trying to use a lock at the same time; it happens because two threads are both waiting on each other
at the same time.
Imagine that two locks exist, named A
and B
, and two threads exist, called 1 and 2. Both of them are doing things that require locked access to both A
and B
, but thread 1 holds A
and thread 2 holds B
. If they both try to acquire the other lock, and then (as is the norm) attempt to wait forever until it becomes available, they’ll both be stuck waiting forever (deadlocked).
The best way to prevent a situation like this is to identify situations in which your code may need to hold more than one lock at a time, and impose some sort of general ordering rule across the entire codebase, such that no thread can acquire multiple concurrent locks out of order. If no thread that needs A
and B
can acquire B
without already holding A
, for example, then it’s impossible to get stuck in the deadlock described above.
At a low level, a mutex can be thought of as containing two pieces of data: a variable identifying the current holder, such as a thread ID number (or 0 if it’s currently unlocked) and some sort of list of threads that are currently waiting on it. Access to this data is controlled by atomic operations, to ensure that multiple threads both attempting to acquire the mutex don’t overwrite or corrupt each other.
2