Consider the following code, with adjacent mutex sections containing shared memory accesses:
std::mutex mutex;
int x;
void func() {
{
std::lock_guard lock{mutex};
x++;
}
{
std::lock_guard lock{mutex};
x++;
}
}
Without the mutexes, the compiler could obviously coalesce the increments, like so:
void func() {
x += 2;
}
This got me thinking about what might prevent that optimisation. We know even if x
is std::atomic<int>
, the optimisation is still legal (but perhaps not done in practice).
But what about with mutexes? Is it legal for the compiler to transform func
into this?
void func() {
std::lock_guard lock{mutex};
x += 2;
}
It’s clear that the writes to x
must be visible side effects to other threads due to acquire-release semantics, but I’m not sure if that forbids the optimisation. Perhaps you can argue under the as-if rule that you couldn’t tell the difference if the optimisation was done or not, therefore the optimisation is legal (since the difference between “unlock then lock again” and “keep holding lock” depends only on thread scheduling timing).
On the other hand, it seems plausible this optimisation could slippery-slope to every program containing a mutex being optimised into: lock mutex, do entire program, unlock mutex. Such behaviour would be obviously unhelpful for concurrent applications, so it might be reasonably forbidden.
What aspect of the C++ memory model, and wording in the Standard, allows or forbids this optimisation?
(Currently, Clang and GCC do not perform this optimisation, as can be seen here. However, this is not proof that the optimisation is or should necessarily be illegal.)