Volatile and optimization with different compilation units
Please have a look at this more or less standard example that i see people use when talking about the usage of volatile in the context of embedded c++ firmware development on a baremetal target: