For large legacy C++ code bases, notions like Herb Sutter’s “const means threadsafe” don’t seem to help much, because there can be an overwhelming amount of code in const functions which are modifying state with no synchronization. And even if legacy code wasn’t a problem, for a class like a ThreadSafeQueue, you wouldn’t want the push_back function to be const just because it is threadsafe.
Is there a method for keeping track of which functions are intended to be threadsafe, ideally leveraging the compiler to help enforce it? Is there some way to fake an introduction of a new keyword like “threadsafe” that works similarly to const (i.e. compiler will give an error if a “threadsafe” labeled function calls any function not tagged as threadsafe)?
Perhaps my best bet is to just tag functions with a standard comment, perhaps using something like doxygen? So a special “threadsafe” tag in a comment instead of the const keyword would mean “immutable or internally synchronized”.
Anyone have experience with the process of adding multithreading to a large legacy code base that could share what worked and what didn’t?
11
One way to add multithreading to an older application is to use multiple processes and inter-process communication. This allows you to isolate the code in question, serializing communication with it, and if the code has bugs, memory leaks, etc, you can kill the process and restart it automatically. This works if the IPC overhead is worth the extra safety.
Another option is to write a wrapper API that is thread safe, but communicates with an non thread safe inner library, using serialization or queuing techniques. Watch out for deadlock.
I don’t think C++ is going to help you improve the thread safety of the code, just by using the compiler. You have to review and potentially rewrite every line.
1
One technique if you’ve got a lot of thread-unsafe code is to only let one thread have access to it. If the other threads don’t touch, the lack of safety is a non-issue. To make this work, you can then communicate with the rest of the application via message queues: the thread-safe parts can ask the thread-unsafe part to do some work by sending a message to it, and the thread-unsafe part can respond by sending a message back (which the thread-safe part might or might not stall waiting for, depending on what’s really going on in the application).
The message queues must be appropriately guarded against multi-threaded access, of course, but they’re quite a small piece of code and can have textbook lock semantics so you can get them right without too much trouble.
This is conceptually similar to running the code in another process and communicating via pipes, except you’re not restricted to sending byte-serialized messages and can use something more structured instead.
The compiler cannot work out how to make code thread-safe by itself. It’s just this software, y’ know? It can give you tools to make it easier to do, less onerous, but it doesn’t really understand what the code is supposed to do. All it can see is exactly what is written, and it’s pretty much the ultimate in language lawyers. It’s idea of what’s right is not yours.