What is the actual reason that locks (sentinels) in OO are hard to reason about? [closed]

In this talk, Rich Hickey introduces Clojure. He gives an Ants demo and talks about his motivations for implementing a Software Transactional Memory (STM) system.

His reasoning for STM is that "locks are hard to reason about".

I took that to mean that the deadlock situation, where two threads are trying to update the same item and they hold resources that the other thread needs.
So to be atomic one thread has to roll back and retry and the other has to roll forward.

To make this work in an Object Oriented Language, you have to either use an STM framework or roll it yourself by hand.

That’s my understanding, but I suspect I’ve only covered one problem case and I would imagine there is a more general principle that I’m missing. Perhaps the issue is that locks don’t compose or something.

My question is: What is the actual reason that locks (sentinels) in OO are hard to reason about?

It has nothing to do with OO in particular. The problems are problems with the “explicit locking” paradigm as an approach to concurrency, regardless of other paradigms used in other parts of the problem.

I haven’t watched that two hour talk, so I won’t comment on Hickey’s reasoning. Here are the criticisms of locks I hear most commonly, summarized:

Locks are not composable. If you have two pieces of code A and B that work correctly (in particular, atomically) on their own, you can’t write A; B to perform the two actions together, atomically. At least not without either mucking around with the locks A and B use (which are often private, for good reasons) or introducing yet another lock and introducing it everywhere it needs to be adhered to.
Locks, as usually implemented, are not formalized in what they protect and when they must be acquired. If the system is any good, it will be documented, but programmers get no automated assistance in adhering to the rules.
Locks are low-level. To properly understand their interactions and what they mean for synchronization, you have to consider all possible ways the concurrent executions might interleaving, and all possible ways code might be reordered. Flow-dependent reasoning is already tricky for a single thread of execution, but concurrency makes the number of paths to consider explode.

Functional programming advocates like to argue that OOP also needs far more synchronization because it uses far more shared mutable state, but this is beside the point here. How much synchronization is needed is orthogonal to how you choose to synchronize.

It’s not the locking that is hard to understand. Locks are easy. They work just like the locks in real life. It’s the concurrency that’s hard to understand.

The only reason locking makes people’s head spin is that we only ever use it in concurrent systems. But imagine for a moment that we had a highly concurrent system with no synchronization whatsoever; the system’s state would over time change to hilariously invalid configurations. Many of those configurations are just as hard to understand; they make people go “Huh? How on Earth could this ever happen?” The reason is that we are not built to understand the weird mix of simultaneous and and alternating changes that modern time-slicing systems yield. Little else in real life works that way, so our brains have a really hard time with it.

I don’t have any experience with STM but I’ll address the locking part.

Getting single threaded code right is hard. It’s pretty telling that the prevalent methodology is to create a large battery of unit tests and hope that if they all pass, nothing is wrong. For code that won’t result in loss of life or billions of dollars, that’s a reasonable approximation, but the point I’m trying to make is that proving code correct is hard. Harder than most people are willing to put effort into.

Now consider concurrent code using locks. To prove it works, you have to prove that you’ve manually synchronized the threads properly for every possible interleaving/scheduling. That’s a monumental task. And if you screw up, because the code is non-deterministic, the buggy behavior is non-deterministic too! It’s not unlike manual memory management, except that tracking resources in single-threaded code is easier (yet still too hard to get right).

Even worse code that uses locks isn’t composable. Suppose you have two collections, and you make every operation on them thread-safe. Now suppose you want to remove an item from one collection and add it to the other. Just because removing the element is thread-safe and adding the element is thread-safe it doesn’t mean the entire operation is thread-safe. If you wrote the code for the thread-safe collections and your coworker is the one that has to write the program, it’s on them to make sure whatever combination of actions they do is safe. Contrast that with the single-threaded scenario, where proving that the insertion and deletion code is correct also proves that moving an element between collections is correct. Also compare to manual memory management, where freeing allocated memory is the function caller’s job.

I’m not really sure how OO fits into this. Lock-based concurrency is hard no matter what paradigm you choose.

The particular issue with OOP and concurrency is that often programmers are tempted to try to hide thread safety implementation away using encapsulation.

When this is done by baking locks right into objects, then it is almost always a leaky abstraction, because of the side effects that locks cause.

If you have a lot of these hidden locks, then you’re unlikely to run into race conditions, but it becomes really easy to get deadlocks (which arguably themselves are actually just race conditions, but I think you get my point). So then you have to reason about the order in which you call certain methods to avoid deadlocks. Not easy to reason about. And the encapsulation you wanted simply isn’t there.
Or you use locks more sparsely, which makes it easier to avoid deadlocks. But it’s also hard to reason about where the locks should be put to avoid weird/unexpected behavior. That also means that you will all too often have to intimately couple the implementation of an object to when (as opposed to just how) it is going to be interfaced by the outside world. So there’s also no real encapsulation.

So it’s not that having objects makes dealing with concurrency harder. It’s that trying to have object deal with concurrent access to them makes dealing with those objects harder. You can always, for example, have a model that is not the least bit thread safe, and just guarantee that all interaction is routed through a facade that deals with thread safety (e.g. through a reactor pattern). Then said facade has the single responsibility of dealing with concurrency.

To understand the correctness of a given line of lock-based concurrency code, you have to understand the entire program.

Now, you can save on this a bit. If you manage to prove to yourself that everything accessed in that bit of lock-based concurrency code is only accessed within a particular lock, you now only have to understand every bit of code in the program that also accesses that lock, its state when it starts the lock access, and everything it does while holding it.

This is because lock-based concurrency only maintains “thread safety” by locking access to data to limited numbers of threads at once (sometimes 1, sometimes only readers, etc). So you have to understand all of the locking behavior, and behavior while locking, in order to know if you lock system is actually doing what you want it to.

Next, because locks are subject to deadlocks, in order to understand multiple locks you have to understand not only the code in a lock, but every possible sequence you can acquire locks in.

And our languages and type systems do not help all that much with these tasks.

It gets worse. Like the other hard to track down and common errors, the deadlock and race condition problems usually do not happen reliably. You can write the code, do some simple tests, even run it in production, and have it work almost always. Except sometimes it fails spectacularly.

Really rare spectacular failures are hard to debug regardless of how easy they are to reason about. Here we have rare, spectacular failures that are difficult to reason about.

And it gets worse. When learning how to program, people typically run into a number of hurdles. Understanding state/assignment/data flow is a hurdle some don’t get over. Understanding conditionals and function calling is another. Recursion and iteration throws some people for a loop. Indirection and pointers cut off an entire set of would-be programmers from becoming useful. Resource management. And concurrency and synchronization is another such hurdle.

We can generate more useful programmers by removing some of these hurdles “in the way” of being productive. There are languages that hide indirection and pointers from the programmer, and handle the big resource management task for you (memory), and you can learn to become a productive programmer in those languages without getting over that hurdle.

In almost every language you can become a “productive programmer” without getting over the concurrency hurdle. You can produce useful code, be a professional developer, graduate from university — all of them without ever getting concurrency.

The ones that didn’t get flow control, data flow and iteration never became professional programmers. The ones who don’t get concurrency are productive senior developers at many a company.

Locks aren’t hard to understand, and concurrency isn’t particularly difficult to understand, either (almost everything going on around us in life is concurrent). It is scheduling of the locks that is hard to understand. Almost nothing in real life blocks, but in sequential computing with a single thread everything does, and all we do by splitting threads or processes is to pretend there are several when in fact we are really blocking each other constantly. Scheduling that block is really hard to get right. Its hard to schedule the CPU time for one, and hard to schedule shared, stateful resources as well, but doing both can be a real beast.

Enter things like the Erlang runtime. There concurrency is easy and works the way you expect. This is because the runtime handles the scheduling for you, freeing you up to think about the problem you’re actually trying to solve. Scheduling is the big win, not the other stuff that gets hyped all the time. The cost for all that easy concurrency is that no state can be shared. It appears this is not something that most programmers feel comfortable with; there are extremely high incidences of folks doing the “read half a book on Erlang and gave up” dance and winding up right back where they started.

Whatever language or environment you choose to solve problems in it is important to recognize that the nature of the problem is not the locks, the state or the language, it is the scheduling of locks whenever state is shared.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị

Filed under: softwareengineering - @ 21:44

Thẻ: composition, concurrency, debugging, locks

Thiết kế website giá rẻ

Danh mục

What is the actual reason that locks (sentinels) in OO are hard to reason about? [closed]