When reading about the various options for working with things like ARC / GC, I often come across explicit wording about which weak references are zeroing (ie, your reference becomes nil/null/0 when the object is collected) and which non-zeroing (you get a dangling pointer). This leads me to ask: What possible use could there be for a non-zeroing weak pointer? You can’t use it for anything if you’re not sure whether it’s any good, can you? And how would you check its validity without risking a core dump / segfault?
Addendum: I understand they’re useful in reference-counted environments, and you may want to use them when writing library code that may be used therein, but I can’t see a good reason for them with a smart-enough ARC-alike or relocating collector. What good is a reference that can’t be relied on?
2
Weak references are important in the context of a reference-counting memory-management scheme. For example, we might have a tree data structure where each node knows its parent node. The result is a circular data structure. When we give up all known references to the root of the tree, the tree isn’t freed because the references from child nodes to the root nodes prevent the refcount from dropping to zero.
This can be avoided with weak references. When each reference from a child to its parent is a weak reference (a reference that does not affect the reference count), then a reference from one of our variables to the root of the tree is the only counting reference. If we remove that reference, the refcount hits zero (or not, if other references were made), and the memory can be reclaimed.
The implication here is that as long as we are interested in nodes of the tree, we will also keep the root of the tree itself around. Therefore, the weak references to a parent node will always stay valid.
An interesting example of a tree with such properties is the Document Object Model for XML or HTML documents. A node in the DOM cannot exist separately from the document (or document fragment) which it belongs to. The DOM contains accessors that allow an implementation to maintain referential integrity even when implemented with weak references or pointers.
There are a few important observations.
-
Weak references are largely unnecessary with more sophisticated memory management schemes like garbage collection. For example, we could halt the execution of a program and in the graph of all references find all disconnected subgraphs. All of these disconnected subgraphs except the main one can be collected (although implementing the semantics of destruction can be difficult. Also, this makes the RAII-idiom impossible or difficult).
As pointed out in the comments, some notifier implementations can benefit greatly from (zeroing) weak references even in the context of GC’d languages.
-
Weak references are basically equivalent to ordinary pointers. You have to structure your program in a way to guarantee that it always points to something valid. While sometimes difficult, experience has shown that this isn’t exactly impossible (e.g. by keeping the context around).
-
An implementation that sets weak references to
null
when the referenced object is freed can be highly inefficient, as we need a reference from the referenced object back to the reference itself, so that it can be reset on freeing.The other option would be to access each object only through a proxy, where weak references only increase the proxy refcount, and normal references increase both the refcount of the proxy and the actual object. When the inner object’s refcount is zero, the proxy has to be notified about the destruction to set it’s internal pointer to
null
. This implies that a weak reference of this design needs deep integration with the runtime’s memory management and can’t really be added later as a library. This is more memory-efficient, at the expense of adding one additional pointer level to each access.
2
You are right – YOU shouldn’t use non-zeroing weak references. You may encounter ancient code that didn’t use automatic reference counting, and any reference in such code effectively behaves as if it was a non-zeroing weak reference. So if you directly use such ancient code, this models the ancient (and undesirable) behaviour correctly.
You might use a non-zeroing weak reference if efficency is more important than safety. (Not saying that is a good idea, but that would be the reason). It’s reasonably acceptable if A has a strong reference to B and nothing else has, and B has a weak reference to A, so you know that when the weak reference becomes invalid, the object B that contains the weak reference goes away anyway. But it has to be done carefully, and I personally would want significant gains in efficiency before I’d consider it.
In Objective-C, non-zeroing weak references exist primarily for compatibility reasons. When executing ARC code on iOS 4, a runtime called ARC-lite is used which supports the majority of ARC except for zeroing weak references.
If you set your project to target iOS 4, Xcode will refuse to compile code that uses the weak
keyword.
Apple also explicitly recommends unsafe_unretained
rather than weak
when working with certain classes, again for compatibility reasons with older code.
There is also a small performance cost to accessing weak references, although for most purposes this is negligible.