Question:
Why can’t Java/C# implement RAII?
Clarification:
I am aware the garbage collector is not deterministic. So with the current language features it is not possible for an object’s Dispose() method to be called automatically on scope exit. But could such a deterministic feature be added?
My understanding:
I feel an implementation of RAII must satisfy two requirements:
1. The lifetime of a resource must be bound to a scope.
2. Implicit. The freeing of the resource must happen without an explicit statement by the programmer. Analogous to a garbage collector freeing memory without an explicit statement. The “implicitness” only needs to occur at point of use of the class. The class library creator must of course explicitly implement a destructor or Dispose() method.
Java/C# satisfy point 1. In C# a resource implementing IDisposable can be bound to a “using” scope:
void test()
{
using(Resource r = new Resource())
{
r.foo();
}//resource released on scope exit
}
This does not satisfy point 2. The programmer must explicitly tie the object to a special “using” scope. Programmers can (and do) forget to explicitly tie the resource to a scope, creating a leak.
In fact the “using” blocks are converted to try-finally-dispose() code by the compiler. It has the same explicit nature of the try-finally-dispose() pattern. Without an implicit release, the hook to a scope is syntactic sugar.
void test()
{
//Programmer forgot (or was not aware of the need) to explicitly
//bind Resource to a scope.
Resource r = new Resource();
r.foo();
}//resource leaked!!!
I think it is worth creating a language feature in Java/C# allowing special objects that are hooked to the stack via a smart-pointer. The feature would allow you to flag a class as scope-bound, so that it always is created with a hook to the stack. There could be options for different types of smart pointers.
class Resource - ScopeBound
{
/* class details */
void Dispose()
{
//free resource
}
}
void test()
{
//class Resource was flagged as ScopeBound so the tie to the stack is implicit.
Resource r = new Resource(); //r is a smart-pointer
r.foo();
}//resource released on scope exit.
I think implicitness is “worth it”. Just as the implicitness of garbage collection is “worth it”. Explicit using blocks are refreshing on the eyes, but offer no semantic advantage over try-finally-dispose().
Is it impractical to implement such a feature into the Java/C# languages? Could it be introduced without breaking old code?
10
Such a language extension would be significantly more complicated and invasive than you seem to think. You can’t just add
if the life-time of a variable of a stack-bound type ends, call
Dispose
on the object it refers to
to the relevant section of the language spec and be done. I’ll ignore the problem of temporary values (new Resource().doSomething()
) which can be solved by slightly more general wording, this is not the most serious issue. For example, this code would be broken (and this sort of thing probably becomes impossible to do in general):
File openSavegame(string id) {
string path = ... id ...;
File f = new File(path);
// do something, perhaps logging
return f;
} // f goes out of scope, caller receives a closed file
Now you need user-defined copy constructors (or move constructors) and start invoking them everywhere. Not only does this carry performance implications, it also makes these things effectively value types, whereas almost all other objects are reference types. In Java’s case, this is a radical deviation from how objects work. In C# less so (already has struct
s, but no user-defined copy constructors for them AFAIK), but it still makes these RAII objects more special. Alternatively, a limited version of linear types (cf. Rust) may also solve the problem, at the cost of prohibiting aliasing including parameter passing (unless you want to introduce even more complexity by adopting Rust-like borrowed references and a borrow checker).
It can be done technically, but you end up with a category of things which are very different from everything else in the language. This is almost always a bad idea, with consequences for implementers (more edge cases, more time/cost in every department) and users (more concepts to learn, more possibility of bugs). It’s not worth the added convenience.
11
The biggest difficulty in implementing something like this for Java or C# would be defining how resource transfer works. You would need some way to extend the life of the resource beyond the scope. Consider:
class IWrapAResource
{
private readonly Resource resource;
public IWrapAResource()
{
// Where Resource is scope bound
Resource builder = new Resource(args, args, args);
this.resource = builder;
} // Uh oh, resource is destroyed
} // Crap, there's no scope for IWrapAResource we can bind to!
What’s worse is that this may not be obvious to the implementer of IWrapAResource
:
class IWrapSomething<T>
{
private readonly T resource; // What happens if T is Resource?
public IWrapSomething(T input)
{
this.resource = input;
}
}
Something like C#’s using
statement is probably as close as you’re going to come to having RAII semantics without resorting to reference counting resources or forcing value semantics everywhere like C or C++. Because Java and C# have implicit sharing of resources managed by a garbage collector, the minimum a programmer would need to be able to do is choose the scope to which a resource is bound, which is exactly what using
already does.
10
The reason why RAII can’t work in a language like C#, but it works in C++, is because in C++ you can decide whether an object is truly temporary (by allocating it on the stack) or whether it is long-lived (by allocating it on the heap using new
and using pointers).
So, in C++, you can do something like this:
void f()
{
Foo f1;
Foo* f2 = new Foo();
Foo::someStaticField = f2;
// f1 is destroyed here, the object pointed to by f2 isn't
}
In C#, you can’t differentiate between the two cases, so the compiler would have no idea whether to finalize the object or not.
What you could do is to introduce some kind of special local variable kind, that you can’t put into fields etc.* and that would be automatically disposed when it goes out of scope. Which is exactly what C++/CLI does. In C++/CLI, you write code like this:
void f()
{
Foo f1;
Foo^ f2 = gcnew Foo();
Foo::someStaticField = f2;
// f1 is disposed here, the object pointed to by f2 isn't
}
This compiles to basically the same IL as the following C#:
void f()
{
using (Foo f1 = new Foo())
{
Foo f2 = new Foo();
Foo.someStaticField = f2;
}
// f1 is disposed here, the object pointed to by f2 isn't
}
To conclude, if I were to guess why the designers of C# didn’t add RAII, it’s because they thought that having two different types of local variables is not worth it, mostly because in a language with GC, deterministic finalization is not useful that often.
* Not without the equivalent of the &
operator, which in C++/CLI is %
. Though doing so is “unsafe” in the sense that after the method ends, the field will reference a disposed object.
1
If what bothers you with using
blocks is their explicitness, perhaps we can take a small baby-step towards less explicitness, rather than changing the C# spec itself. Consider this code:
public void ReadFile ()
{
string filename = "myFile.dat";
local Stream file = File.Open(filename);
file.Read(blah blah blah);
}
See the local
keyword I added? All it does is add a bit more syntactic sugar, just like using
, telling the compiler to call Dispose
in a finally
block at the end of the variable’s scope. That is all. It’s totally equivalent to:
public void ReadFile ()
{
string filename = "myFile.dat";
using (Stream file = File.Open(filename))
{
file.Read(blah blah blah);
}
}
but with an implicit scope, rather than an explicit one. It’s simpler than the other suggestions since I don’t have to have the class defined as scope-bound. Just cleaner, more implicit syntactic sugar.
There might be issues here with hard-to-resolve scopes, though I can’t see it right now, and I’d appreciate anyone who can find it.
Update: Since the initial writing of this, C# has gotten this feature with C#8. Only instead of local
as suggested here the keyword is simply using
, as it replaces a using
-block.
24
For an example of how RAII does work in a garbage-collected language, check the with
keyword in Python. Instead of relying on deterministically-destroyed objects, it let’s you associate __enter__()
and __exit__()
methods to a given lexical scope. A common example is:
with open('output.txt', 'w') as f:
f.write('Hi there!')
As with C++’s RAII style, the file would be closed when exiting that block, no matter if it’s a ‘normal’ exit, a break
, an immediate return
or an exception.
Note that the open()
call is the usual file opening function. to make this work, the returned file object includes two methods:
def __enter__(self):
return self
def __exit__(self):
self.close()
This is a common idiom in Python: objects that are associated with a resource typically include these two methods.
Note that the file object could still remain allocated after the __exit__()
call, the important thing is that it is closed.
11