Suppose I have a constructor that performs an expensive IO operation that takes a noticeable amount of time. I don’t like it for a few reasons (first of all, it’s simply wrong, but there are practical considerations too)
What should I do?
I see two options
- Extract the expensive operation into a separate method. That method may return some async wrapper (
CompletableFuture
,Mono
). The clients that call the constructor should be updated to include the new method call after constructor calls (in case they relied on a fully initialized state)
MyClass myClass = MyClass();
CompletableFuture<MyClass /* or some specific component that is expensively initialized*/> = myClass.initAsync();
- Remove the constructor from the public API altogether, replace it with a factory method returning an async wrapper. I would have to identify all the places the “expensive” state is used and move that logic to the callbacks passed to the async wrapper. It feels more right
MyClass.createAsync().thenAccept(/* consumer code */)`)
What do you think?
4
- Don’t put i/o in constructor. It leads to so much trouble, e.g. how will you test that? There are some valid cases for such thing in other languages (e.g. RAII in C++ and Rust) but I don’t think it applies to Java (which I don’t know that well), just like it doesn’t apply to C# (which I do know well).
- Don’t make “init” methods. That’s even worse. It leads to situation where you actually can have invalid objects, if you forget to init them. Ultimately this will lead to trouble, as someone will use such class without init at some point. Murphy’s law, man. Don’t trust in developers’ discipline if programing language alone can solve your problem.
- Use builders/factories that return valid objects. These can do any i/o and async stuff under the hood, anything you want. Ultimately it is just a way of bypassing constructor limitations. But it has some other advantages (e.g. caching).
- As an alternative pass i/o to constructor as lazy dependency. Which does i/o only when you use the object, not during construction. I recommend builders/factories though. Because their behavior is easier to predict and debug.
8
This is very similar to an ancient StackOverflow question. I suggest reading my answer there, as it hasn’t changed.
Arguing that the constructor must not be slow is a mindset that comes from real-time systems (e.g. those with GUIs that shouldn’t block), but it is not a hard-and-fast rule.
At a minimum, your constructor needs to get the object configured to the point that its invariants are true.
Be careful of premature optimization. Be careful of blanket application of good rules one domain into all domains.
Overall, it depends.
11
This is a scenario where dependency injection is a good idea. Instead of doing the IO in the constructor, do the IO separately and then provide the handle (File handle, Buffer handle, whatever…) of that IO job to your object’s constructor.
Advantages:
- You are not anymore doing a heavy task in the constructor. This results in quick object construction;
- Unit testing the IO job will be easy since it can be moved to another class whose object you will be passing to your constructor;
- Unit testing your constructor code will be easy because you won’t have to wait for the constructor to run IO, and you can provide a mock object to the constructor for testing; and,
- Most importantly, any possible IO or network operation failure can now be gracefully handled instead of having an object left in a garbage state.
2
Consume the completed operation
Oddthinking’s diagnosis is correct:
At a minimum, your constructor needs to get the object configured to the point that its invariants are true.
If it is impossible to create a valid object (with true invariants) from the constructor’s parameters without performing expensive I/O, then I’d say that the constructor’s parameters are wrong.
Modify the constructor so that it takes the completed form of the expensive operation, and then provide an async helper method (or factory) which both does the I/O and constructs the object, to replace existing calls to the constructor.
This has the following advantages:
- It doesn’t require you to hide any constructors – the constructor is a real legitimate constructor which produces a valid object if called.
- It decouples the I/O from the business logic. If you later add another way of getting the data for the object, you don’t have to change the code of the object itself.
- If you need to construct multiple copies of the object from the same source data, you’ll only need to read the source data once.
- It’s unit-testable (if the data object itself can be constructed in your tests)
7
The main focal point here is that constructors are not the right place for any kind of expensive operation, let alone async IO calls.
- Extract the expensive operation into a separate method
The main point of concern here is that what you’re really doing is breaking the constructor in two parts. That introduces as many problems as it appears to solve, because now you have a second method that everyone should remember to always run right after the constructor. If they don’t, then your object will not be in the initialized state that it should be post-initialization.
- Remove the constructor from the public API altogether, replace it with a factory method returning an async wrapper.
MyClass.createAsync().thenAccept(/* consumer code */)
I agree with the overarching goal here. However, you seem to be taking the static route by calling MyClass.createAsync()
.
Instead, I would opt for an instanced factory instead of a static class method. In other words, create a MyClassFactory
which has a CreateMyClass
method (names are obviously open for improvement based on context that I don’t have in this question), which performs the async work and then calls the MyClass
constructor with the already received results from those asynchronous operations.
This achieves the best of both worlds:
- Your constructor remain to be inexpensive and synchronous, focused only on initializing the instance and nothing more.
- Your expensive logic can still be encapsulated in an appropriate class (i.e. whose responsibility it is to perform this task)
- The factory is instanced and therefore can be both injected as a dependency and mocked as needed.
5