I am currently working on numerical computations in a specific research domain and have inherited a Java package implemented long time ago by my team. My task now is to optimize its performance. The package makes extensive use of ArrayList<Double>
. As far as I know, Java, at least up to version 9 which I am using, does not support ArrayList
of primitive types due to type erasure in Java generics.
There are a few issues with this:
-
Storage overhead: Double is a wrapper class for double. Since Double is an object, it includes an object header, which takes up 8 bytes. Therefore, each Double object takes up 16 bytes of memory.
-
Non-contiguous memory:
ArrayList
internally uses an array to storeDouble
objects, i.e.,Double[] data
. This array holds references toDouble
instances. These instances are not stored contiguously in the heap. For example, ifDouble[] data = {1.0, 2.0}
, the references in data are contiguous, but the actualDouble
objects that these references point to are not. This leads to poor locality and frequent cache misses due to the need to dereference these pointers. -
Unboxing and autoboxing: There is additional overhead due to unboxing and autoboxing when performing comparisons and calculations with Double.
My questions are:
What are the best practices to avoid the mentioned overhead? I welcome any answers regarding the questions below.
-
Are there any new features in the latest versions of Java that allow the use of ArrayList with primitive types?
-
Are there any experimental features, frameworks, or libraries that support this?
-
If not, would it be feasible to to create a personal
DoubleArrayList
by replacing all Object in theArrayList
source code by primitive double? What would be the potential pitfalls of this approach?
PS:
-
We are on a very tight schedule and do not have the time to rewrite the entire package in C++, where
vector<double>
is available. -
In our use case, the container will change size during computation, and we don’t know the size beforehand. Therefore, we must use a resizable array (such as
ArrayList
) instead ofdouble[]
.
Use double[]
.
The “free” resizing of ArrayList
is just a convenience – when more space is needed, the internal backing array is replaced by one twice as large and the elements copied across using an old school loop.
It’s a trivial implementation to code yourself.
Create a class with a double[]
field and basically copy what you need from ArrayList
but add a getter method for the array and use that for your computations.
If you can get away with it, use float[]
instead as operations on float
will be roughly twice as fast as for double
.
1