If I want to create an infinite list of integers in Java like so:
ArrayList<Integer> list = new ArrayList<Integer>();
for(int i = 0;;i++){
list.add(i);
}
I run out of memory. My understanding is that we allocate some memory for i, list, and every time we add i to the list, we allocate more memory for it.
But, by using a thunk, I can create an infinite list which will not eat all of my memory:
infList(final int start){
return new InfiniteList (
start,
return new InfiniteList.LazyTail(){eval(){return infList(start+1);}}
);
}
(I adapted source from here). How does this work? I know I am delaying the evaluation of the expressions : start+1, start+1, start+1 … but do I end up with a list of something like this:
[start, eval(){start+1}, eval(){start+1}, eval(){start+1} … ]
Are the ‘eval(){…}’ object references? Do they take up space in memory? Will I eventually run out of memory with this implementation (I tried, but memory allocation was not budging)?
Do I only start using memory, say, if I wanted to system.out.print integers, then they would have to be evaluated, and some memory allocated?
4
The idea behind all simple lists is the following:
- Make sure you can tell if you have an empty list or not.
- If it’s not the empty list, you have 2 parts:
- the head of the list
- and the tail (rest of the list)
Since Java supports closures (well, kind of), one can implement lazy lists with the following trick: Instead of having a reference directly to the tail, have a reference to an Object which is able to compute the tail. Such an Object could be a Callable<ZList>
, say, if ZList
is our list type. And yes, such a closure constructed with
// assuming constructor ZList(int, Callable<ZList>);
public Callable<ZList> nextNode(final int current) {
return new Callable<ZList> {
ZList call() {
return new ZList(current, nextNode(current+1));
}
};
}
takes up some finite amount of memory. Certainly more than a singe list node, but less than many list nodes.
As for the memory requirement: It is posible to go through such an infinite list, and still have constant memory usage as long as the start of the list is stored nowhere. Because, in that case, the head node you’re done with can and will be garbage collected.
If, however, you have the head of the infinite list somewhere (in a static variable, say) memory usage will increase as you traverse the list.
Because such things are painful to write in Java, there are new JVM languages like Scala (functional/OO), Clojure (List) or Frege (Haskell like) that make it easy to have such things in the JVM.
Frege may be interesting for you because a) it is non-strict (i.e. lazy) and b) it spits out java source code, hence you could study and play the java sources created by simple functional programs.
This is classic lazy initialisation as in the objects that are never referenced are never created; only when they are first referenced does the initialization occur.
FYI to create a infinite list in java you can do the following:
public class InfList extends AbstractList<Integer> implements RandomAccess{
public Integer get(int i){
return i;
}
public int size(){
return Integer.MAX_value;
}
}
or to use a more traditional predicate based list:
public class FibList extends AbstractSequentialList<BigInteger> {
public ListIterator<BigInteger> listIterator(int i){
ListIterator<BigInteger> res = new FibIterator(0,1);
for(;i>0;i++)res.next();
return res;
}
private FibIterator implements ListIterator<BigInteger>{
BigInteger prev,curr;int ind=0;
FibIterator(int pr,int c){
prev=BigInteger.valueOf(pr);
curr=BigInteger.valueOf(c);
}
public boolean hasNext(){return true;}
public boolean hasPrevious(){return true;}
public void remove(){throw new UnsupportedOperationException();}
public void set(BigInterger I){throw new UnsupportedOperationException();}
public void add(BigInterger I){throw new UnsupportedOperationException();}
public BigInteger next(){
BigInteger tmp = curr;
curr=curr.add(prev);
prev=tmp;
ind++;
return tmp;
}
public BigInteger previous(){
BigInteger tmp = prev;
prev=curr.subtract(prev);
curr=tmp;
ind--;
return tmp;
}
public int nextIndex(){return ind;}
public int previousIndex(){return ind-;}
}
}
(yeah iterator based list are a pain)
1
In order to get an infinite list of items, you need two things:
- The ability to get the first item
- The ability to get a list containing the rest of the items.
There different names for this pattern such as first/rest, head/tail.
I think you are looking for something more along the lines of the following. It does not implement java’s List interfaces, but it would be possible to.
import java.math.BigInteger;
public class InfiniteList {
private BigInteger _current;
public InfiniteList(BigInteger start){
_current = start;
}
public BigInteger first(){
return _current;
}
public InfiniteList rest(){
return new InfiniteList(_current.add(BigInteger.ONE));
}
}
It has a way to get the first element, but then it also has a way to get a list containing the rest. The list only allocates enough memory for one element, and then allows you to get a new list, containing the rest of the elements, but of course that next list only allocates one element, but gives you a way to get the rest.
If you call rest
99 times and then call first, it will only have allocated 100 elements total. If you do not hold on to any of the items when you’re traversing the list, the ones in the beginning can get garbage collected.
The classic example of the infinite sequence is that of a lazy sequence from a functional programing.
From Wikibooks Clojure Programming: Lazy Fibonacci there are a number of lazy sequences for calculating the Fibonacci sequence. One example:
(def fib-seq
((fn rfib [a b]
(lazy-seq (cons a (rfib b (+ a b)))))
0 1))
user> (take 20 fib-seq)
(0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181)
In this, the sequence is not evaluated until one asks for it and can keep being queried to get the next value in the sequence.
Note that this is a sequence and not an array.
The idea of a structure that is hiding calculation behind it is certainly not unique to pure functional languages.
In perl one example of an array where one asks for $primes[199]
returning the 200th prime can be found at Math::Prime::TiedArray. Similarly, in Python, this stack overflow question creates a iterator/generator (I’m not a pythonian, I’m not exactly sure of the termonology) that will generate an infinite stream of primes.
Note that with the primes it still allocates memory as it is generates the numbers to make the sieve more optimal. If you request more objects than can fit in memory, you will fill up memory. However, it doesn’t calculate the value until you ask for it.
Specifically for Java, one could do what is desired in clojure and then call the clojure code from Java — they are both jvm languages. Its an option, though I don’t recommend it.
For Java, iterator is an interface for something that moves in one direction. The key point being that this is an interface – you can implement it.
Consider:
package com.michaelt.se;
import java.util.Iterator;
public class EvenIter implements Iterator<Integer>
{
private int value = 0;
@Override
public boolean hasNext() {
return true;
}
@Override
public Integer next() {
return (value++ * 2);
}
@Override
public void remove() { }
}
and the demo to run it
package com.michaelt.se;
public class IterDemo
{
public static void main(String[] args)
{
EvenIter evens = new EvenIter();
for(int i = 0; i < 10; i++) {
System.out.println(i + ": " + evens.next());
}
}
}
This is an infinite “list” of numbers that takes a small, constant amount of memory. I did this with the iterable, it could be done with other interfaces too. The slides from the presentation on infinite streams does it as a stream.
The thunk specifically is a closure to be invoked later when needed. Closures are something that are in Java 7.
Again, one can do similar things in Java as code by creating an object with a getter that has an associated method that is then memoized. The idea being “yea, I’ll calculate this value when you actually want it, but until that time, hang on to this.”
The question of the infinite list of integers integers (be they natural numbers, primes, Fibonacci, or some other sequence) can be implemented lazy in a variety of ways in Java. It really boils down to what one wants to do with it beyond the “it should be lazy and not eat all memory as I get big numbers”.
No matter how you cut the cake, an integer takes up space in memory (32-bits in Java to be exact) and an infinite number of them will take up an infinite amount of space. This will cause you to run out of memory(1). Any solution that claims to create an infinite list is not actually creating a list, just an illusion. For example, your loop would make System.out.println(list.get(10000);
print out 10000. I could very easily devise a simple object with a method called get(int x)
that just returns x
. This would create the illusion of an infinite list of integers.
Side Note (1): Depending on how much memory you can allocate to the list, you may hit the maximum integer size of ~2 billion before running out of memory. This would require something in the ballpark of 15 GB of memory to hit this limit.
2