I’ve learned from a textbook how to implement binary search trees recursively in Java, and am working on implementing them nonrecursively. I’ve found a simple and elegant way to implement an insert method in C/C++ by traversing the tree using a pointer of type Node**, but can’t find an elegant way to do this in Java.
My C++ code is as follows:
class BST {
class Node {
public:
int key;
int val;
Node *left = nullptr;
Node *right = nullptr;
Node(int k, int v) : key(k), val(v) {}
};
Node *root = nullptr;
public:
void insert(int key, int val);
};
void BST::insert(int k, int v) {
Node **link = &root;
while (*link != nullptr && (*link)->key != k) {
if (k < (*link)->key)
link = &(*link)->left;
else if (k > (*link)->key)
link = &(*link)->right;
}
if (*link == nullptr)
*link = new Node(k, v);
else
(*link)->val = v;
}
Which I think expresses the idea pretty directly. My first attempt in Java was the following.
public class BST {
private class Node {
int key;
int val;
Node left = null;
Node right = null;
Node (int k, int v) {
key = k; val = v;
}
}
Node root = null;
void insert(int k, int v) {
if (root == null) {
root = new Node(k, v);
return;
}
Node node = root;
while (node.key != k) {
if (k < node.key) {
if (node.left == null) {
node.left = new Node(k, v);
return;
} else {
node = node.left;
}
} else {
if (node.right == null) {
node.right = new Node(k, v);
return;
} else {
node = node.right;
}
}
}
node.val = v;
}
}
This seems much less clear to me – it has to deal with lots of different cases separately, and the code for creating new nodes is mingled with the code for traversing the tree, whereas the C++ version goes through two clear stages. Then it occurred to me that I could replicate the C++ version by storing the Node pointers in objects of a Link class:
public class BST {
private class Link {
Node node = null;
}
private class Node {
int key;
int val;
Link left = new Link();
Link right = new Link();
Node (int k, int v) {
key = k; val = v;
}
}
Link root = new Link();
void insert(int k, int v) {
Link link = root;
while (link.node != null && link.node.key != k) {
if (k < link.node.key)
link = link.node.left;
else if (k > link.node.key)
link = link.node.right;
}
if (link.node == null)
link.node = new Node(k, v);
else
link.node.val = v;
}
}
This allows the algorithm to be expressed elegantly, but the overhead of adding an extra layer of indirection to every node of the data structure seems rather a heavy price to pay. Is there a better way of doing this?
14
Part of the problem is that you’re not following good OO principles.
In this case, the one that’s probably most applicable is “Tell, don’t ask“: If you want a Node
to add a child, you should tell it to add a child, you shouldn’t ask it to expose all the relevant information you need then go ahead and add the child for it. Likewise, if (because you’re doing an iterative solution), you need to traverse a level down the tree, tell the Node
you want to traverse to the appropriate child, don’t ask it for everything you need to know to do the calculation yourself.
Consider:
private class Node {
int key;
int val;
Node left = null;
Node right = null;
Node (int k, int v) {
key = k; val = v;
}
public Node traverseNext(int targetKey) {
if (key==target) {
return null;
}
return key < target ? left : right;
}
public void setChild(int key, int value) {
if (key == this.key) {
return;
}
if (key < this.key) {
left = new Node(key, value);
}
else {
right = new Node(key, value);
}
}
Then a caller wanting to add a child would just have to do:
Node node;
Node nextNode = root;
while(nextNode != null) {
node = next;
nextNode = node.traverseNext(key);
}
node.setChild(key, value);
(Excuse any mistakes, Java isn’t my native language)
That’s not exactly stellar code because I had to go out of my way to work around the obvious recursive solution. For example, you wouldn’t realistically want to expose a method that cheerfully replaces one child with another if asked. But it should give the basic idea.
Note: I realise that this doesn’t very directly address the issue which caused the difference between those two code samples- the capability that a pointer has which a reference doesn’t. But I’m also of the opinion that it’s no coincidence that as soon as you solve this problem in a Java-idiomatic way, that difference ceases to be relevant.
0
You can avoid the repetition by creating the new child Node
s earlier and defaulting them to empty, rather than using null
to represent an empty Node
. The body of your while
loop might look something like:
Node child = node.right;
if (k < node.key)
child = node.left;
if (child.isEmpty()) {
child.set(k, v);
child.left = new Node(); //Could be done inside set
child.right = new Node();
return;
} else {
node = child;
}
This basically preallocates the next level of leaves, but shouldn’t be too much overhead. Another option could be replacing left
and right
with a children[2]
, then you can use the array index to avoid the repetition.
1
There is no way to do what you want to do in Java. Java doesn’t offer the freedom of C when dealing with pointers. The only pointers Java has are object references.
Your first attempt is therefore as good as you can do in Java. You have to distinguish the case of left, right and root insertion. If creating and initializing the node is complex, you can move the node creation code to a procedure.
Adding a Link object is a clever idea but it complexifies the data structure as you noted. It also affects all other methods that manipulate or traverse the tree, and that just to make your insert method more elegant. A bit silly if you think of it.
So the answer is you can’t. This being said, I am sure it is just an exercise, so experiment as you like. But in real life, before writing your own binary sort tree, make sure no Java collection fills your needs. And if they don’t, check out open source frameworks like Google guava.
Ben Aaronson’s answer is much better than either of my attempts, but I think the main reason is the technique he uses to iterate through the tree, using two Node
pointers instead of one, rather than the more object-orientated style. The whole thing can be inlined, which allows a direct comparison with my version and shows that it is indeed more elegant, cleanly separating the tree traversal from the node creation (even if it does still take twice as much code as the C++ version):
void insert(int k, int v) {
if (root == null) {
root = new Node(k, v);
return;
}
Node node = root;
Node next = root;
while (next != null) {
node = next;
if (k == node.key)
next = null;
else
next = k < node.key ? node.left : node.right;
}
if (k == node.key)
node.val = v;
else if (k < node.key)
node.left = new Node(k, v);
else if (k > node.key)
node.right = new Node(k, v);
}
I do take on board Ben’s broader points about the value of using OO principles, and clearly the code is improved further by dividing it into separate methods as he has done.