I recently created a red-black tree in C# to better understand how it works. With it, I implemented an in-order enumerator, however I quickly realized that enumerating a tree can have destructive results.
Consider this code:
RedBlackTree<Person> tree = new RedBlackTree<Person>();
tree.Add(new Person("Randy"));
// add more people
...
foreach(var person in tree)
{
if (person.Name == "Randy")
{
person.Name = "Cheeseburger eater";
}
}
The tree requires the generic parameter to implement IComparable
interface, in order to correctly determine ordering of items.
If Person
class is compared by name, then, by changing the name during enumeration, we might have broken the ordering. e.g. randyNode.CompareTo(randyNode.Right) < 0
might be false, which breaks the fundamental BST property, that left child is smaller than its parent and right child bigger.
In this case, is it better to remove tree enumeration altogether, or should I keep it in, with a huge warning to not modify the items during enumeration?
1
You’ll find with existing C# collections, attempting to mutate the collection while iterating causes exceptions. It would be idiomatic to throw an exception in that scenario in your collection.
The problem that you’ll face is that this doesn’t matter if you’re iterating. If someone changes person.Name
, your RBT is already out of order – iteration or no.
It depends on whether any users (including yourself) will use the enumerator feature of the library. If there are use cases where it would make the job better, or easier, it will be nice to have. Many enumerator libraries, including .NET, have requirements that the underlying set not change during enumeration.
If, on the other hand, the feature is not used much, and only leads to frustration, then take it out. If it’s useful to you internally as the implementor of the library, then hide it from public use. You know the ins and outs of its use and will not shoot yourself in the foot with it (hopefully).