I program mostly in statically typed languages, like C++ and Java. A common strategy employed in languages like these to handle dealing with collections of objects which are related, but which need to employ specific behaviors, is to use polymorphism and inheritance hierarchies. But Polymorphism only works when the Base class has a method (virtual function) which can then be overridden by every Derived class. But there are often cases where a Derived class needs to have a method, and it doesn’t make sense to place a method of the same name in the Base class, because it is very specific to the Derived class.
Let me give a concrete example that many of us can probably relate to: the HTML DOM.
Javascript programmers are quite used to doing things like iterating over collections of DOM nodes. They take for granted the dynamically-typed nature of Javascript when they do something like:
node = document.getElementById("foo");
for (var i = 0; i < node.childNodes.length; ++i)
doSomething(node.childNodes[i]);
Something like this would be very messy to do in a statically typed language like Java, because inheritance-based polymorphism doesn’t lend itself well to situations where Derived classes have methods which don’t exist in the Base class. For example, when iterating over the above HTML Nodes, we might find a DIV node, which has certain methods, and a Text node, which has certain different methods, and a TABLE node, which has its own unique methods – and it wouldn’t make much sense for ALL of these various methods to exist in a common Base class “Node”, because they are highly specific to each Derived type. (e.g. a DIV node has no need of a “cells” method, but a TABLE node does.)
So, in a statically typed language, you’d be forced to fall back on something like Type Enums and a lot of messy down-casting to check what type of Node you have. So basically you end up with messy code like:
Node node = document.getElementById("foo");
for (int i = 0; i < node.childNodes().length(); ++i)
{
Node n = node.childNodes().elementAt(i);
if (n.type() == Node.DIV_NODE) doSomethingWithDiv(((DivNode)(n)));
else if (n.type() == Node.TABLE_NODE) doSomethingWithTable(((TableNode)(n)));
/* etc... */
}
This is similar to what C programmers do, and it’s seemingly the sort of thing virtual functions were designed to avoid. And yet, when you’re faced with a situation where there are various different methods in a Derived class for which it doesn’t make any sense to place in a common base class, you have no recourse but to resort to something like Type Enums.
So, how is this problem usually addressed in statically-typed languages? Is there a better design pattern than Type Enums?
3
This isn’t a limitation in statically typed languages, there exists a solution to this problem which is to declare a common Interface between the various classes for the operations that you care about. This allows you to do whatever you need to do, while still being type safe.
17
Best fit would probably be Visitor pattern.
And if you need to do things like this, then you should re-think your object model, because having behavior, that works only on specific type should be part of that type.
This is what interfaces are for. You shouldn’t be using a base class for this. Base classes signal a relationship of the A is a B
kind; interfaces say A can do X
.
Using the traditional animal example: a Bee
is an Insect
, which is an Animal
. A Sparrow
is a Bird
, which is also an Animal
. Both bees and sparrows can fly, but not all animals can fly, and what’s worse, not even all insects nor birds can fly. So where does the fly()
method go? Animal
? Or should we restructure our class hierarchy so that Bee
and Sparrow
have a common ancestor? But if we do that, we cannot have a common ancestor for Penguin
and Sparrow
that provides a beak without either making penguins fly or equipping bees with beaks (draw the class hierarchies out on paper if you don’t get it).
The solution is to have a FlyingAnimal
interface, which provides the signature for fly()
; whenever you need a collection of flying animals, declare it as Collection<FlyingAnimal>
.
Using interface inheritance, you can group interfaces into ‘super-interfaces’, e.g., you can have an FlyingAndWalkingAnimal
which inherits fly()
from FlyingAnimal
and walk()
from WalkingAnimal
.
To me it seems your problem is either a flaw in your understanding or a flaw in your design. Or both. :)
What your example attempts do (and does in JS) is to execute a method with the same name on each of a bunch of objects of unrelated types. That’s one of those hacks you do when you knocking up a quick throw-away tool in some scripting language, and it has its place there.
But static languages are statically typed so that the compiler can catch simple errors like calling a non-existent method at compile time. When you are doing development in such a language, the assumption naturally is that you mean business, that your program is to be run at some customers, and that you want to weed out as many bugs as possible as early as possible.
One of the things a statically language requires in order to be able to hold your hand is that you express it in your design if a group of types support the same operation. You do this by deriving them from the same base class or interface, which declares that this operation is available in all derived classes. If you don’t do this, your code will have to invent numerous hacks in order to achieve the same goal. (A wise man once said that a switch over types — which is what your if(n.type() == Node.SOME_TYPE) ... else if(n.type() == Node.SOME_OTHER_TYPE) ...
really amounts to — is just unreasonable fear of virtual function.)
So: If those types support the same operation, and if you want to execute that operation on objects of any of those types without really caring which concrete type they’re of, then you need to declare those operations available in some common base class or interface. (For the corner case that you have only a relatively small number of types, but a number of unrelated operations, which would unreasonably fatten a base class’ interface, there’s idioms like the Visitor Pattern which employ multiple dispatch to solve that problem.)
That you need this at all is a failure in your design. Subtypes exist to act as a refinement of your base type. If you need specific behaviors, then work with the specific type.
The better pattern is to let the collection supply the collection of specific types you need (ala CSS/jquery selectors).
1
There’s always type-checking either at compile-time or at run-time you just don’t call it so. For example, the JavaScript fragment looks very impressive, but it hides the details about doSomething. Yes, it can accept any object and any object can have very specific set of methods, but this function should access methods and properties so assumes the methods it needs are there, otherwise it will give you run-time error. So you either previously filter objects for passing to be safe for doSomething or it will do this inside. The statically typed languages have luxury (or burden, depending on the POV) to do much of this work at compile-time leaving part to run-time with a little help of RTTI
All statically typed languages have a built-in method which is quite equivalent, such as instanceof
or dynamic_cast
. DOM elements do have some shared interfaces which you could put in a base class. However, there are statically typed tools which are not inheritance- for example, boost::variant
and boost::any
which can provide dynamic typing between elements which are not related by inheritance.
1
There are several ways of handling this in statically typed languages. Probably the most elegant is pattern matching. Here’s Haskell:
doSomething (DivNode n) = does something with a div node
doSomething (TableNode n) = does something with a table node
doSomething _ = do nothing
The same, or similar, mechanisms exist in other statically typed languages (F#, Scala as far as I know). In C++, the same can be emulated (albeit more verbosely) with the Boost.Variant and the static visitor:
struct visit_node : static_visitor<void> {
void operator ()(DivNode const& n) {
// Do something with a div node.
}
void operator ()(TableNode const& n) {
// Do something with a table node.
}
template <typename T>
void operator ()(T const& n) {
// Do nothing.
}
}
(It’s worth noting that this looks like compile-time overload resolution but thanks to the magic of Boost.Variant, this will work at runtime, and be completely type safe.)
More generally, a visitor pattern can be used to implement it, but they are even more verbose than Boost’s static_visitor
helper and (which essentially abstracts away the cruft) which makes them very unresponsive to changes in the class hierarchy.
And yes, you can obviously hard-code the dispatch via dynamic casts as some other answers have shown. That’s easily the worst possible solution, though. Like the above methods, it hard-codes the type hierarchy, but it does so in a particularly verbose way.
In Java, you can use reflection to determine if an object has a particular method on it. But it’s better to collect your interface methods into an interface, and if you need to see if an object provides that interface, it’s just a cast away.
Similarly, while C++ does not provide reflection, it does provide RTTI and dynamic_cast
(or, better yet, you’re using shared_ptr
and dynamic_ptr_cast
) which you can use to query a C++ object for an interface (which is a pattern expressed via multiple inheritance, rather than a specific language feature).
I don’t see where you have different problem in statically typed languages versus dynamically typed ones. Because the only difference in dynamically typed language is that you don’t have to declare the interface and that an error is only detected at runtime.
However in both cases you have to either:
- Check the actual type of the instance and only call the method you know it has. The only difference is that statically typed language will need a cast.
- Have all the methods in all the types, even the ones where they don’t make sense.
- Use a Visitor Pattern, which is quite simple in statically typed languages, while it requires introspection in most dynamic ones.
There is a handful of languages that have dynamic multiple dispatch, but IIRC there are both dynamic and static ones that do (and none widely used).
Your example begs the question: How do you KNOW that you are iterating over a set of DOMs? How do you know something else didn’t sneak in there? How do you know the new hire from PoliticallyCorrectUniversity read the documentation closely enough, and didn’t call your routine with the wrong list of stuff?
You want the compiler to catch these things for you. That’s what strong typing and static typing are all about. (Recall the ALGOL committee’s reply to Wirth’s proposal for default type rules for ALGOL.)
You can tell the compiler what you are doing in one of two ways: put a virtual method in the base class (or in an interface), with an error barf for the default case that everyone EXCEPT the privileged derived class will inherit, or, as you said, “fall back on something like Type Enums and a lot of messy down-casting to check what type of Node you have”.
Here’s some rules of using inheritance:
- Separate your code to two parts: interface and implementation
- interface must publish all data via the interface
- implementation can implement only the functions available in the interface. No other implementation is allowed.
- constructor parameters of derived class are part of the “interface” of an object — the interface is divided to two parts: common base class, and constructor parameters of derived class
- All specific behaviour is handled via constructor parameters
First, if you want something similar to what you’d have written in javascript, the if statements would probably go in the body of the doSomething method rather than outside. It’s not cleaner, just closer to your javascript example.
That being said, you can do the exact same thing in a static language using method Overloading. Actually, it’s even better : you’d remove all the if statements alltogether!
Let’s say both DivNode
and TableNode
inherit from a Node
type.
Then, if ou write the 3 following functions, you don’t need any for loop or casting :
private void doSomething(Node node) {
throw new RuntimeException ("This method should never be called !");
}
private void doSomething (DivNode divNode) {
// function body
}
private void doSomething (TableNode tableNode) {
// function body
}
1