In C# it is legal to write
Animal[] a = new Giraffe[4]; //with obvious relationships
Because arrays are covariant. However, this is a breaking relationship. It’s a runtime exception to then apply a different type of Animal to this array because the backing store is a Giraffe
.
a[0] = new Cat(); //KABOOM
Why doesn’t the language create a covariant backing store? This doesn’t appear to be a limitation of the language because I can create this type of data structure myself:
class Program
{
static void Main(string[] args)
{
A[] a = new B[2];
a[0] = new B();
a[1] = new C();//runtime exception
var containerLegal = new Container<B, A>();
containerLegal.Add(new B());
containerLegal.Add(new C());//works fine
}
}
class A { }
class B : A { }
class C : A { }
class Container<T,S> where T: class, S
{
S[] s = new S[4];
int currentIndex = 0;
public void Add(S t)
{
s[currentIndex] = t;
currentIndex++;
}
}
I’m cheating a little bit cause I’m not using T
in Container<>
…but I think its a fair relationship anyhow.
1
The problem isn’t that you couldn’t store a reference to a C in that array (though I’m pretty sure it’s nontrivial to implement). The problem is, if you permit that, code like this (note that this can be spread out over several unrelated methods, due to arrays being reference types) becomes wrong:
B[] bs = new B[1];
A[] as = bs;
as[0] = new C();
B b = bs[0]; /* It's a C, but we use it as a B, and a C is not a B! */
How does it go wrong? Well, if it doesn’t notice and assumes a wrong object layout, you get all kinds of fun usually reserved for C, C++, and unsafe
code:
- Re-interpreting an int as a float (or a reference to
T
as a reference toU
, for recursive instances of such fun). - Accessing uninitialized memory (or another object’s memory)
- Calling a method with the wrong type or number of arguments.
- Using padding bytes as if they has any meaning.
- Overwriting metadata the GC/memory management system needs to operate.
Alternatively, you could restore safety by making all code reading from arrays do run-time type checks (as opposed to doing that check when writing to the array, as it currently does) and throwing an exception when an item has an unexpected type. That’s slow and you’d just get the same exception in a different place (when you access the reference vs. when you store it), so you didn’t “fix” anything.
That’s because you can’t fix it, such code is inherently broken. Your Container
would exhibit the same problem if it permitted getting references back from the S[]
(e.g., an indexer) unless it always returns the most general type (an S
, not a T
; otherwise you run into the same problem).
2
Because that’s what you told it to do. That is the equvalient of:
Girafe[] g = new Girafe[4];
Animal[] a = g; // girafe array has been created by this time;
Your statement is clear about all types, if you let the left side determine the type of the right, you’d run into some problems. Consider..
Girafe[] g = GetGArray():
Animal[] a = GetGArray();
Does a equal g? If it doesn’t then a NEW backing store would have to be created, if it does, then the compiler would have to be determining what happens based upon whether it was a new or a function call. But then there is inlining…
Simpiliest to just do exactly what the line says.
2
As noted, arrays bind to whatever type is specified when they are created. The fact that even mutable arrays may be passed covariantly is a consequence of a few things:
-
Arrays predate generics, and until covariant generics were added to .NET there was no immutable type which could behave covariantly, so the arrays (which were all mutable) were the only thing that could be covariant.
-
To efficiently sort an array, one must be able to read items out of the array and write them back to it. If one has a reference of type
Animal[]
, even if it points to aCat[]
, one may without any possibility of run-time error read out any item of that array into a reference of typeAnimal
and later store it back to the array from which it came. The ability to read out items might not be totally necessary ifArray
contained some methods likeCopyItem
andSwapItem
, but being able to read out an item and write it back after performing many other changes to an array is more efficient than would be having to decompose every sort into a sequence of swaps. While one could design a covariant interface which would allow an array or other collection to be efficiently sorted without requiring the sorting code to directly write anything into the collection itself, it’s much easier to simply allow array covariance. -
An array knows the type of element it’s supposed to contain. This allows it to enforce type-safety at runtime. Although the generic collections added in .NET 2.0 also know their element type, most generic collections in Java don’t. If one tried to pass an
ArrayList<Cat>
to code expecting anArrayList<Animal>
, and if the latter code tried to add aDog
to it, there would be no way theArrayList<Cat>
could know that it was created as anArrayList<Cat>
and should reject the request.
If one declared arrays using syntax like Array<Dog> myDogs = new Array<Dog>(34);
rather than Dog[] myDogs = new Dog[34];
, then it might be possible to have multiple kinds of array references, so that e.g. Array<Animal> myDogs = new Array<Dog>(34);
wouldn’t compile, but SortableArray<Animal> myDogs = new Array<Dog>(34);
would. Neither Java nor .NET supports multiple array types, however.