In C++ a reference argument to a function allows the function to make the reference refer to something else:
int replacement = 23;
void changeNumberReference(int& reference) {
reference = replacement;
}
int main() {
int i = 1;
std::cout << "i=" << i << "n"; // i = 1;
changeNumberReference(i);
std::cout << "i=" << i << "n"; // i = 23;
}
Analogously, a constant reference argument to a function will throw a compile time error if we try to change the reference:
void changeNumberReference(const int& reference) {
reference = replacement; // compile-time error: assignment of read-only reference 'reference'
}
Now, with Java, the docs say that functions arguments of non-primitive types are references. Example from the official docs:
public void moveCircle(Circle circle, int deltaX, int deltaY) {
// code to move origin of circle to x+deltaX, y+deltaY
circle.setX(circle.getX() + deltaX);
circle.setY(circle.getY() + deltaY);
// code to assign a new reference to circle
circle = new Circle(0, 0);
}
Then circle is assigned a reference to a new Circle object with x = y
= 0. This reassignment has no permanence, however, because the reference was passed in by value and cannot change.
To me this doesn’t look at all like C++ references. It doesn’t resemble regular C++ references because you cannot make it refer to something else, and it doesn’t resemble C++ const references because in Java, the code that would change (but really doesn’t) the reference does not throw a compile-time error.
This is more similar in behavior to C++ pointers. You can use it to change the pointed objects values, but you cannot changes the pointer’s value itself in a function. Also, as with C++ pointers (but not with C++ references), in Java you can pass “null” as value for such an argument.
So my question is: Why does Java use the notion of “reference”? Is it to be understood that they don’t resemble C++ references? Or do they indeed really resemble C++ references and I’m missing something?
2
Why? Because, although consistent terminology is generally good for the entire profession, language designers don’t always respect the language use of other language designers, particularly if those other languages are perceived as competitors.
But really, neither use of ‘reference’ was a very good choice. “References” in C++ are simply a language construct to introduce aliases (alternative names for exactly the same entity) explicitly. Things would have been much clearer of they had simply called the new feature “aliases” in the first place. However, at that time the big difficulty was to make everyone understand the difference between pointers (which require dereferencing) and references (which don’t), so the important thing was that it was called something other than “pointer”, and not so much specifically what term to use.
Java doesn’t have pointers, and is proud of it, so using “pointer” as a term was no option. However, the “references” that it does have behave quite a bit as C++’s pointers do when you pass them around – the big difference is that you can’t do the nastier low-level operations (casting, adding…) on them, but they result in exactly the same semantics when you pass around handles to entities that are identical vs. entities that merely happen to be equal. Unfortunately, the term “pointer” carries so many negative low-level associations that it’s unlikely ever to be accepted by the Java community.
The result is that both languages use the same vague term for two rather different things, both of which might profit from a more specific name, but neither of which is likely to be replaced any time soon. Natural language, too, can be frustrating sometimes!
11
A reference is a thing that refers to another thing. Various languages have ascribed a more specific meaning to the word “reference”, usually to “some thing like a pointer without all the bad aspects”. C++ ascribes a specific meaning, as do Java or Perl.
In C++, references are more like aliases (which can be implemented via a pointer). This allows pass by reference or out arguments.
In Java, references are pointers except that this isn’t a reified concept of the language: All objects are references, primitive types like numbers are not. They don’t want to say “pointer” because there is no pointer arithmetic and there are no reified pointers in Java, but they want to make clear that when you pass an Object
as an argument, the object is not copied. This also isn’t pass by reference, but something more like pass by sharing.
It largely goes back to Algol 68, and partly to a reaction against the way C defines pointers.
Algol 68 defined a concept called a reference. It was pretty much the same as (for one example) a pointer in Pascal. It was a cell that contained either NIL or the address of some other cell of some specified type. You could assign to a reference, so a reference could refer to one cell at one time, and a different cell after being reassigned. It did not, however, support anything analogous to C or C++ pointer arithmetic.
At least as Algol 68 defined things, however, references were are fairly clean concept that were fairly clumsy to use in practice. Most variable definitions were actually of references, so they defined a shorthand notation to keep it from getting completely out of hand, but any more than trivial use could get clumsy pretty quickly anyway.
For example, a declaration like INT j := 7;
was really treated by the compiler as a declaration like REF INT j = NEW LOC INT := 7
. So, what you declared in Algol 68 was normally a reference, which was then initialized to refer to something that was allocated on the heap, and that was (optionally) initialized to contain some specified value. Unlike Java, however, they at least tried to let you maintain sane syntax for that instead of constantly having things like foo bar = new foo();
or trying to tell fibs about “our pointers aren’t pointers.”
Pascal and most of its descendants (both direct and…spiritual) renamed the Algol 68 reference concept to “pointer” but kept the concept itself essentially the same: a pointer was a variable that held either nil
, or the address of something you allocated on the heap (i.e., at least as originally defined by Jensen and Wirth, there was no “address-of” operator, so there was no way for a pointer to refer to a normally defined variable). Despite being “pointers”, no pointer arithmetic was supported.
C and C++ added a few twists to that. First, they do have an address-of operator, so a pointer can refer not only to something allocated on the heap, but to any variable, regardless of how it’s allocated. Second, they define arithmetic on pointers–and define array subscripts as basically just a shorthand notation for pointer arithmetic, so pointer arithmetic is ubiquitous (next to unavoidable) in most C and C++.
When Java was being invented, Sun apparently thought “Java doesn’t have pointers” was a simpler, cleaner marketing message than: “nearly everything in Java is a pointer, but these pointers are mostly like Pascal’s instead of C’s.” Since they’d decided “pointer” wasn’t an acceptable term, they needed something else, and dredged up “reference” instead, even though their references were subtly (and in some cases not so subtly) different from Algol 68 references.
Though it came out somewhat differently, C++ was stuck with roughly the same problem: the word “pointer” was already known and understood, so they needed a different word for this different thing they were adding that referred to something else, but was otherwise quite a bit different from what people understood “pointer” to mean. So, even though it’s also noticeably different from an Algol 68 reference, they re-used the term “reference” as well.
1
Notion of ‘reference type’ is a generic one, it can express both a pointer and a reference (in terms of C++), as opposed to ‘value type’.
Java’s references are really pointers without syntactic overhead of ->
an *
dereferencing.
If you look into JVM’s implementation in C++, they’re really bare pointers.
Also, C# has a notion of references similar to Java’s, but it also has pointers and ‘ref’ qualifiers on function parameters, allowing to pass value-types by reference and avoid copying just like &
in C++.
1