Every competent Java programmer knows that you need to use String.equals() to compare a string, rather than == because == checks for reference equality.
When I’m dealing with strings, most of the time I’m checking for value equality rather than reference equality. It seems to me that it would be more intuitive if the language allowed string values to be compared by just using ==.
As a comparison, C#’s == operator checks for value equality for strings. And if you really needed to check for reference equality, you can use String.ReferenceEquals.
Another important point is that Strings are immutable, so there is no harm to be done by allowing this feature.
Is there any particular reason why this isn’t implemented in Java?
16
I guess it’s just consistency, or “principle of least astonishment”. String is an object, so it would be surprising if was treated differently than other objects.
At the time when Java came out (~1995), merely having something like String
was total luxury to most programmers who were accustomed to representing strings as null-terminated arrays. String
‘s behavior is now what it was back then, and that’s good; subtly changing the behavior later on could have surprising, undesired effects in working programs.
As a side note, you could use String.intern()
to get a canonical (interned) representation of the string, after which comparisons could be made with ==
. Interning takes some time, but after that, comparisons will be really fast.
Addition: unlike some answers suggest, it’s not about supporting operator overloading. The +
operator (concatenation) works on String
s even though Java doesn’t support operator overloading; it’s simply handled as a special case in the compiler, resolving to StringBuilder.append()
. Similarly, ==
could have been handled as a special case.
Then why astonish with special case +
but not with ==
? Because, +
simply doesn’t compile when applied to non-String
objects so that’s quickly apparent. The different behavior of ==
would be much less apparent and thus much more astonishing when it hits you.
10
James Gosling, the creator of Java, explained it this way back in July 2000:
I left out operator overloading as a fairly personal
choice because I had seen too many people abuse it in C++. I’ve spent
a lot of time in the past five to six years surveying people about
operator overloading and it’s really fascinating, because you get the
community broken into three pieces: Probably about 20 to 30 percent of
the population think of operator overloading as the spawn of the
devil; somebody has done something with operator overloading that has
just really ticked them off, because they’ve used like + for list
insertion and it makes life really, really confusing. A lot of that
problem stems from the fact that there are only about half a dozen
operators you can sensibly overload, and yet there are thousands or
millions of operators that people would like to define — so you have
to pick, and often the choices conflict with your sense of intuition.
14
Consistency within the language. Having an operator that acts differently can be surprising to the programmer. Java doesn’t allow users to overload operators – therefore reference equality is the only reasonable meaning for ==
between objects.
Within Java:
- Between numeric types,
==
compares numeric equality - Between boolean types,
==
compares boolean equality - Between reference types,
==
compares reference equality- Use
.equals(Object o)
to compare values
- Use
That’s it. Simple rule and simple to identify what you want. This is all covered in section 15.21 of the JLS. It comprises three subsections that are easy to understand, implement, and reason about.
Once you allow overloading of ==
, the exact behavior isn’t something that you can look to the JLS and put your finger on a specific item and say “that’s how it works,” the code can become difficult to reason about. The exact behavior of ==
may be surprising to a user. Every time you see it, you have to go back and check to see what it actually means.
Since Java doesn’t allow for overloading of operators, one needs a way to have a value equality test that you can override the base definition of. Thus, it was mandated by these design choices. ==
in Java tests numeric for numeric types, boolean equality for boolean types, and reference equality for everything else (which can override .equals(Object o)
to do whatever they want for value equality).
This is not an issue of “is there a use case for a particular consequence of this design decision” but rather “this is a design decision to facilitate these other things, this is a consequence of it.”
String interning, is one such example of this. According to the JLS 3.10.5, all string literals are interned. Other strings are interned if one invokes .intern()
on them. That "foo" == "foo"
is true is a consequence of design decisions made to minimize the memory footprint taken up by String literals. Beyond that, String interning is something that is at the JVM level that has a little bit of exposure to the user, but in the overwhelming vast majority of cases, should not be something that concerns the programmer (and use cases for programmers wasn’t something that was high on the list for the designers when considering this feature).
People will point out that +
and +=
are overloaded for String. However, that is neither here nor there. It remains the case that if ==
has a value equality meaning for String (and only String), one would need a different method (that only exists in String) for reference equality. Furthermore, this would needlessly complicate methods that take Object and expect ==
to behave one way and .equals()
to behave another requiring users to special case all those methods for String.
The addition of auto boxing/unboxing of primitive wrappers (e.g. java.lang.Integer, etc.) muddles things a bit because comparing an int
with an Integer
(or vice-versa) with ==
will indeed compare the integer values of each of those things, but if both variables are of type Integer
then using ==
may give an unexpected answer where the integer values are equal but the references are different, yielding a false
comparison.
The consistent contract for ==
on Objects is that it is reference equality only and that .equals(Object o)
exists for all objects which should test for value equality. Complicating this complicates far too many things.
9
Java doesn’t support operator overloading, which means ==
only applies to primitive types or references. Anything else requires invocation of a method. Why the designers did this is a question only they can answer. If I had to guess, it’s probably because operator overloading brings complexity they weren’t interested in adding.
I’m no expert in C#, but the designers of that language appear to have set it up such that every primitive is a struct
and every struct
is an object. Because C# allows operator overloading, that arrangement makes it very easy for any class, not just String
, to make itself work in the “expected” way with any operator. C++ allows the same thing.
15
This has been made different in other languages.
In Object Pascal (Delphi/Free Pascal) and C#, the equality operator is defined to compare values, not references, when operating on strings.
Particularly in Pascal, string is a primitive type (one of the things I really love about Pascal, getting NullreferenceException just because of an uninitialized string is simply irritating) and have copy-on-write semantics thus making (most of time) string operations very cheap (in other words, only noticeable once you start concatenating multi-megabyte strings).
So, it’s a language design decision for Java. When they designed the language they followed the C++ way (like Std::String) so strings are objects, which is IMHO an hack to compensate of C lacking an real string type, instead of making strings an primitive (which they are).
So for a reason why, I can only speculate they made that to easy on their side and not coding the operator make an exception on compiler to strings.
3
In Java, there is no operator overloading whatsoever, and that’s why the comparison operators are only overloaded for the primitive types.
The ‘String’ class is not a primitive, thus it does not have an overloading for ‘==’ and uses the default of comparing the address of the object in the computer’s memory.
I’m not sure, but I think that in Java 7 or 8 oracle made an exception in the compiler to recognize str1 == str2
as str1.equals(str2)
11
Java seems to have been designed to uphold a fundamental rule that the ==
operator should be legal any time one operand can be converted to the type of the other, and should compare the result of such conversion with the non-converted operand.
This rule is hardly unique to Java, but it has some far-reaching (and IMHO unfortunate) effects on the design of other type-related aspects of the language. It would have been cleaner to specify the behaviors of ==
with regard to particular combinations of operand types, and forbid combinations of types X and Y where x1==y1
and x2==y1
wouldn’t imply x1==x2
, but languages seldom do that [under that philosophy, double1 == long1
would either have to indicate whether double1
is not an exact representation of long1
, or else refuse to compile; int1==Integer1
should be forbidden, but there should be a convenient and efficient non-throwing means of testing whether an object is a boxed integer with particular value (comparison with something that isn’t a boxed integer should simply return false
)].
With regard to applying the ==
operator to strings, if Java had forbidden direct comparisons between operands of type String
and Object
, it could have pretty well avoided surprises in the behavior of ==
, but there’s no behavior it could implement for such comparisons that wouldn’t be astonishing. Having two string references kept in type Object
behave differently from references kept in type String
would have been far less astonishing than having either of those behaviors differ from that of a legal mixed-type comparison. If String1==Object1
is legal, that would imply that the only way for the behaviors of String1==String2
and Object1==Object2
to match String1==Object1
would be for them to match each other.
9
In general, there is very good reason to want to be able to test if two object references point to the same object. I’ve had plenty of times that I’ve written
Address oldAddress;
Address newAddress;
... populate values ...
if (oldAddress==newAddress)
... etc ...
I may or may not have an equals function in such cases. If I do, the equals function may compare the entire contents of both objects. Often it just compares some identifier. “A and B are references to the same object” and “A and B are two different objects with the same content” are, of course, two very different ideas.
It’s probably true that for immutable objects, like Strings, this is less of an issue. With immutable objects, we tend to think of the object and the value as being the same thing. Well, when I say “we”, I mean “I”, at least.
Integer three=new Integer(3);
Integer triangle=new Integer(3);
if (three==triangle) ...
Of course that returns false, but I can see someone thinking it should be true.
But once you say that == compares reference handles and not contents for Objects in general, making a special case for Strings would be potentially confusing. As someone else on here said, what if you wanted to compare the handles of two String objects? Would there be some special function to do it only for Strings?
And what about …
Object x=new String("foo");
Object y=new String("foo");
if (x==y) ...
Is that false because they are two different objects, or true because they are Strings whose contents are equal?
So yes, I understand how programmers get confused by this. I’ve done it myself, I mean write if myString == “foo” when I meant if myString.equals(“foo”). But short of redesigning the meaning of the == operator for all objects, I don’t see how to address it.
6
This is a valid question for Strings
, and not only for strings, but also for other immutable Objects representing some “value”, e.g. Double
, BigInteger
, and even InetAddress
.
For making the ==
operator usable with Strings and other value-classes, I see three alternatives:
-
Have the compiler know about all these value-classes and the way to compare their contents. If it were just a handful of classes from the
java.lang
package, I’d consider that, but that doesn’t cover cases like InetAddress. -
Allow operator overloading so a class defines its
==
comparison behaviour. -
Remove the public constructors and have static methods returning instances from a pool, always returning the same instance for the same value. To avoid memory leaks, you need something like SoftReferences in the pool, which didn’t exist in Java 1.0. And now, to maintain compatibility, the
String()
constructors can’t be removed anymore.
The only thing that could still be done today would be to introduce operator overloading, and personally I don’t wouldn’t like Java to go that route.
To me, code readability is most important, and a Java programmer knows that the operators have a fixed meaning, defined in the language specification, whereas methods are defined by some code, and their meaning has to be looked up in the method’s Javadoc. I’d like to stay with that distinction even if it means that String comparisons will not be able use the ==
operator.
There’s just one aspect of Java’s comparisons that’s annoying to me: the effect of auto-boxing and -unboxing. It hides the distinction between the primitive and the wrapper type. But when you compare them with ==
, they are VERY different.
int i=123456;
Integer j=123456;
Integer k=123456;
System.out.println(i==j); // true or false? Do you know without reading the specs?
System.out.println(j==k); // true or false? Do you know without reading the specs?