I read code way more often than I write code, and I’m assuming that most of the programmers working on industrial software do this. The advantage of type inference I assume is less verbosity and less written code. But on the other hand if you read code more often, you’ll probably want readable code.
The compiler infers the type; there are old algorithms for this. But the real question is why would I, the programmer, want to infer the type of my variables when I read the code? Isn’t it more faster for anyone just to read the type than to think what type is there?
Edit: As a conclusion I understand why it is useful. But in the category of language features I see it in a bucket with operator overloading – useful in some cases but affecting readability if abused.
4
Let’s take a look at Java. Java 8 can’t have variables with inferred types. This means I frequently have to spell out the type, even if it is perfectly obvious to a human reader what the type is:
int x = 42; // yes I see it's an int, because it's a bloody integer literal!
// Why the hell do I have to spell the name twice?
SomeObjectFactory<OtherObject> obj = new SomeObjectFactory<>();
And sometimes it’s just plain annoying to spell out the whole type.
// this code walks through all entries in an "(int, int) -> SomeObject" table
// represented as two nested maps
// Why are there more types than actual code?
for (Map.Entry<Integer, Map<Integer, SomeObject<SomeObject, T>>> row : table.entrySet()) {
Integer rowKey = entry.getKey();
Map<Integer, SomeObject<SomeObject, T>> rowValue = entry.getValue();
for (Map.Entry<Integer, SomeObject<SomeObject, T>> col : rowValue.entrySet()) {
Integer colKey = col.getKey();
SomeObject<SomeObject, T> colValue = col.getValue();
doSomethingWith<SomeObject<SomeObject, T>>(rowKey, colKey, colValue);
}
}
This verbose static typing gets in the way of me, the programmer. Most type annotations are repetitive line-filler, content-free regurgiations of what we already know. However, I do like static typing, as it can really help with discovering bugs, so using dynamic typing isn’t always a good answer. Type inference is the best of both worlds: I can omit the irrelevant types, but still be sure that my program (type-)checks out.
While type inference is really useful for local variables, it should not be used for public APIs which have to be unambiguously documented. And sometimes the types really are critical for understanding what’s going on in the code. In such cases, it would be foolish to rely on type inference alone.
There are many languages that support type inference. For example:
-
C++. The
auto
keyword triggers type inference. Without it, spelling out the types for lambdas or for entries in containers would be hell. -
C#. You can declare variables with
var
, which triggers a limited form of type inference. It still manages most cases where you want type inference. In certain places you can leave out the type completely (e.g. in lambdas). -
Haskell, and any language in the ML family. While the specific flavour of type inference used here is quite powerful, you still often see type annotations for functions, and for two reasons: The first is documentation, and the second is a check that type inference actually found the types you expected. If there is a discrepancy, there’s likely some kind of bug.
And since this answer was originally written, type inference has become more popular. E.g. Java 10 has finally added C#-style inference. We’re also seeing more type systems on top of dynamic languages, e.g. TypeScript for JavaScript, or mypy for Python, which make heavy use of type inference in order to keep the overhead of type annotations manageable.
14
It’s true that code is read far more often than it is written. However, reading also takes time, and two screens of code are harder to navigate and read than one screen of code, so we need to prioritize to pack the best useful-information/reading-effort ratio. This is a general UX principle: Too much information at once overwhelms and actually degrades the effectiveness of the interface.
And it is my experience that often, the exact type isn’t (that) important. Surely you sometimes nest expressions: x + y * z
, monkey.eat(bananas.get(i))
, factory.makeCar().drive()
. Each of these contains sub-expressions that evaluate to a value whose type is not written out. Yet they are perfectly clear. We’re okay with leaving the type unstated because it’s easy enough to figure out from the context, and writing it out would do more harm than good (clutter the understanding of the data flow, take valuable screen and short-term memory space).
One reason to not nest expressions like there’s no tomorrow is that lines get long and the flow of values becomes unclear. Introducing a temporary variable helps with this, it imposes an order and gives a name to a partial result. However, not everything that benefits from these aspects also benefits from having its type spelled out:
user = db.get_poster(request.post['answer'])
name = db.get_display_name(user)
Does it matter whether user
is an entity object, an integer, a string, or something else? For most purposes, it does not, it’s enough to know that it represents a user, comes from the HTTP request, and is used to fetch the name to display in the lower right corner of the answer.
And when it does matter, the author is free to write out the type. This is a freedom that must be used responsibly, but the same is true for everything else that can enhance readability (variable and function names, formatting, API design, white space).
And indeed, the convention in Haskell and ML (where everything can be inferred without extra effort) is to write out the types of non-local functions functions, and also of local variables and functions whenever appropriate. Only novices let every type be inferred.
3
I think type inference is quite important and should be supported in any modern language. We all develop in IDEs and they could help a lot in case you want to know the inferred type, just few of us hack in vi
.
Think of verbosity and ceremony code in Java for instance.
Map<String,HashMap<String,String>> map = getMap();
But you can say it’s fine my IDE will help me, it could be a valid point. However, some features wouldn’t be there without the help of type inference, C# anonymous types for instance.
var person = new {Name="John Smith", Age = 105};
Linq wouldn’t be as nice as it is now without the help of type inference, Select
for instance
var result = list.Select(c=> new {Name = c.Name.ToUpper(), Age = c.DOB - CurrentDate});
This anonymous type will be inferred neatly to the variable.
I dislike type inference on return types in Scala
because I think your point applies here, it should be clear for us what a function returns so we can use the API more fluently
2
I think the answer to this is really simple: it saves reading and writing redundant information. Particularly in object oriented languages where you have a type on both sides of the equal sign.
Which also tells you when you should or should not use it — when the information isn’t redundant.
2
Suppose one sees the code:
someBigLongGenericType variableName = someBigLongGenericType.someFactoryMethod();
If someBigLongGenericType
is assignable from the return type of someFactoryMethod
, how likely would someone reading the code be to notice if the types don’t match precisely, and how readily could someone who did notice the discrepancy recognize whether it was intentional or not?
By allowing inference, a language can suggest to someone who is reading code that when the type of a variable is explicitly stated the person should try to find a a reason for it. This in turn allows people who are reading code to better focus their efforts. If, by contrast, the vast majority of the times when a type is specified, it happens to be exactly the same as what would have been inferred, then someone who is reading code may be less prone to notice the times that it is subtly different.
I see that there are a number of fine answers already. Some of which I will be repeating but sometimes you just want to put things in your own words. I will comment with some examples from C++ because that is the language with which I have the most familiarity.
What is necessary is never unwise. Type inference is necessary to make other language features practical. In C++ it is possible to have unutterable types.
struct {
double x, y;
} p0 = { 0.0, 0.0 };
// there is no name for the type of p0
auto p1 = p0;
C++11 added lambdas which are also unutterable.
auto sq = [](int x) {
return x * x;
};
// there is no name for the type of sq
Type inference also underpins templates.
template <class x_t>
auto sq(x_t const& x)
{
return x * x;
}
// x_t is not known until it is inferred from an expression
sq(2); // x_t is int
sq(2.0); // x_t is double
But your questions were “why would I, the programmer, want to infer the type of my variables when I read the code? Isn’t it more faster for anyone just to read the type than to think what type is there?”
Type inference removes redundancy. When it comes to reading code it may sometimes be faster and easier to to have redundant information in the code but redundancy can overshadow the useful information. For example:
std::vector<int> v;
std::vector<int>::iterator i = v.begin();
It does not take much familiarity with the standard library for a C++ programmer to identify that i is an iterator from i = v.begin()
so the explicit type declaration is of limited value. By its presence it obscures details that are more important (such as that i
points to the beginning of the vector). The fine answer by @amon provides an even better example of verbosity overshadowing important details. In contrast using type inference gives greater prominence to the important details.
std::vector<int> v;
auto i = v.begin();
While reading code is important it is not sufficient, at some point you will have to stop reading and start writing new code. Redundancy in code makes modifying code slower and harder. For example, say I have the following fragment of code:
std::vector<int> v;
std::vector<int>::iterator i = v.begin();
In the case that I need to change the value type of the vector to double changing the code to:
std::vector<double> v;
std::vector<double>::iterator i = v.begin();
In this case I have to modify the code in two places. Contrast with type inference where the original code is:
std::vector<int> v;
auto i = v.begin();
And the modified code:
std::vector<double> v;
auto i = v.begin();
Note that I now only have to change one line of code. Extrapolate this to a large program and type inference can propagate changes to types much more quickly than you can with an editor.
Redundancy in code creates the possibility of bugs. Any time your code is dependent on two pieces of information being kept equivalent there is a possibility of mistake. For example, there is an inconsistency between the two types in this statement which is probably not intended:
int pi = 3.14159;
Redundancy makes intention harder to discern. In some cases type inference can be easier to read and understand because it is simpler than explicit type specification. Consider the fragment of code:
int y = sq(x);
In the case that sq(x)
returns an int
, it is not obvious whether y
is an int
because it is the return type of sq(x)
or because it suits the statements that use y
. If I change other code such that sq(x)
no longer returns int
, it is uncertain from that line alone whether the type of y
should be updated. Contrast with the same code but using type inference:
auto y = sq(x);
In this the intent is clear, y
must be the same type as returned by sq(x)
. When the code changes the return type of sq(x)
, the type of y
changes to match automatically.
In C++ there is a second reason why the above example is simpler with type inference, type inference can not introduce implicit type conversion. If the return type of sq(x)
is not int
, the compiler with silently insert an implicit conversion to int
. If the return type of sq(x)
is a type complex type which defines operator int()
, this hidden function call may be arbitrarily complex.
1