I saw a conference by Herb Sutter where he encourages every C++ programmer to use auto
.
I had to read C# code some time ago where var
was extensively used and the code was very hard to understand—every time var
was used I had to check the return type of the right side. Sometimes more than once, because I forgot the type of the variable after a while!
I know the compiler knows the type and I don’t have to write it, but it is widely accepted that we should write code for programmers, not for compilers.
I also know that is more easy to write:
auto x = GetX();
Than:
someWeirdTemplate<someOtherVeryLongNameType, ...>::someOtherLongType x = GetX();
But this is written only once and the GetX()
return type is checked many times to understand what type x
has.
This made me wonder—does auto
make C++ code harder to understand?
16
Short answer: More completely, my current opinion on auto
is that you should use auto
by default unless you explicitly want a conversion. (Slightly more precisely, “… unless you want to explicitly commit to a type, which nearly always is because you want a conversion.”)
Longer answer and rationale:
Write an explicit type (rather than auto
) only when you really want to explicitly commit to a type, which nearly always means you want to explicitly get a conversion to that type. Off the top of my head, I recall two main cases:
- (Common) The
initializer_list
surprise thatauto x = { 1 };
deducesinitializer_list
. If you don’t wantinitializer_list
, say the type — i.e., explicitly ask for a conversion. - (Rare) The expression templates case, such as that
auto x = matrix1 * matrix 2 + matrix3;
captures a helper or proxy type not meant to be visible to the programmer. In many cases, it’s fine and benign to capture that type, but sometimes if you really want it to collapse and do the computation then say the type — i.e., again explicitly ask for a conversion.
Routinely use auto
by default otherwise, because using auto
avoids pitfalls and makes your code more correct, more maintainable and robust, and more efficient. Roughly in order from most to least important, in the spirit of “write for clarity and correctness first”:
- Correctness: Using
auto
guarantees you’ll get the right type. As the saying goes, if you repeat yourself (say the type redundantly), you can and will lie (get it wrong). Here’s a usual example:void f( const vector<int>& v ) { for( /*…*
— at this point, if you write the iterator’s type explicitly, you want to remember to writeconst_iterator
(did you?), whereasauto
just gets it right. - Maintainability and robustness: Using
auto
makes your code more robust in the face of change, because when the expression’s type changes,auto
will continue to resolve to the correct type. If you instead commit to an explicit type, changing the expression’s type will inject silent conversions when the new type converts to the old type, or needless build breaks when the new type still works-like the old type but doesn’t convert to the old type (for example, when you change amap
to anunordered_map
, which is always fine if you aren’t relying on order, usingauto
for your iterators you’ll seamlessly switch frommap<>::iterator
tounordered_map<>::iterator
, but usingmap<>::iterator
everywhere explicitly means you’ll be wasting your valuable time on a mechanical code fix ripple, unless an intern is walking by and you can foist off the boring work on them). - Performance: Because
auto
guarantees no implicit conversion will happen, it guarantees better performance by default. If instead you say the type, and it requires a conversion, you will often silently get a conversion whether you expected it or not. - Usability: Using
auto
is your only good option for hard-to-spell and unutterable types, such as lambdas and template helpers, short of resorting to repetitivedecltype
expressions or less-efficient indirections likestd::function
. - Convenience: And, yes,
auto
is less typing. I mention that last for completeness because it’s a common reason to like it, but it’s not the biggest reason to use it.
Hence: Prefer to say auto
by default. It offers so much simplicity and performance and clarity goodness that you’re only hurting yourself (and your code’s future maintainers) if you don’t. Only commit to an explicit type when you really mean it, which nearly always means you want an explicit conversion.
Yes, there is (now) a GotW about this.
14
It’s a case-by-case situation.
It sometimes makes code harder to understand, sometimes not. Take, for instance:
void foo(const std::map<int, std::string>& x)
{
for ( auto it = x.begin() ; it != x.end() ; it++ )
{
//....
}
}
is definitely easy to understand and definitely easier to write than the actual iterator declaration.
I’ve been using C++ for a while now, but I can guarantee that I’d get a compiler error at my first shot at this because I’d forget about the const_iterator
and would initially go for the iterator
… 🙂
I’d use it for cases like this, but not where it actually obfuscates the type (like your situation), but this is purely subjective.
11
Look at it another way. Do you write:
std::cout << (foo() + bar()) << "n";
or:
// it is important to know the types of these values
int f = foo();
size_t b = bar();
size_t total = f + b;
std::cout << total << "n";
Sometimes it doesn’t help to spell the type out explicitly.
The decision whether you need to mention the type isn’t the same as the decision whether you want to split the code across multiple statements by defining intermediate variables. In C++03 the two were linked, you can think of auto
as a way to separate them.
Sometimes making the types explicit can be useful:
// seems legit
if (foo() < bar()) { ... }
vs.
// ah, there's something tricky going on here, a mixed comparison
if ((unsigned int)foo() < bar()) { ... }
In cases where you declare a variable, using auto
lets the type go unspoken just as it is in many expressions. You should probably try to decide for yourself when that helps readability and when it hinders.
You can argue that mixing signed and unsigned types is a mistake to begin with (indeed, some argue further that one should not use unsigned types at all). The reason it’s arguably a mistake is that it makes the types of the operands vitally important because of the different behaviour. If it’s a bad thing to need to know the types of your values, then it probably isn’t also a bad thing not to need to know them. So provided the code isn’t already confusing for other reasons, that makes auto
OK, right? 😉
Particularly when writing generic code there are cases where the actual type of a variable shouldn’t be important, what matters is that it satisfies the required interface. So auto
provides a level of abstraction where you ignore the type (but of course the compiler doesn’t, it knows). Working at a suitable level of abstraction can help readability quite a lot, working at the “wrong” level makes reading the code a slog.
2
There are several reasons why I dislike auto
for the general use:
- You can refactor code without modifying it. Yes, this is one of the things often listed as a benefit of using auto. Just change the return type of a function, and if all of the code that calls it uses auto, no additional effort is required! You hit compile, it builds – 0 warnings, 0 errors – and you just go ahead and check your code in without having to deal with the mess of looking through and potentially modifying the 80 places the function is used.
-
But wait, is that really a good idea? What if the type mattered in a half dozen of those use-cases, and now that code actually behaves differently? This applied especially low level hardware program. This can also implicitly break “Encapsulation” by modifying not just the input values, but the behavior itself of the private implementation of other classes that call the function.
-
1a. I’m a believer in the concept of “Self-Documenting Code”. The reasoning behind self-documenting code is that comments tend to become out-of-date, no longer reflecting what the code is doing, whereas the code itself – if written in an explicit manner – is self-explanatory, always stays up to date on its intent, and won’t leave you confused with stale comments. If types can be changed without needing to modify the code itself though, then the code/variables themselves can become stale. For example:
auto bThreadOK = CheckThreadHealth();
Except the problem is that CheckThreadHealth()
at some point was refactored to return an enum
value indicating the error status, if any, instead of a bool. But the person who made that change missed inspecting this particular line of code, and the compiler was of no help because it compiled without warnings or errors.
- You may never know what the actual types are. This is also often listed as a primary “Benefit” of
auto
. Why learn what a function is giving you, when you can just say, “It still compiles!”
-
It even kind of works, probably. I say kind of works, because even though you’re making a copy of a 500 byte struct for every loop iteration, so you can inspect a single value on it, the code is still completely functional. So even your unit tests don’t help you realize that bad code is hiding behind that simple and innocent-looking auto. Most other people scanning through the file won’t notice it on first glance either.
-
This also can be made worse if you don’t know what the type is, but you choose a variable name that makes a wrong assumption about what it is, in effect achieving the same result as in the list “1a”, but from the very beginning rather than post-refactor.
- Typing the code when initially writing it isn’t the most time consuming part of programming. Yes,
auto
makes writing some code faster initially. As a disclaimer, I do type > 100 WPM, so maybe it doesn’t bother me as much as others. But if all I had to do was write new code all day, I’d be a happy camper. The most time consuming part of programming is diagnosing hard-to-reproduce, edge-case bugs in the code, often which result from subtle non-obvious problems – such as the kind overuse of auto is likely to introduce (reference vs. copy,signed
vs.unsigned
,float
vs.int
, Boolean vs. Pointer, etc.).
-
It seems obvious to me that auto was introduced primarily as a workaround for terrible syntax with standard library template types. Rather than try to fix the template syntax that people are already familiar with – which may also be nearly impossible to do because of all of the existing code it could break – add in a keyword that basically hides the problem. Essentially what you might call a “hack”.
I actually don’t have any disagreement with the use of auto
with Standard Library containers. It’s obviously what the keyword was created for at C++11, and functions in the standard library are not likely to fundamentally change in purpose (or type for that matter), making ‘auto’ relatively safe to use. At the same time, some people feel that the “C++” doesn’t like the “C++” anymore, post of C++ Defeat The Purpose. I would be cautious about using it with your own code and interfaces that may be more volatile, and potentially subject to more fundamental changes.
Another useful application of auto
that enhances the capability of the language is creating temporaries in type-agnostic macros. This is something you couldn’t really do before, but you may do it now.
4
IMO, you’re looking at this pretty much in reverse.
It’s not a matter of auto
leading to code that’s unreadable or even less readable. It’s a matter of (hoping that) having an explicit type for the return value will make up for the fact that it’s (apparently) not clear what type would be returned by some particular function.
At least in my opinion, if you have a function whose return type isn’t immediately obvious, that’s your problem right there. What the function does should be obvious from its name, and the type of the return value should be obvious from what it does. If not, that’s the real source of the problem.
If there’s a problem here, it’s not with auto
. It’s with the rest of the code, and chances are pretty good that the explicit type is just enough of a band-aid to keep you from seeing and/or fixing the core problem. Once you’ve fixed that real problem, readability of the code using auto
will generally be just fine.
I suppose in fairness I should add: I’ve dealt with a few cases where such things weren’t nearly as obvious as you’d like, and fixing the problem was fairly untenable as well. Just for one example, I did some consulting for a company a couple years ago that had previously merged with another company. They ended up with a code base that was more “shoved together” than really merged. The constituent programs had started out using different (but quite similar) libraries for similar purposes, and though they were working to merge things more cleanly, they still did. In a fair number of cases, the only way to guess what type would be returned by a given function was to know where that function had originated.
Even in such a case, you can help make quite a few things clearer. In that case, all the code started out in the global namespace. Simply moving a fair amount into some namespaces eliminated the name clashes and eased type-tracking quite a bit as well.
5
Yes, it makes it easier to know the type of your variable if you don’t use auto
. The question is: do you need to know the type of your variable to read the code? Sometimes the answer will be yes, sometimes no. For example, when getting an iterator from a std::vector<int>
, do you need to know that it’s a std::vector<int>::iterator
or would auto iterator = ...;
suffice? Everything that anybody would want to do with an iterator is given by the fact it’s an iterator – it just doesn’t matter what the type is specifically.
Use auto
in those situations when it doesn’t make your code harder to read.
1
Personally I use auto
only when it’s absolutely obvious for the programmer what it is.
Example 1
std::map <KeyClass, ValueClass> m;
// ...
auto I = m.find (something); // OK, find returns an iterator, everyone knows that
Example 2
MyClass myObj;
auto ret = myObj.FindRecord (something)// NOT OK, everyone needs to go and check what FindRecord returns
8
Many good answers so far, but to focus on the original question, I do think Herb goes too far in his advice to use auto
liberally. Your example is one case where using auto
obviously hurts readability. Some people insist it is a non-issue with modern IDEs where you can hover over a variable and see the type, but I disagree: even people that always use an IDE sometimes need to look at snippets of code in isolation (think of code reviews, for instance) and an IDE won’t help.
Bottom line: use auto
when it helps: i.e. iterators in for loops. Don’t use it when it makes the reader struggle to find out the type.
This question solicits opinion, which will vary from programmer to programmer, but I would say no. In fact in many cases just the opposite, auto
can help to make code easier to understand by allowing the programmer to focus on the logic rather than the minutiae.
This is especially true in the face of complex template types. Here is a simplified & contrived example. Which is easier to understand?
for( std::map<std::pair<Foo,Bar>, std::pair<Baz, Bot>, std::less<BazBot>>::const_iterator it = things_.begin(); it != things_.end(); ++it )
.. or…
for( auto it = things_.begin(); it != things_.end(); ++it )
Some would say the second is easier to understand, others may say the first. Yet others might say that a gratuitous use of auto
may contribute to a dumbing-down of the programmers that use it, but that’s another story.
8
I’m quite surprised that no one pointed out yet that auto helps if there is no clear type. In this case, you either work around this problem by using a #define or a typedef in a template to finding the actual usable type (and this is sometimes not trivial), or you just use auto.
Suppose you got a function, that returns something with platform-specific type:
#ifdef PLATFROM1
__int256 getStuff();
#else //PLATFORM2
__int128 getStuff();
#endif
Witch usage would you prefer?
#ifdef PLATFORM1
__int256 stuff = getStuff();
#else
__int128 stuff = getStuff();
#endif
or just simply
auto stuff = getStuff();
Sure, you can write
#define StuffType (...)
as well somewhere, but does
StuffType stuff = getStuff();
actually tell anything more about x’s type? It tells it is what is returned from there, but it is exactly what auto is. This is just redundant – ‘stuff’ is written 3 times here – this in my opinion makes it less readable than the ‘auto’ version.
5
Readability is subjective; you’ll need to look at the situation and decide what’s best.
As you pointed out, without auto, long declarations can produce a lot of clutter. But as you also pointed out, short declarations can remove type information which may be valuable.
On top of this, I’d also add this: be sure you’re looking at readability and not writeability. Code that’s easy to write is generally not easy to read and vice versa. For instance, if I were writing, I’d prefer auto. If I were reading, maybe the longer declarations.
Then there’s consistency; how important is that to you? Would you want auto in some parts and explicit declarations in others, or one consistent method throughout?
1
I have two guidelines:
-
If the type of the variable is obvious, tedious to write or hard to
determine use auto.auto range = 10.0f; // Obvious for (auto i = collection.cbegin(); i != cbegin(); ++i) // Tedious if collection type // is really long template <typename T> ... T t; auto result = t.get(); // Hard to determine as get() // might return various stuff
-
If you need specific conversion or the result type is not obvious and might cause confusion.
class B : A {}; A* foo = new B(); // 'Convert' class Factory { public: int foo(); float bar(); }; int f = foo(); // Not obvious
1
Yes.
It decreases verbosity but the common misunderstanding is that verbosity decreases readability.
This is only true if you consider readability to be aesthetic rather than your actual ability to intepret code – which is not increased by using auto.
In the most commonly cited example, vector iterators, it may appear on the surface that using auto increases the readability of your code.
On the other hand, you don’t always know what the auto keyword will give you. You have to follow the same logical path as the compiler does to make that internal reconstruction, and a lot of the time, particular with iterators, you’re going to make the wrong assumptions.
At the end of the day ‘auto’ sacrifices readability of code and clarity, for syntactic and aesthetic ‘cleanliness’ (which is only necessary because iterators have needlessly convoluted syntax) and the ability to maybe type 10 fewer characters on any given line.
It’s not worth the risk, or the effort involved long-term.
I will take the point of less readable code as an advantage, and will encourage the programmer to use it more and more. Why? Clearly if the code using auto is difficult to read, then it will be difficult to write too. The programmer is forced to use the meaningful variable name , to make his/her job better.
Maybe in the beginning the programmer may not write the meaningful variable names. But eventually while fixing the bugs, or in code review, when he/she has to explain the code to others, or in not so near future, he/she explaining the code to maintenance people, the programmer will realize the mistake and will use the meaningful variable name in future.
1