This has become a large frustration with the codebase I’m currently working in; many of our variable names are short and undescriptive. I’m the only developer left on the project, and there isn’t documentation as to what most of them do, so I have to spend extra time tracking down what they represent.
For example, I was reading over some code that updates the definition of an optical surface. The variables set at the start were as follows:
double dR, dCV, dK, dDin, dDout, dRin, dRout
dR = Convert.ToDouble(_tblAsphere.Rows[0].ItemArray.GetValue(1));
dCV = convert.ToDouble(_tblAsphere.Rows[1].ItemArray.GetValue(1));
... and so on
Maybe it’s just me, but it told me essentially nothing about what they represented, which made understanding the code further down difficult. All I knew was that it was a variable parsed out specific row from a specific table, somewhere. After some searching, I found out what they meant:
dR = radius
dCV = curvature
dK = conic constant
dDin = inner aperture
dDout = outer aperture
dRin = inner radius
dRout = outer radius
I renamed them to essentially what I have up there. It lengthens some lines, but I feel like that’s a fair trade off. This kind of naming scheme is used throughout a lot of the code however. I’m not sure if it’s an artifact from developers who learned by working with older systems, or if there’s a deeper reason behind it. Is there a good reason to name variables this way, or am I justified in updating them to more descriptive names as I come across them?
12
It appears that these variable names are based on the abbreviations you’d expect to find in a physics textbook working various optics problems. This is one of the situations where short variable names are often preferable to longer variable names. If you have physicists (or people that are accustomed to working the equations out by hand) that are accustomed to using common abbreviations like Rin, Rout, etc. the code will be much clearer with those abbreviations than it would be with longer variable names. It also makes it much easier to compare formulas from papers and textbooks with code to make sure that the code is actually doing the computations properly.
Anyone that is familiar with optics will immediately recognize something like Rin as the inner radius (in a physics paper, the in
would be rendered as a subscript), Rout as the outer radius, etc. Although they would almost certainly be able to mentally translate something like innerRadius
to the more familiar nomenclature, doing so would make the code less clear to that person. It would make it more difficult to spot cases where a familiar formula had been coded incorrectly and it would make it more difficult to translate equations in code to and from the equations they would find in a paper or a textbook.
If you are the only person that ever looks at this code, you never need to translate between the code and a standard optics equation, and it is unlikely that a physicist is ever going to need to look at the code in the future perhaps it does make sense to refactor because the benefit of the abbreviations no longer outweighs the cost. If this was new development, however, it would almost certainly make sense to use the same abbreviations in the code that you would find in the literature.
18
Variables with short lifetimes should be named shortly. As an example, you don’t write for(int arrayCounter = 0; arrayCounter < 10; arrayCounter++) { ...
. Instead, you use for(int i ...
.
In general rule of thumb it could be said that the shorter the variable scope the shorter the name should be. Loop counters are often only single letters, say i
, j
and k
. Local variables are something like base
or from
and to
. Global variables are then somewhat more elaborate, for example EntityTablePointer
.
Perhaps a rule like this isn’t being followed with the codebase you work with. It’s a good reason for doing some refactoring though!
15
The problem with the code is not the short names, but rather the lack of a comment which would explain the abbreviations, or point to some helpful materials about the formulas from which the variables are derived.
The code simply assumes the problem-domain familiarity.
That is fine, since problem-domain familiarity is probably required to understand and maintain the code, especially in the role of someone who “owns” it, so it behooves you to acquire the familiarity rather than to go around lengthening names.
But it would be nice if the code provided some hints to serve as springboards. Even a domain expert could forget that dK
is a conic constant. Adding a little “cheat sheet” in a comment block wouldn’t hurt.
3
For certain variables which are well-known in the problem domain — like the case you have here — terse variable names are reasonable. If I’m working on a game, I want my game entities to have position variables x
and y
, not horizontalPosition
and verticalPosition
. Likewise loop counters that don’t have any semantics beyond indexing, I expect to see i
, j
, k
.
3
Acording to “Clean Code”:
Variable names should:
- Be intention revealing
- Avoid disinformation
- Make meaningful distinctions
- Be pronounceable
- Be searchable
Exceptions are the proverbial i,j,k,m,n
used in for loops.
The variable names you, rightfully, complain about do nothing of the above. Those names are bad names.
Also, as every method must be short, using prefixes to indicate either scope or type is no longer in use.
This names are better:
radius
curvature
conicConstant
innerAperture
outerAperture
innerRadius
outerRadius
A commenter says that this would be too complex with long variable names:
Short variable names don’t make it simple either:
fnZR = (r^2/fnR(1+Math.sqrt((1+k) * (r^2/R^2)))) + a[1]*r^2 + a[1]*r^4 + a[1]*r^6 ...;
The answer is long names and intermediate results until you get this at the end:
thisNiceThing = ( thisOKThing / thisGreatThing ) + thisAwsomeThing;
5
There are two good reasons not to rename variables in legacy code.
(1) unless you’re using an automated refactoring tool, the possibility of introducing bugs is high. Hence, “if it’s not broken, don’t fix it”
(2) you will make comparing current versions with past versions, in order to see what changed, impossible. This will make future maintenance of the code more difficult.
3
Is there a good reason to name variables this way, or am I justified
in updating them to more descriptive names as I come across them?
The reason to use smaller names is if the original programmer finds them easier to work with. Presumably they have the right to find that to be the case, and the right not to have the same personal preferences that you have. Personally, I’d find…
dDin better than dDinnerAperture
dDout better than dDouterAperture
…if I was using them in long, complex calculations. The smaller the math expression, often the easier it is to see the whole thing at once. Though if that was the case, they might be better as dIn and dOut, so there wasnt a repetitive D that could lead to an easy typo.
On the other hand, if you find it harder to work with, then knock yourself out and rename then to their longer form. Especially if you are responsible for that code.
6
Generally, I believe that the rule for this, should be that you can use extremely short variable names where you know that people who are “skilled in the art” of your particular code will immediately understand the reference of that variable name. (You always have comments for the exception of this case anyway), and that the localized usage of the variables can easily be discerned based on the context of how they are used.
To expand on this, it means that you shouldn’t go out of your way to obfuscate your variable names, but, you can use abbreviations for your variables names, where you know that only people who understand the underlying concept of your code are likely to read it anyway.
To use a real world example, recently, I was creating a Javascript class that would take a latitude, and tell you the amount of sunlight that you would expect on a given date.
To create this Sundial class, I referred to perhaps half a dozen resources, the Astronomy Almanac, and snippets from other languages, (PHP, Java, C etc).
In almost all of these, they used similar identical abbreviations, which upon the face of it mean absolute nothing.
K
, T
, EPS
, deltaPsi
, eot
, LM
, RA
However, if you have knowledge of physics you can understand what they were. I wouldn’t expect anyone else to be touching this code, so why use verbose variable names?
julianTime
, nutationOfEclipticalLongitudeExpressedInDegrees
, equationOfTime
, longitudeMean
, rightAscension
.
Additionally, a lot of the time, when variable names are transitory, that is to say, they are only used to temporarily allocate some value, then it frequently doesn’t make sense to use a verbose version, especially when the context of the variable explains it’s purpose.
There absolutely is; often times a short variable name is all that is necessary.
In my case, I am doing waypoint navigation in my senior Robotics class, and we program our robots in KISS-C. We need variables for current and destination (x, y) co-ordinates, (x, y) distances, current and destination headings, as well as turn angles.
Especially in the case of x and y co-ordinates, a long variable name is completely unnecessary, and names such as xC (current x), yD (destination y), and pD (destination phi), suffice and are the easiest to understand in this case.
You might argue that these aren’t ‘descriptive variable names’ as programmer protocol would dictate, but since the names are based on a simple code (d = destination, c = current), a very simple comment at the outset is all the description they require.
Is there a good reason to name variables this way, or am I justified in updating them to more descriptive names as I come across them?
Usually, complex algorithms are implemented in matlab (or similar language). What I have seen is people just taking over the variables name. That way, it is simple to compare implementations.
All other answer are almost correct. These abbreviations can be found in math and physics, except they do not begin with d
(as in your example). Variables beginning with d are usually named to represent differentiation.
All normal coding guides are telling not to name variables with the first letter representing type (as in your case), because it is so easy to browse the code in all modern IDEs.
1
I can think of a reason for variable names to be reasonably short.
The short names are easy to read using shorter eye-span, hence a short attention span.
For example, once I get used to the fact that svdb means “save to the database”, the rate of scanning the source code gets better as I only have to quickly scan 4 characters instead of when reading SaveToDatabase (14 characters, things get worse for more complex operation names). I say “scanning” not “reading”, because that takes a major part of analyzing source code.
When scanning through large amount of source code, this can provide good performance gains.
Also, it just helps the lazy programmer to type out these short names when writing code.
Of course, all these “shorthands” is expected to be listed in some standard location in the source code.
1
To frame what @zxcdw said in a slightly different way, and elaborate on it in terms of approach:
Functions should be pure, brief and a perfect encapsulation of some functionality: Black box logic, regardless of whether it has been sitting untouched for 40 years, it will continue to do the job it was designed for, because its interface (in and out) is sound, even as you know nothing of its internals.
This is the kind of code you want to write: this is code that lasts, and is simple to port.
Where necessary, compose functions out of other (inline) function calls, to keep code fine-grained.
Now, with a suitably descriptive function name (verbose if necessary!), we minimise any chance of misinterpretation of those shortened variable names, since the scope is so small.
Variable names should be as descriptive as possible to help the readability of the program. You experienced it yourself: you had a lot of trouble identifying what the program did because of the poor naming.
There is no good reason not to use descriptive name. It will help you, and everyone else that works/will work on the project. Well actually there is a single valid use for short names: loop counters.
3
Surely this is what // comments are for?
If the assignments have comments which are descriptive you get the best of both worlds: a description of the variable and any equations are easily comparable with their text book counterpart.
For interfaces (e.g., method signatures, function signatures) I tend to solve this by annotating the parameter declarations. For C/C++ this decorates the .h file as well as the code of the implementation.
I do the same for variable declarations where knowing the usage of the variable is not obvious in context and in the naming. (This applies in languages that don’t have strong typing also.)
There are many things that we don’t want to clog the variable name. Is the angle in radians or degrees, is there some tolerance on precision or range, etc. The information can provide valuable assertions about characteristics that must be dealt with correctly.
I am not religious about it. I am simply interested in clarity and in making sure that I now what it is the next time my forgetful self visits the code. And anyone looking over my shoulder has what they need to know to see where something is off, what is essential (the last critical for proper maintenance).
Is there an excuse for excessively short variable names?
First: Naming a variable for energy e while calculating formulas such as E=MC2 is NOT excessively short naming. Using symbols as an argument for short names is not valid
This question was rather interesting to me, and I can only think of one excuse and that is money.
Say for instance you’re wiring javascript for a client which knows that the file is to be downloaded many times a second.
It would be cheaper and make the experience of the users better if the file (in number of bytes) is as small as possible.
(Just to keep the example ‘realistic’, you were not allowed to use a minifier tool, why? Security issues, no external tools allowed to touch the code base.)
1
I notice that the other answers don’t mention the use of Hungarian Notation. This is orthogonal to the length debate, but relevant to naming schemes in general.
double dR, dCV, dK, dDin, dDout, dRin, dRout
The “d” at the beginning of all these variables is meant to indicate that they’re doubles; but the language is enforcing this anyway. Despite the terseness of these names, they’re up to 50% redundant!
If we’re going to use a naming convention to reduce errors, we’re much better off encoding information which isn’t being checked by the language. For example, no language will complain about dK + dR
in the above code, even though it’s meaningless to adding a dimensionless number to a length.
A good way to prevent such errors is to use stronger types; however, if we’re going to use doubles then a more appropriate naming scheme might be:
// Dimensions:
// l = length
// rl = reciprocal length
double lR, rlCV, K, lDin, lDout, lRin, lRout
The language will still allow us to write K + lR
, but now the names give us a hint that this might be incorrect.
This is the difference between Systems Hungarian (generally bad) and Apps Hungarian (possibly good)
http://en.wikipedia.org/wiki/Hungarian_notation
3
The only case where illegibly short variable names are acceptable in modern software engineering is when they exist in a script and the throughput of that script (over a network generally) is important. Even then, keep the script with long names in source control, and minify the script in production.
7