I have searched Programmers and Stackoverflow and was not able to come up with a satisfying answer, even though I’m quite sure it must have been asked many times before. The only question I found has answers only dealing with readability. The code excerpts below are from actual C# code I was writing.
When dealing with a local variable that has some calculated value when a condition is true but a default value otherwise, do I:
First set the value, then change the value when the condition is true?
short version = 1; // <-- set the value here.
if (separator != -1)
{
version = Int16.Parse(filename.Substring(separator + 1));
filename = filename.Substring(0, separator);
}
Or not set the value, and add an else when the condition is false?
short version;
if (separator != -1)
{
version = Int16.Parse(filename.Substring(separator + 1));
filename = filename.Substring(0, separator);
}
else
version = 1; // <-- set the value here.
Could a programmer have wrong expectations when confronted with either solution (e.g. when the if-clause is quite big and the else-clause is way down, out of view)? Do you have any experiences where the difference mattered? Is one particular style enforced at your company, and why? Are there any technical reasons why one would be preferred over the other? The first one sets a variable twice, but I’m not sure whether this would matter in modern-day optimizing compilers.
6
There’s a school of thought that says a variable should not come into scope until you can assign it a valid value. In other words, the variable should never contain either a wrong value or a garbage value. This prevents the possibility of, for example, a maintainer later putting a call after the variable declaration but before the variable is properly initialized.
In your short example, it’s fairly obvious those statements are a group, but over time there is nothing preventing people from putting several unrelated statements in between.
That means both of your examples are wrong. The first way has a short period of time when it incorrectly holds the default value, and the second way has a short period of time when it incorrectly holds no value. The best solution from a readability and maintainability point of view is something like:
short version = getVersion(filename, separator);
This has the added bonus of conforming to the “functions should do exactly one thing” school of thought as well.
2
I would encapsulate the functionality in its own method and use a guard clause to take care of the special case:
private short getVersion(String filename, int separator)
{
if (separator == -1)
{
return 1;
}
return Int16.Parse(filename.Substring(separator + 1));
}
I left out the filename modification because that probably deserves it’s own method too:
private String removeVersion(String filename, int separator)
{
if (separator == -1)
{
return filename;
}
return filename.Substring(0, separator);
}
My C# is rusty but hopefully it’s correct enough to make sense.
You left out a third possibility:
short version;
if (separator == -1)
{
version = 1; // <-- set the value here.
}
else
{
version = Int16.Parse(filename.Substring(separator + 1));
filename = filename.Substring(0, separator);
}
As you note, the compiler really doesn’t care which way you go. The way I’ve put it here, at least you don’t have the separation that you were concerned about. The setting of version between true and false is obvious at a glance, the top one will always be nice and short, and it makes obvious the setting of version in each case–if that’s what you are worried about.
I think you’ll find (with the several votes to close) that it’s really just a matter of preference.
It’s more clear when you enclose the assignment inside an if-else block if it is relevant to the condition itself. Not only can it hint the compiler that the assignment can be ignored in case condition fails, but it also hints the programmer that the two things(assignment of a variable and the condition) are related. Similarly, if the two are not related, separating them(assignment before or after the if() block) would probably be the intuitive approach. Of course, this all depends on the context and is case-specific. But in general tie related things together and be elaborate with the logic.
When I go to the machine, which button should I press: Coffee or Tea? Well, it depends on what I want to drink: coffee or tea… 🙂
-
When I want to express that variable “var” is by default should get value “A”, but in some cases it can be “B” or “C”, I set the initialization value to “A”.
-
When the code is responsible to set “var” to a specific value without any default, I leave it uninitialized – that will even give a nice compiler warning if there is a forgotten path. Although I never used this technique, ‘final’ keyword can ensure that the variable is initialized only once (in Java).
But what I suggest you never ever omit is the { } around ANY block (even if it “can be a single statement” by the language definition). I have already wasted too much time on things like “I thought it will be a single line” or “I did not know there will be another if-else around/inside this”, in my own and in others’ codes. Indenting is not a compiler operation, but an eye candy, even if your IDE supports it. When you create a block, notify the compiler about it… typing two more characters does not hurt that much.
PLEASE SHARE MORE CODE – a complete class if it fits. Otherwise, there is no way to give you a decent advice.
I do not exactly what you are doing, so I will guess. Here is one way … what does the separator variable do? Ideally the separator would be passed in as a parameter, but then it does not make sense because if separator is -1, then it should not have been called in the first place.
// Initialize some stuff inside of the constructor.
...
private void ParseVersionAndFileName()
{
if (this.Separator < 0)
{
Debug.Assert(separator == -1, "Negative value but not -1? How?");
return;
}
this.Version = Int16.Parse(this.Filename.Substring(this.Separator + 1));
this.Filename = this.Filename.Substring(0, this.Separator);
}
public string Version
{
get;
private set;
}
public int Version
{
get;
private set;
}
...
There is the hardware architecture perspective. By writing your code by initializing and avoiding the branch to the else clause, you avoid gratuitously polluting the branch prediction table. For your relatively trivial example, it doesn’t matter because no matter what you do, you have to do the comparison and branch.
1