I often get to positions in my code where I find myself checking a specific condition over and over again.
I want to give you a small example: suppose there is a text file which contains lines starting with “a”, lines starting with “b” and other lines and I actually only want to work with the first two sort of lines. My code would look something like this (using python, but read it as pseudocode):
# ...
clear_lines() # removes every other line than those starting with "a" or "b"
for line in lines:
if (line.startsWith("a")):
# do stuff
elif (line.startsWith("b")):
# magic
else:
# this else is redundant, I already made sure there is no else-case
# by using clear_lines()
# ...
You can imagine I won’t only check this condition here, but maybe also in other functions and so on.
Do you think of it as noise or does it add some value to my code?
8
This is an excedingly common practice and the way of dealing with it is via higher-order filters.
Essentially, you pass a function to the filter method, along with the list/sequence that you want to filter against and the resulting list/sequence contains only those elements that you want.
I’m unfamiliar with python syntax (though, it does contain such a function as seen in the link above), but in c#/f# it looks like this:
c#:
var linesWithAB = lines.Where(l => l.StartsWith("a") || l.StartsWith("b"));
foreach (var line in linesWithAB)
{
/* line is guaranteed to ONLY start with a or b */
}
f# (assumes ienumerable, otherwise List.filter would be used):
let linesWithAB = lines
|> Seq.filter (fun l -> l.StartsWith("a") || l.StartsWith("b"))
for line in linesWithAB do
/* line is guaranteed to ONLY start with a or b */
So, to be clear: if you use tried and tested code/patterns, it is bad style. That, and mutating the list in-memory the way you appear to via clear_lines() loses you thread safety and any hopes of parallelism that you could have had.
5
I recently had to implement a firmware programmer using the Motorola S-record format, very similar to what you describe. Since we had some time pressure, my first draft ignored redundancies and made simplifications based on the subset I actually needed to use in my application. It passed my tests easily, but failed hard as soon as someone else tried it. There was no clue what the problem was. It got all the way through but failed at the end.
So I had no choice but to implement all the redundant checks, in order to narrow down where the issue was. After that, it took me around two seconds to find the issue.
It took me maybe two hours extra to do it the right way, but wasted a day of other people’s time as well in troubleshooting. It’s very rare that a few processor cycles are worth a day of wasted troubleshooting.
That being said, where reading files are concerned, it’s often beneficial to design your software to work with reading it and processing it one line at a time, rather than reading the entire file into memory and processing it in memory. That way it will still work on very large files.
1
You can raise an exception in else
case. This way it’s not redundant. Exceptions aren things that aren’t supposed to happen but are checked anyway.
clear_lines() # removes every other line than those starting with "a" or "b"
for line in lines:
if (line.startsWith("a)):
# do stuff
if (line.startsWith("b")):
# magic
else:
throw BadLineException
# ...
4
In design by contract, one guesses each function must do its job as described in its documentation. So, each function has a list of pre-conditions, that is, conditions on the function’s inputs as well as post-conditions, that is, conditions of the function’s output.
The function must guarantee to its clients that, if the inputs respect the pre-conditions, then the output will be as described by the post-conditions. If at least one of the pre-conditions is not respected, the function can do what it wants (crash, return any result, …). Therefore pre and post-conditions are a semantic description of the function.
Thanks to contract, a function is sure its clients use it correctly and a client is sure the function does its job correctly.
Some languages handle contracts natively or through a dedicated framework. For the others, the best is to check the pre and post-conditions thanks to asserts, as @Lattyware said. But I would not call that defensive programming, since in my mind this concept is more focused on the protection against the (human) user’s inputs.
If you exploit contracts, you can avoid the redundantly checked condition since either the called function perfectly works and you don’t need the double check, or the called function is dysfunctional and the calling function can behave as it wants.
The harder part is then to define which function is responsible of what, and to strictly document these roles.
You actually don’t need the clear_lines() at the start. If the line is neither “a” or “b”, the conditionals simply won’t trigger. If you want to get rid of those lines then make the else into a clear_line(). As it stands you are doing two passes through your document. If you skip the clear_lines() at the beginning and do it as part of the foreach loop then you cut your processing time in half.
It’s not only bad style, it’s bad computationally.
1
If you actually want to do anything if you find an invalid string (output debug text for example) then I’d say that’s absolutely fine. A couple of extra lines and a few months down the line when it stops working for some unknown reason you can look at the output to find out why.
If, however, it’s safe to just ignore it, or you know for certain you will never get an invalid string then there’s no need for the extra branch.
Personally I’m always for putting in at least a trace output for any unexpected condition – it makes life much easier when you have a bug with output attached telling you exactly what went wrong.
… suppose there is a text file which contains lines starting with “a”, lines starting with “b” and other lines and I actually only want to work with the first two sort of lines. My code would look something like this (using python, but read it as pseudocode):
# ...
clear_lines() # removes every other line than those starting with "a" or "b"
for line in lines:
if ...
I hate if...then...else
constructions. I would avoid the whole issue:
process_lines_by_first_character (lines,
'a' => { |line| ... a code ... },
'b' => { |line| ... b code ... } )