I have a method where all logic is performed inside a foreach loop that iterates over the method’s parameter:
public IEnumerable<TransformedNode> TransformNodes(IEnumerable<Node> nodes)
{
foreach(var node in nodes)
{
// yadda yadda yadda
yield return transformedNode;
}
}
In this case, sending in an empty collection results in an empty collection, but I’m wondering if that’s unwise.
My logic here is that if somebody is calling this method, then they intend to pass data in, and would only pass an empty collection to my method in erroneous circumstances.
Should I catch this behaviour and throw an exception for it, or is it best practice to return the empty collection?
17
Utility methods should not throw on empty collections. Your API clients would hate you for it.
A collection can be empty; a “collection-that-must-not-be-empty” is conceptually a much more difficult thing to work with.
Transforming an empty collection has an obvious outcome: the empty collection. (You may even save some garbage by returning the parameter itself.)
There are many circumstances in which a module maintains lists of stuff that may or may not be already filled with something. Having to check for emptiness before each and every call to transform
is annoying and has the potential to turn a simple, elegant algorithm into an ugly mess.
Utility methods should always strive to be liberal in their inputs and conservative in their outputs.
For all these reasons, for God’s sake, handle the empty collection correctly. Nothing is more exasperating than a helper module that thinks it knows what you want better than you do.
30
I see two important questions which determine the answer to this:
- Can your function return anything meaningful and logical when passed an empty collection (including null)?
- What is the general programming style in this applicaton/library/team? (Specifically, how FP are you?)
1. Meaningful return
Essentially, if you can return something meaningful, then do not throw an exception. Let the caller deal with the result. So if your function…
- Counts the number of elements in the collection, return 0. That was easy.
- Searches for elements which match particular criteria, return an empty collection. Please do not throw anything. The caller may have a huge collection of collections, some of which are empty, some not. The caller wants any matching elements in any collection. Exceptions only make the caller´s life harder.
- Is looking for the largest/smallest/best-fit-to-the-criteria in the list. Oops. Depending on the style question, you might throw an exception here or you could return null. I hate null (very much an FP person) but it is arguably more meaningful here and it lets you reserve exceptions for unexpected errors within your own code. If the caller doesn´t check for null, a fairly clear exception will result anyway. But you leave that to them.
- Asks for the nth item in the collection or the first/last n items. This is the best case for an exception yet and the one that is least likely to cause confusion and difficulty for the caller. A case can still be made for null if you and your team are used to checking for that, for all the reasons given in the previous point, but this is the most valid case yet for throwing a DudeYouKnowYouShouldCheckTheSizeFirst exception. If your style is more functional, then either null or read on to my Style answer.
In general, my FP bias tells me ¨return something meaningful¨ and null can have valid meaning in this case.
2. Style
Does your general code (or the project´s code or the team´s code) favour a functional style? If no, exceptions will be expected and dealt with. If yes, then consider returning an Option Type. With an option type, you either return a meaningful answer or None/Nothing. In my third example from above, Nothing would be a good answer in FP style. The fact that the function returns an Option type signals clearly to the caller than a meaningful answer may not be possible and that the caller should be prepared to deal with that. I feel it gives the caller more options (if you will forgive the pun).
F# is where all the cool .Net kids do this kind of thing but C# does support this style.
tl;dr
Keep exceptions for unexpected errors in your own code path, not entirely foreseeable (and legal) input from somebody else´s.
6
As ever, it depends.
Does it matter that the collection is empty?
Most collection-handling code would probably say “no”; a collection can have any number of items in it, including zero.
Now, if you have some sort of collection where it is “invalid” to have no items in it, then that’s a new requirement and you have to decide what to do about it.
Borrow some testing logic from the Database world: test for zero items, one item and two items. That caters for the most important cases (flushing out any badly-formed inner or cartesian join conditions).
5
As a matter of good design, accept as much variation in your inputs as practical or possible. Exceptions should only be thrown when (unacceptable input is presented OR unexpected errors happen while processing) AND the program cannot continue in a predictable way as a result.
In this case, it should be expected that an empty collection will be presented, and your code needs to handle it (which it already does). It would be a violation of all that is good if your code threw an exception here. This would be akin to multiplying 0 by 0 in mathematics. It is redundant, but absolutely necessary for it to work the way it does.
Now, on to the null collection argument. In this case, a null collection is a programming mistake: the programmer forgot to assign a variable. This is a case where an exception could be thrown, because you cannot meaningfully process that into an output, and attempting to do so would introduce unexpected behavior. This would be akin to dividing by zero in mathematics – it is completely meaningless.
3
The right solution is much harder to see when you are just looking at your function in isolation. Consider your function as one part of a larger problem. One possible solution to that example looks like this (in Scala):
input.split("\D")
.filterNot (_.isEmpty)
.map (_.toInt)
.filter (x => x >= 1000 && x <= 9999)
First, you split the string up by non-digits, filter out the empty strings, convert the strings to integers, then filter to keep only the four-digit numbers. Your function might be the map (_.toInt)
in the pipeline.
This code is fairly straightforward because each stage in the pipeline just handles an empty string or an empty collection. If you put an empty string in at the start, you get an empty list out at the end. You don’t have to stop and check for null
or an exception after every call.
Of course, that’s presuming an empty output list doesn’t have more than one meaning. If you need to differentiate between an empty output caused by empty input and one caused by the transform itself, that completely changes things.
1
This question is actually about exceptions. If you look at it that way and ignore the empty collection as an implementation detail, the answer is straightforward:
1) A method should throw an exception when it can’t proceed: either unable to do the designated task, or return the appropriaate value.
2) A method should catch an exception when it is able to proceed despite the failure.
So, your helper method should not be “helpful” and throw an exception unless it is unable to do it’s job with an empty collection. Let the caller determine whether the results can be handled.
Whether it returns an empty collection or null, is a bit more difficult but not much: nullable collections should be avoided if possible. The purpose of a nullable collection would be to indicate (as in SQL) that you don’t have the information — a collection of children for example might be null if you don’t know if someone has any, but you don’t know that they don’t. But if that is important for some reason, it’s probably worth an extra variable to track it.
The method is named TransformNodes
. In case of an empty collection as input,
getting back an empty collection is natural and intuitive,
and makes prefect mathematical sense.
If the method was named Max
and designed to return the maximum element, then it would be natural to throw NoSuchElementException
on an empty collection, as the maximum of nothing makes no mathematical sense.
If the method was named JoinSqlColumnNames
and designed to return a string where the elements are joined by a comma to use in SQL queries, then it would make sense to throw IllegalArgumentException
on an empty collection, as the caller would eventually get an SQL error anyway if he used the string in an SQL query directly without further checks, and instead of checking for a returned empty string he really should have checked for empty collection instead.
1
Let’s step back and use a different example which computes the arithmetic mean of an array of values.
If the input array is empty (or null), can you reasonably fulfill the caller’s request? No. What are your options? Well, you could:
- present/return/throw an error. using your codebase’s convention for that class of error.
- document that a value such as zero will be returned
- document that a designated invalid value will be returned (e.g. NaN)
- document that a magic value will be returned (e.g. a min or max for the type or some hopefully-indicative value)
- declare the result is unspecified
- declare the action is undefined
- etc.
I say give them the error if they have given you invalid input and the request cannot be completed. I mean a hard error from day one so they understand your program’s requirements. After all, your function is not in a position to respond. If the operation could fail (e.g. copy a file), then your API should give them an error they can deal with.
So that can define how your library handles malformed requests and requests which may fail.
It’s very important for your code to be consistent in how it handles these classes of errors.
The next category is to decide how your library handles nonsense requests. Getting back to an example similar to yours — let’s use a function which determines whether a file exists at a path: bool FileExistsAtPath(String)
. If the client passes an empty string, how do you handle this scenario? How about an empty or null array passed to void SaveDocuments(Array<Document>)
? Decide for your library/codebase, and be consistent. I happen to consider these cases errors, and forbid clients from making nonsense requests by flagging them as errors (via an assertion). Some people will strongly resist that idea/action. I find this error detection very helpful. It is very good for locating issues in programs — with good locality to the offending program. Programs are much clearer and correct (consider evolution of your codebase), and don’t burn cycles within functions which do nothing. Code’s smaller/cleaner this way, and the checks are generally pushed out to the locations where the issue may be introduced.
9
As a rule of thumb, a standard function should be able to accept the broadest list of inputs and give feedback on it, there are many examples where programmers use functions in ways that the designer had not planned, with this in mind i believe the function should be able to accept not only empty collections but a wide range of input types and gracefully return feedback, be it even an error object from whatever operation was performed on the input…
4