This is something that’s been bugging me for a bit now. In some cases you see code that is a series of overloads, but when you look at the actual implementation you realize they do logically different things. However writing them as overloads allows the caller to ignore this and get the same end result. But would it be more sound to name the methods more explicitly then to write them as overloads?
public void LoadWords(string filePath)
{
var lines = File.ReadAllLines(filePath).ToList();
LoadWords(lines);
}
public void LoadWords(IEnumerable<string> words)
{
// loads words into a List<string> based on some filters
}
Would these methods better serve future developers to be named as LoadWordsFromFile()
and LoadWordsFromEnumerable()
? It seems unnecessary to me, but if that is better what programming principle would apply here?
On the flip side it’d make it so you didn’t need to read the signatures to see exactly how you can load the words, which as Uncle Bob says would be a double take. But in general is this type of overloading to be avoided then?
2
I really wonder why the version with actual IO has been added to the class in the first place. What’s the point here? How is that the responsibility of the class?
Why not add a method to load the words from a zipped file, or from an .xml or a .json or a .doc or an SQL database or over http or ftp or whatever? Because it’s not the responsibility of the class, that’s why!
Particularly the file name based method adds a dependency against a global object. Which is anything but ideal if you ask me.
So I would argue that the programming principle that would apply here is composition.
loader.LoadWords(someList)
loader.LoadWords(File.ReadAllLines(somePath).ToList())
loader.LoadWords(Http.get(someUrl).split(NEWLINE))
- …
Tying together a particular method of loading the data and a particular method of processing it should not be done in the class itself, but in another module, whose responsibility it is to carry out just that composition. Could be an IoC container or something, but it doesn’t have to be as fancy as that.
13
This violates at least three principles of object-oriented programming, and potentially a fourth.
First, this violates the Principle of Least Astonishment. Let’s assume we are given documentation for only one of the methods, and we have to infer what the other does. First:
// Reads words from a file
public void LoadWords(string)
// ???
public void LoadWords(IEnumerable<string>)
It would be very easy to look at the first, and think that the second would read words from a list of files. That would be consistent behavior. It doesn’t, though. Now the reverse:
// ???
public void LoadWords(string)
// Loads the IEnumerable of strings into words
public void LoadWords(IEnumerable<string>)
It would now be very easy to look at the second, and think that the first would load a single word. Instead, I would get IO errors or potentially something totally unexpected (if the word happened to match a real file name) if I tried to use it as such. This method is hard to reason about because the behaviors are so different.
Second, this code violates the Law of Demeter. In a method, I now have to reach out and interact with the file system, which is a totally unrelated object at a completely different level of abstraction. Where before I was dealing with filtering and saving words, now I have to worry about file system code. I have to know how to get to the file system, open a file, read a file, close a file, tokenize the file contents, catch any exception any of those might incur, etc. This should not be this class’s responsibility, which also shows it is a violation of the Single Responsibility Principle.
Finally, we can also violate the Open-Closed Principle. Say we need to overload the method to read from a network stream. We have to open up this class to add the additional functionality. This also exacerbates the Demeter and SRP violations.
If the methods are named more explicitly, we start to leak out implementation details about our class. In this case, the reason is that the responsibilities should not lie in our class. We should push those implementation decisions out to the class’s clients. This allows them to make the decision, as they were already doing, but offers them far more flexibility and control, while alleviating your class of the burden of handling all those details internally.
back2dos’ answer shows how to accomplish this through client code.
Elaborating on @back2dos answer, there are two things going on in your code.
- loading words into a List based on some filters
- parsing words (in this case, from a file, splitting on new lines)
So I’d vote for two methods. Note – I’m a Java guy so I’m guessing about the types…
void filterWords(IEnumerable<string> rawwords) // note - why return void instead of List?
List<string> parseWords(InputStream in)
Whether these belong in separate classes or one class depends on how specific splitting on new lines is to this particular class. If that happens other places in your code (or you can easily imagine it happening in the future) that should be split off.
By using a stream instead of a filename in parseWords you gain some flexibility and abstraction (e.g., you could use an Http GET). Perhaps stream is not the right abstraction, if so, think about what is right.