I and a friend are having an argument on what is the better technique to use in the following scenario:
An application we need to test has a number of components (lets call them Eggs); each component is a bag of predefined sub-components. These components are the input to our application, there is no way we can change/annotate these components at their source.
Now, we have a set of validators for each sub-component. It is not programatically possible to associate an Egg with its validators, so it has to be done manually for each new Egg (there are a finite number of eggs). Once the mapping is done, we can run each validator for each Egg to see if the Egg is good or rotten.
Now, my way of doing it is to have a config file which consists of the mapping between each Egg and its validators. Eg:
{
Egg1: [validator1, validator2]
Egg2: [validator5, validator7, validator2]
}
Now we can read this file and use reflections to select the appropriate validator (from the list of all classes implementing IValidator) for each Egg.
My friend is averse to using reflections, and prefers to add a new C# class for each Egg everytime a new type of Egg is added. The class represents the Egg and runs the appropriate validators for that Egg.
Which way is better? I prefer the reflections method because everytime a new type of Egg shows up all we need to do is edit the config.
To clarify, an Egg is a selection from predefined list of sub-components. Assume that there will be no more sub-components, and hence there is no need for new validators. We cannot enumerate all possible Eggs because there are too many.
8
Which way is better?
Based on what you’ve described, adding a new class is way better.
why?
Because editing a config is just as hard/painful as editing code, except it’s tons more error prone and requires more work to setup – and nobody coming into your company knows what the hell this config is, or what format it’s in, or where to find it.
Now, if you could use reflection to inspect the Eggs and see what sub-components they have (and you don’t have significant performance requirements), that seems like an even better solution…
1
I think the other answers are being a bit too simplistic.
My approach would probable by to put it in configuration in code. I.e. keep the same data structure you were thinking for your configuration file, but don’t put it in an external configuration file. Instead, build it in code.
In a python world, I’d do something like (I don’t know c# well enough to know what it looks like there.):
VALIDATORS_BY_TYPE = {
"Egg1": [Validator1(), Validator2(), Validator3()],
"Egg2": [Validator2(), Validator4(), Validator5()]
}
def validate_egg(egg):
for validator in VALIDATORS_BY_TYPE[egg.type]:
validator.validate(egg)
What do the alternatives look like? Let’s suppose you wrote classes for each type of egg. You’d get something like:
class Egg(object):
def __init__(self, egg):
self.egg = egg
class Egg1(Egg):
def validate(self):
validator1.validate(self.egg)
validator2.validate(self.egg)
validator2.validate(self.egg)
class Egg2(Egg):
def validate(self):
validator2.validate(self.egg)
validator4.validate(self.egg)
validator5.validate(self.egg)
def MakeEgg(raw_egg):
if raw_egg.type == 'Egg1':
return Egg1(raw_egg)
elif raw_egg.type == 'Egg2':
return Egg2(raw_egg)
else:
raise "Unknown egg type!"
def validate_egg(raw_egg):
MakeEgg(raw_egg).validate()
That’s considerably more code. Its also going to be much harder to change. Suppose that we decide that we’d actually like to accumulate the errors for each error type. If the first validator fails with the code above, we’d just bail. But instead, suppose that we want to add all the errors into a list and throw them together. With the configuration method, you just have to change validate_egg
. If you wrote classes, you’ll have to rewrite all of the Egg
subclasses to add accumulation logic.
What about using an external configuration file? An external configuration file adds complexity. Here, I can construct my data structure directly, complete with references to validator objects. If I wanted to implement a configuration file I’d have to do something like:
validator_config = json.load(config_files.get_path("validator_config_file.json"))
VALIDATORS_BY_TYPE = {egg_type: [getattr(validators, name) for name in names]
for egg_type, names in validator_config}
And this is going to break down and get more complex if you start needing to add parameters to validators, or other metadata.
On the other hand, configuration files do have the advantage of being easier to manipulate programmatically. Let’s suppose that we want to give a description to the EggTypes, we’d need to rewrite our configuration block to be something like:
VALIDATORS_BY_TYPE = {
"Egg1": EggType("Scrambled", [Validator1(), Validator2(), Validator3()),
"Egg2": EggType("Sunny-side up", [Validator2(), Validator4(), Validator5()])
}
That’s going to be annoying rearrangement you’ll have to do by hand. On the other hand, writing a throwaway script that parses a json configuration file, manipulates it to the new format and outputting a replacement is straightforward.
However, you do need to be careful in adopting a configuration based approach. It is often difficult to come up with a configuration that handles all the cases. When you can’t handle a particular case through your configuration, the tendency is to add crazy hacks all over your code to handle it.
2