I am programming in Python. I have several complicated/hard-to-understand XML files that describe the structure of an election. I am trying to write a python wrapper that makes it easy to access this data.
The structure of the entities I am trying to model is: there is an election. Elections have races. Races consist of candidates who get results in precincts. Precincts are organized into counties.
I first built this so that the election class holds an array of races objects which hold arrays of candidate objects and result objects. But now I am starting to write unit tests and it is hard to test the candidates or races (because they are coupled to the election).
The election has information about the races, which it feeds to each race’s constructor. The race has information about the candidates (i.e. what they are running for), which it feeds to their constructors.
I know that it is in general a good idea to have decoupled classes but in this case there is a natural hierarchy in the domain and I am wondering to what extent it makes sense to have my code reflect this hierarchy. The ultimate goal of this is to write out “results” in json (for a web app) that are much, much easier to parse than the XML. The results will change every few minutes as they come in (on a new election xml file). I’d love some design advice.
1
Premise 1: Many applications are strictly Data Processing in the 1960s definition. The Swiss Army Knife of Data Processing is the Relational Database model.
Lemma 1: Third Normal Form is an efficient way of representing data.
A Precinct has-a County, but no other relation cares about that. Make a mapping (dict) between Precinct and County which is unrelated to any other bindings.
The Election needs to have a collection of Races but it doesn’t need any other information.
A Race needs (Candidate, Office, Party?) but that is decoupled from the Vote.
A Vote needs (Race, Candidate, Precinct) but no more.
I decoupled the data classes, oh noes! However, I did normalize the data which makes composition into well-structure data much easier to think about, which will yield more sensible JSON. Unless you are voting for hoopiest frood, you’ll probably want to store this stuff in a database, and it is already in the proper form for doing that.
The design as you described it relies on a god-object, Election, and that’s rarely a good idea. If you find a class constructor needs a bunch of data to pass to a bunch of other constructors, you are on your way to a god-object. You’ll gain more flexibility and easier unit-testing by decoupling pieces of implementation.
Moral: Don’t ignore traditional approaches when faced with a traditional problem.
4