Given a mutable property, it generally makes sense to only hold/store that property in a single place. When the data needs to change you only need to update a single field, and you can completely avoid bugs where the different fields get out of sync.
An extremely simple example could be:
The ‘right’ approach:
class Owner {
String name;
}
class Dog {
Owner owner;
String getOwnersName() {
return owner.name;
}
}
The ‘wrong’ approach:
class Owner {
String name;
}
class Dog {
Owner owner;
String ownerName;
String getOwnersName() {
return ownerName;
}
}
Experience has taught me that it’s very seldom a good idea to break this rule of thumb. The risk of bugs being introduced and the increased effort required to understand the code almost always outweighs any benefit.
My question is, is there a name for this rule/principle?
Bonus points for linking to articles/blogs/etc. which make the argument for this clearly. Double bonus points for counter-arguments!
9
It’s called a Single Source of Truth. As far as counterarguments go, that article points out its main drawback, which is difficulty scaling. However, even in large distributed systems, you want to have a single source of truth locally.
4
This is a facet of Don’t repeat yourself, and arguably the single most important meta-principle of computer programming.
Database people call it Normalization. Good system designs tend to start with only normalized data and only make exceptions as needed.
One counter-example not mentioned in the Normalization article is concurrent programming. When two or more processes access the same piece of mutable data, there are all sorts of issues where process A
starts a read, then process B
performs a write, which value does process A
read? The old one, or the new one? If your data is a complex object, A
may get a pointer to a new, but uninitialized object and start using it before it is fully created.
When you have to share mutable data across processes, it is often better to make (immutable) defensive copies of your data. That way process A
can fully guarantee that process B
reads a correct, fully initialized value and that process B
cannot then accidentally or intentionally change the value of A
‘s data (or vice-versa).
This is the reason that Java has the convention of using those annoying get/set methods. If you expose a field with a get/set, then later discover an issue where a client of your class is changing your class’s underlying behavior in an unsafe way, then without changing your class’s interface you can:
- Make the underlying data immutable
- Return a defensive copy from the get() method
- Add synchronization to the get/set methods
Immutable data is the simplest and most reliable solution when it’s practical to design your class so that the data in it does not change. If the object has a small memory footprint and will not be created so many times as to fill memory with tiny copies of itself, then a defensive copy is probably more efficient than synchronization. If the underlying object is expensive to create, has a large memory footprint, or needs to be created so many times that it would fill memory quickly, then synchronization may be better than defensive copies.
Newer languages like Ruby and Scala generate implicit get/set methods for you so that it looks like you are accessing the fields directly, but you can later override the default get/set methods as described above. Functional languages like Scala, Clojure, or Haskell assume all data is immutable unless you specify otherwise.
Even within a single process it’s very easy to pass myObject to different methods or use it on different pages of code within the same procedure and one place it sets myObject.color = BLUE and somewhere else it sets myObject.color = RED and each place expects myObject to retain its color. It’s easy to make this kind of programming error whenever code becomes complicated enough.
1