I have been programming for a couple of years and have often found myself at a dilemma.
There are two solutions –
- one is simple one i.e. simple approach, easier to understand and maintain. It involves some redundancy, some extra work (extra IO, extra processing) and therefore is not the most optimal solution.
- but other uses a complex approach,difficult to implement, often involving interaction between lot of modules and is a performance efficient solution.
Which solution should I strive for when I do not have hard performance SLA to meet and even the simple solution can meet the performance SLA? I have felt disdain among my fellow developers for simple solution.
Is it good practice to come up with most optimal complex solution if your performance SLA can be met by a simple solution?
6
Which solution should I strive for when I do not have hard performance SLA to meet and even the simple solution can meet the performance SLA?
The simple one. It meets spec, it’s easier to understand, it’s easier to maintain, and it’s probably a whole lot less buggy.
What you are doing in advocating the performance efficient solution is introducing speculative generality and premature optimization into your code. Don’t do it! Performance goes against the grain of just about every other software engineering ‘ility’ there is (reliability, maintainability, readability, testability, understandability, …). Chase performance when testing indicates that there truly is a need to chase after performance.
Do not chase performance when performance doesn’t matter. Even if it does matter, you should only chase performance in those areas where the testing indicates that a performance bottleneck exists. Do not let performance problems be an excuse to replace simple_but_slow_method_to_do_X()
with a faster version if that simple version doesn’t show up as a bottleneck.
Enhanced performance is almost inevitably encumbered with a host of code smell problems. You’ve mentioned several in the question: A complex approach, difficult to implement, higher coupling. Are those really worth dragging in?
4
Short Answer: Prefer the simple solutions over complex, and remember KISS and YAGNI principles
Because initial project requirements and software is never perfect, it requires changes as application is developed/used. Iterative approach in development phases is a very good match to start things simple and extend it as needed. The simplest solutions have room for flexibility and more easy to maintain.
In addition, trying to be smart and placing some add-hoc optimization while still building your application is not a good practice and may over-complicate your solution. As it is known, "premature optimization is the root of all evil"
– from Knuth’s book
1
Take a lesson from Knuth here: “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil”.
Think about your solutions in this order: First, always, correctness. Second, improve clarity and simplicity. Third, and only when you can demonstrate the need, efficiency.
Adding efficiency will almost always cost you something important, and so should only be pursued when you know you need to.
8
Simplicity is prerequisite of reliability. If you have a simple solution that works, by all means go for it! It is much easier to optimize a working program than to make an optimized program work. Also do not forget about Moore’s law: if your simple solution meet the performance goals today, it will probably crush them1 in a year or two.
1 There’s no guarantee there, because as Jimmy Hoffa noted in his comment below, Moore’s law has its limits.
4
Is it good practice to come up with most optimal complex solution if
your performance SLA can be met by a simple solution?
Optimal is an ambiguous word!
Ultimately, if there is much risk in having to maintain the complex one, and if the simple one is “good enough” I’d always err on the side of the simple one.
Add in any risk of the complex one not being good enough, then KISS is probably the right answer.
Which one costs less?
Most of the time, a simple solution that’s slightly slower will be perfectly acceptable in terms of performance, and simplicity makes it cheaper to develop, maintain, and eventually replace.
On the other hand, sometimes speed is really important, and the financial gain that comes from even small speed improvements can be far greater than the increased cost of a more complicated solution. For example, shaving 0.01s off the time to complete a transaction can make a securities trading system much more profitable. A 10% improvement in efficiency of a system that supports several million users could mean a significant reduction in server costs.
So, the question you have to ask yourself is: Does using the complex solution have enough of an impact on the bottom line to pay for its additional cost? Actually, you should probably ask your client to decide since they’re paying the bills and reaping the potential benefits. One good option is to go with the simple solution first, and offer the more complex solution as a possible improvement. That lets you get your system up and running and gives your client something to start testing, and that experience may inform the decision to implement (or not implement) the more complicated solution.
When evaluating two approaches, one being simpler but less efficient while other being more complex and more efficient, one has to consider the problem and the project domain.
Consider a multi-billion software project for healthcare industry which has planned lifetime for over 15 years of maintenance and +20 years of use. In such a project performance is definitely not going to be a concern, but project complexity and structure can cause major problems for the maintenance of the project, which lasts for that 15 years at minimum. Maintainability and simplicity come before anything.
Then, consider another example. A console game engine which is supposed to power the upcoming games of the company for the next 5+ years. Because games are extremely resource-constrained programs, efficiency goes before maintainability in many cases. Writing your very own, very specific data-structures and algorithms for some task can be very important even if it goes against any kind of “best practices” of software development. A good example of this could be Data Oriented Design in which you store your data in similar data arrays, rather than in actual objects. This is to increase reference of locality and as such increase CPU cache efficiency. Not practical, but very crucial in the given domain.
I’d prefer the simple one. In my opinion the premature optimizations cause as much problems as they solve. In many cases good design allows you to change given implementations in the future, if they become bottlenecks.
So at the bottom line – I’ll design it as flexible as possible, but won’t sacrifice simplicity for flexibility too much.
This is always a hard question and I see answers swinging one way, so I’ll play the game for the other side, though I don’t claim either answer is correct, it’s a very soft and case-by-case topic.
One thing about a complex but high performance solution is you can always just document the ever living heck out of it. I’m generally a fan of self-documenting code, but I’m also a fan of software that responds in an amount of time that makes me feel like it’s not slowing me down. If you do go with the complex but high performance solution, consider what you can do to make it not so bad:
Wrap it in an interface, put it in an assembly on it’s own, possibly even a process all it’s own. Make it as loosely coupled as possible with as thick an abstraction wall around it as possible to avoid leaks. Write lots of unit tests for it to save regressions in the future.
Document it in the code, even consider writing some real documentation. Think about complex data structures and how they’re documented, imagine trying to understand the implementation of one of them from code without a data structures book/wikipedia article to explain it. And yet we all accept that these complex data structures are in fact good things and it’s beneficial that someone did implement them in our languages.
Remember that we’re all sending messages on a TCP/IP stack that is likely as harry as code can get if any of us were to look at it, expressly so it performs the way we all require it too. Perhaps your problem doesn’t require this level of optimization, perhaps it does, but beware when tackling this question as we all have to from time to time: There be dragons there.
I’m coming at this working in areas where there is no performance SLA. When it comes to offline renderers in computer graphics, there is no “satisfactory performance” to users, because they’re already dishing out enormous sums of money to distribute computing across clouds and render farms even with the state-of-the-art renderers to output production-quality images and frames for films, e.g.
But I have to say as one working in this domain for many years that any solution which significantly degrades maintainability in favor of efficiency is actually working against the-ever-shifting performance requirements. Because if you can’t effectively maintain your solution for years to come as things are shifting under your feet (both in terms of surrounding code and what users expect as competitors keep outperforming each other), then your solution is already working towards obsolescence and in need of wholesale replacement.
I don’t see the ultimate purpose of profilers like VTune as a way to make my code run faster. Their ultimate value is to make sure I’m not degrading my productivity to meet ever-escalating performance demands. If I absolutely have to apply some gross-looking micro-optimization, then the profiler, combined with running it against real-world user cases (and not some test case I imagine might be important), makes sure I apply such inevitably gross-looking optimizations very, very judiciously to only the top hotspots that appear as well as very carefully documenting them because I’ll inevitably have to revisit and maintain and tweak and change them for the following years to come if that solution remains viable.
And especially if your optimized solution involves more coupling, then I’d really be reluctant to use it. Among the most valuable metric I’ve come to appreciate in the most performance-critical areas of the codebase is decoupling (as in minimizing the amount of information something needs to work, which likewise minimizes the probability of it requiring changes unless it directly needs changes), because those critical areas significantly multiply the reasons for things to change. Which means the less information something requires to work, the less reasons it has for change, and minimizing the reasons for change is really a huge part of improving productivity in my particular areas of focus because things are going to have to constantly change anyway (we’ll become obsolete in a year otherwise), and it helps to get that down to the bare minimum as well as reducing the cost of such changes.
To me the the greatest and most effective solutions I’ve found are the ones where efficiency and maintainability and productivity are not diametrically opposed to each other. The quest to me is to try to make these concepts as harmonious as one can possibly make it.