When designing an application where scalability is going to be important, it would be useful if the application could be designed to “scale out” on cheap commodity servers. However, there are many scale up applications (see my previous question).
Is it always possible to design an application to scale out, or do some requirements make scale out impossible, even if all software can be designed from scratch and no legacy software is involved?
I am looking for cases where scale out is an inferior solution to scale up, and where good application design could not mitigate this. In other words, the application cannot be designed to scale out effectively, no matter how hard one tries.
I am not looking for cases where legacy software or data structures — or even entire architectures (as in the answer to the previous question) — were not designed to scale out and could not be changed.
Alternatively, if there is reason to believe that one can always scale out, if the application is well designed, I would love to know that as well.
Edit: I am looking for an explicit example of such a system, if there is one.
3
TLDR: scaling out is not possible for sequential algorithms or atomic data, and can be a bad way to spend a development budget.
As others have pointed out, sequential algorithms cannot be scaled out. An example could be financial transaction processing, where transactions must be applied in the right order. However, in typical CRUD scenario’s you are unlikely to run into sequential long-lived algorithms, so my answer will be directed towards cases where you are not dealing with such a sequential algorithm.
The key to scaling out is avoiding synchronization work between servers, because it’s the synchronization work that introduces bottlenecks as you scale out. For example, application servers can avoid storing data locally, fetching everything fresh from the database layer for each request. This way no data needs to be synchronized between application servers (e.g. session data). Another example is sharding of databases, where different database servers hold different parts of the data, which means no synchronization work is needed between those servers.
So, the answer is that even in cases where there are no algorithmic bottlenecks to scaling out there are situations where scaling out is not possible, and these have to do with atomicity of data. If you are dealing with a set of data which cannot be sharded and must be shared across nodes to implement your logic (e.g. a non-divisible social graph), then your only option is scaling up: one big server that holds the whole dataset instead of a bunch of smaller ones that each hold a subsegment. These situations are however pretty rare, so in most cases software can be designed to be scaled out, provided the design takes this into account from the beginning.
Another thing to take into account is design cost. Designing for scaling out increases the development cost. Database sharding for example requires additional application-level logic to deal with the shards. For that reason, it makes sense to consciously choose to scale up, especially at the DB layer. For an example, see the site you are looking at right now.
1
No, the history of computation is doing things that are [were] “impossible.”
Cheek aside, at a more practical level – it sounds like you’re essentially asking your audience to prove a negative. So again, the answer is pretty easily “no.”
No, it is not always possible to “scale out”. As Amdahl’s law shows us, your abilities to parallelize tasks are limited by sections of your algorithm that must be done sequentially.
I think there are few CRUD applications where you cannot get significant performance benefits from scaling out. However, there are situations where you have significant sequential computing requirements, such as many scientific applications, where you are fundamentally required to compute one thing before another. This CS stack exchange question shows a few examples.
Here’s an example of a design that doesn’t scale. The reason is as explained by the other answers, it has to be done sequentially:
An exchange orderbook for a particular product such as a futures contract. Users are counting on getting filled on a price-time rule, so the server needs to process each entry in sequence.
No, any design is only going to survive scaling out as far as could be conceived at the design phase. Eventually the system will hit a wall or inefficiency where it cannot be scaled out, or scaling out further decreases efficiency without a new architecture.