Almost every cloud instance I can find defaults one CPU. Why is this only one CPU now, and should I expect this to increase in the future?
Does this design impact my code design so that I exclude technologies like Task Parallel Library?
This is on topic to Programmers.SE because it impacts the long-term scalability of muti-threaded code on cloud platforms.
3
Cloud computing deals with ridiculously parallel problems by default, like serving up resources from a URL. There are several ways to achieve parallelism, regardless of the number of cores you have. You should build your application knowing how you intend to take advantage of it. You can get cloud instances with multiple cores and lots of RAM, but they cost more.
Most web services run within an embedded web server (like Spring Boot web services for example). The parallelism you need is taken care of by the server, so as long as you don’t add points of contention your service remains ridiculously parallel and you don’t have to think about threads at all.
That said, one service can only handle so many clients at once. That’s why cloud solutions typically bring another instance online and distribute traffic between the instances of your service. Many times it is much cheaper to have another instance for a short burst of traffic than it is to have one instance with multiple cores.
What you aren’t seeing is that your service is usually hosted on a server with multiple cores, but it only looks like one to you. When you have multiple copies of your web service running, you are also using multiple cores.
The point being the parallelism is there, you just need to know how not to mess it up. For that you need to understand how parallelism works, etc.
You mentioned the Task Parallel Library, and that is a key feature in Microsoft’s approach for web services–particularly when paired with async
and await
. Understanding how that works will really help your application handle more concurrent users. It is time well spent.
No, you actually should. I would even go as far as to say – everyone should.
Parallel programming is actually a huge problem for the whole industry, it’s a topic universities, tutorials, project managers and architects usually shy away from. This is bad, very bad, and should be fixed ASAP.
Parallel programming is not actually that hard, but it needs a different mindset that most people are not comfortable with. Decent multi-threaded apps can be written with just knowing and abiding by the rule that all function calls should be reentrant and shared objects protected. This solves most of the problems.
But getting used to that is problematic. Documentation is not always threading aware, or does not explicitly talk about threads. Bread and butter safety nets like unit-tests fail miserably in parallel programming. Thread joining and abnormal termination/cancellation is a huge issue. And deadlocks are one scary thing as soon as people start getting overly creative.
This extra complexity is enough that most people just forget about extra threads, heck, even I used to, and even now I sometimes “forget” about threads. This is also a perfect way for cloud providers to forget about multi-cores, and run more VMs on the base hardware.
But this is wrong, computing power growth is in threads now, we as programmers have to get used to the new world, and we should try to use the threads whenever possible.
1
For sure, you should plan for the execution environment you are expecting to use.
The cloud platform I use lets me define VMs with multiple CPUs. If explicit parallelism is important to you, select a different vendor.
This certainly hampers concurrency prospects, but there are cases when a single-threaded program running on a single core could still benefit from multiple threads : every time the thread waits for something (e.g. blocking IO).
Let’s say this program executes a set of similar tasks where each involves C compute time and W wait time, then to keep the single core fully utilized you need W/C threads.
Here’s the general formula, given those definitions:
Ncpu = number of CPUs
Ucpu = target CPU utilization; 0 <= Ucpu <= 1
W/C = ratio of wait time to compute time
The optimal number of threads to keep CPUs at the dezired utilisation is :
Nthreads = Ncpu * Ucpu * (W/C)
This means that multiple threads do not improve performance (only) when there are more runnable threads than available CPUs.
Generally in the cloud you want to “scale horizontally” instead of vertically. Elastically adding VMs to handle peak loads reduces your cost, whereas CPUs/cores generally can’t be added or removed elastically.