In one of my interviews I was asked a vague question, I’m still not sure of the answer. We have a client and a server they are placed apart. The network that connects them has high latency, which is dominating the performance. How can we improve the performance. Note that we can’t change the network’s topology. I was thinking of caching, breaking the request to multiple smaller requests and opening multiple connections with the server. Any ideas?
Please note that the question description is vague and I wasn’t supplied with more information about the situation.
Clarified question: How should the client-server communication be designed in order to get the best performance on a network that has big latency?
3
If high latency is screwing up performance, I’d do the exact opposite of what you’re suggesting: find ways to combine multiple requests into a single request.
Let’s say latency is 1 second, and you need to process 100 items, and the actual processing time is 0.01 seconds per item.
100 requests
============
Processing time = 0.01 * 100 = 1 second
Latency = 1 * 100 = 100 seconds
Total time = 101 seconds
But if you can find a way to send two items in a single request:
50 requests
============
Processing time = 0.01 * 100 = 1 second
Latency = 1 * 50 = 50 seconds
Total time = 51 seconds
Congratulations! You just cut the runtime for this batch in half. Or, if you can find a way to send all 100 items in a single request:
1 request
============
Processing time = 0.01 * 100 = 1 second
Latency = 1 * 1 = 1 second
Total time = 2 seconds
As a few commenters have noted, this only makes sense if the problem actually is latency, and not other network-related issues. But since that’s what you were asked about, this is the right way to handle it.
6
Either load more data up front so you dont need to go back again until there’s an update.
Use a local data store and sync when appropriate if it’s mostly local data
Make lots of targeting small requests, small data going across the wire
and compression.
I probably would’ve thought of a funny answer… personality goes a long way.
"I'll just use my iPhone, i've got unlimited data and these 4g speeds are GREAT!"
A few ideas that come to mind:
-
Compression – is the data being sent back and forth compressed to make it as small as possible?
-
Caching – could there be caches set up on the client to try to make things appear faster? Could the system be set up so the client can operate disconnected from the server for a time?
-
Network connectivity – While you can’t change the topology, it may be worth identifying if there is something else here that could be changed?
While it is an open question, there is something to be said for what is in this client/server system. Are there database calls? Is there synchronization to be set up? Are there alternative architectures that may work better?
The question then becomes how much can compression affect things. If the compression can mean that 100th or 1000th as much data has to be sent on the wire, I’d think that could be a factor as sending so much less may change things. The issue with the question is that there are many unknowns.
1
Latency and bandwidth are two different things; high latency means there’s a time overhead in sending messages back and forth, while low bandwidth means that the pipeline has a limit restricting synchronous data transfer. Both of these can mean it will take a while to get data, but for different reasons.
The solution for high latency if bandwidth is not an issue is to use larger requests for data; as much as you know you’ll need, to minimize the additive impact that the overhead delay will have. The solution for low bandwidth if latency is not an issue is to use smaller requests for data, as little as you can get away with at one time, to reduce the delay that completely retrieving a large dataset can add. In addition, for both cases, you should work as asynchronously as possible, so that you can give the user feedback while the data pull is occurring. Caching is an option if the data is relatively static and is not sensitive (i.e. anyone else who could get into the system could see exactly the same data; stock ticker = cache, bank account balances = don’t cache).
On the network with big latency, at first place you have to organize you protocols asynchronously. That means that you have to try to work without waiting for the response of the server.
You have to send the requests and to not wait for response. Then you have to organize separate thread or something that to process the responses when they come.
This way you will use the whole speed of the network and will get the respective responses only delayer in time with double the latency time.
Of course, it is possible only if the requests does not depends on each other and this is your task to design such workflow of the application.