I’m in the early stages of designing a client/server application. The clients will be batch programs that read a file of customer contact data (name, address, email address, phone no’s) and pass these components to the server, which will add them (if not already present) to corresponding mySQL tables and return id’s for each component. In order to boost performance, the server will have spawned four “manager” servers, each tasked with looking up and possibly adding a new row, and will pass the four components via IPC to those managers so that they can work concurrently.
In other words
- One “master server” that does nothing but listen for a connection from a batch job, and fork/exec a “slave” process,
- Four “managers” that read from a socket, do a table lookup and possibly add a row, and write back an id,
- One “slave” process for each connected client, spawned by the master when a new connection arrives from a client, which converses with the batch job: receiving a customer contact record, sending components to each of the managers, waiting for all of the managers to respond with the id’s they’ve computed, and sending a summary record back to the client before looping back to receive the next record.
(The processing is a bit more involved than I’ve described- there are actually eight managers, and the results from the first four need to be completely gathered before invoking the next three managers, which in turn must all complete before calling the final manager. But that’s just a simple process with several sequential stages, each of which involves farming out concurrent work and waiting for it all to finish.)
In discussing this with another member of the team, I was asked “why have the master server and the slave servers? why not have the client make separate connections directly with those manager processes?”
I haven’t got a really good objection: each client could essentially implement the slave logic directly and create eight simultaneous connections to the manager servers. I have a feeling that’s not the best approach – that somehow it might be important to have some centralized control to robustly deal with failures and errors, or to accumulate statistics about the server as a whole. But I’ve no prior experience building a full-scale, production-worthy client/server app.
I’d be very interested in hearing the opinions of those with prior experience building apps of this nature.
UPDATE 1: One advantage: If the client process abruptly crashed or was canceled, the slave process would stay alive, could detect that the client had gone away, would have complete knowledge of the job state, and could ensure data integrity by completing (or backing out) the current unit of work. It could bring an orderly end to a failure.
3
In short, the answer to your colleague is “encapsulation.”
What are you going to do when you need to spin up a ninth manager? Or more slave processes per client? Or you need to adjust the interaction logic just enough that it would force updating and invalidating all of the existing clients? Do you have full control over all of the clients in order to do that?
It doesn’t sound like any of the processing needs to be done by the clients, so they shouldn’t.
And, in theory, you have control over the master / managers / slaves, so you can trust that code a little bit more than you can with the client. Even though you may be writing / publishing the clients you want to look at it from a security point of view as well. Clients are generally less trustworthy than the server components because of deployment and being on systems outside of those you control. You may also have to support downlevel client versions which can really foul up your ability to make changes if the logic resides within the client.
You’re on the right path for a client / server environment. Stick to design you’ve started and don’t provide inappropriate access to the clients.
1