I am constructing a program that does several things.
- Listens to a port. Every second a string is received
- When new data is received, it has to be logged. No errors or skipped entries are allowed
- When data is received, a plot on the screen has to be updated. This is an expensive computation. Missing some updates is not critical.
- Once a month, an archiving function shall be called, to backup the logs. This might take unknown amount of time.
I am new to threading and would like to ask for advice on the threading infrastructure. On one hand, it seems that I need at least 1 additional thread besides the main thread, for the port listener.
On the other hand, I am afraid that if I spawn too many a thread:
- Sync problems may occur
- The app may become resource-hungry
Here is some proposed code for the listener.
* The complexity of the code might rocket up
look up producer consumer problem for general solutions to the sync issue,
I think using a circular buffer that pops the oldest entry when it is full instead of blocking is your best bet
in psuedo code the port listener thread will be: (using an interface like javas ArrayBlockingQueue
while(alive){
record = read(port) //only blocking operation in the loop
logger.logRecord(record)
while(!plotqueue.offer(record)){ //try to push to plotter thread
plotqueue.poll() // discard oldest if full
}
}
then the plotter thread will be:
while(alive){
record = plotqueue.take()//block until one is available
plot(queue)
}
2
This is the sort of problem in which node.js thrives. I realize you’re asking for a python library, but I’m just telling you that if you have the liberty to change programming language, you should probably do so in this case.
Node.js uses web socket technology, meaning every client connected to your web site isn’t going to have to poll for new information every second, thus making it scale with remarkable efficiency.
How you use that information is entirely up to you, but there are many javascript libraries that will plot information from arrays of data or by repeated calls to add new data, meaning you could initially provide an array, then upon receiving new information, simply add it to your plot, which is likely a very efficient way of updating the plot without having to redraw everything (depending on what you’re using).
If you use node.js, you can choose to use a NoSQL database like MongoDB. That can be local to your server, but it can also be external, which will take care of your archiving problem.
If scalability is truly your concern, then I would highly recommend you consider switching to node.js for this. It is built primarily for these types of web applications in mind. If you aren’t familiar with node.js, then I highly recommend you learn it anyway, even if you decide not to go this route.