This might be a flagged as duplicated or maybe irrelevant. But I actually believe that this question is important both for me and future inexperienced developers of Python.
The concept of Local Worker Queue for CPU bound tasks is essential in Python due to the GIL. There are obvious answers in this regard on SE. The method of sub-processes is used to bypass the lack of real CPU bounded parallelism. In python we can use the multiprocessing.Pool
class to achieve exactly that. I wanted to build a generic worker queue singleton that can accept any function and any type of arguments and process it asynchronously along my main process on a different CPU, as a worker queue should.
It is that simple.
Only that it really isn’t.
In the basis of the issue lies the Pickle problem.
Pickle, as I found out, is limited. See this question.
Since once can have a code in the likes of:
def NetworkClassType(topologyClass):
class Network(topologyClass):
...
Pickle is not an option for a generic worker queue. So I found out I can use dill
. Yet, I kept encountering a series of issues that I had to figure out.
It started from this issue. I have to use this setting as I am aiming for a generic WQ and sometimes things are not in the context that the sub-process expects.
Then I had an issue where this code from the pickle package:
if reduce is not None:
rv = reduce(self.proto)
returns the exception TypeError: 'NoneType' object is not callable
, which was a nightmare to figure out why. It turns out that the following code breaks the if
statement when you put it in a pickled class (even in dill):
def __getattr__(self, prop):
return None
which basically means that you don’t want any exceptions when trying to get a property that was not defined in the __dict__
of the class. Turns out, pickle and dill cannot handle this code. There goes the generic WQ idea already.
But OK, lets get rid of this weirdness by removing the above code from our classes.
Then I met this error, which has multiple suggestions on how to solve, none which actually worked for my case.
At this point, I stopped. I thought to myself, “I am doing something wrong here. Python is a data oriented language, mature, with well established echo system. A generic worker queue can’t be this complicated. Someone must have done this before me.”
So to make it clear, I am blaming myself here. I must have missed something. A well known package for WQs in python? Maybe multiprocessing isn’t the answer? Maybe dill isn’t the answer? maybe there is a way to bypass GIL (without changing the interpreter)? Maybe I missed a single concise answer to all of this somewhere on SE?
Now I can keep going at this, and I will. Solving one issue at a time until I figure out how to achieve my goal. But I will be happy if someone shows me that I am wrong. That there is in fact a working functioning generic worker queue (or even better, an Actor Queue) in python, which handles everything and is well efficient.
I can only hope.
Thank you.
I am still working on the project.
I tried all of the above links (and then some), and will try to understand how to make my package of WQ work.
Tzali is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.