I’m using Python 3.9.19 on Rocks 7.0 (Manzanita) , and when I do not set the number of threads to some value and let it default , it crashes. I have 64 cpu cores. Could the issue be that it is just using too much memory and then crashing? While debugging I set the number_workers parameter to 2 and it was working fine, just too slow, so using only 2 is not practically feasible for me.
I am currently running it with 8 workers and it is running fine for now.
“Default value of max_workers is changed to min(32, os.cpu_count() + 4) .” (Quoted from the official documentation)
In my case it is leading to the error message below with a value of 16 or more for max_workers.
EDIT: Tried using 16 workers , it leads to the same issue.
Traceback (most recent call last):
File "/data1/xyz/Displace2024_baseline/speaker_diarization/SHARC_check/wespeaker/diar/spectral_clusterer.py", line 264, in main
for (subsegs, labels) in zip(subsegs_list,
File "/data1/xyz/.conda/envs/wespeaker/lib/python3.9/concurrent/futures/process.py", line 562, in _chain_from_iterable_of_lists
for element in iterable:
File "/data1/xyz/.conda/envs/wespeaker/lib/python3.9/concurrent/futures/_base.py", line 609, in result_iterator
yield fs.pop().result()
File "/data1/xyz/.conda/envs/wespeaker/lib/python3.9/concurrent/futures/_base.py", line 446, in result
return self.__get_result()
File "/data1/xyz/.conda/envs/wespeaker/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
Above is the error message I receive when I set the max_workers parameter to default or more than(or equal to) 16 .
Thanks.