I’m using Python multiprocessing and then submitting the job with sbatch. The .sh code is as follows:
#!/bin/bash
##----------------------- Start job description -----------------------
#SBATCH --partition=standard
#SBATCH --job-name=multi_Leaspy
#SBATCH --nodes=1
#SBATCH --ntasks=40
#SBATCH --mem-per-cpu=4096
#SBATCH --time=160:00:00
#SBATCH --mail-type=ALL
#SBATCH --output=out-%j.log
#SBATCH --error=err-%j.log
##------------------------ End job description ------------------------
This is the snippet where i parallelize the code:
num_workers = ntasks = int(os.environ.get('SLURM_NTASKS', 1))
if __name__ == "__main__":
with ProcessPoolExecutor(max_workers = num_workers - 1) as executor:
arg_list = [(i, row, df_test, id_test, btstrp, classes, classes_auc,
results_table, MAE_table, features_set, vector_range) for i, row in feat_sub.iterrows()]
for result, MAE_row in executor.map(multiLeaspyIter, arg_list):
if result is not None and MAE_row is not None:
results_table = results_table.append(result, ignore_index=True)
MAE_table = MAE_table.append(MAE_row, ignore_index = True)
Then, i ssh and htop to see the resource utilization and there are more executing processes than requested. Furthermore, many waiting processes appear, slowing down everything.
htop results:
Tasks = 1627, 501 thr, 40 running
Load average = 201.3, 180.94, 117.36
Each node has 40 cpus, hyper-threading enabled.
Previously, i was using ProcessPoolExecutor, same code, and it was far slower.
I’m not sure whether multiprocessing is appropiate for a slurm environment or i’m missing something.
Previously, i was using ProcessPoolExecutor, same code, and it was slower.
When i configure the .sh so –ntasks = 1 and the .py so max_workers = 1, the code runs faster. If –ntasks = 5 and max_workers = 5, it goes slower and creates aditional waiting processes.
Apart from that, i noticed with just 1 process, many threads are created.
I’ve tried the same code on Windows and it goes faster.
user26458368 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.