I am new to multiprocessing, so this might be a stupid question.
I am using Ubuntu 20.04.6 LTS (64-bit) with a 12th Gen Intel(R) Core(TM) i7-12700K processor and 16GB of RAM under Python 3.9.19. When I run the following code, the htop
command shows that all 10 cores are being utilized as expected:
import time
def fun(ele):
ele ** 1000
if __name__ == '__main__':
import multiprocessing
input_list = [1000] * 10000000
start_time = time.time()
pool = multiprocessing.Pool(10)
result = pool.map(func=fun, iterable=input_list)
pool.close()
pool.join()
end_time = time.time()
print(end_time - start_time)
However, when I import some additional libraries (like PIL
, numpy
, etc.), it seems that only 2 cores are working, even though 11 processes (1 main process and 10 child processes) are created. Here is the modified code:
import time
from PIL import Image
import numpy as np
import pickle
import os
import torch
def fun(ele):
ele ** 1000
if __name__ == '__main__':
import multiprocessing
input_list = [1000] * 1000000
start_time = time.time()
pool = multiprocessing.Pool(10)
result = pool.map(func=fun, iterable=input_list)
pool.close()
pool.join()
end_time = time.time()
print(end_time - start_time)
Using the command ps aux | grep [p]ython | grep $name_of_program$ | wc -l
, I confirm that 11 processes are indeed created (1 main process and 10 child processes). However, it seems that these processes are not being distributed across all CPU cores.
After setting the start method to spawn
with multiprocessing.set_start_method('spawn')
, all the 10 cores work correctly again.
import time
from PIL import Image
import numpy as np
import time
import numpy as np
from PIL import Image
import pickle
import os
import torch
def fun(ele):
ele**1000
if __name__ == '__main__':
import multiprocessing
multiprocessing.set_start_method('spawn')
input_list = [1000]*1000000
n = 250000
start_time = time.time()
pool = multiprocessing.Pool(10)
result = pool.map(func=fun, iterable=input_list)
pool.close()
pool.join()
end_time = time.time()
print(end_time-start_time)
- Why are only 2 cores being utilized in the second case?
- Could this issue be related to memory limitations caused by the additional libraries I imported?
- What steps can I take to ensure that the processes are evenly distributed across the CPU cores in such cases (with fork method)?