I’m working on a Python script to calculate the sum of a large list of random integers. Here’s the code for the single-threaded version:
import time
import random
large_list = [random.randint(-100, 100) for _ in range(200000000)]
start_time = time.time()
total = 0
for n in large_list:
total += n
print(total)
print(time.time() - start_time)
This outputs:
-1153397 # total sum
9.442509889602661 # time taken
I then tried to parallelize this using multithreading with the following code:
from multiprocessing.pool import ThreadPool
start_time = time.time()
_num_threads = 10
total = 0
del chunks
def partial_sum(chunk):
total = 0
for n in chunk:
total += n
return total
chunk_size = len(large_list) // _num_threads
chunks = [large_list[i * chunk_size: (i + 1) * chunk_size] for i in range(_num_threads)]
if len(large_list) % _num_threads != 0:
chunks.append(large_list[_num_threads * chunk_size:])
# Calculate the partial sums in parallel
with ThreadPool(processes=_num_threads) as pool:
partial_sums = pool.map(partial_sum, chunks)
total_sum = sum(partial_sums)
print(total_sum)
print(time.time() - start_time)
Running it with 10 threads on a 12-core CPU, I get:
-1153397 # total sum (expected to be around zero)
8.065398454666138 # time taken
However, running it with just one thread gives:
-1153397 # total sum (same as before)
5.773534536361694 # time taken
My questions:
-
Why is the one thread version of the multi-threaded code much faster than the single-thread approach?
-
Why does the performance degrade when I increase the number of threads to 10, even though my CPU has 12 cores?
I expected the multi-threaded code to be faster, or at least not slower when using more threads. What could be causing this behavior?
I tried running the code above several times, and I got consistent results to the ones presented. However , I was expecting the one threaded version of the multi-thread code to be as fast (or even slower due to overhead) as the single-thread implementation, and I expected the 10-threaded version to be much faster than the other two approaches.
Héctor Balsells Roure is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
2