I’m a bit puzzled by this one, as I was expecting (just goes to show how surprising performance can be!) the results with 20 threads and 100 threads to be very similar, and if anything the 20 to be marginally faster (due to reduced context switching).
import java.util.concurrent.{Callable, Executors}
import scala.jdk.CollectionConverters._
object Test extends App {
private val threadPool = Executors.newFixedThreadPool(20) // Change thread number here
private def work(workId: Int): Callable[Unit] = () => {
var counter = 0L
while (counter < 10_000_000_000L) {
counter += 1
if (counter % (1_000 + counter) == 0) {
counter += workId
}
}
}
private val tasks = (1 to 100).map(work)
private val start = System.nanoTime()
threadPool.invokeAll(tasks.asJava)
private val end = System.nanoTime()
println(s"Time taken: ${(end - start)/1_000_000} ms")
threadPool.shutdown()
}
Running this 3 times with 20 threads in the pool: 202, 204, 205 seconds
Running this 3 times with 100 threads in the pool: 194, 192, 193 seconds
I’ve actually repeated this several times, and the percentage difference is consistent.
I have 14 cores and 20 logical processors: i7-12000H
JDK 17, Hotspot. Windows 10 Enterprise.
Any ideas as to why this might be, or suggestions as to what I can look at to understand this? I’m keen to learn more about this sort of thing!