Me and my colleagues work with CFD and we usually use in-house parallel programs with OpenMP.
We recently bought a new workstation equipped with an Intel Core i9-13900K processor. Now, this processor is equipped with 8 performance cores (with hyper-threading, 2 threads per core) and 16 efficient cores (without HyT).
Up until now, for each server/workstation we had, whenever we launched an executable asking OpenMP to use N threads, we would see with the ‘top’ command an Nx100% CPU usage, or with the ‘htop’ command we would see N threads working at 100% capacity, meaning that our program is running at full speed, and is efficienctly using the CPU capacity.
This does not happen with this new workstation (Linux OpenSUSE Leap 15.5). In general, whenever launching the executable asking for N threads, we see a much lower CPU usage. The executable is usually a Fortran program compiled using ifort (with -qopenmp option).
I tried setting OMP_PLACES=cores and OMP_PROC_BIND=close as environment variables but nothing actually changes.
In general, efficient cores are used up to 80% capacity and performance cores up to 40%, which is something we would like to avoid. We obviously would like our program to run as fast as possible by taking advantage of the entire CPU capacity.
The only way I found to make the program use the specific number of threads asked for is by doing ‘taskset –cpu-list n-m executable’ (n-m are numbers indicating the chosen cpus). This way I actually managed to tell my program to use the specified number of threads. However, even by doing this, those threads usage are capped at 80-40%, depending on the type of core they are in.
Is there anything I can do to make this machine work as the others? So that when I launch my CFD program with OMP_NUM_THREADS=N it actually does use N threads at full capacity?
Thanks in advance.
PaoloZolla is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.