Parallelize operations on arrays and merge results into one array using OpenMP
I am trying to speed up a function that, given a complex-valued array arr
with n
entries, calculates the sum of m
operations on that array using BLAS routines. Finally, it replaces the values of arr
.
For simplicity, assume the operations to be matrix-vector multiplications with different square matrices mats
and arr
being the vector, followed by a scalar multiplication from an array of scalars scalars
. In reality, they are rather combinations of functions from this question Apple’s dispatch vs OpenMP to parallelize a for loop on Apple MacBook Pro with M3Pro, but I don’t want to bother you with too much detail here; only if necessary.