Most efficient way to parallelize function which returns three 2D arrays I have the following function: