I am running a simulation of random events happening in accordance with a given set of probabilities, until n events are hit, and repeating for as many iterations as desired. When I run this sequentially, it finishes a small test set of 24 iterations in 5 seconds. When I set a higher number of threads (16), the time involved becomes 55 seconds. I made these comparisons using time bash -c "..."
.
I am sorry for asking another one of these questions, but I have been searching for a solution for several hours now. I think the problem has to be some sort of race condition, but I just cannot seem to figure out on what. Variables target, n_probabilities and probabilities should all be private to the thread, and the majority of the time is spent between the print statements ‘Started thread %d’ and ‘Finished thread %d’. This suggests to me that mutation_iterations[i] = simulate_iteration(target, n_probabilities, probabilities);
then has to be the offending line, and mutation_iterations is a shared variable, but it should only be accessed a single time per thread so I don’t see how it could cause this much of a speed drop.
However, looking at gprof output I don’t really see an unusual amount of time spent on the function in question (although the 6.15 for main seems weird to me, as it is not the actual runtime):
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
98.86 6.08 6.08 24 253.33 253.33 simulate_iteration
0.98 6.14 0.06 _init
0.16 6.15 0.01 main
Admittedly this is my first time reading gprof, so I may be reading this entirely wrong. But as I see it, my program seems to be running for 49 seconds longer than gprof says it should, and I cannot figure out where this comes from.
The offending code:
int **mutation_iterations = malloc(iterations * sizeof(int *));
srand(0); // Set random number seed
#pragma omp parallel shared(mutation_iterations)
{
int target = desired_mutations;
int n_probabilities = simulation_size;
double probabilities[simulation_size];
for (int i = 0; i < simulation_size; i++) {
probabilities[i] = simulation_set[i].probability;
}
#pragma omp for
for (int i = 0; i < iterations; i++) {
mutation_iterations[i] = NULL;
printf("Started thread %dn", i);
mutation_iterations[i] = simulate_iteration(target, n_probabilities, probabilities);
// simulate_iteration(desired_mutations, simulation_size, probabilities, &mutation_iterations[i]);
printf("Finished thread %dn", i);
}
}
The simulate_iteration(…) function:
int * simulate_iteration(int target, int n_probabilities, double probabilities[]) {
int *counts = calloc(n_probabilities, sizeof(int));
int total = 0;
while (total < target) {
for (int i = 0; i < n_probabilities; i++) {
double rand_val = (double) rand() / RAND_MAX;
if (rand_val < probabilities[i]) {
counts[i]++;
total++;
}
}
}
return counts;
}
I must be overlooking something but I just can’t see what. Any and all help would be much appreciated.