Here is code I am trying to optimize:
import numpy as np
rng = np.random.default_rng()
num = np.fromiter((rng.choice([0, 1], size=n, p=[1-qEff, qEff]).sum()
for n in num), dtype='int')
What it does is it takes array num
, where each element represents a bucket with given number of items. For each item it rolls a dice (Bernoulli trial with probability qEff) and if successful, it keeps the item in the bucket, if not it removes it.
To avoid the loop, I came with idea to do all the rolling first and then sum number of trials corresponding to particular element in num
array. So it would go something like this:
rol = rng.choice([0, 1], size=num.sum(), p=[1-qEff, qEff]
num = some_smart_sum(rol, num)
So now I need to figure out how to do the some_smart_sum()
without a loop. So far it looks like np.add.reduceat()
should do the trick, but I am having difficulties building the indices
parameter.
To add buckets you can get the indices for reduceat
from a call to cumsum
, but you will have to add a 0
in front and get rid of the last entry, which is equal to the total number of items and is added automatically by reduceat
. Something like this should work:
indices = np.concatenate(([0], np.cumsum(num)[:-1]))
num = np.add.reduceat(roll, indices)
A less obvious approach, but probably faster for large inputs is to not do the sums at all… Your sums are the number of successes of n
independent Bernoulli trials with success probability p
, so they follow a binomial distribution. So you can do that directly instead:
rng = np.random.default_rng()
num = rng.binomial(num, qEff)
1