My end goal is to get c_indices
as described here. However, I’m unable to find any good optimization to increase the speed. Is it even possible to perform some kind of vectorization to reduce the time complexity?
_c = np.random.randint(0,1000, (10000))
c_starts = np.concatenate([[0], np.cumsum(_c[:-1])])
c_indices = np.concatenate([np.tile(np.arange(start, start + size), _c[i])
for i, (start, size) in enumerate(zip(c_starts, _c))])
for example for a given _c = [3, 4, 2]
, I would expect the following output
>>> _c
[3, 4, 2]
>>> c_starts
array([0, 3, 7])
>>> c_indices
array([0, 1, 2, 0, 1, 2, 0, 1, 2, 3, 4, 5, 6, 3, 4, 5, 6, 3, 4,
5, 6, 3, 4, 5, 6, 7, 8, 7, 8])
#based on _c = [3, 4, 2]
#[0, 1, 2] -> repeated 3 times
#[3, 4, 5, 6] -> repeated 4 times
#[7, 8] -> repeated 2 times
What you can do is :
-
Avoiding Concatenation: By directly assigning values to pre-allocated arrays, you avoid the overhead of repeatedly concatenating arrays, which can be slow.
-
Efficient Memory Usage: Pre-allocating the
c_indices
array and then filling it avoids the inefficiency of repeated array creation and copying.
For that :
-
Construct
c_indices
:- Use
np.empty
to allocate memory forc_indices
. - The loop fills
c_indices
by broadcasting the range of indices for each segment into the appropriate positions.
- Use
# Create the indices array
# The length of c_indices should be the sum of _c
num_elements = np.sum(_c)
c_indices = np.empty(num_elements, dtype=int)
# Fill the c_indices array using broadcasting
current_start = 0
for i, size in enumerate(_c):
end = current_start + size
c_indices[current_start:end] = np.arange(c_starts[i], c_starts[i] + size)
current_start = end
YassineLbk is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.