How to optimize the following structure as a vectorized operation
My end goal is to get c_indices
as described here. However, I’m unable to find any good optimization to increase the speed. Is it even possible to perform some kind of vectorization to reduce the time complexity?