I am working with pygeometirc Batch objects. But, for those unfamiliar with it, I will pose my question as follows:
I am given a 1d tensor Y
and a 1d tensor batch
.
Y
has the shape (num_of_nodes, )
and is the nodes’ probabilities concatenated over the different graphs.
batch
has the shape (num_of_nodes, )
and specifies the graph from which each node comes. Example:
Y = [0.2, 0.8, 1.]
batch = [0, 0, 1]
The first two nodes come from the first graph and the last from the second graph in the batch. Note that batch
is always sorted.
I want to sample one node from each graph in the batch in a vectorized way. I am aware of a sequential way by splitting the vector and creating Categorical distributions, but this is too inefficient.
Example output:
out = [0, 2]
or out = [1, 2]
Restarting the count at each graph (out=[0, 0]
or out=[1, 0]
) is fine, but not desirable.