When I was training the model, I needed to iterate through each batch, which is definitely inefficient.
The code is as follows
for batch, indices in enumerate(topk_indices):
indices = indices.tolist()
for i in range(self.n_random_expert):
random_expert_idx = -1
while random_expert_idx == -1:
random_expert_idx = random.randint(0, self.n_channels - 1)
if random_expert_idx not in indices:
indices.append(random_expert_idx)
else:
random_expert_idx = -1
for idx, indice in enumerate(indices):
net = getattr(self, self.ch_names[indice] + '_' + self.attr_type)
if self.attr_type == 'ExpertBase':
expert_output[batch:, idx * self.n_expert_hidden_dim:(idx + 1) * self.n_expert_hidden_dim,
:] = self.dropout(net(x[batch:, idx, :]))
Is there a way to make this code can be faster?
Since each batch might have a different model, I can’t think of a way to parallelize it.
New contributor
ShiJiaXin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.