func_model, func_params = make_functional(self.model)
def fm(x, func_params):
fx = func_model(func_params, x)
return fx.squeeze(0).squeeze(0)
def floss(func_params,input):
fx = fm(input, func_params)
return fx
per_sample_grads =vmap(jacrev(floss), (None, 0))(func_params, input)
cnt=0
for g in per_sample_grads:
g = g.detach()
J_d = g.reshape(len(g),-1) if cnt == 0 else torch.hstack([J_d,g.reshape(len(g),-1)])
cnt = 1
result = J_d.detach()
In this code, per_sample_grads
includes all the grad of a network model by inputting some data. Where the model
self.model = Network(self.input_size, self.hidden_size, self.output_size, self.depth, act=torch.nn.Tanh() )
I use this code to compute the Jacobian matrix. [ Jacobian matrix : data number * parameter number ]However, what I want to get is a certain small part of grad.
params = torch.cat([p.view(-1) for p in self.model.parameters()], dim=0)
selected_columns = torch.random.choice(p_number, opt_num, replace=False)
target_params=params[selected_columns]# I just need to caculate the grads of the target_params
And getting the all the gradient is some kind of waste in my codes. I want to save some time in this auto_grad step. What can I do to achieve that?
How can I compute the gradients for only part of the model to save time?
Klae zhou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1