I am implementing the SQUAT algorithm from a research paper on adversarial attacks using Sequential Quadratic Programming (SQP). My current task involves calculating the Gradient (∇g
) of a constraint function ????(????_????) = (????_???? − 1_????.???? * ????_????)*????(????_????) ≤ 0
where C(x k)
represents the classifier’s output (a vector torch.FloatTensor torch.Size([10])
), and ????_j
is the canonical basis vector for class j.
However, I’m encountering issues with computing the Gradient using PyTorch’s torch.autograd.functional.jacobian function. Yes I’m using jacobian because is the only way (apparently) to compute the gradient in this situation (g ∈ R^10
). The function returns an all zero value tensor of dimension [10,784]
(a matrix) . Below is a snippet of my code where I attempt to compute the Gradient of g:
def squat_algorithm(x, x_k):
n = 28 * 28
d = cp.Variable(n) # direction (variable of the optimization problem)
I = torch.eye(n)
for k in range(config.N_iter):
if k < config.N_1:
f_grad = utils.f_gradient(x, x_k)
objective = cp.Minimize(cp.matmul(f_grad.t(), d) + 0.5 * cp.quad_form(d, I))
g = cp.Variable(utils.g(x_k).shape, value=utils.g(x_k).numpy())
g_grad = torch.autograd.functional.jacobian(utils.g, x_k.flatten()).numpy() # HERE
constraints = [g_grad @ d + g <= 0] # HERE
problem = cp.Problem(objective, constraints)
result = problem.solve(solver=cp.OSQP, verbose=False)
optimal_d = d.value
x_k = x_k.flatten() + (config.α * optimal_d)
x_k = x_k.clamp(min=0.0, max=1.0).reshape(28, 28).to(torch.float32)
- How should I correctly compute the Jacobian of
g(x_k)
in this context? - Is there a more efficient way to implement this computation in PyTorch, especially considering that
g(x_k)
involves operations on the output of a neural network?
Any suggestions or examples would be greatly appreciated as I work through this implementation challenge.
BernMan is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.