I’m using the overruling dataset is a small (n=2400, evenly split) binary classification task. I’m also using a tiny BERT model (4.4M params).
I’m also assuming (following this FAQ that I don’t need to explicitly provide a target.
My call to attribute()
looks like this:
attributions_ig, delta = lig.attribute(in_tensor, reference_indices,
additional_forward_args (ttype_tensor,attn_tensor),
n_steps=500, return_convergence_delta=True)
-
It makes it to LayerIntegratedGradients.gradient_func L#480 and then
fails on this assertion:assert output[0].numel() == 1, ( "Target not provided when necessary, cannot" " take gradient with respect to multiple outputs." )
-
Indeed,
output[0].numel() = 1000
;output[0].shape = torch.Size([500, 2])
-
Looking back up the stack,
_attribute()
is getting the right
inputs argument andinputs[0].shape
is [1,128,128] but the
scaled_features_tpl
passed on as inputs togradient_func
as
inputs
has shape [500, 128,128] ?! I also notice that the inputs
have ALSO been prepended (redundantly?) ontoadditional_forward_arguments
?
(Also posted as captum issue )