I am trying to evaluate different Integrated Gradients methods on my RoBERTa based model, and I came to a paper introducing “Sequential Integrated Gradients” with this github repo:
text
I understand to get the IGs we need to pass input_embeddings not ids to the model so it’s able to compute gradients but in the given code the author is producing the embeddings like:
getattr(model, model_name).embeddings.word_embeddings(input_ids)
and later generate positional and type embeddings (also attention mask) manually and finally summing them and applying the LayerNorm and dropout layers to pass to the model using this wrapper:
class ForwardModel(nn.Module):
def __init__(self, model, model_name):
super().__init__()
self.model = model
self.model_name = model_name
def forward(
self,
input_embed,
attention_mask=None,
position_embed=None,
type_embed=None,
return_all_logits=False,
):
embeds = input_embed + position_embed
if type_embed is not None:
embeds += type_embed
# Get predictions
embeds = getattr(self.model, self.model_name).embeddings.dropout(
getattr(self.model, self.model_name).embeddings.LayerNorm(embeds)
)
pred = self.model(
inputs_embeds=embeds,
attention_mask=attention_mask,
)[0]
# Return all logits or just maximum class
if return_all_logits:
return pred
else:
return pred.max(1).values
however when I pass the model just “embeds” itself using argument “inputs_embeds” I get the same output as passing “input_ids” using “input_ids” argument, it means the model handles the position and type embeddings itself. So adding them manually to input_embeddings brings a different output. It seems they did it maybe because the model bypasses those embeddings if the input_embeddings instead of ids are passed. I appreciate if someone clarifies.
Ali Aghababaei is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.