Multihead Attention for 4-D tensor in Pytorch
I tried to transform tensorflow to pytorch, but I have a trouble with multi head attention due to its dimensions.
I tried to transform tensorflow to pytorch, but I have a trouble with multi head attention due to its dimensions.