test = nn.Linear(1440, 1440, bias=False)
hidden1 = torch.randn(100, 1440)
hidden2 = torch.randn(400, 1440)
output1 = test(hidden1)
output2 = test(hidden2)
If I test it as above,
shouldn’t the output1 and output2[:100,:] parts be exactly the same?
There are slightly different parts, do you know why?
It should be the same as simple matmul calculation, but it is different.