Analytical gradient of Softmax entropy loss does not match the numerical gradient
I’m trying to implement the gradient of the softmax entropy loss in Python. However, I can see that the analytical gradient does not match the numeric gradient. Here is my Python code: