Custom Parametric Activation Function Leading to NaN Loss and Weights
I am trying to create a custom non-linear activation function, and use a 3 3 hidden layer neural network to do MNIST classification. When I use ‘Relu’ the network trains fine, but when I use my custom activation function (which is a Parametric SoftPlus Function), the network is gives me NaN loss and weights.