I was thinking about messing around with activation functions that are a little less standard. I know of making the class, implementing the forward, etc. but do more complicated functions require a backdrop function?
Is there a way to use torch to let it autogradient? Or if the math I’m doing isn’t within torch functionality should I just do the derivation by hand and put that in backwards function (I guess I just really wanna confirm if I need a backward function and what that would look like to be properly inherited, I couldn’t find an obvious documented example).