Here in the code the dot sum function is valid for dLdB2 but not for dLdB1. Please tell the reason for it
def loss_gradients(forward_info: Dict[str, ndarray],
weights: Dict[str, ndarray]) -> Dict[str, ndarray]:
”’
Compute the partial derivatives of the loss with respect to each of the parameters in the neural network.
”’
dLdP = -(forward_info[‘y’] – forward_info[‘P’])
dPdM2 = np.ones_like(forward_info['M2'])
dLdM2 = dLdP * dPdM2
dPdB2 = np.ones_like(weights['B2'])
dLdB2 = (dLdP * dPdB2).sum(axis=0)
dM2dW2 = np.transpose(forward_info['O1'], (1, 0))
dLdW2 = np.dot(dM2dW2, dLdP)
dM2dO1 = np.transpose(weights['W2'], (1, 0))
dLdO1 = np.dot(dLdM2, dM2dO1)
dO1dN1 = sigmoid(forward_info['N1']) * (1- sigmoid(forward_info['N1']))
dLdN1 = dLdO1 * dO1dN1
dN1dB1 = np.ones_like(weights['B1'])
dN1dM1 = np.ones_like(forward_info['M1'])
dLdB1 = (dLdN1 * dN1dB1).sum(axis=0)
dLdM1 = dLdN1 * dN1dM1
dM1dW1 = np.transpose(forward_info['X'], (1, 0))
dLdW1 = np.dot(dM1dW1, dLdM1)
loss_gradients: Dict[str, ndarray] = {}
loss_gradients['W2'] = dLdW2
loss_gradients['B2'] = dLdB2.sum(axis=0)
loss_gradients['W1'] = dLdW1
loss_gradients['B1'] = dLdB1.sum(axis=0)
return loss_gradients
The thought was that the compiler is thinking that dLdB2 is and array whereas dLdB1 is not.
Soham Ghodake is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.