While reading in book i encountered this statement
Now, the reason we don’t initialize the weights to zero is that the learning rate (eta) only has an effect on the classification outcome if the weights are initialized to non-zero values. If all the weights are initialized to zero, the learning rate parameter eta affects only the scale of the weight vector, not the direction.
I Followed the answer of this link
https://datascience.stackexchange.com/questions/26134/initialize-perceptron-weights-with-zero#:~:text=If%20you%20initialize%20all%20weights,vector%2C%20not%20the%20direction%22.
But i am still not getting the point.
Suppose i have 2 features this means 3 weights (including bias)
So this means a vector W = (w1,w2,w3) will be represented on the co-ordinates ?
So direction will be line from (0,0,0) to (w1,w2,w3) and magnitude the distance between them.
Or does each weight example w1 , w2 etc will be represented on its own on the co-ordinates ? and all three weights have different direction and magnitude
How if weights are initialized to 0 then how direction don’t changed ? How only the scale changes ?
I am new to algebra so it will be great if you explain whole scenarios to me in a very simple way.