I am trying to implement a scheduler into my pytorch neural network. When I train the network, at the very start I get circa 2.5 loss with the scheduler (every scheduler I tried gives similar results) and around 1.5 loss without a scheduler.
As mentioned, I tried several different schedulers that pytorch offers but the loss gap between the scheduler and no scheduler is consistent.
So I am interested, does the scheduler affect the learning process before it changes the learning rate?
Thanks