DQN gradients are go to infinity
I tried to implement a DQN model myself, and according to what I read from studies the output the network should output is the same output it output, except for the Q value of the best operation. I implemented it like this:
DQN ALgorithm not converge
I’m a software engineering student.
I implemented the DQN algorithm myself for an autonomous driving project that I am doing as part of my studies.
During two weeks I try to train the algorithm, but it fails to converge to a proper solution.
I appeal to all those who understand the field, and ask that those who are willing to help me and go over the project with me in a Zoom call or receive a ZIP file from me and review it – this will help me a lot, and will allow me to continue the desired project.
For contact: [email protected]