Relative Content

Tag Archive for pythonnumpyreinforcement-learning

R_max algorithm doesn’t converge to the right policy

I have a task where I need to implement an R_max algorithm with modified policy iteration over the frozen lake problem.