I’m working on a RL problem where, in a nutshell, an agent has to go from point A to point B, in that order, with as few steps as possible, using DQN with PyTorch, to train the agent.
During training, the agent does learn how to perform the task. At the end of the training I save the model state dictionary.
But, when I try to load the model state dictionary to “test” the agent, to show other people, for example, that the agent/model works, the performance drops significantly (the agent doesn’t work).
I’ve search on the internet about the proper way to do this, but it’s not working for me, for some reason.
Basically what I’m doing is as follows:
- Load the model state dictionary.
- Set the model to eval.
- Using with torch.no_grad().
- Getting the prediction with model(…).
- Selection the prediction with the highest “probability”.
Like this:
state_dict = torch.load('model_trained_state_dict.pth')
model.load_state_dict(state_dict)
model.eval()
with torch.no_grad():
actions = model(observation)
action = torch.argmax(actions).item()