I want to understand data stream in Tianshou. How the data is pre-processed or we must do it ourselves?
I have read the example here. Tic tac too PettingZoo enviroment have Dict space observation. Observation contains key ‘observation’ and key ‘action_mask’:
Dict('action_mask': Box(0, 1, (9,), int8), 'observation': Box(0, 1, (3, 3, 2), int8))
But policy have only model for ‘observation’ (input dim = 18 = 3x3x2):
DQNPolicy(
(model): Net(
(model): MLP(
(model): Sequential(
(0): Linear(in_features=18, out_features=128, bias=True)
(1): ReLU()
(2): Linear(in_features=128, out_features=128, bias=True)
(3): ReLU()
(4): Linear(in_features=128, out_features=128, bias=True)
(5): ReLU()
(6): Linear(in_features=128, out_features=128, bias=True)
(7): ReLU()
(8): Linear(in_features=128, out_features=9, bias=True)
)
)
)
(model_old): Net(
(model): MLP(
(model): Sequential(
(0): Linear(in_features=18, out_features=128, bias=True)
(1): ReLU()
(2): Linear(in_features=128, out_features=128, bias=True)
(3): ReLU()
(4): Linear(in_features=128, out_features=128, bias=True)
(5): ReLU()
(6): Linear(in_features=128, out_features=128, bias=True)
(7): ReLU()
(8): Linear(in_features=128, out_features=9, bias=True)
)
)
)
)
I tried to understand it by looking at the source code but I couldn’t find it. Do anyone can explain me? Thanks.
New contributor
user21588592 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.