Why does the function not return done = true if when calling another function inside where done = true is output.
For example, I submit 4 to the _apply_discrete_action function. Why does not it return done = True in the step method?
def _apply_discrete_action(self, action):
x, y = self.position
self.previous_position = self.position
if action == 0: # up
self.position = (x, y - 1)
elif action == 1: # down
self.position = (x, y + 1)
elif action == 2: # left
self.position = (x - 1, y)
elif action == 3: # right
self.position = (x + 1, y)
elif action == 4: # up-left
self.position = (x - 1, y - 1)
elif action == 5: # up-right
self.position = (x + 1, y - 1)
elif action == 6: # down-left
self.position = (x - 1, y + 1)
elif action == 7: # down-right
self.position = (x + 1, y + 1)
# Processing: beyond map, obstacle collision
if self.position[0] < 0 or self.position[0] >= self.map_size[0] or
self.position[1] < 0 or self.position[1] >= self.map_size[1] or
self.obstacles[self.position[1]][self.position[0]] == 1:
self.done = True
def step(self, action):
if self.done:
raise Exception(“Episode has finished. Please reset the environment.”)
discrete_action, continuous_action = action
self._apply_discrete_action(discrete_action)
self._apply_continuous_action(continuous_action)
self.steps += 1
time_step = self._calculate_time()
self.total_time += time_step
self._calculate_energy_consumption(time_step)
reward = self._calculate_reward()
self.done = self._check_done()
return self._get_state(), reward, self.done
test start:
for _ in range(10):
action = (np.random.randint(8), np.random.uniform(-1, 1)) # random action
next_state, reward, done = env.step(action)
print(f”Action: {action}, Next state: {next_state}, Reward: {reward}, Done: {done}”)
if done:
break
For example, I submit 4 to the _apply_discrete_action function, why does not it return done = True in the step method?
After all, I have a check for going beyond the two-dimensional grid map, but if from the initial position [0,0] action 4 takes me beyond the map but step function does not output true. My actor definitely goes beyond the map because the next coordinate is [-1,-1]
Aero is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.