im working on a model that learns chess through ddqn reinforcement learning , in short, in this specific snippet of the code:
state_tensor = torch.FloatTensor(state).unsqueeze(0) # Ensure state is a 2D tensor with shape (1, 768)
# Get Q-values for the current state
q_values = env.q_network(torch.FloatTensor(state_tensor))
# Mask out invalid actions
valid_actions = env.get_valid_actions() # This method should return a binary mask of valid actions
# Convert valid_actions to a tensor and reshape to match q_values shape
valid_actions = torch.tensor(valid_actions, dtype=torch.bool).unsqueeze(0) # Now valid_actions has shape [1, 4672]
# Assuming q_values is obtained from the neural network output
#q_values = torch.tensor(q_values, dtype=torch.float32) # Convert q_values to tensor if it isn't already
# Verify the shapes
print(f'q_values shape: {q_values.shape}') # Expected output: [1, 4672]
print(f'valid_actions shape: {valid_actions.shape}') # Expected output: [1, 4672]
# Apply the mask
q_values[~valid_actions] = float('-inf')
i keep getting this error:
IndexError: The shape of the mask [1, 4672] at index 1 does not match the shape of the indexed tensor [1, 1, 4672] at index 1
for some reason the tensor dimensions are always mask dimension+1 i realize it is probably because of the “unsqueeze(0)” but i tried like infinite combinations to make the tensor and the mask the same dimentions but it just wouldnt work , at the end it should be in the for of [1, 4672]
#q_values = torch.tensor(q_values, dtype=torch.float32) # Convert q_values to tensor if it isn't already
valid_actions = torch.tensor(valid_actions, dtype=torch.bool).unsqueeze(0) # Now valid_actions has shape [1, 4672]