I have built a DQ network ,which analyses the states of asteroids and the rocket gives out the best possible action to be performed (left, right ,shoot , idle).But the agent keeps spamming bullets ?
can anyone help me in solving this problem ?
I thought the agent would explore all 4 actions but it keeps spamming bullets and harldy moves left or right ,stays almost at the centre and keeps spamming bullets .