How could we address the probelm of the exploding gradient problem in DRL
I am curious about how we can address the exploding gradient problem in deep reinforcement learning (DRL), particularly within Deep Q-Network (DQN) algorithms. Could you explain the exploding gradient problem in the context of DRL? Is it possible for DQN algorithms to face the same challenges with exploding gradients that are observed in deep learning?