Changing how DQN agent explores
2 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Hi,
I'm using a DQN agent with epsilon-greedy exploration. The problem is that my agent sees state 1 99% of the time, so it never learns to act in other states. By the time it learns to get to state 2 from state 1, epsilon has already decayed significantly and the agent gets stuck taking a sub-optimal action in state 2. Is there a way to implement some other form of exploration, like using a Boltzmann distribution? Thanks for your time.
2 comentarios
Tanay Gupta
el 13 de Jul. de 2021
Can you give a brief description of the states and the respective transitions?
Respuestas (0)
Ver también
Categorías
Más información sobre Training and Simulation en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!