Training a DDPG agent for MIMO system

Question

0 votos

I am trying to train a DDPG agent for a MIMO system. The issue which I am facing is that the actor output i.e 'action values ' are very high (not in the action range ). These action values when given to the environment results in nan states. To solve this problem, i tried confining the action between the desired bounds by appling a tanh actuvation at the output layer of the actor and then scaled the action it to the actual bounds. Doing this, the action values are now in the range but the values are always on the higher bound and i am getting a constant action throughout. Not able to solve this issue now for a long time. PLease help me with this.

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Yi Zhao el 22 de Nov. de 2022

Hello, I have encountered the same problem. How did you solve it, please.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Hornett el 17 de Sept. de 2024

0 votos

Hi Tanuja,

I understand that you want to train a DDPG agent for MIMO system but the action values you are getting are always on the higher side with constant action throughput, here are a few things you can try to get the correct action values

Adjust the exploration rate or use different exploration strategies, such as epsilon-greedy or noise-based exploration. This can allow the agent to explore a wider range of actions and potentially discover better control strategies.
Learning rate: If the learning rate is too high, the weights in your neural network might be updating too drastically, which can lead to instability and constant action values. Try reducing the learning rate.
Reward function: The design of the reward function can significantly affect the learning of the agent. Make sure that your reward function is designed in such a way that it encourages the agent to learn the desired behaviour.

Please find links to below documentation which I believe will help you for further reference:

Reinforcement Learning Agents: https://www.mathworks.com/help/reinforcement-learning/ug/create-agents-for-reinforcement-learning.html
DDPG Agents: https://www.mathworks.com/help/reinforcement-learning/ug/ddpg-agents.html
rlDDPGAgentOptions: https://www.mathworks.com/help/reinforcement-learning/ref/rl.option.rlddpgagentoptions.html
Define Reward and Observation Signals: https://www.mathworks.com/help/reinforcement-learning/ug/define-reward-and-observation-signals.html

Hope this helps!

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Training a DDPG agent for MIMO system

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Respuestas (1)

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Etiquetas

Community Treasure Hunt

Training a DDPG agent for MIMO system

1 comentario Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Respuestas (1)

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Etiquetas

Ver también

Community Treasure Hunt

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos