Reinforcement agent learning always stucks

Question

Eugen Fekete el 6 de Feb. de 2025

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/2173762-reinforcement-agent-learning-always-stucks

Editada: Gayathri el 10 de Feb. de 2025

I'm working on creating an agent that can learn the sine function as a simple warm-up, but I'm already stuck. The problem is that after a few iterations, the agent reaches a certain level and then hits a ceiling and won't learn further no matter how long I let the learning process run. My code:

obs_info = rlNumericSpec([1 1], LowerLimit=-pi, UpperLimit=pi);
obs_info.Name = "Sinus Value";
act_info = rlNumericSpec([1 1], LowerLimit=-1, UpperLimit=1);
act_info.Name = "Predicted Value";
reset_fcn_handle = @()reset_train();
step_fcn_handle = @(action, portfolio)step_train( ...
    action, portfolio);
sinus_train_env = rlFunctionEnv( ...
    obs_info, act_info, step_fcn_handle, reset_fcn_handle);
function [initial_observation, portfolio] = reset_train()
    initial_observation = 2*pi*rand(1)-pi;
    portfolio = struct;
    portfolio.LastValue = initial_observation;
end
function [next_observation, reward, is_done, portfolioOut] = step_train( ...
    action, portfolio)
    expected_prediction = sin(portfolio.LastValue);
    reward = 1 / 100 / (0.01 + abs(action - expected_prediction));
    next_observation = 2*pi*rand(1)-pi;
    portfolioOut = portfolio;
    portfolioOut.LastValue = next_observation;
    is_done = false;
end

I use the Reinforcement Learning Designer to construct the agent. The "Compatible algorithm" is set to TD3 (default option) and the number of hidden units is 32. The hyperparameters and Exploration Model settings:

For the training Max Episode Length = 1000, Average Window Length = 5, Stopping Criteria = AverageReward, Stopping Value = 900. The result after 30 minutes:

The result after an hour of training:

I tried to modify the reward function and let it run for two hours:

reward = 1 / (0.01 + abs(action - expected_prediction));

The result after 30 minutes of training:

Second try at modifying the reward function:

if (abs(action-expected_prediction) > 0.05)
    reward = -1;
else 
    reward = 1;
end

The result:

As you can see, none of the results show a sine wave. No matter how long I let it run (I even let it run overnight), the result is always one of the images above and the learning process always stucks at a certain level.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Gayathri el 10 de Feb. de 2025

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/2173762-reinforcement-agent-learning-always-stucks#answer_1559399

Editada: Gayathri el 10 de Feb. de 2025

Abrir en MATLAB Online

Hi @Eugen Fekete,

I understand that you are experiencing issues with your reinforcement learning agent effectively learning the sine function. You can try following the below steps to solve the issue.

The reward function is crucial for guiding the agent's learning. Your current reward functions might be too sparse or not providing enough gradient information for effective learning. Consider using a continuous reward function that smoothly penalizes errors as shown below.

reward = -(action - expected_prediction)**2;

Normalize the input and output of your neural networks. Since the sine function outputs values between -1 and 1, ensure that the network's output layer (action) is appropriately bounded.
Try reducing the learning rate to 0.001/0/00001. A small learning rate takes tiny steps, ensuring stability but slowing down the process.
Also, you can try adjusting the "Gradient Decay" parameter.

For more information on training using the reinforcement Learning Designer, please refer to the following documentation link,

https://www.mathworks.com/help/releases/R2024a/reinforcement-learning/ug/design-dqn-using-rl-designer.html

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Reinforcement agent learning always stucks

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

Reinforcement agent learning always stucks

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos