Using time as a negative reward in RL toolbox

Question

Amin Moradi el 24 de Feb. de 2022

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/1658035-using-time-as-a-negative-reward-in-rl-toolbox

Respondida: Kartik Saxena el 30 de Nov. de 2023

I want to use RL toolbox to train a DQN agent. Right now, i'm using the related step_function to implement the reward function. The problem is I don't know how to punish the agent for taking too long to do the objective. How should I add time to my reward function in this toolbox? Your help is appreciated.

function [NextObs,Reward,IsDone,LoggedSignals] = WW6_StepFunction_genloss(Action,LoggedSignals)
a = Action;
obj=4;
d=[1 2];
state = LoggedSignals.State;
[next_state, ~, genloss]=attack_eff_WW6(state, a, d);
LoggedSignals.State = next_state;
NextObs = LoggedSignals.State;
Down=nnz(~next_state);
IsDone = Down==11;
Reward=genloss;
end

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Kartik Saxena el 30 de Nov. de 2023

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/1658035-using-time-as-a-negative-reward-in-rl-toolbox#answer_1362957

Hi,

I understand that you want to add time penalty in the reward function to punish it for taking too long.

The example given below in the MathWorks documentation would be useful for this purpose:

https://www.mathworks.com/help/reinforcement-learning/ug/create-matlab-environments-using-custom-functions.html

You can refer to it and introduce penalty in your reward function by deducting from the reward as per your requirements, instead of adding '1'.

I hope this resolves your issue.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Using time as a negative reward in RL toolbox

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

Using time as a negative reward in RL toolbox

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos