Reinforcement Learning algorithm to tune PID parameters of a system

Question

0 votos

Hello, I try to tune a PID controller for a second order system model using Reinforcement Learning like the example of Tune PI Controller using Reinforcement Learning in MATLAB ( https://it.mathworks.com/help/reinforcement-learning/ug/tune-pi-controller-using-td3.html )
I created the system model with PID called “System_PID.slx” and simulated it to get satisfactory result.
I also created the system model with RL_agent block “ rl_PID_Tune_mod .slx”
I am running the code for 100 iterations.
The ‘Agent’ loaded in workspace after training also gives satisfactory performance when it is considered as standalone controller.
But, the value of kp, ki, kd are also generated but the value of proportional gain (kp) which is extracted from training, is very small nearly zero. As a result, the extracted Kp, Ki and Kd from ‘Agent’ are not giving the satisfactory result.
That’s why the system is going to be unstable
So where is the problem in the configuration of the code as given below.
Any help is highly obelized.

simOpts = rlSimulationOptions('MaxSteps',maxsteps);

experiences = sim(env,agent,simOpts);

actor = getActor(agent);

parameters = getLearnableParameters(actor);

Ki = abs(parameters{1}(1))

Kp = abs(parameters{1}(2))

Kd = abs(parameters{1}(3))

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Sam Chak el 1 de Sept. de 2023

Abrir en MATLAB Online

0 votos

Hi @Bipratip,

I didn't check everything, but I noticed that you reused the same quadratic cost function (Reward in RL terms) from the 'rlwatertankPIDTune.slx' example for your second-order plant,

. The water level control system in a tank is a first-order system. In the example, the cost is chosen as

. where the weighing factors

and

, and a PI controller is sufficient.

Since you want to tune a PID controller for a second-order system using RL, perhaps it is appropriate to define the cost as

where

and

. One challenge is that you need to estimate

because only the output of the second-order system is measurable.

a    =  1;          % tf numerator
b    = [1 1 0];     % tf denominator
% [A, B, C, D] = tf2ss(a, b)
sys  = ss(tf(a, b)) % state-space
sys =
 
  A = 
       x1  x2
   x1  -1   0
   x2   1   0
 
  B = 
       u1
   x1   1
   x2   0
 
  C = 
       x1  x2
   y1   0   1
 
  D = 
       u1
   y1   0
 
Continuous-time state-space model.

In continuous-time and in the absence of Gaussian noise, the PID controller gains may be tuned as follows:

Kp   = 0.75;
Ki   = 0;
Kd   = 0.5;
N    = 3;
Gpid = pid(Kp, Ki, Kd, 1/N)
Gpid =
 
               s    
  Kp + Kd * --------
             Tf*s+1 

  with Kp = 0.75, Kd = 0.5, Tf = 0.333
 
Continuous-time PDF controller in parallel form.

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Reinforcement Learning algorithm to tune PID parameters of a system

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Respuestas (1)

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Productos

Versión

Etiquetas

Community Treasure Hunt

Reinforcement Learning algorithm to tune PID parameters of a system

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Respuestas (1)

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Productos

Versión

Etiquetas

Ver también

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos