Reinforcement Learning algorithm to tune PID parameters of a system
Mostrar comentarios más antiguos
- Hello, I try to tune a PID controller for a second order system model using Reinforcement Learning like the example of Tune PI Controller using Reinforcement Learning in MATLAB ( https://it.mathworks.com/help/reinforcement-learning/ug/tune-pi-controller-using-td3.html )
- I created the system model with PID called “System_PID.slx” and simulated it to get satisfactory result.
- I also created the system model with RL_agent block “ rl_PID_Tune_mod .slx”
- I am running the code for 100 iterations.
- The ‘Agent’ loaded in workspace after training also gives satisfactory performance when it is considered as standalone controller.
- But, the value of kp, ki, kd are also generated but the value of proportional gain (kp) which is extracted from training, is very small nearly zero. As a result, the extracted Kp, Ki and Kd from ‘Agent’ are not giving the satisfactory result.
- That’s why the system is going to be unstable
- So where is the problem in the configuration of the code as given below.
- Any help is highly obelized.
simOpts = rlSimulationOptions('MaxSteps',maxsteps);
experiences = sim(env,agent,simOpts);
actor = getActor(agent);
parameters = getLearnableParameters(actor);
Ki = abs(parameters{1}(1))
Kp = abs(parameters{1}(2))
Kd = abs(parameters{1}(3))
Respuestas (1)
I didn't check everything, but I noticed that you reused the same quadratic cost function (Reward in RL terms) from the 'rlwatertankPIDTune.slx' example for your second-order plant,
. The water level control system in a tank is a first-order system. In the example, the cost is chosen as
. where the weighing factors
and
, and a PI controller is sufficient.
. where the weighing factors Since you want to tune a PID controller for a second-order system using RL, perhaps it is appropriate to define the cost as
where
and
. One challenge is that you need to estimate
because only the output of the second-order system is measurable.
where
and a = 1; % tf numerator
b = [1 1 0]; % tf denominator
% [A, B, C, D] = tf2ss(a, b)
sys = ss(tf(a, b)) % state-space
In continuous-time and in the absence of Gaussian noise, the PID controller gains may be tuned as follows:
Kp = 0.75;
Ki = 0;
Kd = 0.5;
N = 3;
Gpid = pid(Kp, Ki, Kd, 1/N)
Categorías
Más información sobre Reinforcement Learning en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!