My reinforcement learning simulation runs for only 0 steps and 0 times in Simulink. I am not getting any error messages so I cannot pinpoint the issue, so I decided to ask.
9 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Muhammad Ahmed
el 17 de Dic. de 2023
Comentada: Emmanouil Tzorakoleftherakis
el 28 de Dic. de 2023
I am trying to run an RL training simulation in Simulink using a pre-trained agent, no problems there. But for some reasons, the initial values of my observations are ALWAYS beginning from 0 even though I put non-zero values in some of them. Due to this, my isDone conditions are being fulfilled right from simulation start, and so, the Episode ends at 0 time. I am using Data Store Blocks to store and update observations. Also, for some strange reason, the Data Inspector and the Scope are showing different initial values for observations. Scope values are correct (to observe this behaviour, connect a constant 0 signal to the isDone port of RL Agent). I do not see any issue with functions. The Simulink model file and Agent files are attached, (be sure to rename the imported-from-file agent to "TestAgent" in MATLAB Workspace).
A small description of Simulation:
The agent controls when to accelerate, or decelerate a hypothetical car moving on a path. Agent can output signal 0 which accelerates car uptil max velocity, signal 1 which decelerates car with a light-brake (supposed to be regenerative brake) and signal 2 which decelerates car using traditional hard brake. The goal is to switch between these signals in such a way that the car completes the path in the shortest time possible while providing the most comfortable ride and using the regen brake for max power regeneration.
Observations:
Angular Velocity of a tire, Angular Displacement of the Tire, Distance of the path left to cover, Distance to Next Bump, Distance to Next Turn, No of Bumps, No of Turns.
Terminate Conditions:
Either the Episode runs for a Really Long Time, or DIstance Left becomes 0.
Agent: A DQN Agent, accepts a [8 1] observation vector and outputs elements of the set [0,1,2]. Is trained to act in sample time 0.2
I didn't want to saturate the question with codes so they are in the attached files. Also, I am new to Reinforcement Learning and Simulink so apologies in advance for any stupid or inefficient coding and wasteful designs.
0 comentarios
Respuestas (1)
Emmanouil Tzorakoleftherakis
el 21 de Dic. de 2023
It likely has to do with the priority of execution of the data store blocks. I would look more into it, but honestly I think you should change the way you have set your Simulink model. You don't need the data store blocks at all, you can just feed the output of the ste function directly to the RL Agent block. That way the setup will be cleaner.
Hope this helps
2 comentarios
Emmanouil Tzorakoleftherakis
el 28 de Dic. de 2023
You can change initial conditions and parameters programmatically using the reset function. See bottom of this example
Ver también
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!