How to compute the gradients of SAC agent for custom training. In additon, is the target critics are updated automatically by matlab, given that agent =rlSACAgent()

6 visualizaciones (últimos 30 días)
I'm trying to train multi SAC agent using parallel computing, i don't know how to compute the gradients of agents using dlfeval function, knowing that i have created minibatchqueue for data processing. In addition, given that the agents have been created as agent=rlSACAgent(actor1,[critic1,critic2],agentOpts) , should i introduce the critics targets or they are internally handled by MATLAB by specifying the smoothing factor tau or updating frequency of target critic, and how i can update them?

Respuestas (1)

praguna manvi
praguna manvi el 4 de Sept. de 2024
Editada: praguna manvi el 4 de Sept. de 2024
The critic and actor networks are updated internally using the “train” function for agents defined as:
agent = rlSACAgent(actor,[critic1,critic2],agentOpts);
You can find an example of training a rlSACAgent in this documentation:
For custom training you can refer to this documentation, which outlines the functions needed:
Typically, you could use “getValue” or “getAction” functions to extract outputs, calculate loss and compute gradients with “dlgradient”. Here is a link to another example with custom training using sampled minibatch experiences:

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by