setup
Set up reinforcement learning environment to run multiple simulations
Description
When you define a custom training loop for reinforcement learning, you can
simulate an agent or policy against an environment using the runEpisode
function. Use the setup
function to configure the environment for running
simulations using multiple calls to runEpisode
.
setup(
sets up the specified reinforcement
learning environment for running multiple simulations using
env
)runEpisode
.
setup(
specifies nondefault configuration options using one or more name-value pair
arguments.env
,Name=Value
)
Examples
Simulate Environment and Agent
Create a reinforcement learning environment and extract its observation and action specifications.
env = rlPredefinedEnv("CartPole-Discrete");
obsInfo = getObservationInfo(env);
actInfo = getActionInfo(env);
Create a Q-value function approximator.
actorNetwork = [... featureInputLayer(obsInfo.Dimension(1),... Normalization="none",Name="state") fullyConnectedLayer(24,Name="fc1") reluLayer(Name="relu1") fullyConnectedLayer(24,Name="fc2") reluLayer(Name="relu2") fullyConnectedLayer(2,Name="output") softmaxLayer(Name="actionProb")]; actorNetwork = dlnetwork(actorNetwork); actor = rlDiscreteCategoricalActor(actorNetwork,obsInfo,actInfo);
Create a policy object using the function approximator.
policy = rlStochasticActorPolicy(actor);
Create an experience buffer.
buffer = rlReplayMemory(obsInfo,actInfo);
Set up the environment for running multiple simulations. For this example, configure the training to log any errors rather than send them to the command window.
setup(env,StopOnError="off")
Simulate multiple episodes using the environment and policy. After each episode, append the experiences to the buffer. For this example, run 100 episodes.
for i=1:100 output = runEpisode(env,policy,MaxSteps=300); append(buffer,output.AgentData.Experiences) end
Cleanup the environment.
cleanup(env)
Sample a mini-batch of experiences from the buffer. For this example, sample 10 experiences.
batch = sample(buffer,10);
You can then learn from the sampled experiences and update the policy and actor.
Input Arguments
env
— Reinforcement learning environment
rlFunctionEnv
object | SimulinkEnvWithAgent
object | rlNeuralNetworkEnvironment
object | rlMDPEnv
object | ...
Reinforcement learning environment, specified as one of the following objects.
rlFunctionEnv
— Environment defined using custom functions.SimulinkEnvWithAgent
— Simulink environment created usingrlSimulinkEnv
orcreateIntegratedEnv
rlMDPEnv
— Markov decision process environmentrlNeuralNetworkEnvironment
— Environment with deep neural network transition modelsPredefined environment created using
rlPredefinedEnv
Custom environment created from a template (
rlCreateEnvTemplate
)
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Example: StopOnError="on"
StopOnError
— Option to stop episode when error occurs
"on"
(default) | "off"
Option to stop an episode when an error occurs, specified as one of the following:
"on"
— Stop the episode when an error occurs and generate an error message in the MATLAB® command window."off"
— Log errors in theSimulationInfo
output ofrunEpisode
.
UseParallel
— Option for using parallel simulations
false
(default) | true
Option for using parallel simulations, specified as a logical
value. Using parallel computing allows the usage of multiple cores, processors,
computer clusters, or cloud resources to speed up simulation.
When you set UseParallel
to true
, the
output of a subsequent call to runEpisode
is an
rl.env.Future
object, which supports deferred evaluation of the
simulation.
SetupFcn
— Function to run on each worker before running an episode
[]
(default) | function handle
Function to run on the each worker before running an episode, specified as a handle to a function with no input arguments. Use this function to perform any preprocessing required before running an episode.
CleanupFcn
— Function to run on each worker when cleaning up the environment
[]
(default) | function handle
Function to run on each worker when cleaning up the environment, specified as a
handle to a function with no input arguments. Use this function to clean up the
workspace or perform other processing after calling
runEpisode
.
TransferBaseWorkspaceVariables
— Option to send model and workspace variables to parallel workers
"on"
(default) | "off"
Option to send model and workspace variables to parallel workers, specified as
"on"
or "off"
. When the option is
"on"
, the client sends variables used in models and defined in
the base MATLAB workspace to the workers.
AttachedFiles
— Additional files to attach to the parallel pool
string | string array
Additional files to attach to the parallel pool before running an episode, specified as a string or string array.
WorkerRandomSeeds
— Work random seeds
-1
(default) | vector
Worker random seeds, specified as one of the following:
-1
— Set the random seed of each worker to the worker ID.Vector with length equal to the number of workers — Specify the random seed for each worker.
Version History
Abrir ejemplo
Tiene una versión modificada de este ejemplo. ¿Desea abrir este ejemplo con sus modificaciones?
Comando de MATLAB
Ha hecho clic en un enlace que corresponde a este comando de MATLAB:
Ejecute el comando introduciéndolo en la ventana de comandos de MATLAB. Los navegadores web no admiten comandos de MATLAB.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)