MATLAB Answers

how to use GPU for actor and critic while env simulation happens on multiple cores for RL training

41 views (last 30 days)
krishna teja
krishna teja on 11 Mar 2020
Answered: Anh Tran on 27 Mar 2020
hi
i am new to GPU computing.
i am using reinforcement learning toolbox - particularly rlACAgent
training happens normally on multiple cores within the system. but to due to large actor and critic networks training gets slower. when i use GPU for actor and critic networks and initiate training, only 1st N (N- number of cores in pool) episodes run properly. beyond that all episodes settle at zero reward (not exactly zero but some negative number close to zero).
is there a way to use both GPU and CPU for RLtraining. GPU for networks and CPU for environment (simulink) simulation
thanks in advance
system spec
CPU - intel xeon gold 5220
RAM - 128GB
GPU - Nvidia RTX 2080
device = 'gpu';
actorOpts = rlRepresentationOptions('LearnRate',1e-4,'GradientThreshold',2,'UseDevice',device);
criticOpts = rlRepresentationOptions('LearnRate',1e-4,'GradientThreshold',2,'UseDevice',device);
agent = rlACAgent(actor,critic,agentOpts);
trainOpts = rlTrainingOptions(...
'MaxEpisodes',3e4, ...
'MaxStepsPerEpisode',1*ceil(4/(10*Ts)), ...
'ScoreAveragingWindowLength',10,...
'Verbose',true, ...
'Plots','training-progress',...
'StopTrainingCriteria','AverageReward',...
'StopTrainingValue',10005,...
'SaveAgentCriteria','AverageReward',...
'SaveAgentValue',2000);
trainOpts.UseParallel = true;
trainOpts.ParallelizationOptions.Mode = "sync";
trainOpts.ParallelizationOptions.DataToSendFromWorkers = "gradients";%for A3C
trainOpts.ParallelizationOptions.StepsUntilDataIsSent = 20;
trainOpts.ParallelizationOptions.WorkerRandomSeeds = -1;
trainOpts.StopOnError = 'off';
trainingStats = train(agent,env,trainOpts);

Answers (1)

Anh Tran
Anh Tran on 27 Mar 2020
We are continuously improving GPU training performance with parallel computing in future releases. For now, I would recommend the following options to improve training speed:
  1. Use parallel computing and not GPU: Set StepsUntilDataIsSent to a higher value (e.g. 132, 256, etc.). This will create a bigger batch each training step (similar to NumStepsToLookAhead property of rlACAgentOptions if not use parallel computing).
  2. Use GPU and not parallel computing: Set NumStepsToLookAhead property to a higher value.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by