Thank you for your guidance. I revised the actor network, but it seems that the inputs are empty. I put constant inputs w1,w2, w3 using
featureInputLayer(w1,"Name","scalarInput1")
featureInputLayer(w2,"Name","scalarInput1")
featureInputLayer(w3,"Name","scalarInput1")
but still there is an error:
Error using rl.internal.validate.mapFunctionObservationInput
Unable to automatically specify deep neural network observation input layer names because some specifications have similar dimension. Specify "ObservationInputNames" name-value pair when creating function object.
modelInputMap = rl.internal.validate.mapFunctionObservationInput(model,observationInfo,nameValueArgs.ObservationInputNames);
Here is the revised code:
global w1,
global w2,
global w3;
W=[w1,w2,w3];
w1=2;
w2=1;
w3=2;
obsMat = [4 3; 5 3; 6 3; 7 3; 8 3; 9 3; 5 11; 6 11; 7 11; 8 11; 6 12; 7 12; 10 12; ];
sA0 = [2 5];
sB0 = [11 5];
sC0 = [3 2];
s0 = [sA0; sB0; sC0];
Ts = 0.1;
Tf = 100;
maxsteps = ceil(Tf/Ts);
mdl = "rlA";
open_system(mdl)
% Define observation specifications.
scalarObs1Info = rlNumericSpec([1 1]);
scalarObs1Info.Name ="scalarObservation1";
scalarObs2Info = rlNumericSpec([1 1]);
scalarObs2Info.Name ="scalarObservation2";
scalarObs3Info = rlNumericSpec([1 1]);
scalarObs3Info.Name ="scalarObservation3";
obsSize = [12 12 4];
oinfo = rlNumericSpec(obsSize);
oinfo.Name = "observations";
allObsInfo = [ scalarObs1Info, scalarObs2Info, scalarObs3Info, oinfo];
allObsInfo(1).Name = "observations";
allObsInfo(2).Name = "scalarObservation1";
allObsInfo(3).Name = "scalarObservation2";
allObsInfo(4).Name = "scalarObservation3";
ActionInfo = rlNumericSpec([1, 2], 'Lowerlimit', -1, 'Upperlimit', 1); ainfo = ActionInfo;
ainfo.Name = "actions";
actInfo.UpperLimit=1;
actInfo.Lowerlimit=-1;
blks = mdl + ["/Agent A","/Agent B","/Agent C"];
env = rlSimulinkEnv(mdl,blks,{allObsInfo,allObsInfo,allObsInfo},{ainfo,ainfo,ainfo});
env.ResetFcn = @(in) resetMap(in, obsMat);
rng(0)
for idx = 1:3
lgraph = layerGraph();
tempLayers = [
featureInputLayer(w1,"Name","scalarInput1")
reluLayer("Name","relu_3")
fullyConnectedLayer(10,"Name","fc_4")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
featureInputLayer(w2,"Name","scalarInput2")
reluLayer("Name","relu_2")
fullyConnectedLayer(10,"Name","fc_3")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
featureInputLayer(w3,"Name","scalarInput3")
reluLayer("Name","relu_1")
fullyConnectedLayer(10,"Name","fc_2")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
imageInputLayer(obsSize,Normalization="none")
convolution2dLayer(8,16, ...
Stride=1,Padding=1,WeightsInitializer="he")
reluLayer
convolution2dLayer(4,8, ...
Stride=1,Padding="same",WeightsInitializer="he")
reluLayer
fullyConnectedLayer(256,WeightsInitializer="he")
reluLayer
fullyConnectedLayer(128,WeightsInitializer="he")
reluLayer
fullyConnectedLayer(2,"Name","fc_1")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
concatenationLayer(2,4,"Name","concat")
softmaxLayer("Name","softmax")];
lgraph = addLayers(lgraph,tempLayers);
% clean up helper variable
clear tempLayers;
lgraph = connectLayers(lgraph,"fc_2","concat/in3");
lgraph = connectLayers(lgraph,"fc_3","concat/in2");
lgraph = connectLayers(lgraph,"fc_1","concat/in4");
lgraph = connectLayers(lgraph,"fc_4","concat/in1");
plot(lgraph);
actorNetwork=lgraph;
actorOptions = rlOptimizerOptions('LearnRate',0.1,'GradientThreshold',inf);
actor(idx) = rlContinuousGaussianActor(actorNetwork,allObsInfo,ainfo);
%Critic network