how to apply an action in an rl matlab environment

Question

Bryan el 27 de Mzo. de 2024

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/2099736-how-to-apply-an-action-in-an-rl-matlab-environment

Comentada: Bryan el 1 de Abr. de 2024

Hello everyone,

I have a functioning environment, and now I want to make it more complex, but I'm not sure how. The code displays a summary and part of my environment. I have 5 objects on which I apply actions, and the possible actions are either 0 or 1 (32 possible actions). These actions are applied to element 7 of the array "mult". Now, I want to do the same but for elements 7, 10, 13, and 16. In other words, the 32 possible actions should be executed at these 4 instances, and the actions at these instances may be the same or different. Let me explain with an example: action 4 is performed at 7, action 24 at 10, action 4 at 13, and action 18 at 16. The only idea I came up with, although not feasible, is to define all possible combinations (20 objects, each with 0 or 1). Could you please provide some guidance?

% Observation information
ObservationInfo = rlNumericSpec([1 99]);
% Action information
ActionInfo = rlFiniteSetSpec({[0 0 0 0 0], ...
                              [0 0 0 0 1], ...
                              [0 0 0 1 0], ...
                              [0 0 0 1 1], ...
                              [0 0 1 0 0], ...
                              [0 0 1 0 1], ...
                              [0 0 1 1 0], ...
                              [0 0 1 1 1], ...
                              [0 1 0 0 0], ...
                              [0 1 0 0 1], ...
                              [0 1 0 1 0], ...
                              [0 1 0 1 1], ...
                              [0 1 1 0 0], ...
                              [0 1 1 0 1], ...
                              [0 1 1 1 0], ...
                              [0 1 1 1 1], ...
                              [1 0 0 0 0], ...
                              [1 0 0 0 1], ...
                              [1 0 0 1 0], ...
                              [1 0 0 1 1], ...
                              [1 0 1 0 0], ...
                              [1 0 1 0 1], ...
                              [1 0 1 1 0], ...
                              [1 0 1 1 1], ...
                              [1 1 0 0 0], ...
                              [1 1 0 0 1], ...
                              [1 1 0 1 0], ...
                              [1 1 0 1 1], ...
                              [1 1 1 0 0], ...
                              [1 1 1 0 1], ...
                              [1 1 1 1 0], ...
                              [1 1 1 1 1]});
function [NextObservation, Reward, IsDone, UpdatedInfo] = StepFunctTest(Action, Info)
% Start communication with OpenDSS
% Load Info into loads and irradiance
% Apply action
DSSText.command = sprintf('New LoadShape.estado_cap_1   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)', Action(1));
DSSText.command = sprintf('New LoadShape.estado_cap_2   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)', Action(2));
DSSText.command = sprintf('New LoadShape.estado_cap_3   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)', Action(3));
DSSText.command = sprintf('New LoadShape.estado_cap_4   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)', Action(4));
DSSText.command = sprintf('New LoadShape.estado_cap_5   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)', Action(5));
% Configure OpenDSS simulation
% Solve the system
% Obtain NextObservation (voltages)
% Calculate reward
% Determine if the episode is done
IsDone = Reward ~= 0;
end

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Maneet Kaur Bagga el 29 de Mzo. de 2024

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/2099736-how-to-apply-an-action-in-an-rl-matlab-environment#answer_1433266

Abrir en MATLAB Online

Hi,

To apply actions to multiple elements (7, 10, 13, and 16) in the "mult" array, you can use a multi-dimensional action space. Each dimension of this space represents the action to be applied at one of the specified elements. Given that each action can be one of 32 possible states for a single element, when dealing with 4 elements independently, the total number of combinations becomes (32^4 = 1,048,576). Then you'll need to encode and decode the actions in your environment's step function.

Encoding Actions:

You can encode the actions as integers. For example, an action could be represented as a single integer in the range ([0, 1048575]) (which is (32^4 - 1)). This integer can then be decoded into the 4 actions for the elements 7, 10, 13, and 16.

To encode and decode actions, you can use base-32 representation since you have 32 possible states for each action.

Decoding Actions in the Step Function:

When you receive an action in your "StepFunctTest", it will be a single integer. You need to decode this integer into 4 separate actions. Here's how you could do it:

function [NextObservation, Reward, IsDone, UpdatedInfo] = StepFunctTest(Action, Info)
    % Decode the single action into 4 separate actions
    actions = zeros(1, 4); % Initialize array to hold decoded actions
    for i = 1:4
        actions(i) = mod(Action, 32); % Get remainder (current action)
        Action = floor(Action / 32); % Reduce Action for the next iteration
    end
    % Now, actions(1), actions(2), actions(3), and actions(4) represent the actions
    % for elements 7, 10, 13, and 16 respectively
    % Apply actions
    % Assuming you have a way to map the 0-31 action to your desired binary format
    % For simplicity, assuming actionMap is a function that maps the action index to its binary representation
    DSSText.command = sprintf('New LoadShape.estado_cap_1 npts=24 interval=1 mult=(0 0 0 0 0 0 %d 0 0 0 %d 0 0 0 %d 0 0 0 %d 0 0 0 0 0)', actionMap(actions(1)), actionMap(actions(2)), actionMap(actions(3)), actionMap(actions(4)));
    % Continue with the rest of your step function
end
function binaryAction = actionMap(index)
    % Here you would map the index (0-31) to its corresponding binary action
    % This is a placeholder function. You need to implement the mapping based on your specific needs
    binaryAction = [0 0 0 0 0]; % Example placeholder return value
end

This workaround allows you to extend your RL environment to handle actions on multiple elements without explicitly defining every possible combination.

Hope this helps!

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Bryan el 1 de Abr. de 2024

Abrir en MATLAB Online

Hi.

Thank you for your response. It allowed me to broaden my search. I'm not sure if what I applied is what you're trying to tell me. Nonetheless, I arrived at an alternative. Could you review if what I implemented is correct? I have another question: With so many possible combinations (32^4 = 1,048,576), will the DQN agent be able to learn? Will the problem converge with so many combinations?

% 32 possible actions of 4 elements.
t1 = 0:31; % 7h
t2 = 0:31; % 10h
t3 = 0:31; % 13h
t4 = 0:31; % 16h
[T4, T3, T2, T1] = ndgrid(t4, t3, t2, t1);
actions = reshape(cat(5, T1, T2, T3, T4), [], 4);
ActionInfo = rlFiniteSetSpec(num2cell(actions, 2));

In my StepFunctTest:

function [NextObservation, Reward, IsDone, UpdatedInfo] = StepFunctTest(Action, Info)
% rest of the step funtion
% process the actions and obtain the actions in binary number
[Action7h, Action10h, Action13h, Action16h] = BinaryAction(Action);
% apply the actions
DSSText.command = sprintf('New LoadShape.estado_cap_1   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 %d 0 0 %d 0 0 %d 0 0 0 0 0 0 0 0)',Action7h(1), Action10h(1), Action13h(1), Action16h(1));
DSSText.command = sprintf('New LoadShape.estado_cap_2   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 %d 0 0 %d 0 0 %d 0 0 0 0 0 0 0 0)',Action7h(2), Action10h(2), Action13h(2), Action16h(2));
DSSText.command = sprintf('New LoadShape.estado_cap_3   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 %d 0 0 %d 0 0 %d 0 0 0 0 0 0 0 0)',Action7h(3), Action10h(3), Action13h(3), Action16h(3));
DSSText.command = sprintf('New LoadShape.estado_cap_4   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 %d 0 0 %d 0 0 %d 0 0 0 0 0 0 0 0)',Action7h(4), Action10h(4), Action13h(4), Action16h(4));
DSSText.command = sprintf('New LoadShape.estado_cap_5   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 %d 0 0 %d 0 0 %d 0 0 0 0 0 0 0 0)',Action7h(5), Action10h(5), Action13h(5), Action16h(5));
% rest of the step funtion
end

Function to convert Action to binary and apply each action:

function [Action7h, Action10h, Action13h, Action16h] = BinaryAction(Action)
Action7hDec = Action(1);
Action10hDec = Action(2);
Action13hDec = Action(3);
Action16hDec = Action(4);
% Convert to 5-bit binary using arithmetic and logical operations
Action7h = bitget(Action7hDec, 5:-1:1);
Action10h = bitget(Action10hDec, 5:-1:1);
Action13h = bitget(Action13hDec, 5:-1:1);
Action16h = bitget(Action16hDec, 5:-1:1);
end

Thank you again for your previous response, I will be attentive to your prompt reply.

Bryan.

Iniciar sesión para comentar.

how to apply an action in an rl matlab environment

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

how to apply an action in an rl matlab environment

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos