doubt in the backpropagation algorithm

Question

jvbx el 8 de Ag. de 2024

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/2144079-doubt-in-the-backpropagation-algorithm

Respondida: Umar el 9 de Ag. de 2024

Hi

I'm studying neural networks and i'm doing a NN with 2 hidden layers and one neuron in the output layer

While I was studying and coding my NN, I faced one doubt.

In the backward step, the math behind this is clear:

to the output layer we have:

, where ⊙ is the hadamard product.

and to hidden layers, we have:

my problem is that, when I code this formulas, I need to change the second equation when I will calculate the the gradient of the first hidden layer (the hidden layer next to the input layer) to match the dimensions of the matrix as follow below:

% Backpropagation

delta_saida = erro_estim.*selecionar_funcao(saida_in_estim,ativ_out,sig_a,tanh_a,tanh_b,'True');

delta_h2 = (w_out'*delta_saida).*selecionar_funcao(h2_in_estim,ativ_out,sig_a,tanh_a,tanh_b,'True');

delta_h1 = (w2*delta_h2')'.*selecionar_funcao(h1_in_estim,ativ_h1,sig_a,tanh_a,tanh_b,'True');

%update weights and biases

w_out = w_out + learning_rate*delta_saida*h2_out_estim';

b_out = b_out + learning_rate*delta_saida;

w2 = w2 + learning_rate*(delta_h2'*h1_out_estim)';

b2 = b2 + learning_rate*sum(delta_h2);

w1 = w1 + learning_rate*delta_h1'*enter_estim;

b1 = b1 + learning_rate*sum(delta_h2);

% I wrote this code partially in portuguese so let me explain a litlle.

%'delta_saida' is the gradient of the output layer

%delta_h2 is the gradient of the second hidden layer

%delta_h1 is the gradient of the first hidden layer

%w_out,w2 and w1 are the weights of output, second hidden layer and first hidden layer, respectively.

%b_out,b2 and b1 are the biases of output, second hidden layer and first hidden layer, respectively.

% the function selecionar_funcao() is just to calculate the derivative accordingly with the activation function of the layer

%As you can see, I need to change delta_h1 to match the matrix dimensions

It is right to change the formula like i'm doing in my code ? I'm asking it because in my mind, the way that we calculate the gradient of all hidden layers must be the same, but in my case it isn't true. I will share part of my code here to help anyone to see if i'm doing some mistake

%weights and biases initialization

w1 = randn(num_entradas,n_h1)*sqrt(2/num_entradas);

w2 = randn(n_h1,n_h2) *sqrt(2/n_h1);

w_out = randn(n_h2,n_out) *sqrt(2/n_h2);

b1 = randn(1, n_h1) * sqrt(2/num_entradas);

b2 = randn(1, n_h2) * sqrt(2/n_h1);

b_out = randn(1,n_out) * sqrt(2/n_h2);

%backpropagation

for epoch =1:max_epocas

soma_valid = 0;

soma_estim = 0;

%embaralhar os dados

conj_estim = embaralhar(conj_estim);

% conj_valid = embaralhar(conj_valid);

%Validating

for j=1:size(conj_valid,1)

enter_valid = conj_valid(j,2:end);

h1_in_valid = [enter_valid,1]*[w1;b1];

h1_out_valid = selecionar_funcao(h1_in_valid,ativ_h1,sig_a,tanh_a,tanh_b,'False');

h2_in_valid = [h1_out_valid,1]*[w2;b2];

h2_out_valid = selecionar_funcao(h2_in_valid,ativ_h2,sig_a,tanh_a,tanh_b,'False');

saida_in_valid = [h2_out_valid,1]*[w_out;b_out];

saida_out_valid = selecionar_funcao(saida_in_valid,ativ_out,sig_a,tanh_a,tanh_b,'False');

erro_valid = conj_valid(j,1) - saida_out_valid;

soma_valid = soma_valid + (erro_valid^2);

end

erro_atual_valid = (soma_valid/(2*size(conj_valid,1)));

erros_epoca_valid = [erros_epoca_valid;erro_atual_valid];

%trainning

for i =1:size(conj_estim,1)

enter_estim = conj_estim(i,2:end);

h1_in_estim = [enter_estim,1]*[w1;b1];

h1_out_estim = selecionar_funcao(h1_in_estim,ativ_h1,sig_a,tanh_a,tanh_b,'False');

h2_in_estim = [h1_out_estim,1]*[w2;b2];

h2_out_estim = selecionar_funcao(h2_in_estim,ativ_h2,sig_a,tanh_a,tanh_b,'False');

saida_in_estim = [h2_out_estim,1]*[w_out;b_out];

saida_out_estim = selecionar_funcao(saida_in_estim,ativ_out,sig_a,tanh_a,tanh_b,'False');

erro_estim = conj_estim(i,1) - saida_out_estim;

soma_estim = soma_estim + (erro_estim^2);

% Backpropagation

delta_saida = erro_estim.*selecionar_funcao(saida_in_estim,ativ_out,sig_a,tanh_a,tanh_b,'True');

delta_h2 = (w_out'*delta_saida).*selecionar_funcao(h2_in_estim,ativ_out,sig_a,tanh_a,tanh_b,'True');

delta_h1 = (w2*delta_h2')'.*selecionar_funcao(h1_in_estim,ativ_h1,sig_a,tanh_a,tanh_b,'True');

%atualizar pesos e biases

w_out = w_out + learning_rate*delta_saida*h2_out_estim';

b_out = b_out + learning_rate*delta_saida;

w2 = w2 + learning_rate*(delta_h2'*h1_out_estim)';

b2 = b2 + learning_rate*sum(delta_h2);

w1 = w1 + learning_rate*delta_h1'*enter_estim;

b1 = b1 + learning_rate*sum(delta_h2);

end

erro_atual_estim = (soma_estim/(2*size(conj_estim,1)));

erros_epoca_estim = [erros_epoca_estim;erro_atual_estim];

if erros_epoca_estim(epoch) <limiar

break

else

end

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Umar el 9 de Ag. de 2024

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/2144079-doubt-in-the-backpropagation-algorithm#answer_1496834

Hi @jvbx,

Based on the code snippet and your explanation, it seems that you are correctly calculating the gradient of the first hidden layer. The dimensions of the matrices involved in the calculation also appear to be consistent.However, it's important to note that the calculation of the gradient in backpropagation can vary depending on the specific architecture and activation functions used in the neural network. Different architectures may require different formulas for calculating the gradients of the hidden layers. So,as long as the dimensions of the matrices are consistent and the formula aligns with the mathematical principles of backpropagation, you should be on the right track.To further validate your implementation, you can compare the results of your network with a known benchmark or test it on different datasets to ensure its accuracy.Remember, the backpropagation algorithm is a complex process, and it's common to encounter challenges and uncertainties along the way. It's important to experiment, iterate, and validate your implementation to ensure the best performance of your neural network.I hope this explanation clarifies your doubts and helps you move forward with your implementation.

If you have any further questions, feel free to ask!

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

doubt in the backpropagation algorithm

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

doubt in the backpropagation algorithm

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos