dlgradients returning zeros, loss dependant of other already trained network

Question

Marcos el 31 de En. de 2024

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/2076566-dlgradients-returning-zeros-loss-dependant-of-other-already-trained-network

Comentada: Marcos el 12 de Feb. de 2024

Hello,

I'm facing a problem that might be similar to the one here https://uk.mathworks.com/matlabcentral/answers/1844083-why-happens-all-the-gradients-of-the-generator-are-zero-from-the-beginning-to-the-end-when-traini, but I don't understand why.

The loss I use depends on another already trained shallow network. Then I use dlgradient and the calculated loss to find the gradients, but this only returns zeros...

Here's the loss,gradients function:

function [loss,gradients] = modelLoss(dlnet,X,net2)
% Forward data through network.
[Y] = forward(dlnet,X);
% Data through trained network2.
X_2 = [extractdata(X(1:2,:));extractdata(Y)];
X_pred = net2(X_2);
% Convert to dlarray.
dlX_pred = dlarray(X_pred,'CB');
% Calculate loss.
loss = mean(mean((dlX_pred - X(3:end,:)).^2));
% Calculate gradients of loss with respect to learnable parameters.
gradients = dlgradient(loss,dlnet.Learnables);
end

And here is the use of the dlfeval:

[loss,gradients] = dlfeval(@modelLoss,dlnet,dLXMiniBatch,net2);

Any idea on what's missing?

Thanks!

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Avadhoot el 12 de Feb. de 2024

1
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/2076566-dlgradients-returning-zeros-loss-dependant-of-other-already-trained-network#answer_1406866

Hi Marcos,

I see that you are calculating gradients for your custom loss function for your model. As mentioned in the example, you have correctly included all the calculations involved in the loss computation inside the loss function, including the pretrained network. Still there could be a few issues which might cause the gradient function to return 0. Below are the probable fixes for this issue:

Make sure that you are not updating the pretrained network weights in calculating the loss, as this can cause a problem in the computation.
The gradients might not get calculated if the pretrained network includes a non-differentiable step.
The dlnet.Learnables should contain all the parameters that you want to update. Check if all the relevant parameters are included in it.
X and Y need to be "dlarray" objects for the gradient function to work. If not, please convert them to dlarray objects before passing them to the loss function.

As you mentioned that you are using a shallow network, the vanishing gradient problem should not trouble you. Please check on all the above factors to determine the cause of the problem.

I hope this helps.

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Marcos el 12 de Feb. de 2024

Hi Avadhoot,

Thanks for your response.

I managed to solve my problem by also using a dlnetwork approach for net2 instead of using feedforwardnet. Y was a dlarray and I tried converting the X_pred to dlarray in the loss function. I guess that was the problem then.

Anyway, having two dlnetwork objects (a pretrained one and the one being trained) isn't a problem, so I'm using that and it seems to be working :)

Thanks again!

Iniciar sesión para comentar.

dlgradients returning zeros, loss dependant of other already trained network

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

dlgradients returning zeros, loss dependant of other already trained network

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos