my dlgradient returns all "0"

Question

0 votos

The Net goes here

layers1 = [
    sequenceInputLayer([4 1 2],"Name","betaIn")
    convolution2dLayer([3 2],32,"Name","conv1_1","Padding",[1 1 1 1],"WeightL2Factor",0)
    reluLayer("Name","relu1_1")
    convolution2dLayer([3 1],64,"Name","conv1_2","Padding",[1 1 1 1],"WeightL2Factor",0)
    reluLayer("Name","relu1_2")
    maxPooling2dLayer([2 2],"Name","pool1")
    convolution2dLayer([3 2],128,"Name","conv2_1","Padding",[1 1 1 1],"WeightL2Factor",0)
    reluLayer("Name","relu2_1")
    convolution2dLayer([2 2],128,"Name","conv2_2","Padding",[1 1 1 1],"WeightL2Factor",0)
    reluLayer("Name","relu2_2")
    maxPooling2dLayer([2 2],"Name","pool2")
    convolution2dLayer([2 2],64,"Name","conv3_1","Padding",[1 1 1 1],"WeightL2Factor",0)
    reluLayer("Name","relu3_1")
    convolution2dLayer([3 3],32,"Name","conv3_2","Padding",[1 1 1 1],"WeightL2Factor",0)
    reluLayer("Name","relu3_2")
    convolution2dLayer([3 3],2,"Name","conv3_3","Padding",[1 1 1 1],"WeightL2Factor",0)
    reluLayer("Name","F")];
layers2 = [
    sequenceInputLayer([5 1 2],"Name","alpha")
    alphaMultiplyF("ComplexMultiply")
    ];
net=dlnetwork(layers1);
net=addLayers(net,layers2);
net=connectLayers(net,"F","ComplexMultiply/F");
net=initialize(net);
function [loss,gradients,state] = modelLoss(net,beta,alpha,T)
% Forward data through network.
[Y,state] = forward(net,beta,alpha);
% Calculate cross-entropy loss.
loss = mse(Y,T);
% Calculate gradients of loss with respect to learnable parameters.
gradients = dlgradient(loss,net.Learnables);
end

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

arushi el 10 de Sept. de 2024

0 votos

When dlgradient returns zeros for all gradients, it usually indicates that the loss function's gradient with respect to the network parameters is zero everywhere. This can happen for a few reasons, including issues with the network architecture, the loss function, the data, or even how the gradients are being calculated. Here are a few steps you can take to debug the issue:

Inspect Learnables: Check net.Learnables to ensure it contains the parameters you expect.
Test Custom Layer: If possible, isolate and test your custom layer (alphaMultiplyF) to ensure it correctly computes forward and backward passes.
Simplify the Model: Temporarily simplify your model to a minimal version that should be capable of learning (e.g., remove some layers). This can help identify if a specific part of the network is causing the issue.
Check Outputs: Before calculating the loss, inspect the outputs of the network (Y) to ensure they're reasonable and not all zeros or NaNs.

Hope it helps!

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

my dlgradient returns all "0"

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Respuestas (1)

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Productos

Versión

Etiquetas

Community Treasure Hunt

my dlgradient returns all "0"

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Respuestas (1)

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Productos

Versión

Etiquetas

Ver también

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos