Value to differentiate is not traced. It must be a traced real dlarray scalar. Use dlgradient inside a function called by dlfeval to trace the variables.
11 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
I'm training a network predicting 9 noise current values(first 6 magnitudes, last 3 phase) with a custom loss function, but I got this error when traning the network.
Error using dlarray/dlgradient (line 105)
Value to differentiate is not traced. It must be a traced real dlarray scalar. Use dlgradient inside a function called by dlfeval to trace the variables.
The following is my custom loss function.
function [total_loss, gradients] = forwardLoss_3port_V4(dlnet, dX, dY8, dY18)
% Forward pass through the network head
rawOut = forward(dlnet, dX);
rawOut = extractdata(rawOut);
dY8 = extractdata(dY8);
dY18 = extractdata(dY18);
B = size(dY8 , 2);
%Division of predicted data
mags_pred = 1./(1 + exp(-rawOut(1:6,:)));
phases_pred = pi*tanh(rawOut(7:9,:));
%Creating minmax values for the true values
phase_tar12 = atan2(dY8(3,:), dY8(4,:));
re12 = dY8(3,:);
im12 = dY8(4,:);
mag_tar12 = hypot(re12, im12);
a = log10(dY8(1,:));
b = log10(dY8(7,:));
c = log10(max(mag_tar12, realmin('double')));
yLog = [a,b,c];
minmax_target = [min(yLog, [], 1).', max(yLog, [], 1).'];
%revamp the predicted magnitudes and build the Cy3
preC11 = mags_pred(1,:) .* (minmax_target(1,2) - minmax_target(1,1)) + minmax_target(1,1);
C11 = 10.^preC11;
preC22 = mags_pred(2,:) .* (minmax_target(2,2) - minmax_target(2,1)) + minmax_target(2,1);
C22 = 10.^preC22;
preC33 = mags_pred(3,:) .* (minmax_target(1,2) - minmax_target(1,1)) + minmax_target(1,1);
C33 = 10.^preC33;
pre_offdiag = mags_pred(4:6,:) .* (minmax_target(3,2) - minmax_target(3,1)) + minmax_target(3,1);
magC12 = 10.^pre_offdiag(1,:);
magC13 = 10.^pre_offdiag(2,:);
magC23 = 10.^pre_offdiag(3,:);
% Build complex off-diagonals from mag & phase
C12 = magC12 .* exp(1i * phases_pred(1,:)); % [1×B]
C13 = magC13 .* exp(1i * phases_pred(2,:)); % [1×B]
C23 = magC23 .* exp(1i * phases_pred(3,:));
Cy3 = dlarray(complex(zeros(3,3,B)));
% Diagonals
Cy3(1,1,:) = reshape(C11, 1,1,[]);
Cy3(2,2,:) = reshape(C22, 1,1,[]);
Cy3(3,3,:) = reshape(C33, 1,1,[]);
% Off-diagonals
Cy3(1,2,:) = reshape(C12, 1,1,[]); Cy3(2,1,:) = conj(Cy3(1,2,:));
Cy3(1,3,:) = reshape(C13, 1,1,[]); Cy3(3,1,:) = conj(Cy3(1,3,:));
Cy3(2,3,:) = reshape(C23, 1,1,[]); Cy3(3,2,:) = conj(Cy3(2,3,:));
% Y3 from flattened [Re(9) Im(9)]
Y3r_flat = reshape(dY18(1:9,:), 3,3,[]);
Y3i_flat = reshape(dY18(10:18,:), 3,3,[]);
Y3_all = Y3r_flat + 1i*Y3i_flat;
% Z_source (unscale from normalized features)
Zs = rescaleVector(dX(end-1,:), 10, 50, 0, 1) + 1j*rescaleVector(dX(end,:), -50, 50, -1, 1) ;
Ys = 1./(Zs); % 1x1xfreq admittance
% Source noise (targets include it)
G2 = 2*physconst('boltzmann')*290*real(Ys);
G2 = reshape(G2, 1,1,[]);
Y2 = reshape(Ys, 1,1,[]);
% Which node of the 3-port is connected to the 1-port
ports1 = 3; ports2 = 1;
% Cascade → external 2-port
Gc = cascadeNoiseCorrelation(Y3_all, Cy3, Y2, G2, ports1, ports2); % 2x2xfreq
% Cascade elements
y11n_pred = log10(real(Gc(1,1,:)));
y22n_pred = log10(real(Gc(2,2,:)));
y12n_pred = log10(hypot(real(Gc(1,2,:)), imag(Gc(1,2,:))));
y12p_pred = atan2(imag(Gc(1,2,:)), real(Gc(1,2,:)));
% 2x2 target
y11n_tar = log10(dY8(1,:));
y22n_tar = log10(dY8(7,:));
y12n_tar = c;
y12p_tar = phase_tar12;
deltaphi = y12p_pred - y12p_tar;
deltaphi = atan2(sin(deltaphi), cos(deltaphi));
mag12_term = (y12n_tar - y12n_pred).^2;
alpha = 0.1;
phase12_term = 0.5 .* (y12n_tar.^2) .* (1-cos(pi*deltaphi));
beta = 1.0;
mag11_term = (y11n_tar - y11n_pred).^2;
mag22_term = (y22n_tar - y22n_pred).^2;
per_ex = mag12_term + alpha.*phase12_term + beta.*(mag11_term + mag22_term);
total_loss = mean(per_ex, 'all');
total_loss = dlarray(total_loss);
% Backprop
gradients = dlgradient(total_loss, dlnet.Learnables);
end
The cascadeNoiseCorrelation function is written below (in case needed for analysis):
function [Gc] = cascadeNoiseCorrelation(Y1, G1, Y2, G2, ports1, ports2)
% Cascades current noise correlation matrices over frequency
%
% Inputs:
% Y1,Y2 m×m×f bzw. n×n×f admittances
% G1,G2 m×m×f bzw. n×n×f noise correlation matrices
% ports1,ports2 vector with connecting ports of each Network of same length c
%
% Outputs:
% Gc (m+n-c)×(m+n-c)×f current noise correlation matrix
% Preallocate:
% External Ports
ext1 = setdiff(1:size(Y1,1), sort(ports1));
ext2 = setdiff(1:size(Y2,1), sort(ports2));
rows1 = [ext1, ports1];
rows2 = [ext2, ports2];
freq = size(Y1,3);
N = numel(ext1) + numel(ext2);
Gc = complex(zeros(N,N,freq));
% block partitioning
m_ext = numel(ext1);
n_ext = numel(ext2);
for k = 1:freq
% Temporary 2D-Matrices
Y1k = Y1(:,:,k);
Y2k = Y2(:,:,k);
G1k = G1(:,:,k);
G2k = G2(:,:,k);
% Permutated Matrices
Y1p = Y1k(rows1, rows1);
Y2p = Y2k(rows2, rows2);
G1p = G1k(rows1, rows1);
G2p = G2k(rows2, rows2);
% Y-Parameters from external to internal
Y_ei = [ Y1p(1:m_ext,m_ext+1:end); Y2p(1:n_ext,n_ext+1:end) ];
% Combined internal Yii parameters
Y_ii = Y1p(m_ext+1:end,m_ext+1:end) + Y2p(n_ext+1:end,n_ext+1:end);
% Noise contributions seperated to:
% external to external
G_ee = blkdiag(G1p(1:m_ext,1:m_ext), G2p(1:n_ext,1:n_ext));
% internal to internal
G_ii = G1p(m_ext+1:end,m_ext+1:end) + G2p(n_ext+1:end,n_ext+1:end);
% external to internal
G_ei = [G1p(1:m_ext,m_ext+1:end) ; G2p(1:n_ext,n_ext+1:end)];
% internal to external
G_ie = [G1p(m_ext+1:end,1:m_ext) , G2p(n_ext+1:end,1:n_ext)];
% total noise correlation matrix
G_tot = [G_ii, G_ie; G_ei, G_ee];
% Noise current transition function from internal currents to external ports
Hj = -Y_ei/(Y_ii);
% Total noise transition matrix
Htot = [Hj, eye(N)];
% Total noise correlation matrix
Gc(:,:,k) = Htot*G_tot*Htot';
end
end
My training intialization is also presented:
function initDLnet(obj)
% Define DL Network
% define input layer
networkLayers = [featureInputLayer(obj.InputDim, 'Name', 'ParameterInputs')];
% define hidden layer
for i_layer = 1:obj.numHiddenLayer
networkLayers = [
networkLayers
fullyConnectedLayer(round(obj.InputDim*obj.hiddenLayerScaling), 'Name', "fc"+num2str(i_layer))
layerNormalizationLayer('Name',"ln"+num2str(i_layer))
reluLayer('Name', "relu"+num2str(i_layer))
];
end
% define output layer
networkLayers = [
networkLayers
fullyConnectedLayer(obj.OutputDim, 'Name', 'fcout');
% tanhLayer("Name", "dOutput")
% sigmoidLayer('Name', 'Output');
% reluLayer('Name', 'dOutput');
];
obj.DLnet = dlnetwork(networkLayers);
end
0 comentarios
Respuestas (1)
Matt J
hace alrededor de 3 horas
Editada: Matt J
hace 11 minutos
rawOut = extractdata(rawOut);
dY8 = extractdata(dY8);
dY18 = extractdata(dY18);
If you pre-convert all your inputs to normal arrays at the top of your code, then none of your operations will be traced. In order for a sequence of operations to be traced for the purpose of using a dlgradient, the sequence must be a continuous chain of dlarray operations.
0 comentarios
Ver también
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!