The gradient of mini batches

Question

MAHSA YOUSEFI el 23 de Nov. de 2020

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/658543-the-gradient-of-mini-batches

Comentada: Mahesh Taparia el 21 de Dic. de 2020

Respuesta aceptada: Mahesh Taparia

Abrir en MATLAB Online

Hi there.

I need your confimation or rejection for this question...

In following code, if the minibatch size is h,

[grad,loss] = dlfeval(@modelGradients,dlnet,dlX_miniBatch,Y_miniBatch);

the grad is the average of gradients of loss over h samples? Does it calculate dradients automatically and at the end with:

grad = 1/h * sum_i=1:h (\nabla loss(y_i,yHat_i)) ??

Following this question, for computing the total loss and geadient (for a full batch), does we should take avarage of losses and averages of gradients (averaging with the number of batches, say 1000 batches each with h size)??

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Mahesh Taparia el 14 de Dic. de 2020

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/658543-the-gradient-of-mini-batches#answer_575280

Hi

The function dlfeval evaluate the custom deep learning models. The loss are calculated based on what has been defined in modelGradients function. So if you are calculating the average loss in this function, then it will return the averaged one. For example, consider this modelGradient function, it is calculating the average cross entropy loss, so it will return the average loss. The gradients are calculated with respect to the loss function defined in for the network.

2 comentarios
Mostrar NingunoOcultar Ninguno

MAHSA YOUSEFI el 19 de Dic. de 2020

Abrir en MATLAB Online

In the example you mentioned, there is a mistake.

function [gradients, loss] = modelGradients(parameters, dlX, T)
    % Forward data through the model function.
    dlY = model(parameters,dlX);
    % Compute loss.
    loss = crossentropy(dlX,T);
    % Compute gradients.
    gradients = dlgradient(loss,parameters);
end

dlY must be feed to crossentropy!

Mahesh Taparia el 21 de Dic. de 2020

Yeah, crossentropy loss will be calculated between dlY and T. The documentation page will be updated.

Iniciar sesión para comentar.

The gradient of mini batches

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

2 comentarios
Mostrar NingunoOcultar Ninguno

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Community Treasure Hunt

The gradient of mini batches

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

2 comentarios Mostrar NingunoOcultar Ninguno

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno