Compute gradients for custom training loops using automatic differentiation
Use dlgradient
to compute derivatives using automatic
differentiation for custom training loops.
Tip
For most deep learning tasks, you can use a pretrained network and adapt it to your own data. For an example showing how to use transfer learning to retrain a convolutional neural network to classify a new set of images, see Train Deep Learning Network to Classify New Images. Alternatively, you can create and train networks from scratch using layerGraph
objects with the trainNetwork
and trainingOptions
functions.
If the trainingOptions
function does not provide the training options that you need for your task, then you can create a custom training loop using automatic differentiation. To learn more, see Define Deep Learning Network for Custom Training Loops.
[
returns the gradients of dydx1,...,dydxk
] = dlgradient(y
,x1,...,xk
)y
with respect to the variables
x1
through xk
.
Call dlgradient
from inside a function passed to
dlfeval
. See Compute Gradient Using Automatic Differentiation and Use Automatic Differentiation In Deep Learning Toolbox.
[
returns the gradients and specifies additional options using one or more name-value pairs.
For example, dydx1,...,dydxk
] = dlgradient(y
,x1,...,xk
,Name,Value
)dydx = dlgradient(y,x,'RetainData',true)
causes the gradient
to retain intermediate values for reuse in subsequent dlgradient
calls.
This syntax can save time, but uses more memory. For more information, see Tips.
The dlgraident
function does not support calculating higher-order
derivatives when using dlnetwork
objects containing custom layers with a
custom backward function.
The dlgraident
function does not support calculating higher-order
derivatives when using dlnetwork
objects containing the following layers:
gruLayer
lstmLayer
bilstmLayer
The dlgradient
function does not support calculating higher-order
derivatives that depend on the following functions:
gru
lstm
embed
prod
interp1
A dlgradient
call must be inside a function. To obtain a numeric
value of a gradient, you must evaluate the function using dlfeval
,
and the argument to the function must be a dlarray
. See Use Automatic Differentiation In Deep Learning Toolbox.
To enable the correct evaluation of gradients, the y
argument
must use only supported functions for dlarray
. See List of Functions with dlarray Support.
If you set the 'RetainData'
name-value pair argument to
true
, the software preserves tracing for the duration of the
dlfeval
function call instead of erasing the trace immediately
after the derivative computation. This preservation can cause a subsequent
dlgradient
call within the same dlfeval
call
to be executed faster, but uses more memory. For example, in training an adversarial
network, the 'RetainData'
setting is useful because the two networks
share data and functions during training. See Train Generative Adversarial Network (GAN).
When you need to calculate first-order derivatives only, ensure that the
'EnableHigherDerivatives'
option is false
as this
is usually quicker and requires less memory.