More detailed documentation on Deep Learning Toolbox normalization layers?

Question

Matt J el 3 de Mayo de 2022

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/1710925-more-detailed-documentation-on-deep-learning-toolbox-normalization-layers

Comentada: Matt J el 5 de Mayo de 2022

Below are excerpts from the Deep Learning Toolbox documentation describing several neural network layer types that perform different kinds of input data normalization. For me, the decriptions are a bit too terse and textual to understand clearly the differences between the normalization operations that these different layers apply (both at training time and at test time). Is there any additional documentation to be found somewhere, with a more mathematical description, and perhaps illustrative figures?

imageInputLayer An image input layer inputs 2-D images to a network and applies data normalization.

batchNormalizationLayer A batch normalization layer normalizes a mini-batch of data across all observations for each channel independently. To speed up training of the convolutional neural network and reduce the sensitivity to network initialization, use batch normalization layers between convolutional layers and nonlinearities, such as ReLU layers.

groupNormalizationLayer A group normalization layer normalizes a mini-batch of data across grouped subsets of channels for each observation independently. To speed up training of the convolutional neural network and reduce the sensitivity to network initialization, use group normalization layers between convolutional layers and nonlinearities, such as ReLU layers.

instanceNormalizationLayer An instance normalization layer normalizes a mini-batch of data across each channel for each observation independently. To improve the convergence of training the convolutional neural network and reduce the sensitivity to network hyperparameters, use instance normalization layers between convolutional layers and nonlinearities, such as ReLU layers.

layerNormalizationLayer A layer normalization layer normalizes a mini-batch of data across all channels for each observation independently. To speed up training of recurrent and multilayer perceptron neural networks and reduce the sensitivity to network initialization, use layer normalization layers after the learnable layers, such as LSTM and fully connected layers.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Tish Sheridan el 4 de Mayo de 2022

1
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/1710925-more-detailed-documentation-on-deep-learning-toolbox-normalization-layers#answer_957150

Hi! Did you find the Algorithms sections at the end? (on each of those layer pages except the input layer, eg https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.batchnormalizationlayer.html#d123e18507)

Any help?

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Ieuan Evans el 5 de Mayo de 2022

For simplicity, here

refer the elements of the input data X where X is an N-D array. The dimensionality of the input depends on the type of data (e.g. 2-D images, 3-D images, sequences, etc.). For example, in these algorithm descriptions,

can denote a single channel of a pixel (e.g. the R value of a pixel an RGB image).

Here, and in most deep learning data contexts, the term "time" refers to temporal dimension of sequence data. For example, if the data is an numChannels-by-numObservations-by-numTimeSteps array representing a batch of sequences, then the time dimension is the third dimension.

For example, if you have video data represented as a H-by-W-by-C-by-numObservations-by-numTimeSteps array, you can normalize over the spatial dimensions (1 and 2), the channel dimension (3), and the time dimension (5) independently of the observation dimension (4).

Group, instance, and layer normalization layers normalize mini-batches independently and calculate a fresh μ and

for each mini-batch. They behave the same in training and inference time. Batch normalization layers behave differently. They use the calculated mini-batch statistics at training time, but use the aggregated μ and

calculated for the training data for inference.

Figure 2 of this paper has some handy diagrams of different types of normalization layers:

https://arxiv.org/pdf/1803.08494.pdf

Matt J el 5 de Mayo de 2022

Excellent, thanks!

Iniciar sesión para comentar.

More detailed documentation on Deep Learning Toolbox normalization layers?

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

More detailed documentation on Deep Learning Toolbox normalization layers?

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

3 comentarios Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo