I've tracked normalization down to (I think?) 4 Functions:
Trainer.m
->initializeNetworkNormalization
-->TrainerGPUStrategy.m
-->Precision.m
Within Trainer.m the function initializeNetworkNormalization contains two lines:
avgI = this.ExecutionStrategy.computeAverageImage(data, augmentations, executionSettings);
net.Layers{1}.AverageImage = precision.cast(avgI);
this.ExecutionStrategy = nnet.internal.cnn.TrainerGPUStrategy:
classdef TrainerGPUStrategy < nnet.internal.cnn.TrainerExecutionStrategy
% TrainerGPUStrategy Execution stategy for running the Trainer on the
% GPU
% Copyright 2016 The Mathworks, Inc.
methods
function Y = environment(~, X)
Y = gpuArray(X);
end
function [avgI, numImages] = computeAccumImage(~, data, augmentations)
data.start();
avgI = gpuArray(0);
numImages = 0;
while ~data.IsDone
X = data.next();
if ~isempty(X)
X = apply(augmentations, X);
X = gpuArray(double(X));
avgI = avgI + sum(X, 4);
numImages = numImages + size(X,4);
end
end
end
end
end
I'm no expert but I'm assuming X = data.next() is pulling the individual images and keeping a running tally of average pixel values and the number of images (to divide by?)? In which case I can't see why this should take so long?
