Increase GPU Throughput During training
2 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
John
el 20 de Ag. de 2018
Comentada: Joss Knight
el 21 de Ag. de 2018
I have a single Tesla GP100 GPU with 16GB of RAM. When I'm training my neural network, I have two issues
- Using a imageDatastore spends a HUGE amount of time doing an fread (I'm using a custom ReadFcn because my data is asymmetric and that seemed easiest). I am able to overcome this by reading all the data into memory prior to training but that will not scale.
- During training I am only using 2.2GB of the 16GB available on the GPU. When I use the exact same network and data with TensorFlow, I use all 16GB. This is the case even if I preload all the data above into memory. I'm guessing that is because TensorFlow is "queuing up" batches and MATLAB is not. Is there a way to increase this?
Here is my minimum example code:
function net = run_training_public(dims, nbatch, lr, nepoch)
% Load Data
ds = imageDatastore('./data/set3', 'IncludeSubfolders',true,...
'ReadFcn',@(x)reader_public(x,dims),...
'LabelSource','foldernames',...
'FileExtensions','.dat');
% load neural network structure
network = cnn1;
% Setup options for training and execute training
options = trainingOptions('adam','MaxEpochs',nepoch,'MiniBatchSize',...
nbatch,'Shuffle','every-epoch',...
'InitialLearnRate',lr,...
'ExecutionEnvironment','gpu','Verbose',true);
net = trainNetwork(ds,network,options);
end
function data = reader_public(fileName, dims)
f=fopen(fileName,'r');
data = fread(f,[dims(2) dims(1)],'*int16').';
fclose(f);
end
2 comentarios
Respuesta aceptada
Más respuestas (1)
Joss Knight
el 21 de Ag. de 2018
Editada: Joss Knight
el 21 de Ag. de 2018
The problem with using a ReadFcn is that it prevents MATLAB from being able to do I/O in batch and in the background, because it has to use the MATLAB compute thread. If you just need to resize your images you should use augmentedImageDatastore instead. You could see as much as 100x improvement in performance.
As for your second point, I wouldn't worry about it. TensorFlow basically preallocates all of your GPU memory up front whether you need it or not; MATLAB takes the approach that this is antisocial to other applications. It only reserves memory as you use it, up to a quarter of total memory. The rest it releases back to the system if it doesn't need it any more. If you increase your MiniBatchSize up to the point where you start to run out of memory, you should be using the GPU's memory with good efficiency.
Sometimes you can get better performance by allowing MATLAB to reserve more memory. You can try this using the following command:
feature('GpuAllocPoolSizeKb', intmax('int32'));
2 comentarios
Joss Knight
el 21 de Ag. de 2018
Interesting. You were essentially using ReadFcn exactly as intended, to support an unsupported file format. But it still has the effect of preventing imageDatastore from performing mass prefetching in background threads because it needs to use the MATLAB interpreter to read each file. I would have suggested creating a custom MiniBatchable Datastore so you can at least load a whole batch of data at once (and this also has the option of using a parallel pool to load in the background, which should mean you can hide all the I/O). However, it looks like you found a solution just as good.
Ver también
Categorías
Más información sobre Image Data Workflows en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!