GPU utilization is not 100%.

31 visualizaciones (últimos 30 días)
DONGHYUN KIM
DONGHYUN KIM el 22 de Mayo de 2019
Comentada: yan gao el 25 de Sept. de 2021
gpu.PNG
The GPU usage is only 40% allocated for running the deep learning network.
Sometimes go up to 80% for a while but usually stay at 40%.
I want to know why.
  1 comentario
Walter Roberson
Walter Roberson el 22 de Mayo de 2019
GPU can only run at full speed if the entire problem fits into memory. That is seldom the case for deep learning: those networks are updated incrementally, so transferring images in from disc and memory uses a fair bit of time.

Iniciar sesión para comentar.

Respuestas (4)

Joss Knight
Joss Knight el 31 de Mayo de 2019
Your question is very hard to answer in it's current form. You want to know why GPU utilisation is not 100%? The answer is, because the GPU isn't running kernels 100% of the time. Why? I don't know, because you haven't provided any information about what you're doing. Maybe, as Walter says, a lot of time is being spent doing file I/O, perhaps because you have a very slow disk or slow network file access. Maybe you have a transformed datastore, or an imageDatastore with a custom ReadFcn, and the data processing is very complex and takes place on the CPU, blocking GPU execution while it is carried out. Maybe you have a very small network, or a low resolution network, or you don't have a high enough mini-batch size, and so you are not successfully occupying all the cores on the GPU. Maybe your network is so small that the amount of time spent running the MATLAB interpreter in order to generate the GPU kernels to do the computation outweighs the amount of time it takes to run those kernels.
If you want to know more, run the MATLAB profiler and find out where time is being spent during training.
  2 comentarios
Ali Al-Saegh
Ali Al-Saegh el 5 de Dic. de 2020
Dear Joss,
I kindly invite you to help me by giving some advice on my question at
https://www.mathworks.com/matlabcentral/answers/680293-gpu-vs-cpu-in-training-time
yan gao
yan gao el 25 de Sept. de 2021
Dear Joss,
I kindly invite you to help me by giving some advice on my question at

Iniciar sesión para comentar.


Abolfazl Nejatian
Abolfazl Nejatian el 15 de Dic. de 2019
dear Joss,
Thank you for the information you provided.
the strange thing is when i was testing my code on Linux and building my network with Python the GPU utilization grew up to around 100 percent but in windows with Matlab, it stays around 45 percent.
  1 comentario
Joss Knight
Joss Knight el 15 de Dic. de 2019
It's not strange. Windows is a different operating system, different file system, and completely different (and considerably slower at allocating memory) GPU driver. Do you have a different card in your Windows machine too? All could be a problem.
Plus, if you start with a model defined in a Python framework and optimized for that, and then adapt it, we've no idea how good a job you did. If you took a MATLAB example and then converted it to Python you might have the same problem with Python. Maybe you're not successfully prefetching your data from the file system. Maybe you're not using MEX acceleration when you should be. Maybe your GPU could be put in TCC mode. That's why it's so difficult to answer your question when you're not telling us what you're doing.

Iniciar sesión para comentar.


Abolfazl Nejatian
Abolfazl Nejatian el 16 de Dic. de 2019
well, i know these are the different OS, but the vague point is, with the same resource(both of them use Tesla V100, actually i install both of OS on one machin), why they can't use GPU in a similar percent!
yes, absolutely i used MEX code on my Matlab.
then i try to train a Resnet with Python. ( and all of the initial value were same, input size, network layers and etc).
there is no code conversion between Matlab and Python i used Matlab function and Pretrained Net for this work and for Python use Keras and Pycharm.
but in a windows environment with Matlab, my GPU utilization goes around 45% and in Python, at Linux OS it was around 90%!
now the question is, do you recommend me reInstall my Matlab on Linux OS and then i can use more from my hardware resource?
  2 comentarios
Walter Roberson
Walter Roberson el 16 de Dic. de 2019
The Windows driver for NVIDIA is slower, and also it reserves about 1 gigabyte of memory for communication purposes, whereas Linux does not need to do that.
Joss Knight
Joss Knight el 16 de Dic. de 2019
Hi Abolfazi. I can't really recommend anything until I've seen your code. It may be as simple as changing the way you access your data; it may be that you should move to Linux; or it may be that there's nothing you can do. Maybe your Python code is grotesquely inefficient with GPU resources or spins up a lot of worthless kernels during spare cycles! It's just impossible to say. Give us your code, and run the MATLAB profiler and show us the profile report.

Iniciar sesión para comentar.


Lamya Mohammad
Lamya Mohammad el 29 de Feb. de 2020
Did you solve the problem? My utilization is 29% and I wish to increase it

Categorías

Más información sobre Image Data Workflows en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by