Deep Learning: higher training loss using GPU. Why?

2 visualizaciones (últimos 30 días)
EK_47
EK_47 el 25 de Sept. de 2022
Respondida: Piyush Dubey el 6 de Sept. de 2023
Hi,
I am training a ResNet-50 network for object detction using about 3,000 images. I have tried it in two ways, using CPU and GPU.
1 - CPU: I used Intel Xeon Processor E5-2687W v3 (10 cores); it took 70 hours; training and validation losses at epoch 40 were 0.0532 and 0.0004.
2- GPU: I used NVIDIA GeForce RTX 3070 Ti 8GB; it took 6 hours; training and validation losses at epoch 40 were 0.0764 and 0.0013.
As you can see, using GPU it takes much less time to train the model, but the training loss is higher. Also, the model trained on GPU gives poorer performance in predicting unseen data.
Why is this? How can I get the same accuracy on GPU?
Thanks
  2 comentarios
Walter Roberson
Walter Roberson el 25 de Sept. de 2022
On the GPU, is it being trained in single precision or in double precision ?
EK_47
EK_47 el 25 de Sept. de 2022
I do not know, but I think it is in single precision. I read somewhere that it’s not possible to change the default value which is single precision for deep learning in Matlab?

Iniciar sesión para comentar.

Respuestas (1)

Piyush Dubey
Piyush Dubey el 6 de Sept. de 2023
Hi @EK_47,
I understand that you are training a "ResNet-50" model for object detection using CPU and GPU. You have noticed that training on the GPU is faster but it results in higher training and validation losses compared to training on the CPU.
I would like to clarify that the training process generates training and validation datasets randomly. As a result, there will be slight variations in the training and validation losses each time you perform the training. Therefore, the differences in losses between CPU and GPU training may not be significant. Averaging the training and validation losses obtained over multiple training sessions with random data sets from training and validation dataset would serve as a better parameter for comparison of performance between CPU and GPU. This average for both CPU and GPU will turn out to be roughly equal.
To address this issue, you can consider applying specific seeding techniques to ensure that the training and validation datasets remain the same across multiple training sessions. By avoiding random seeding, you can achieve more consistent results and compare the losses between CPU and GPU training more effectively.
Additionally, you can use cross-validation techniques to compare the losses of the network trained on CPU and GPU. Cross-validation involves splitting the dataset into multiple subsets and performing training and validation on different combinations of these subsets. This can help provide a more reliable comparison of the performance between CPU and GPU training.
For more information on cross-validation techniques, I recommend referring to the following MathWorks documentation link:
By applying seeding techniques and utilizing cross-validation, you can obtain more reliable and comparable results for training the ResNet-50 model on both CPU and GPU.
Hope this helps.

Categorías

Más información sobre Image Data Workflows en Help Center y File Exchange.

Productos


Versión

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by