Using deep learning parfor to predict in parfor get GPU out of memory issues
2 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Xiangxue Wang
el 1 de Oct. de 2020
Comentada: Xiaohao Sun
el 24 de Mzo. de 2023
I have deep learning model trained and now want to use it for predict on a batch of my images.
I have a 12GB GPU and 16 physical cores cpu and want to use parfor for predict function. The psedo code looks like below
parfor i = 1:100
image = read('xxx');
mask = predict(net, image);
end
However, I got gpu out of memory issues when use parfor. Just use for works fine. Also, forecefully turn off all gpu by disable them in device manager works as well to run on 12 core CPU with parfor, which is even faster to use for with GPU (I think because copying single image between devices is even more expensive).
Anyone has idea how to use both parfor and gpu to for deep learning inference? I also try to limit the number of the worker in parfor to 2 and even 1. Even 1 worker gives the same gpu out of memeory problem. I thus assume the parfor doesn't realse the memory or resource timely after each prediction. I even added the gpuDevice command after predict trying to forcefully clear out memory, but the error stays.
0 comentarios
Respuesta aceptada
David Willingham
el 5 de Oct. de 2020
1. Set the 'ExecutionEnvironment'to 'cpu' in your predict function. Documentation on this can be found here.
2. Using parfor with a single GPU will not work. Using parfor with 16 workers will try to access the 1 GPU in parallel 16 times.
3. It is strange that you are receiving an out of memory error but this maybe due to the fact you are using predict with parfor.
I would recommend performing a small test on 10% of your images to find which method runs quicker:
1. Predict all images using the single GPU. Put a tic/toc around it to test the timing.
2. Predict all images using parfor with the ExecutionEnvironment set to cpu. Use tic/toc again to test the timing.
Regards,
4 comentarios
Xiaohao Sun
el 24 de Mzo. de 2023
Hi David,
There is another predict function for dlnetwork objects (https://www.mathworks.com/help/deeplearning/ref/dlnetwork.predict.html)
The documentation doesn't tell how to use CPU for it (ExecutionEnvironment cannot be used either). Could you help how I can use CPU for this predict function? Thank you!
Best,
Xiaohao
Más respuestas (0)
Ver también
Categorías
Más información sobre Parallel and Cloud en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!