Overcoming VRAM limitations on Nvidia A100
    8 visualizaciones (últimos 30 días)
  
       Mostrar comentarios más antiguos
    
    Christopher McCausland
      
 el 13 de Mzo. de 2023
  
    
    
    
    
    Comentada: Joss Knight
    
 el 14 de Mzo. de 2023
            I have access to a cluster with several Nvidia A100 40GB GPU's. I am training a deep learning network on these GPU's, however using trainNetwork() only makes use of around 10GB of the GPU's vRAM. I beleive this is a limitation of Nvidia Cuda, see here. 
I have two related questions; 
- Other cluster users are writting in python with the 'DistributedDataParallel' module in PyTorch and are able to load in 40Gb of data (over the cuda limitation) onto the GPU's; is there a similar work around for MATLAB?
- If this isn't the case is there any way to use Multi-instance GPU's, so essentially split the physical card into several smaller virtual GPU's and compute in parrellel?
Ideally I would like to speed up computation, so having a 3/4 of the vRAM empty which could otherwise be used for mini-batches is a little heart breaking. 
0 comentarios
Respuesta aceptada
  Joss Knight
    
 el 14 de Mzo. de 2023
        Just increase the MiniBatchSize and it'll use more memory.
6 comentarios
  Joss Knight
    
 el 14 de Mzo. de 2023
				You may never get that 10% so don't get your hopes up! Also, the best utilization is not necessarily at the highest batch size.
Why not ask a new question where you show your code for your datastore and one of us can help you make it partitionable.
Más respuestas (0)
Ver también
Categorías
				Más información sobre Parallel and Cloud en Help Center y File Exchange.
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!


