Multiple GPU's used in parallel.

I'm using R2014b and I'm fortunate enough to be in an environment where I have multiple GPU's available.
I have code that uses parfor's to divide the code up over multiple workers. It works fine.
I have code that uses a single GPU by using gpuArray to load the input variables, native Matlab routines to do the processing and gather to retrieve the output. It works fine.
I am familiar with Loren's blog which seems to be a cookbook for my next step, using multiple GPU's. http://blogs.mathworks.com/loren/2013/06/24/running-monte-carlo-simulations-on-multiple-gpus/
The thing is she doesn't push any old data out onto the GPU's and is using arrayfun instead of gpuArray and native matlab.
stay with me here.... Computation within my parfor loop uses several variables from earlier in the code. These variables are all input which will not be changed within the parfor. The code executes fine in that state. When I set my number of workers to 1 the code works fine.
When I start trying to use a single GPU (with one worker) and try to load data into the GPU I hit a snag.
workers = 1;
Zc=2.25;
parfor www = 1:workers;
ftemp=zeros(nx,ny);
ftemp = gpuArray(ftemp);
Zc = gpuArray(Zc);
...
loading Zc onto the GPU gives me the following error.
Error using testz (line 513)
An UndefinedFunction error was thrown on the workers for 'Zc'. This might be because the file containing 'Zc' is
not accessible on the workers. Use addAttachedFiles(pool, files) to specify the required files to be attached.
See the documentation for 'parallel.Pool/addAttachedFiles' for more details.
Caused by:
Undefined function or variable 'Zc'.
Going back to Loren's blog it seems the best solution may be to refactor my code to use arrayfun instead of gpuarray, but I hate to get into that without really understanding the root of my problem and why my current approach is exploding.
Advice is welcome. Other than the blog referenced above, there just isn't a lot of current info about using multiple GPUs. Thanks for reading.

5 comentarios

James Lebak
James Lebak el 29 de Abr. de 2015
Does the behavior change if you give the GPU variables different names from the CPU variable? E.g. gZc instead of Zc.
Edric Ellis
Edric Ellis el 30 de Abr. de 2015
Editada: Edric Ellis el 30 de Abr. de 2015
Hm, I ran the following in R2014b:
workers = 1;
Zc=2.25;
nx = 1; ny = 1;
parfor www = 1:workers;
ftemp=zeros(nx,ny);
ftemp = gpuArray(ftemp);
Zc = gpuArray(Zc);
end
and got an error like so:
Error: The temporary variable Zc in a parfor is uninitialized.
See Parallel for Loops in MATLAB, "Uninitialized Temporaries".
Which is the expected error here - basically the problem is that Zc is being used like a parfor broadcast variable on the right-hand-side (i.e. the same value from outside the loop is sent to each iteration in the loop) as well as a temporary variable on the left-hand-side (i.e. a new value is created for each iteration of the loop).
In short, try @James' suggestion to create gZc which will resolve that ambiguity.
David Short
David Short el 30 de Abr. de 2015
James and Eric.
Thank you, but that gets me the same error that I had before.
For what it's worth....the error is thrown on the line of the parfor, but matlab flags the right hand Zc as the error.
Does anybody have further references to using multiple GPU's other than the blog I mentioned in the original article?
David
David Short
David Short el 14 de Mayo de 2015
All,
With some experimentation I have something close to what I need.
What I wanted to do. For i = 1,N Allocate GPU i End
Parfor GPUARRAY.. (Pass the data to the GPU) Computation Get (Return data from GPU) End
What matlab actually needs. Parfor Allocate a GPU GPUARRAY Computation Get End
So I'll need to think about partially refactoring and rolling out some of the parfor loop. Much closer to what I need. Not what I desired, but I CAN get there from here.
David
Joss Knight
Joss Knight el 28 de Mayo de 2015
David, you won't be able to allocate GPU memory in one loop that you can access in another, especially when you have multiple GPUs. The host machine owns all arrays outside the loops and any gpuArrays must be stored on its GPU - and the mechanism for passing data back and forth to the workers uses CPU memory so you're gaining nothing. Keep your data in CPU memory and send it to the device inside your loops for your computations.

Iniciar sesión para comentar.

Respuestas (0)

Categorías

Más información sobre Parallel Computing Toolbox en Centro de ayuda y File Exchange.

Preguntada:

el 29 de Abr. de 2015

Comentada:

el 28 de Mayo de 2015

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by