Borrar filtros
Borrar filtros

Slice into gpuArray and perform functions on the GPU with arrayfun

4 visualizaciones (últimos 30 días)
Hamad
Hamad el 11 de En. de 2015
Comentada: Joss Knight el 23 de Feb. de 2015
I would like to know how I can index into a given matrix to make pairwise combinations of column-vectors, and perform operations on these vectors - all on the GPU. So consider the simple function below:
function out = sum2Vecs(in1,in2) %in1 and in2 are (n x 1) vectors.
out = sum(in1,1) + sum(in2,1); %Output is a scalar "double".
end
Quick example: an array such as
fullMatrix = rand(3000,100);
Now I choose all pairwise column-vector combinations of "fullMatrix":
idxArray = nchoosek(1:100,2); %All possible pairwise index combinations of "fullMatrix".
nCombinations = length(idxArray);
And a simple for-loop performs the "sum2Vecs" function on each combination of two-column vectors:
for idx = 1 : nCombinations,
outArray(idx) = sum2Vecs( fullMatrix(:,idxArray(idx,1)) , fullMatrix(:,idxArray(idx,2)) );
end
Also, a parfor-loop with slicing works fine:
parfor idx = 1 : nCombinations,
in1 = fullMatrix(:,idxArray(idx,1));
in2 = fullMatrix(:,idxArray(idx,2));
outArray(idx) = sum2Vecs(in1,in2);
end
My goal is to be able to perform this loop on the GPU using e.g. "arrayfun". But I am relatively inexperienced with this, so I would appreciate any helpful pointers. What I am particularly interested in learning is how to efficiently index into an array like "fullMatrix" and send parts of it to each GPU worker efficiently.
Thanks very much. Hamad.

Respuestas (1)

Matt J
Matt J el 11 de En. de 2015
Editada: Matt J el 11 de En. de 2015
In the generality that you've described, that kind of computation doesn't look like the kind of thing that's well-suited to the GPU . The GPU is for situations when you have lots of parallel tasks involving small chunks of data. The chunks in your example, two 3000x1 vectors, wouldn't likely be small enough unless the operation can be subdivided further.
For that specific example, I would probably try to vectorize on the GPU as follows,
idxArray = gpuArray( nchoosek(1:100,2).' ) ;
A= gpuArray(fullMatrix);
[m,n]=size(A);
outArray=sum( reshape(A(:,idxArray),2*m ,[]), 1 );
  4 comentarios
Joss Knight
Joss Knight el 23 de Feb. de 2015
arrayfun can take a user-defined function, as long as that function carries out scalar operations. You can also index into arrays in that function as long as the array is passed in as an upvalue - see for instance here, the Mandelbrot example on this page and the Monte Carlo example here.
You need to remember that GPU cores are not like parallel workers. They cannot perform complex vector operations. Taken together, they perform complex vector operations, but not individually. In PCT a large number of complex algorithms have been implemented in such a way as to take maximum advantage of the GPU. If you are having trouble formulating your problem in a data-parallel way, then post your real code and we can have a look at whether it is inherently parallelisable. The example you gave - summing vectors - is easily vectorizable as Matt showed above.

Iniciar sesión para comentar.

Categorías

Más información sobre GPU Computing en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by