CUDA Kernels and long vectors

5 visualizaciones (últimos 30 días)
Raphael
Raphael el 22 de Feb. de 2013
Hello!
At the moment i'm writing my first programs with MATLAB and Cuda. I wrote a kernel in .cu file, compiled to ptx and execute it with a feval function.
[dResultGPU]= feval(k_calc,dResultGPU,g_x,g_y,nLines,nColumns);
I have a ThreadBlockSize of 1024 and a maximum gridsize of [65535 65535].
Everything works fine so far, but i have some troubles with the indexing in my kernel.
When I want to add vectors with length 10e7 i am not able to get the index in the kernels right.
In my m-File I set the my grid size like
nBlocks = ceil(nPoints / k_calc.ThreadBlockSize(1));
if nBlocks <= 65535
k_calc.GridSize = nBlocks;
else
k_calc.GridSize = [65535 ceil(nBlocks/65535)];
end
In my example with a vector with 10e7 elements, 10e7 is bigger than 65535*1024, so i have a gridsize of [65535 2].
In my cu-kernel I tried the index
int idx = blockDim.x * blockIdx.x + threadIdx.x;
but this is wrong for elements with index greater than 65535*1024. Which cuda-variable tells me in which row of my grid i am?
gridDim.x gives me only the Dimensions not the current location as far as i know.
Thank you very much, Raphael

Respuestas (1)

Ben Tordoff
Ben Tordoff el 26 de Feb. de 2013
Editada: Ben Tordoff el 26 de Feb. de 2013
If you want to go down the "x" dimension first, you probably want
int const globalBlockIdx = blockIdx.y * gridDim.x + blockIdx.x;
int const globalThreadIdx = globalBlockIdx * blockDim.x + threadIdx.x;
or something similar. This assumes your grid is only 2D and your blocks are only 1D.

Categorías

Más información sobre GPU Computing en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by