GPU CUDA kernel malloc error

Question

0 votos

Hello, i have a geforce 425m card with compute capability 2.1 I wrote a kernel that is using malloc inside the kernel. First the ptx file didnot compiled. After I tried to set the nvcc parameter arch=sm_21 ( nvcc -I "D:\...VC\include" -arch=sm_21 -use_fast_math -ptx SR2.cu ) With this it compiled succesfully, i was just wondering why do i need the specify that. After that i tried to create the kernel in matlab:

ckernel=parallel.gpu.CUDAKernel('SR2.ptx', 'SR2.cu');

But i a get the error:

    ??? Error using ==> parallel.gpu.CUDAKernel
    An error occurred during PTX compilation of <image>.
    The information log was:
    : Considering profile 'compute_20' for gpu='sm_21' in
    'cuModuleLoadDataEx_2a9
    The error log was:
    The CUDA error code was: CUDA_ERROR_INVALID_IMAGE.

Before modifying the kernel to use malloc, and not specifying nvcc arch=sm_21, i was able to run my kernel from MATLAB without any problem.

I think that there is some configuration problem with CUDA. I hope someone has some idea how to solve this.

Thanks,

Gaszton

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Gaszton el 10 de Mayo de 2011

Seems like that there is no options in the cuModuleLoadDataEx for compute capability 2.1:

CUjit_target_enum; possible values are:

CU_TARGET_COMPUTE_10

CU_TARGET_COMPUTE_11

CU_TARGET_COMPUTE_12

CU_TARGET_COMPUTE_13

CU_TARGET_COMPUTE_20

http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/docs/online/group__CUDA__MODULE_g9e8047e9dbf725f0cd7cafd18bfd4d12.html#g9e8047e9dbf725f0cd7cafd18bfd4d12

But in the cuda toolkit 3.2 release notes i found:

Added CU_TARGET_COMPUTE_21 to JIT options.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Edric Ellis el 11 de Mayo de 2011

Abrir en MATLAB Online

2 votos

You can get that error message if you have a mismatch between the CUDA runtime in use by Parallel Computing Toolbox and the version of nvcc that you're using. If you're using R2010b, you need to use CUDA-3.1; for R2011a, you can use CUDA-3.2. I was able to compile and use the following trivial kernel:

    // simple.cu
    __global__ void fcn( double * out ) {
        int * x = (int *) malloc( 1024 );
        out[0] = x[0];
        free( x );
    }

By compiling like so:

$ /usr/local/cuda32/cuda/bin/nvcc -arch compute_20 -ptx simple.cu

and then using within MATLAB R2011a like so:

>> k = parallel.gpu.CUDAKernel( 'simple.ptx' );
>> gather(k.feval(0))
ans =
       1.768515945000000e+09

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Gaszton el 11 de Mayo de 2011

Thank you for your help,

I have R2010b, and cuda toolkit 3.2.

Everything worked, until i specified the -arch options to nvcc.

If i dont specify that, what is the default? i wonder why it is not 2.1 if i have a card that has 2.1 compute capability.

If i compile my cu with -arch compute_20 or sm_20 , i still get error from matlab.

I should install CUDA toolkit 3.1, and try out if it works?

with cuda_3.1 am i able to use kernel malloc?

Thank you,

Gaszton

Gaszton el 11 de Mayo de 2011

Seems like, CUDA 3.1 does not support kernel malloc.

Otherwise with 3.1 i am able to use sm21 code in matlab.

Iniciar sesión para comentar.

GPU CUDA kernel malloc error

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Respuesta aceptada

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Más respuestas (0)

Categorías

Productos

Etiquetas

Community Treasure Hunt

GPU CUDA kernel malloc error

1 comentario Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Respuesta aceptada

2 comentarios Mostrar Ninguno Ocultar Ninguno

Más respuestas (0)

Categorías

Productos

Etiquetas

Ver también

Community Treasure Hunt

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

2 comentarios
Mostrar Ninguno Ocultar Ninguno