Borrar filtros
Borrar filtros

mexcuda compiler error: "__global__" does not apply here

33 visualizaciones (últimos 30 días)
Nathan
Nathan el 17 de Jun. de 2024
Comentada: Nathan el 24 de Jun. de 2024 a las 15:22
I'm writing a MEX CUDA function that performs the distance formula as: for each pixel of an image, where the pixel locations are in x and y respectively, and and contain the locations for an array of transducers. The function details aren't important for now, because I cannot compile when this kernel is in the code:
__global__ void distance_formula_index_units(uint32_t * delay, const double * x, const double * z, const double * x0, const double * z0, params * p) {
int x_px = threadIdx.x;
int y_px = threadIdx.y;
int sen = blockIdx.x;
int m_id = blockDim.x * blockDim.y + sen;
double x_dist = x[x_px] - x0[sen];
double y_dist = z[y_px] - z0[sen];
double distance = sqrt(x_dist*x_dist + y_dist*y_dist);
delay[m_id] = (uint32_t)(distance / p->c / p->dt);
}
Other explanations for context: delay is the output matrix of time delays, p is a typedef'd struct from earlier in the code that contains parameters about the image and system such as the wavespeed c and sample time resolution dt.
typedef struct params {
size_t n_xp;
size_t n_yp;
size_t n_sens;
double dt;
double c;
} params;
I run "mexcuda calculate_delays.cu", and instead of compiling, I get:
Error using mex
.\06_MEX_Functions\calculate_delays.cu(14): error: attribute "__global__" does not apply here
__declspec(__global__) void distance_formula_distance_units(uint32_t * delay, const double * xx, ...
^
I have only included this one function from the code, because it successfully compiles when I comment out this specific kernel, and the logic gate that activates it. There's another kernel in the code as well (distance_formula_time_units). The code that calls this function ALSO throws a different error:
if (*use_index_units) {
mxGPUArray * delaymatrix = mxGPUCreateGPUArray(3, output_dimensions, mxUINT32_CLASS, mxREAL, MX_GPU_DO_NOT_INITIALIZE);
uint32_t * delay_dvc_int = (uint32_t*) mxGPUGetData(delaymatrix);
distance_formula_index_units
<<<input_parameters->n_xp*input_parameters->n_yp, input_parameters->n_sens>>>
(delay_dvc_int, x_arr_dvc, y_arr_dvc, x0_dvc, z0_dvc, input_parameters);
cudaDeviceSynchronize();
// copy delay matrix from device back to host, set output
outputs[0] = mxGPUCreateMxArrayOnCPU(delaymatrix);
mxGPUDestroyGPUArray(delaymatrix);
} else { ...(call the other kernel)
Commenting out the destance formula index units kernel gives a compiler error on this block:
.\06_MEX_Functions\calculate_delays.cu(101): error: identifier "uint32_t" is undefined
uint32_t * delay_dvc_int = (uint32_t*) mxGPUGetData(delaymatrix);
^
This is wild, because uint32_t is completely defined in line 24 of stdint.h, and this is in the include tree of mex.h.
As far as I can tell, my code is completely valid, and intellisense in VSCode thinks so, too. ChatGPT doesn't find any code errors, either. What is going on here?

Respuesta aceptada

Joss Knight
Joss Knight el 24 de Jun. de 2024 a las 9:48
Editada: Joss Knight el 24 de Jun. de 2024 a las 9:50
You cannot call a function declared __global__ from another function declared __global__. Declare the second function as __device__ and that should work.
Your device functions and kernels need to be declared using supported CUDA device types, whereas you are using host-side types defined by the MEX headers. Try uint32_T instead (i.e. capital T).
  1 comentario
Nathan
Nathan el 24 de Jun. de 2024 a las 15:22
Excellent, great catch on the capital vs. lowercase T in uint32_T, that fixed the compiler issues completely! (Did not need to declare the second function as __device__)

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Get Started with GPU Coder en Help Center y File Exchange.

Productos

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by