CUDA_ERROR_LAUNCH_FAILED and CUDA_ERROR_ILLEGAL_ADDRESS on Quadro RTX in TCC mode
2 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Massimiliano Zanoli
el 11 de Oct. de 2021
Comentada: Massimiliano Zanoli
el 17 de Jun. de 2022
We have a Quadro RTX 6000 in TCC mode with a fresh Windows 10 and Matlab 2021a U5 installation. The output of gpuDevice is:
CUDADevice with properties:
Name: 'Quadro RTX 6000'
Index: 1
ComputeCapability: '7.5'
SupportsDouble: 1
DriverVersion: 11.4000
ToolkitVersion: 11
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 2.3916e+10
AvailableMemory: 2.3584e+10
MultiprocessorCount: 72
ClockRateKHz: 1770000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 0
CanMapHostMemory: 1
DeviceSupported: 1
DeviceAvailable: 1
DeviceSelected: 1
If I run the following simple code:
gpu = gpuDevice;
n = 0;
while true
A = rand(1000, 1000, 100, 'single');
A_ = gpuArray(A);
A_ = sum(A_ .^ 2, 2) .^ (1 ./ 2);
reset(gpu)
n = n + 1;
end
eventually (n ~ 10) I get a range of errors, from "unspecified launch failure" to "CUDA_ERROR_LAUNCH_FAILED" to "CUDA_ERROR_ILLEGAL_ADDRESS". I tried:
- With clean installations of latest and previous nVidia drivers, no luck.
- Disabling Window's TDR, no luck.
- Switching to WDDM mode. This seems to solve the issue. But it's not really a solution for us (Windows uses precious GPU memory).
The code above is just a test case. In reality we encounter these errors at random during the execution of large scripts. No custom kernels.
Any other suggestions?
6 comentarios
Joss Knight
el 12 de Jun. de 2022
I can't think of any reason why this would happen. To check it's not a driver issue check:
- Does it now reproduce in earlier versions of MATLAB?
- Downgrade your drivers as far as you can (unfortunately I can only see drivers as far back as 461.40 for the RTX 6000)
To check it's not related to low memory issues, monitor GPU memory in the Task Manager while you run your code.
It may unfortunately just be a faulty card. In which case it will probably reproduce in older versions of MATLAB.
Hoping you get back to me in less than 7 months!
Respuestas (0)
Ver también
Categorías
Más información sobre GPU Computing en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!