GPU problem CUDA_ERROR_UNKNOWN

18 visualizaciones (últimos 30 días)
Peter
Peter el 4 de En. de 2017
Movida: Matt J el 30 de Mzo. de 2023
I'm running a matlab simulation code using an iterative matrix equation solver. This solver is called on the GPU every few time steps in a time stepping loop. This goes well for some dozens of time steps (although the computations gradually slow down...) until the screen goes black for a short instant of time and the simulation crashes with the following error message:
Error using gpuArray/subsasgn
An unexpected error occurred during CUDA execution. The CUDA error was:
CUDA_ERROR_UNKNOWN
After this, Matlab does not recognize the GPU device anymore: the command
gpuDevice
results in:
Error using gpuDevice (line 26)
An unexpected error occurred trying to retrieve CUDA device properties. The CUDA error was:
CUDA_ERROR_UNKNOWN
Restarting matlab is not sufficient to restore the GPU. Restarting the PC is.
I'm running matlab 2016b on windows 10, using an Nvidia TITAN X (Pascal) GPU with the newest driver installed.
Do the above symptoms inspire anyone for a diagnosis of this problem?
  4 comentarios
Xubin Lin
Xubin Lin el 13 de Jun. de 2020
Dear Joss,
I also have the same problem.
An error occurred during PTX compilation of <image>.
The information log was:
The error log was:
The CUDA error code was: CUDA_ERROR_ILLEGAL_ADDRESS.
My output of gpuDevice is as follows(matlabR2019a and CUDA 10.2):
Name: 'GeForce GTX 1060'
Index: 1
ComputeCapability: '6.1'
SupportsDouble: 1
DriverVersion: 11
ToolkitVersion: 10
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 6.4425e+09
MultiprocessorCount: 10
ClockRateKHz: 1670500
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1
Swati Jain
Swati Jain el 30 de Mzo. de 2023
Movida: Matt J el 30 de Mzo. de 2023
I'm facing this error while working on Deep Network Designer.
Please help me in solving this error.

Iniciar sesión para comentar.

Respuesta aceptada

Peter
Peter el 6 de En. de 2017
Monitoring the GPU performance revealed that most probably the temperature is causing the issue: Slowing down of performance goes with rising of temperature and performance is capped by temperature.
Crash of the GPU occurred when GPU reached 95 degrees...
  2 comentarios
Vaclav Bocek
Vaclav Bocek el 19 de Abr. de 2018
Movida: Matt J el 30 de Mzo. de 2023
How did you solved it please?
Peter
Peter el 23 de Abr. de 2018
Movida: Matt J el 30 de Mzo. de 2023
I solved it by: 1) a smarter placement of the GPU in the pc casing, allowing for better air-flow 2) change the behavior of the cooling fan: generally it only reacts to CPU activity. can be set in BIOS I believe. just made it blow a little harder. This is all very machine specific so it will take some investigating on your part to try these options.

Iniciar sesión para comentar.

Más respuestas (1)

Matt J
Matt J el 4 de En. de 2017
I've had symptoms like that before. Re-installing/updating the GPU driver fixed it for me, but it was never clear to me what the root cause was.
  1 comentario
Peter
Peter el 5 de En. de 2017
thanks Matt, I did install the latest drivers (several times now) hoping for it to solve the issue but unfortunately without success.

Iniciar sesión para comentar.

Categorías

Más información sobre GPU Computing en Help Center y File Exchange.

Productos

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by