NVIDIA: 「ランタイムエラー: 利用可能な CUDA GPU がありません」

2024-9-14 • tag-icon

NVIDIA: 「ランタイムエラー: 利用可能な CUDA GPU がありません」

私は Ubuntu 上で PyTorch を使用して簡単なアルゴリズムを実装しています。NVIDIA ドライバーが何らかの理由で破損したため、アルゴリズムを実行すると次のトレースバックが生成されることがすでに 2 回ありました。

Traceback (most recent call last):
 File "module.py", line 212, in <module>
    inputs_tensor = torch.tensor(inputs_train).to(device)
  File "/home/user/.venv/lib/python3.8/site-packages/torch/cuda/__init__.py", line 172, in _lazy_init
    torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available

ドライバーを 2 回再インストールしましたが、再起動を数回繰り返すと再び破損してしまいます。

$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. 
Make sure that the latest NVIDIA driver is installed and running.

$ grep "X Driver" /var/log/Xorg.0.log
[    43.342] (II) NVIDIA dlloader X Driver  440.100  Fri May 29 08:21:27 UTC 202

関連情報