
새로 구축된 Ubuntu 16.04 시스템에서 nvidia-smi
일반 사용자로 실행이 실패합니다.
$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
루트 작업으로 실행
$ sudo nvidia-smi
[sudo] password for hanxue:
Fri Jul 19 10:05:49 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.130 Driver Version: 384.130 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... Off | 00000000:3B:00.0 Off | 0 |
| N/A 38C P0 31W / 250W | 0MiB / 16276MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla P100-PCIE... Off | 00000000:5E:00.0 Off | 0 |
| N/A 33C P0 31W / 250W | 0MiB / 16276MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla P100-PCIE... Off | 00000000:86:00.0 Off | 0 |
| N/A 31C P0 31W / 250W | 0MiB / 16276MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla P100-PCIE... Off | 00000000:AF:00.0 Off | 0 |
| N/A 31C P0 28W / 250W | 0MiB / 16276MiB | 4% Default |
+-------------------------------+----------------------+----------------------+
이후 일반 사용자로 실행하면 작동합니다.
$ nvidia-smi
Fri Jul 19 10:09:00 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.130 Driver Version: 384.130 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... Off | 00000000:3B:00.0 Off | 0 |
| N/A 40C P0 31W / 250W | 0MiB / 16276MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla P100-PCIE... Off | 00000000:5E:00.0 Off | 0 |
| N/A 35C P0 31W / 250W | 0MiB / 16276MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla P100-PCIE... Off | 00000000:86:00.0 Off | 0 |
| N/A 33C P0 31W / 250W | 0MiB / 16276MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla P100-PCIE... Off | 00000000:AF:00.0 Off | 0 |
| N/A 33C P0 27W / 250W | 0MiB / 16276MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
루트 사용자가 먼저 실행 해야 하는 구성 오류가 있습니까 nvidia-smi
? 이에 대한 해결책이 있습니까? 예를 들어 NVIDIA 커널 모듈을 수동으로 로드합니다.