NVIDIA:X 伺服器初始化期間發生 GPU 異常

NVIDIA:X 伺服器初始化期間發生 GPU 異常

我的GPU是GTX870M。我全新安裝了 Ubuntu 18.04。我所做的只是:

sudo apt-get update
sudo apt-get upgrade
sudo ubuntu-drivers autoinstall
nvidia-xconfig
reboot

它安裝了 nvidia-390 驅動程式。現在每當我嘗試啟動 X 伺服器時startx都會失敗。我仍然可以使用 Wayland。這是我嘗試過的(在恢復模式下):

startx

輸出:

X.Org X Server 1.20.1
X Protocol Version 11, Revision 0
Build Operating System: Linux 4.4.0-140-generic x86_64 Ubuntu
Current Operating System: Linux <censored>-PC 4.18.0-22-generic #23~18.04.1-Ubuntu SMP Thu Jun 6 08:37:25 UTC 2019 x86_64
Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.18.0-22-generic root=UUID=0d1d9304-4cd6-41f6-80b2-3562578a252e ro recovery nomodeset
Build Date: 27 November 2018  05:27:12PM
xorg-server-hwe-18.04 2:1.20.1-3ubuntu2.1~18.04.1 (For technical support please see http://www.ubuntu.com/support) 
Current version of pixman: 0.34.0
    Before reporting problems, check http://wiki.x.org
    to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
    (++) from command line, (!!) notice, (II) informational,
    (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Sat Jun 22 13:47:29 2019
(==) Using config file: "/etc/X11/xorg.conf"
(==) Using system config directory "/usr/share/X11/xorg.conf.d"
(EE) 
Fatal server error:
(EE) NVIDIA: A GPU exception occurred during X server initialization(EE) 
(EE) 
Please consult the The X.Org Foundation support 
     at http://wiki.x.org
 for help. 
(EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
(EE) 
(EE) Server terminated with error (1). Closing log file.
xinit: giving up
xinit: unable to connect to X server: Connection refused
xinit: server error

/var/log/Xorg.0.log:https://pastebin.com/ygxRKPPg

在這些日誌中,有兩件事引起了我的注意:

[   119.994] (II) NVIDIA(0): Virtual screen size determined to be 640 x 480
[   119.994] (WW) NVIDIA(0): Unable to get display device for DPI computation.

[   119.994] (--) NVIDIA(0): Memory: 3145728 kBytes
[   119.994] (II) NVIDIA: Using 6144.00 MB of virtual memory for indirect memory

似乎我的顯示設備未正確檢測到和/或 X 伺服器試圖使用過多的記憶體?

dmesg輸出:https://pastebin.com/fcYMPrUB

相關部分:

[  120.275346] NVRM: GPU at PCI:0000:01:00: GPU-c588f20e-6b26-3352-5b81-666db3c970a2
[  120.275348] NVRM: Xid (PCI:0000:01:00): 44, Ch 00000000, engmask 00000101, intr 10000000
[  120.793329] NVRM: Xid (PCI:0000:01:00): 31, Ch 00000008, engmask 00000111, intr 10000000

我查了一下Xid的意思:https://docs.nvidia.com/deploy/xid-errors/index.html

31 GPU memory page fault

44 Graphics Engine fault during context switch

nvidia-smi輸出:

Sat Jun 22 14:23:52 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.116                Driver Version: 390.116                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 870M    Off  | 00000000:01:00.0 N/A |                  N/A |
| N/A   83C    P0    N/A /  N/A |      0MiB /  3018MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0                    Not Supported                                       |
+-----------------------------------------------------------------------------+

如有任何幫助,我們將不勝感激,謝謝。

答案1

您的驅動程式似乎找不到顯示器解析度。您可能需要手動設定。你試過這個嗎?

如何使用 Nvidia 驅動程式為不發送 EDID 的顯示器設定正確的顯示器解析度?

答案2

這是我的樣子dmesg

$ dmesg | grep -i nvidia
[    1.517472] nvidia: loading out-of-tree module taints kernel.
[    1.517477] nvidia: module license 'NVIDIA' taints kernel.
[    1.520410] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[    1.524609] nvidia-nvlink: Nvlink Core is being initialized, major device number 242
[    1.524802] nvidia 0000:01:00.0: enabling device (0006 -> 0007)
[    1.524981] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  384.130  Wed Mar 21 03:37:26 PDT 2018 (using threaded interrupts)
[    1.530574] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  384.130  Wed Mar 21 02:59:49 PDT 2018
[    1.531818] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
[    1.531820] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 1
[    4.318800] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 240
[    4.864567] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input9
[    7.517965] nvidia-modeset: Allocated GPU:0 (GPU-30fab9bc-fe6f-ec05-e8e6-c151a1a96121) @ PCI:0000:01:00.0

dmesg有兩條額外的線:

[   16.317773] nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[   16.504557] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input13

你的dmesg缺少一行:

[    7.517965] nvidia-modeset: Allocated GPU:0 (GPU-30fab9bc-fe6f-ec05-e8e6-c151a1a96121) @ PCI:0000:01:00.0

我的系統是 Skylake 6700HQ 和 nVidia GTX 970M,所以它與你的相當接近。我384.130從第一天起就使用驅動程式並取得了巨大成功,並且從未改變過它。我只有一個怪癖,Windows 可以為 nVidia 卡提供聲音,但 Linux 卻不能。所以我必須套用一個名為 的補丁nvhda才能將 HDMI 聲音傳輸到我的電視上。

相關內容