伺服器意外關閉 (BSOD),並顯示訊息“WHEA_UNCORRECTABLE_ERROR”

伺服器意外關閉 (BSOD),並顯示訊息“WHEA_UNCORRECTABLE_ERROR”

當我們檢查系統事件日誌時,我們發現重複記錄以下警告。

Event 17
A corrected hardware error has occurred.
Component: PCI Express Root Port
Error Source: Advanced Error Reporting (PCI Express)
Bus:Device:Function: 0x0:0x2:0x0
Vendor ID:Device ID: 0x8086:0x6F04
Class Code: 0x30400

當系統意外關閉(BSOD)時,會記錄下列錯誤。

Event 16
A fatal hardware error has occurred.
Component: PCI Express Root Port
Error Source: Advanced Error Reporting (PCI Express)
Bus:Device:Function: 0x0:0x2:0x0
Vendor ID:Device ID: 0x8086:0x6F04
Class Code: 0x30400

儘管自建立伺服器電腦(2021 年 3 月 27 日)以來每天都會記錄警告(事件 17),但係統僅因上述錯誤(事件 16)而意外關閉(20 年 7 月 20 日)一次。

BSOD 的崩潰轉儲分析:

Crash dump file: D:\MEMORY.DMP
This was probably caused by the following module: pci.sys (pci+0x1364B)
Bug check code: 0x124 (0x4, 0xFFFFE000C7D1E038, 0x0, 0x0)
Error: WHEA_UNCORRECTABLE_ERROR
File path: C:\Windows\system32\drivers\pci.sys
Product: Microsoft® Windows® Operating System
Company: Microsoft Corporation
Description: NT Plug and Play PCI Enumerator
Bug check description: This bug check indicates that a fatal hardware error has occurred. This bug check uses the error data that is provided by the Windows Hardware Error Architecture (WHEA).
This is likely to be caused by a hardware problem.
The crash took place in a Microsoft module. Your system configuration may be incorrect. Possibly this problem is caused by another driver on your system that cannot be identified at this time.

我們已經嘗試過了

我們已經更新到最新的windows server 2012 R2 (v6.3.9600 Build 9600)

所有相關驅動程式均已更新至最新版本

PCI.sys已更新至最新版本(v6.3.9600.18939)

伺服器詳細資訊:

Motherboard: AsrockRack Server Board EP2C612D16NM-2T8R
Raid: Dell (LSI OEM) 9341-8I mega raid (Latest Firmware)
Processor: Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10 GHz, 2100 MHz
OS: Microsoft Windows Server 2012 R2 Standard
OS Version: 6.3.9600 Build 9600

答案1

如果您已經將作業系統和驅動程式更新到最新版本,那麼也許您應該考慮將韌體伺服器也更新到最新版本。您收到的錯誤訊息也表示硬體有故障,錯誤文字是與 PCI 相關的元件。其他原因可能是您的伺服器過熱。

您可以使用其他幾個選項來嘗試解決此問題並在文件。

我希望這對您有幫助。

相關內容