乙太網路鏈路斷開並在伺服器重新啟動時恢復

乙太網路鏈路斷開並在伺服器重新啟動時恢復

我們有一台連接到 DELL PowerConnect 5424 交換器的 DELL R610 伺服器。此交換器連接到 DELL Equallogic SAN。 DELL R610作為MySQL資料庫伺服器,SAN提供資料資料目錄,掛載為iSCSI磁碟機。

根據我之前提出的問題這裡,我們觀察到重新啟動後,MySQL 無法自行啟動。這也是一種間歇性行為。經過調查,我們發現 iSCSI 啟動器服務無法在啟動時執行指令。在檢查日誌時,我們發現當網路介面啟動時出現一個小現象。 grep bnx2 日誌:

 bnx2: Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.2.4 (Aug 05, 2013)
 bnx2 0000:01:00.0 eth0: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem d6000000, IRQ 36, node addr 5c:f9:dd:f1:8a:ea
 bnx2 0000:01:00.1 eth1: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem d8000000, IRQ 48, node addr 5c:f9:dd:f1:8a:ec
 bnx2 0000:02:00.0 eth2: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem da000000, IRQ 32, node addr 5c:f9:dd:f1:8a:ee
 bnx2 0000:02:00.1 eth3: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem dc000000, IRQ 42, node addr 5c:f9:dd:f1:8a:f0
 bnx2 0000:02:00.0: irq 78 for MSI/MSI-X
 bnx2 0000:02:00.0: irq 79 for MSI/MSI-X
 bnx2 0000:02:00.0: irq 80 for MSI/MSI-X
 bnx2 0000:02:00.0: irq 81 for MSI/MSI-X
 bnx2 0000:02:00.0: irq 82 for MSI/MSI-X
 bnx2 0000:02:00.0: irq 83 for MSI/MSI-X
 bnx2 0000:02:00.0: irq 84 for MSI/MSI-X
 bnx2 0000:02:00.0: irq 85 for MSI/MSI-X
 bnx2 0000:02:00.0: irq 86 for MSI/MSI-X
 bnx2 0000:02:00.0 em3: using MSIX
 bnx2 0000:01:00.0: irq 87 for MSI/MSI-X
 bnx2 0000:01:00.0: irq 88 for MSI/MSI-X
 bnx2 0000:01:00.0: irq 89 for MSI/MSI-X
 bnx2 0000:01:00.0: irq 90 for MSI/MSI-X
 bnx2 0000:01:00.0: irq 91 for MSI/MSI-X
 bnx2 0000:01:00.0: irq 92 for MSI/MSI-X
 bnx2 0000:01:00.0: irq 93 for MSI/MSI-X
 bnx2 0000:01:00.0: irq 94 for MSI/MSI-X
 bnx2 0000:01:00.0: irq 95 for MSI/MSI-X
 bnx2 0000:01:00.0 em1: using MSIX
 bnx2 0000:01:00.1: irq 96 for MSI/MSI-X
 bnx2 0000:01:00.1: irq 97 for MSI/MSI-X
 bnx2 0000:01:00.1: irq 98 for MSI/MSI-X
 bnx2 0000:01:00.1: irq 99 for MSI/MSI-X
 bnx2 0000:01:00.1: irq 100 for MSI/MSI-X
 bnx2 0000:01:00.1: irq 101 for MSI/MSI-X
 bnx2 0000:01:00.1: irq 102 for MSI/MSI-X
 bnx2 0000:01:00.1: irq 103 for MSI/MSI-X
 bnx2 0000:01:00.1: irq 104 for MSI/MSI-X
 bnx2 0000:01:00.1 em2: using MSIX
 bnx2 0000:01:00.0 em1: NIC Copper Link is Up, 100 Mbps full duplex
 bnx2 0000:02:00.0 em3: NIC Copper Link is Up, 1000 Mbps full duplex
 bnx2 0000:01:00.1 em2: NIC Copper Link is Up, 1000 Mbps full duplex
 **bnx2 0000:01:00.1 em2: NIC Copper Link is Down**
 bnx2 0000:01:00.1 em2: NIC Copper Link is Up, 1000 Mbps full duplex

我們目前的解決方法是重新啟動伺服器。到目前為止,重新啟動伺服器時,一切都順利進行,並且沒有觀察到上面日誌中所示的現象。

任何人都可以幫助如何繼續解決失敗的問題嗎?我已經提到過這裡但這很可能不是我的情況,因為我們的問題僅在重新啟動時發生。除此之外,ifconfig 中沒有 NIC 錯誤,且 NIC 中沒有封包遺失/遺失。伺服器啟動後我們從未遇到任何網路問題。

DELL R610 運行的是 Ubuntu 14.04。

@Dom 建議的更多日誌:

 $ cat logfile.mysql.withoutdate |grep -B 15 -A 15 "NIC Copper Link is Down"
 Loading iSCSI transport class v2.0-870.
 bnx2 0000:01:00.1 em2: using MSIX
 IPv6: ADDRCONF(NETDEV_UP): em2: link is not ready
 iscsi: registered transport (tcp)
 iscsi: registered transport (iser)
 multipathd (2470): /proc/2470/oom_adj is deprecated, please use /proc/2470/oom_score_adj instead.
 bnx2 0000:01:00.0 em1: NIC Copper Link is Up, 100 Mbps full duplex

 IPv6: ADDRCONF(NETDEV_CHANGE): em1: link becomes ready
 bnx2 0000:02:00.0 em3: NIC Copper Link is Up, 1000 Mbps full duplex
 , receive & transmit flow control ON
 IPv6: ADDRCONF(NETDEV_CHANGE): em3: link becomes ready
 bnx2 0000:01:00.1 em2: NIC Copper Link is Up, 1000 Mbps full duplex
 , receive & transmit flow control ON
 IPv6: ADDRCONF(NETDEV_CHANGE): em2: link becomes ready
 bnx2 0000:01:00.1 em2: NIC Copper Link is Down
 bnx2 0000:01:00.1 em2: NIC Copper Link is Up, 1000 Mbps full duplex
 , receive & transmit flow control ON

相關內容