一個核心上 100.0% 中斷

一個核心上 100.0% 中斷

為什麼interrupts不擴展到所有核心?

Cpu0  :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,100.0%si,  0.0%st
Cpu1  : 25.2%us, 32.6%sy,  0.0%ni, 12.6%id, 26.2%wa,  0.0%hi,  3.3%si,  0.0%st
Cpu2  : 29.0%us, 15.0%sy,  0.0%ni, 29.3%id, 26.7%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  : 16.0%us, 21.7%sy,  0.0%ni, 34.3%id, 27.7%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu4  : 26.0%us, 14.3%sy,  0.0%ni, 33.7%id, 25.7%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu5  : 15.0%us, 15.0%sy,  0.0%ni, 44.2%id, 25.2%wa,  0.0%hi,  0.7%si,  0.0%st
Cpu6  : 13.0%us, 13.3%sy,  0.0%ni, 42.2%id, 31.2%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu7  :  9.7%us, 11.0%sy,  0.0%ni, 56.3%id, 23.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu8  : 13.0%us, 12.6%sy,  0.0%ni, 49.2%id, 25.2%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu9  :  9.6%us,  7.3%sy,  0.0%ni, 69.1%id, 13.6%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu10 :  8.9%us,  7.9%sy,  0.0%ni, 54.8%id, 28.1%wa,  0.0%hi,  0.3%si,  0.0%st

沒有任何明顯的原因,我的伺服器開始工作不良,在檢查 top 之後,我注意到只有一個核心處理 100% 的中斷。

貓/過程/中斷

            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7       CPU8       CPU9       CPU10      CPU11      
   0:        213          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-edge      timer
   8:          1          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-edge      rtc0
   9:          1          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   acpi
  16:        557          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   ehci_hcd:usb2, uhci_hcd:usb6, uhci_hcd:usb7, uhci_hcd:usb8
  17:    4373632      89953          0          0          0   10737111          0          0          0          0          0   22943776   IO-APIC-fasteoi   firewire_ohci
  19:         48          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb3, uhci_hcd:usb4, uhci_hcd:usb5
  24:        378          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   nouveau
  34:        232          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   hda_intel
  64:          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      aerdrv, PCIe PME
  65:          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      aerdrv, PCIe PME
  66:          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      aerdrv, PCIe PME
  67:          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      PCIe PME, pciehp
  68:          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      PCIe PME, pciehp
  69:          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      PCIe PME, pciehp
  70:   27356052          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      mpt2sas0
  71:     360910          0          0          0      10388     366203          0     660341          0          0          0    1011704   PCI-MSI-edge      ahci
  72:          7          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth0
  73:    3223115          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth0-TxRx-0
  74:          6          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth1
  75:    3573711          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth1-TxRx-0
  76:          6          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth2
  77:    3548069          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth2-TxRx-0
  78:          6          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth3
  79:    3290681          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth3-TxRx-0
  80:          6          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth4
  81:    3319709          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth4-TxRx-0
  82:          7          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth5
  83:    3294914          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth5-TxRx-0
  84:        223          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      hda_intel
  85:          4          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      ioat-msix
  86:          4          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      ioat-msix
  87:          4          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      ioat-msix
  88:          4          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      ioat-msix
  89:          4          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      ioat-msix
  90:          4          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      ioat-msix
  91:          4          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      ioat-msix
  92:          4          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      ioat-msix
 NMI:      20083      11292       9555      10288       8470       9085       7319       7726       6190       6286       5305       5966   Non-maskable interrupts
 LOC:   12625312   12863741   12757467   12819307   12735818   12636631   12594014   12340042   12351248   11896407   11976946   11309230   Local timer interrupts
 SPU:          0          0          0          0          0          0          0          0          0          0          0          0   Spurious interrupts
 PMI:      20083      11292       9555      10288       8470       9085       7319       7726       6190       6286       5305       5966   Performance monitoring interrupts
 IWI:          0          0          0          0          0          0          0          0          0          0          0          0   IRQ work interrupts
 RES:    2102300   11881309   11859706   12689803   11274676   10461216    9626798    8188722    7976358    6329291    6344685    4528014   Rescheduling interrupts
 CAL:     732819   20016455      15519      15361      17958      23935      23377      43079      40287     108860      70814     257653   Function call interrupts
 TLB:       7589      72270      46673      99284      46373     121129      43286     101506      34109      78720      28570      70600   TLB shootdowns
 TRM:          0          0          0          0          0          0          0          0          0          0          0          0   Thermal event interrupts
 THR:          0          0          0          0          0          0          0          0          0          0          0          0   Threshold APIC interrupts
 MCE:          0          0          0          0          0          0          0          0          0          0          0          0   Machine check exceptions
 MCP:         44         44         44         44         44         44         44         44         44         44         44         44   Machine check polls
 ERR:          0
 MIS:          0

cat /proc/interrupts 顯示所有網路卡都是由 CPU0 處理的,我猜這就是問題所在。

網路配置為 BONDING_OPTS="mode=4 miimon=100 xmit_hash_policy=layer3+4"

我嘗試過的:

  1. 運作中斷平衡
  2. 重啟

答案1

使用以下命令檢查中斷 CPU 關聯性:

cat /proc/irq/70/smp_affinity 

我選擇 70 是因為它與記憶卡驅動程式 mpt2sas0 相關。您可能還想對所有其他潛水員重複檢查,特別是如果您正在處理大量流量,則尤其是網卡。

您希望該設定報告值噗噗噗因為這意味著所有CPU都可以服務這個中斷。

你可以按照這個文件來自 RedHat 作為參考。

答案2

您可能需要使用中斷平衡

相關內容