mellanox connectx-3(cx312a) 표시등이 깜박이더라도 링크가 끊어집니다.

mellanox connectx-3(cx312a) 표시등이 깜박이더라도 링크가 끊어집니다.

TrueNas Scale을 실행하는 시스템이 있습니다. 몇 가지 이상한 이유로 인해 dmesg가 이를 보고하고 포트에서 표시등이 깜박이는 경우에도 시스템은 링크가 작동 중인 것을 확인하지 못합니다. 나는 Ericson의 제품(Juniper에서 인식)을 사용하고 있기 때문에 이것이 광학 장치의 호환성인지 확실하지 않습니다. 또한 이상한 점은 자동 협상이 꺼져 있다고 광고된다는 것입니다. 다음은 문제 해결에 사용하는 몇 가지 명령입니다. 또한 NIC의 두 포트를 모두 사용해 보았습니다.

여기 dmesg가 있습니다

admin@truenas[~]$ sudo dmesg | grep mlx    
[    1.865599] mlx4_core: Mellanox ConnectX core driver v4.0-0
[    1.866114] mlx4_core: Initializing 0000:01:00.0
[    1.866639] mlx4_core 0000:01:00.0: enabling device (0000 -> 0002)
[    8.864380] mlx4_core 0000:01:00.0: DMFS high rate steer mode is: disabled performance optimized steering
[    8.865246] mlx4_core 0000:01:00.0: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link)
[    8.943401] mlx4_en: Mellanox ConnectX HCA Ethernet driver v4.0-0
[    8.943890] mlx4_en 0000:01:00.0: Activating port:1
[    8.947954] mlx4_en: 0000:01:00.0: Port 1: Using 8 TX rings
[    8.948294] mlx4_en: 0000:01:00.0: Port 1: Using 8 RX rings
[    8.948836] mlx4_en: 0000:01:00.0: Port 1: Initializing port
[    8.949441] mlx4_en 0000:01:00.0: registered PHC clock
[    8.950038] mlx4_en 0000:01:00.0: Activating port:2
[    8.951205] mlx4_core 0000:01:00.0 enp1s0: renamed from eth0
[    8.953510] mlx4_en: 0000:01:00.0: Port 2: Using 8 TX rings
[    8.954069] mlx4_en: 0000:01:00.0: Port 2: Using 8 RX rings
[    8.954840] mlx4_en: 0000:01:00.0: Port 2: Initializing port
[    8.980259] <mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v4.0-0
[    8.982100] <mlx4_ib> mlx4_ib_add: counter index 2 for port 1 allocated 1
[    8.982215] mlx4_core 0000:01:00.0 enp1s0d1: renamed from eth0
[    8.982567] <mlx4_ib> mlx4_ib_add: counter index 3 for port 2 allocated 1
[   11.659376] mlx4_en: enp1s0: Link Up
[  185.391704] mlx4_en: enp1s0: Link Down
[  257.903512] mlx4_en: enp1s0: Link Up
[ 1486.106437] mlx4_en: enp1s0: Link Down
[ 1524.842156] mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module ID attr(ff60) port(1) i2c_addr(50) offset(0) size(1): Response Mad Status(31c) - cable is not connected
[ 1591.308231] mlx4_en: enp1s0: Link Up
[ 2026.000939] mlx4_en: enp1s0: Link Down
[ 2047.724983] mlx4_en: enp1s0: Link Up
[ 3657.633665] mlx4_en: enp1s0: Link Down
[ 3684.070133] mlx4_en: enp1s0d1: Link Up
[ 4577.747372] mlx4_en: enp1s0d1: Link Down
[ 4631.805428] mlx4_en: enp1s0: Link Up

여기는 ethtool입니다

admin@truenas[~]$ sudo ethtool enp1s0
Settings for enp1s0:
    Supported ports: [ FIBRE ]
    Supported link modes:   1000baseX/Full
                            10000baseCR/Full
                            10000baseSR/Full
    Supported pause frame use: Symmetric Receive-only
    Supports auto-negotiation: No
    Supported FEC modes: Not reported
    Advertised link modes:  1000baseX/Full
                            10000baseCR/Full
                            10000baseSR/Full
    Advertised pause frame use: Symmetric
    Advertised auto-negotiation: No
    Advertised FEC modes: Not reported
    Speed: 10000Mb/s
    Duplex: Full
    Auto-negotiation: off
    Port: FIBRE
    PHYAD: 0
    Transceiver: internal
    Supports Wake-on: d
    Wake-on: d
        Current message level: 0x00000014 (20)
                               link ifdown
    Link detected: no

광학용 ethtool은 다음과 같습니다.

admin@truenas[~]$ sudo ethtool -m enp1s0
    Identifier                                : 0x03 (SFP)
    Extended identifier                       : 0x04 (GBIC/SFP defined by 2-wire interface ID)
    Connector                                 : 0x07 (LC)
    Transceiver codes                         : 0x10 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
    Transceiver type                          : 10G Ethernet: 10G Base-SR
    Encoding                                  : 0x06 (64B/66B)
    BR, Nominal                               : 10300MBd
    Rate identifier                           : 0x00 (unspecified)
    Length (SMF,km)                           : 0km
    Length (SMF)                              : 0m
    Length (50um)                             : 80m
    Length (62.5um)                           : 30m
    Length (Copper)                           : 0m
    Length (OM3)                              : 300m
    Laser wavelength                          : 850nm
    Vendor name                               : FINISAR CORP.
    Vendor OUI                                : 00:90:65
    Vendor PN                                 : FTLX8571D3BCL-ER
    Vendor rev                                : A
    Option values                             : 0x00 0x1a
    Option                                    : RX_LOS implemented
    Option                                    : TX_FAULT implemented
    Option                                    : TX_DISABLE implemented
    BR margin, max                            : 0%
    BR margin, min                            : 0%
    Vendor SN                                 : AP3193K
    Date code                                 : 130127
    Optical diagnostics support               : Yes
    Laser bias current                        : 8.036 mA
    Laser output power                        : 0.6284 mW / -2.02 dBm
    Receiver signal average optical power     : 0.7003 mW / -1.55 dBm
    Module temperature                        : 41.64 degrees C / 106.95 degrees F
    Module voltage                            : 3.2852 V
    Alarm/warning flags implemented           : Yes
    Laser bias current high alarm             : Off
    Laser bias current low alarm              : Off
    Laser bias current high warning           : Off
    Laser bias current low warning            : Off
    Laser output power high alarm             : Off
    Laser output power low alarm              : Off
    Laser output power high warning           : Off
    Laser output power low warning            : Off
    Module temperature high alarm             : Off
    Module temperature low alarm              : Off
    Module temperature high warning           : Off
    Module temperature low warning            : Off
    Module voltage high alarm                 : Off
    Module voltage low alarm                  : Off
    Module voltage high warning               : Off
    Module voltage low warning                : Off
    Laser rx power high alarm                 : Off
    Laser rx power low alarm                  : Off
    Laser rx power high warning               : Off
    Laser rx power low warning                : Off
    Laser bias current high alarm threshold   : 13.200 mA
    Laser bias current low alarm threshold    : 4.000 mA
    Laser bias current high warning threshold : 12.600 mA
    Laser bias current low warning threshold  : 5.000 mA
    Laser output power high alarm threshold   : 1.0000 mW / 0.00 dBm
    Laser output power low alarm threshold    : 0.2512 mW / -6.00 dBm
    Laser output power high warning threshold : 0.7943 mW / -1.00 dBm
    Laser output power low warning threshold  : 0.3162 mW / -5.00 dBm
    Module temperature high alarm threshold   : 78.00 degrees C / 172.40 degrees F
    Module temperature low alarm threshold    : -13.00 degrees C / 8.60 degrees F
    Module temperature high warning threshold : 73.00 degrees C / 163.40 degrees F
    Module temperature low warning threshold  : -8.00 degrees C / 17.60 degrees F
    Module voltage high alarm threshold       : 3.7000 V
    Module voltage low alarm threshold        : 2.9000 V
    Module voltage high warning threshold     : 3.6000 V
    Module voltage low warning threshold      : 3.0000 V
    Laser rx power high alarm threshold       : 1.0000 mW / 0.00 dBm
    Laser rx power low alarm threshold        : 0.0100 mW / -20.00 dBm
    Laser rx power high warning threshold     : 0.7943 mW / -1.00 dBm
    Laser rx power low warning threshold      : 0.0158 mW / -18.01 dBm

답변1

난 멍청이야. 인터페이스에 IP를 할당하고 사용하도록 불러올 수 있습니다.

관련 정보