最近的 Linux 核心更新後出現 IPsec 網站到網站 VPN 問題

最近的 Linux 核心更新後出現 IPsec 網站到網站 VPN 問題

上週末,我們對一個將網站連接到雲端環境的 VPN 閘道進行了自動安全升級。執行故障排除(透過基本網路故障排除,例如透過 Wireshark)後​​,我們發現最新的安全性更新之一是造成此問題的原因。我們已將系統恢復到已知的良好狀態,並已將(我們認為)受影響的軟體包設定為暫停。

它是 AWS 上安裝了 linux-image-aws 的 Ubuntu 20.04 LTS 實例。我們使用 IPsec 將多個 EdgeRouter 連接到私有雲環境。

升級後,所有站點都照常連接和通信,例如ICMP正在工作,但我們無法存取私有雲環境中的某些服務(例如RDP或SMB)。

相關包的更改日誌沒有顯示任何明顯的連結更改,所以我想知道我是否遺漏了一些基本的東西。此配置/設定已經運作良好一年多了,沒有出現任何問題。

已知好版本: linux-image-aws 5.8.0.1041.43~20.04.13

有問題的版本:linux-image-aws 5.8.0.1042.44~20.04.14 及更高版本(我們也測試了最新的 5.11,似乎受到影響)

IPsec 設定摘錄

# MAIN IPSEC VPN CONFIG
config setup

conn %default
        keyexchange=ikev1

# <REMOVED>
conn peer-rt1.<REMOVED>.net.au-tunnel-1
        left=%any
        right=rt1.<REMOVED>.net.au
        rightid="%any"
        leftsubnet=172.31.0.0/16
        rightsubnet=10.35.0.0/16
        ike=aes128-sha1-modp2048!
        keyexchange=ikev1
        ikelifetime=28800s
        esp=aes128-sha1-modp2048!
        keylife=3600s
        rekeymargin=540s
        type=tunnel
        compress=no
        authby=secret
        auto=route
        keyingtries=%forever
        dpddelay=30s
        dpdtimeout=120s
        dpdaction=restart

先感謝您。

編輯 1:在另一次測試升級後,我已經能夠從不成功的 RDP 連線捕獲另一個 tcpdump,如下所示:

21:43:01.813502 IP <LOCAL>.51099 > <REMOTE>.3389: Flags [S], seq 2706968963, win 64954, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
21:43:01.813596 IP <LOCAL>.51099 > <REMOTE>.3389: Flags [S], seq 2706968963, win 64954, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
21:43:01.814238 IP <REMOTE>.3389 > <LOCAL>.51099: Flags [S.], seq 152885333, ack 2706968964, win 64000, options [mss 1460,nop,wscale 0,nop,nop,sackOK], length 0
21:43:01.839105 IP <LOCAL>.51099 > <REMOTE>.3389: Flags [.], ack 1, win 1025, length 0
21:43:01.839168 IP <LOCAL>.51099 > <REMOTE>.3389: Flags [.], ack 1, win 1025, length 0
21:43:01.840486 IP <LOCAL>.51099 > <REMOTE>.3389: Flags [P.], seq 1:48, ack 1, win 1025, length 47
21:43:01.840541 IP <LOCAL>.51099 > <REMOTE>.3389: Flags [P.], seq 1:48, ack 1, win 1025, length 47
21:43:01.843746 IP <REMOTE>.3389 > <LOCAL>.51099: Flags [P.], seq 1:20, ack 48, win 63953, length 19
21:43:01.922120 IP <LOCAL>.51099 > <REMOTE>.3389: Flags [.], ack 20, win 1025, length 0
21:43:01.922212 IP <LOCAL>.51099 > <REMOTE>.3389: Flags [.], ack 20, win 1025, length 0
21:43:01.932646 IP <LOCAL>.51099 > <REMOTE>.3389: Flags [P.], seq 48:226, ack 20, win 1025, length 178
21:43:01.932729 IP <LOCAL>.51099 > <REMOTE>.3389: Flags [P.], seq 48:226, ack 20, win 1025, length 178
21:43:01.940677 IP <REMOTE>.3389 > <LOCAL>.51099: Flags [P.], seq 20:1217, ack 226, win 63775, length 1197
21:43:01.967343 IP <LOCAL>.51099 > <REMOTE>.3389: Flags [P.], seq 226:408, ack 1217, win 1020, length 182
21:43:01.967417 IP <LOCAL>.51099 > <REMOTE>.3389: Flags [P.], seq 226:408, ack 1217, win 1020, length 182
21:43:01.969452 IP <REMOTE>.3389 > <LOCAL>.51099: Flags [P.], seq 1217:1324, ack 408, win 63593, length 107
21:43:02.044376 IP <LOCAL>.51099 > <REMOTE>.3389: Flags [.], ack 1324, win 1020, length 0
21:43:02.044471 IP <LOCAL>.51099 > <REMOTE>.3389: Flags [.], ack 1324, win 1020, length 0
21:43:02.135594 IP <LOCAL>.51099 > <REMOTE>.3389: Flags [P.], seq 408:637, ack 1324, win 1020, length 229
21:43:02.135653 IP <LOCAL>.51099 > <REMOTE>.3389: Flags [P.], seq 408:637, ack 1324, win 1020, length 229
21:43:02.136796 IP <REMOTE>.3389 > <LOCAL>.51099: Flags [P.], seq 1324:2609, ack 637, win 63364, length 1285
21:43:02.212871 IP <LOCAL>.51099 > <REMOTE>.3389: Flags [.], ack 2609, win 1025, length 0
21:43:02.212940 IP <LOCAL>.51099 > <REMOTE>.3389: Flags [.], ack 2609, win 1025, length 0

答案1

無需深入研究 - 是否可能刪除舊的、弱的密碼,例如 sha1 和 aes128?嘗試在工作核心版本中將其變更為aes256-sha256-modp2048,然後升級以查看是否仍損壞。也可能是 ikev2 而不是 ikev1?但這是用戶空間問題,而不是核心設定。

試試一下…

相關內容