![Kubernetes 叢集設定後,Keepalived 不會將流量轉送到 BACKUP 節點](https://rvso.com/image/768920/Kubernetes%20%E5%8F%A2%E9%9B%86%E8%A8%AD%E5%AE%9A%E5%BE%8C%EF%BC%8CKeepalived%20%E4%B8%8D%E6%9C%83%E5%B0%87%E6%B5%81%E9%87%8F%E8%BD%89%E9%80%81%E5%88%B0%20BACKUP%20%E7%AF%80%E9%BB%9E.png)
系統結構:
10.10.1.86
:Kubernetes主節點10.10.1.87
:Kubernetes工作節點1;keepalived
主節點10.10.1.88
:Kubernetes工作2節點;keepalived
備份節點10.10.1.90
:VIP,會將負載平衡到.87
&.88
;由 實施keepalived
。
這個 Kubernetes 叢集是一個開發環境,測試收集網路流日誌。
我想要實現的是:
- 所有路由器/交換器netflow log先輸出到
.90
- 然後使用
keepalived
負載平衡 (lb_kind
:)NAT
到.87
&.88
,這是兩個 Kubernetes Worker。 - 有
NodePort
服務將這些流量捕獲到 Kubernetes 叢集中並完成其餘的資料解析工作。
- 就像是:
| {OS Network} | {Kubernetes Network}
K8s Worker -> filebeat -> logstash (deployments)
/
<data> -> [VIP] load balance
\
K8s Worker -> filebeat -> logstash (deployments)
- filebeat.yml(已測試 filebeat 之後流量一切正常,因此我使用
file
輸出來縮小根本原因。)
# cat filebeat.yml
filebeat.inputs:
- type: tcp
max_message_size: 10MiB
host: "0.0.0.0:5100"
- type: udp
max_message_size: 10KiB
host: "0.0.0.0:5150"
#output.logstash:
# hosts: ["10.10.1.87:30044", "10.10.1.88:30044"]
output.file:
path: "/tmp/"
filename: tmp-filebeat.out
庫伯內斯
- Master 和 Workers 是我的私人環境中的 3 個虛擬機器;不是任何 GCP 或 AWS 提供者。
- 版本:
# kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean", BuildDate:"2021-04-08T16:31:21Z", GoVersion:"go1.16.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean", BuildDate:"2021-04-08T16:25:06Z", GoVersion:"go1.16.1", Compiler:"gc", Platform:"linux/amd64"}
- 服務
# cat logstash.service.yaml
apiVersion: v1
kind: Service
metadata:
name: logstash-service
spec:
type: NodePort
selector:
app: logstash
ports:
- port: 9514
name: tcp-port
targetPort: 9514
nodePort: 30044
- 一旦數據進入 Kubernetes,一切就正常了。
- 是VIP負載平衡不轉送。
Keepalived 設定檔
!Configuration File for keepalived
global_defs {
router_id proxy1 # `proxy 2` at the other node
}
vrrp_instance VI_1 {
state MASTER # `BACKUP` at the other node
interface ens160
virtual_router_id 41
priority 100 # `50` at the other node
advert_int 1
virtual_ipaddress {
10.10.1.90/23
}
}
virtual_server 10.10.1.90 5100 {
delay_loop 30
lb_algo rr
lb_kind NAT
protocol TCP
persistence_timeout 0
real_server 10.10.1.87 5100 {
weight 1
}
real_server 10.10.1.88 5100 {
weight 1
}
}
virtual_server 10.10.1.90 5150 {
delay_loop 30
lb_algo rr
lb_kind NAT
protocol UDP
persistence_timeout 0
real_server 10.10.1.87 5150 {
weight 1
}
real_server 10.10.1.88 5150 {
weight 1
}
它在 Kubernetes 叢集設定之前有效
- 兩者
.87
都已.88
安裝keepalived
,並且rr
(RoundRobin)負載平衡工作正常(TCP 和 UDP)。 - 在設定 kubernetes 叢集時停止
keepalived
服務 (systemctl stop keepalived
),以防萬一。
Kubernetes叢集設定後出現問題
- 發現只有MASTER節點
.87
可以轉送流量,VIP無法轉送到BACKUP節點.88
。 - 從 MASTER 轉送的資料已成功被 kubernetes
NodePort
和部署擷取。
問題測試通過nc
:
nc
:持有VIP(MASTER節點)的人才能轉送流量,rr
轉送到BACKUP時,只顯示逾時。nc -l 5100
在兩台伺服器上也進行了測試,只有 MASTER 節點得到了結果。
# echo "test" | nc 10.10.1.90 5100
# echo "test" | nc 10.10.1.90 5100
Ncat: Connection timed out.
# echo "test" | nc 10.10.1.90 5100
# echo "test" | nc 10.10.1.90 5100
Ncat: Connection timed out.
一些資訊
- 封裝版本
# rpm -qa |grep keepalived
keepalived-1.3.5-19.el7.x86_64
- Kubernetes CNI:
Calico
# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-b656ddcfc-wnkcj 1/1 Running 2 78d
calico-node-vnf4d 1/1 Running 8 78d
calico-node-xgzd5 1/1 Running 1 78d
calico-node-zt25t 1/1 Running 8 78d
coredns-558bd4d5db-n6hnn 1/1 Running 2 78d
coredns-558bd4d5db-zz2rb 1/1 Running 2 78d
etcd-a86.axv.bz 1/1 Running 2 78d
kube-apiserver-a86.axv.bz 1/1 Running 2 78d
kube-controller-manager-a86.axv.bz 1/1 Running 2 78d
kube-proxy-ddwsr 1/1 Running 2 78d
kube-proxy-hs4dx 1/1 Running 3 78d
kube-proxy-qg2nq 1/1 Running 1 78d
kube-scheduler-a86.axv.bz 1/1 Running 2 78d
ipvsadm
( ,上的結果相同.87
).88
# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.10.1.90:5100 rr
-> 10.10.1.87:5100 Masq 1 0 0
-> 10.10.1.88:5100 Masq 1 0 0
UDP 10.10.1.90:5150 rr
-> 10.10.1.87:5150 Masq 1 0 0
-> 10.10.1.88:5150 Masq 1 0 0
- Selinux 始終是
Permissive
- 如果停止
firewalld
,仍然不起作用。 sysctl
不同之處:
# before:
net.ipv4.conf.all.accept_redirects = 1
net.ipv4.conf.all.forwarding = 0
net.ipv4.conf.all.route_localnet = 0
net.ipv4.conf.default.forwarding = 0
net.ipv4.conf.lo.forwarding = 0
net.ipv4.ip_forward = 0
# after
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.forwarding = 1
net.ipv4.conf.all.route_localnet = 1
net.ipv4.conf.default.forwarding = 1
net.ipv4.conf.lo.forwarding = 1
net.ipv4.ip_forward = 1
現在不知道是否可以做進一步檢查,請指教,謝謝!