So können Sie mit der Load Balancer-IP nicht auf Dienste von außerhalb des Kubernetes-Clusters (vor Ort) zugreifen

2024-6-23 • tag-icon

So können Sie mit der Load Balancer-IP nicht auf Dienste von außerhalb des Kubernetes-Clusters (vor Ort) zugreifen

Ich habe ein lokales Kubernetes-Cluster-Setup (v1.24) mit cri-o als Container-Runtime, calico als cni und metallb für Load Balancer-IPs.

Das Master- und Worker-Betriebssystem läuft auf Rockylinux9, Selinux ist aktiviert, aber Firewall ist deaktiviert und der Kube-Proxy verwendet den IPVs-Modus.

Ich habe BGP-Peering mit einem Mikrotik-Router eingerichtet und gesehen, dass die von mir festgelegten IP-Bereiche dem Router über den Routenabschnitt mit einem Eintrag angekündigt werden, der zeigt

10.16.0.0/28 reachable through bridge

Die externe IP meines Nginx-Testdienstes lautet 10.16.0.1. Das Curling der IP von Mastern, Workern und innerhalb von Pods funktioniert, wodurch die standardmäßige Nginx-Willkommensseite zurückgegeben wird. Beim Curling von meinem Laptop hängt es jedoch einfach bis zum Timeout und die Selinux-Audit-Protokolle zeigen keine Verstöße an.

nmap die IP zeigt, dass Port 80 geöffnet ist, Ping die IP funktioniert auch, Traceroute zeigt auch die richtige Antwort und zur Sicherheit lösche ich auch den Dienst und führe nmap, Ping und Traceroute erneut aus und es funktioniert nicht mehr wie erwartet.

# commands below runs on my laptop
# that connected to local network
# but result also same as I run on
# other devices on the network

# ---

❯ nmap -T4 10.16.0.1
Starting Nmap 7.94 ( https://nmap.org ) at 2023-09-23 11:01 +07
Nmap scan report for 10.16.0.1
Host is up (0.0048s latency).
Not shown: 997 closed tcp ports (conn-refused)
PORT    STATE    SERVICE
22/tcp  open     ssh
80/tcp  filtered http
179/tcp open     bgp

# ---

# the 192.168.88.43 is local master node ip
❯ ping 10.16.0.1
PING 10.16.0.1 (10.16.0.1): 56 data bytes
64 bytes from 10.16.0.1: icmp_seq=0 ttl=64 time=9.144 ms
92 bytes from 192.168.88.1: Redirect Host(New addr: 192.168.88.43)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
 4  5  00 0054 c3b4   0 0000  3f  01 94cc 192.168.88.111  10.16.0.1

64 bytes from 10.16.0.1: icmp_seq=1 ttl=64 time=3.003 ms
92 bytes from 192.168.88.1: Redirect Host(New addr: 192.168.88.43)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
 4  5  00 0054 c9cc   0 0000  3f  01 8eb4 192.168.88.111  10.16.0.1

64 bytes from 10.16.0.1: icmp_seq=2 ttl=64 time=3.209 ms
92 bytes from 192.168.88.1: Redirect Host(New addr: 192.168.88.43)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
 4  5  00 0054 305b   0 0000  3f  01 2826 192.168.88.111  10.16.0.1

64 bytes from 10.16.0.1: icmp_seq=3 ttl=64 time=2.557 ms
92 bytes from 192.168.88.1: Redirect Host(New addr: 192.168.88.43)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
 4  5  00 0054 25d8   0 0000  3f  01 32a9 192.168.88.111  10.16.0.1

64 bytes from 10.16.0.1: icmp_seq=4 ttl=64 time=3.594 ms
x92 bytes from 192.168.88.1: Redirect Host(New addr: 192.168.88.43)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
 4  5  00 0054 4016   0 0000  3f  01 186b 192.168.88.111  10.16.0.1

64 bytes from 10.16.0.1: icmp_seq=5 ttl=64 time=2.974 ms

64 bytes from 10.16.0.1: icmp_seq=6 ttl=64 time=4.397 ms
^C
--- 10.16.0.1 ping statistics ---
7 packets transmitted, 7 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 2.557/4.125/9.144/2.119 ms

# --

# 192.168.88.1 is my default gateway
❯ traceroute 10.16.0.1
traceroute to 10.16.0.1 (10.16.0.1), 64 hops max, 52 byte packets
 1  192.168.88.1 (192.168.88.1)  3.669 ms  2.713 ms  2.552 ms
 2  10.16.0.1 (10.16.0.1)  3.303 ms  3.292 ms  3.145 ms

Suche nach Nginx-Informationen

❯ kubectl get pods -o wide
NAME                                 READY   STATUS    RESTARTS        AGE   IP              NODE           NOMINATED NODE   READINESS GATES
nginx                                1/1     Running   0               39m   172.16.29.154   k8s-worker-0   <none>           <none>

❯ kubectl get svc
NAME                 TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)                               AGE                              42h
kubernetes           ClusterIP      10.96.0.1      <none>        443/TCP                               2d5h
nginx                LoadBalancer   10.97.153.41   10.16.0.1     80:30422/TCP                          40m

❯ kubectl get endpoints
NAME                 ENDPOINTS                                                               AGE                                                    42h
kubernetes           192.168.88.43:6443                                                      2d5h
nginx                172.16.29.154:80                                                        41m

❯ kubectl describe service nginx
Name:                     nginx
Namespace:                default
Labels:                   <none>
Annotations:              metallb.universe.tf/address-pool: public
                          metallb.universe.tf/ip-allocated-from-pool: public
Selector:                 app=nginx
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.97.153.41
IPs:                      10.97.153.41
LoadBalancer Ingress:     10.16.0.1
Port:                     <unset>  80/TCP
TargetPort:               80/TCP
NodePort:                 <unset>  30422/TCP
Endpoints:                172.16.29.154:80
Session Affinity:         None
External Traffic Policy:  Local
HealthCheck NodePort:     31552
Events:
  Type    Reason       Age   From                Message
  ----    ------       ----  ----                -------
  Normal  IPAllocated  41m   metallb-controller  Assigned IP ["10.16.0.1"]

Calico-Konfigurationen

❯ kubectl describe bgppeer
Name:         global-peer
Namespace:
Labels:       <none>
Annotations:  <none>
API Version:  projectcalico.org/v3
Kind:         BGPPeer
Metadata:
  Creation Timestamp:  2023-09-22T06:37:35Z
  Resource Version:    430164
  UID:                 62d549ac-b7ed-47ab-bc2c-b94de8a08939
Spec:
  As Number:  65530
  Filters:
    default
  Peer IP:  192.168.88.1
Events:     <none>

❯ kubectl describe bgpconfiguration
Name:         default
Namespace:
Labels:       <none>
Annotations:  <none>
API Version:  projectcalico.org/v3
Kind:         BGPConfiguration
Metadata:
  Creation Timestamp:  2023-09-22T06:37:24Z
  Resource Version:    839696
  UID:                 fd6aef3e-0a4c-4ecc-afe4-09395b76d107
Spec:
  As Number:                  65500
  Node To Node Mesh Enabled:  false
  Service Load Balancer I Ps:
    Cidr:  10.16.0.0/28
Events:    <none>

❯ kubectl describe bgpfilter
Name:         default
Namespace:
Labels:       <none>
Annotations:  <none>
API Version:  projectcalico.org/v3
Kind:         BGPFilter
Metadata:
  Creation Timestamp:  2023-09-22T06:37:34Z
  Resource Version:    434512
  UID:                 df8b5aeb-e25e-481c-9403-fcc54727eea8
Spec:
  exportV4:
    Action:          Reject
    Cidr:            10.16.0.0/28
    Match Operator:  NotIn
Events:              <none>

❯ kubectl describe installation
Name:         default
Namespace:
Labels:       <none>
Annotations:  <none>
API Version:  operator.tigera.io/v1
Kind:         Installation
Metadata:
  Creation Timestamp:  2023-09-21T05:32:53Z
  Finalizers:
    tigera.io/operator-cleanup
  Generation:        3
  Resource Version:  1052996
  UID:               2f7928cd-3815-42be-9483-512f2f39fcf9
Spec:
  Calico Network:
    Bgp:         Enabled
    Host Ports:  Enabled
    Ip Pools:
      Block Size:          26
      Cidr:                172.16.0.0/16
      Disable BGP Export:  false
      Encapsulation:       None
      Nat Outgoing:        Disabled
      Node Selector:       all()
    Linux Dataplane:       Iptables
    Multi Interface Mode:  None
    nodeAddressAutodetectionV4:
      First Found:  true
  Cni:
    Ipam:
      Type:                    Calico
    Type:                      Calico
  Control Plane Replicas:      2
  Flex Volume Path:            /usr/libexec/kubernetes/kubelet-plugins/volume/exec/
  Kubelet Volume Plugin Path:  /var/lib/kubelet
  Logging:
    Cni:
      Log File Max Age Days:  30
      Log File Max Count:     10
      Log File Max Size:      100Mi
      Log Severity:           Info
  Node Update Strategy:
    Rolling Update:
      Max Unavailable:  1
    Type:               RollingUpdate
  Non Privileged:       Disabled
  Variant:              Calico
Status:
  Calico Version:  v3.26.1
  Computed:
    Calico Network:
      Bgp:         Enabled
      Host Ports:  Enabled
      Ip Pools:
        Block Size:          26
        Cidr:                172.16.0.0/16
        Disable BGP Export:  false
        Encapsulation:       None
        Nat Outgoing:        Disabled
        Node Selector:       all()
      Linux Dataplane:       Iptables
      Multi Interface Mode:  None
      nodeAddressAutodetectionV4:
        First Found:  true
    Cni:
      Ipam:
        Type:                    Calico
      Type:                      Calico
    Control Plane Replicas:      2
    Flex Volume Path:            /usr/libexec/kubernetes/kubelet-plugins/volume/exec/
    Kubelet Volume Plugin Path:  /var/lib/kubelet
    Logging:
      Cni:
        Log File Max Age Days:  30
        Log File Max Count:     10
        Log File Max Size:      100Mi
        Log Severity:           Info
    Node Update Strategy:
      Rolling Update:
        Max Unavailable:  1
      Type:               RollingUpdate
    Non Privileged:       Disabled
    Variant:              Calico
  Conditions:
    Last Transition Time:  2023-09-23T15:08:30Z
    Message:
    Observed Generation:   3
    Reason:                Unknown
    Status:                False
    Type:                  Degraded
    Last Transition Time:  2023-09-23T15:08:30Z
    Message:
    Observed Generation:   3
    Reason:                Unknown
    Status:                False
    Type:                  Ready
    Last Transition Time:  2023-09-23T15:08:30Z
    Message:               DaemonSet "calico-system/calico-node" is not available (awaiting 1 nodes)
    Observed Generation:   3
    Reason:                ResourceNotReady
    Status:                True
    Type:                  Progressing
  Mtu:                     1450
  Variant:                 Calico
Events:                    <none>

calicoctl node statuszeigt an

Metallb-Konfiguration

❯ kubectl -n metallb-system describe ipaddresspool
Name:         public
Namespace:    metallb-system
Labels:       <none>
Annotations:  <none>
API Version:  metallb.io/v1beta1
Kind:         IPAddressPool
Metadata:
  Creation Timestamp:  2023-09-22T07:57:01Z
  Generation:          3
  Resource Version:    434511
  UID:                 aa2d0e08-4fee-4ce1-a822-9bf4f52a3a3c
Spec:
  Addresses:
    10.16.0.0/28
  Auto Assign:       true
  Avoid Buggy I Ps:  true
Events:              <none>

Wie kann ich also das debuggen, warum ich von außerhalb des Clusters nicht auf den Nginx-Dienst zugreifen kann?

Aktualisierung0

Wenn ich sowohl auf dem Master- als auch auf dem Worker-Knoten laufe und dann versuche, ihn auf einem anderen Rechner außerhalb des Clusters tcpdump -n -i ens18 host 10.16.0.1auszuführen, erhalte ich diese auf dem Master- und dem Worker-Knotenping 10.16.0.1arping -I 10.16.0.1

# ping only show on master node
# 192.168.88.111 is the other machine(my laptop)
dropped privs to tcpdump
tcpdump: listening on ens18, link-type EN10MB (Ethernet), snapshot length 262144 bytes
15:31:29.842416 IP (tos 0x0, ttl 63, id 33297, offset 0, flags [none], proto ICMP (1), length 84)
    192.168.88.111 > 10.16.0.1: ICMP echo request, id 14677, seq 0, length 64
15:31:29.842465 IP (tos 0x0, ttl 64, id 38235, offset 0, flags [none], proto ICMP (1), length 84)
    10.16.0.1 > 192.168.88.111: ICMP echo reply, id 14677, seq 0, length 64
15:31:30.846173 IP (tos 0x0, ttl 63, id 14503, offset 0, flags [none], proto ICMP (1), length 84)
    192.168.88.111 > 10.16.0.1: ICMP echo request, id 14677, seq 1, length 64
15:31:30.846206 IP (tos 0x0, ttl 64, id 38836, offset 0, flags [none], proto ICMP (1), length 84)
    10.16.0.1 > 192.168.88.111: ICMP echo reply, id 14677, seq 1, length 64

# arping received 0 responses
# arping showing both on master and worker node
15:33:08.269603 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.16.0.1 (Broadcast) tell 192.168.88.63, length 46
15:33:09.269759 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.16.0.1 (Broadcast) tell 192.168.88.63, length 46
15:33:10.269768 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.16.0.1 (Broadcast) tell 192.168.88.63, length 46
15:33:11.269725 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.16.0.1 (Broadcast) tell 192.168.88.63, length 46
15:33:12.269575 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.16.0.1 (Broadcast) tell 192.168.88.63, length 46
15:33:13.269695 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.16.0.1 (Broadcast) tell 192.168.88.63, length 46
15:33:14.269685 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.16.0.1 (Broadcast) tell 192.168.88.63, length 46
15:33:15.269715 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.16.0.1 (Broadcast) tell 192.168.88.63, length 46
15:33:16.269754 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.16.0.1 (Broadcast) tell 192.168.88.63, length 46

Antwort1

Ich würde vorschlagen, dies inCalico-Benutzer locker.

In Ihren BGP-Konfigurationen habe ich gesehen, dass Sie Fullmesh deaktiviert haben. Ist das beabsichtigt? Verwenden Sie Routenreflektoren? Node To Node Mesh Enabled: false

Ich würde damit beginnen, die BGP-Route auf Mikrotik zu überprüfen. So etwas wie „Anpassen, wenn Sie eine ältere Version des Router-Betriebssystems verwenden“ /ip/route/print detail from=[find gateway~"^192.168.88.63[0x00-0 xff]*"]

Wenn das richtig aussieht, würde ich prüfen, ob LB über Miki erreichbar ist /tool/fetch http://10.43.1.1

Überprüfen Sie dann den Status des Bird-Protokolls und versuchen Sie birdcl von Calico-Node-Pods kubectl exec -n calico-system ds/calico-node -c calico-node -- birdcl show protocols

kubectl exec -n calico-system ds/calico-node -c calico-node -- birdcl show route

Antwort1

verwandte Informationen