%20con%20la%20IP%20del%20equilibrador%20de%20carga.png)
Tengo una configuración de clúster de Kubernetes (v1.24) local con cri-o como tiempo de ejecución del contenedor, calico como cni y metallb para las IP del equilibrador de carga.
El sistema operativo Masters y Workers se ejecuta en rockylinux9, selinux habilitado pero firewalld deshabilitado y el proxy kube usa el modo ipvs.
Configuré el emparejamiento bgp con el enrutador Mikrotik y vi que los rangos de IP que configuré se anuncian en el enrutador a través de la sección de rutas con un registro que muestra
10.16.0.0/28 reachable through bridge
La IP externa de mi servicio Nginx de prueba es 10.16.0.1
: rizar la IP de los maestros, trabajadores y dentro de los pods funciona y devuelve la página de bienvenida predeterminada de Nginx, pero cuando se riza desde mi computadora portátil, simplemente se cuelga hasta que se agota el tiempo de espera y los registros de auditoría de Selinux no muestran ninguna violación.
nmap la ip muestra el puerto 80 abierto, hacer ping a la ip también funciona, traceroute también muestra la respuesta correcta y para una verificación de locura también elimino el servicio y hago nmap, ping y traceroute nuevamente y deja de funcionar como se esperaba.
# commands below runs on my laptop
# that connected to local network
# but result also same as I run on
# other devices on the network
# ---
❯ nmap -T4 10.16.0.1
Starting Nmap 7.94 ( https://nmap.org ) at 2023-09-23 11:01 +07
Nmap scan report for 10.16.0.1
Host is up (0.0048s latency).
Not shown: 997 closed tcp ports (conn-refused)
PORT STATE SERVICE
22/tcp open ssh
80/tcp filtered http
179/tcp open bgp
# ---
# the 192.168.88.43 is local master node ip
❯ ping 10.16.0.1
PING 10.16.0.1 (10.16.0.1): 56 data bytes
64 bytes from 10.16.0.1: icmp_seq=0 ttl=64 time=9.144 ms
92 bytes from 192.168.88.1: Redirect Host(New addr: 192.168.88.43)
Vr HL TOS Len ID Flg off TTL Pro cks Src Dst
4 5 00 0054 c3b4 0 0000 3f 01 94cc 192.168.88.111 10.16.0.1
64 bytes from 10.16.0.1: icmp_seq=1 ttl=64 time=3.003 ms
92 bytes from 192.168.88.1: Redirect Host(New addr: 192.168.88.43)
Vr HL TOS Len ID Flg off TTL Pro cks Src Dst
4 5 00 0054 c9cc 0 0000 3f 01 8eb4 192.168.88.111 10.16.0.1
64 bytes from 10.16.0.1: icmp_seq=2 ttl=64 time=3.209 ms
92 bytes from 192.168.88.1: Redirect Host(New addr: 192.168.88.43)
Vr HL TOS Len ID Flg off TTL Pro cks Src Dst
4 5 00 0054 305b 0 0000 3f 01 2826 192.168.88.111 10.16.0.1
64 bytes from 10.16.0.1: icmp_seq=3 ttl=64 time=2.557 ms
92 bytes from 192.168.88.1: Redirect Host(New addr: 192.168.88.43)
Vr HL TOS Len ID Flg off TTL Pro cks Src Dst
4 5 00 0054 25d8 0 0000 3f 01 32a9 192.168.88.111 10.16.0.1
64 bytes from 10.16.0.1: icmp_seq=4 ttl=64 time=3.594 ms
x92 bytes from 192.168.88.1: Redirect Host(New addr: 192.168.88.43)
Vr HL TOS Len ID Flg off TTL Pro cks Src Dst
4 5 00 0054 4016 0 0000 3f 01 186b 192.168.88.111 10.16.0.1
64 bytes from 10.16.0.1: icmp_seq=5 ttl=64 time=2.974 ms
64 bytes from 10.16.0.1: icmp_seq=6 ttl=64 time=4.397 ms
^C
--- 10.16.0.1 ping statistics ---
7 packets transmitted, 7 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 2.557/4.125/9.144/2.119 ms
# --
# 192.168.88.1 is my default gateway
❯ traceroute 10.16.0.1
traceroute to 10.16.0.1 (10.16.0.1), 64 hops max, 52 byte packets
1 192.168.88.1 (192.168.88.1) 3.669 ms 2.713 ms 2.552 ms
2 10.16.0.1 (10.16.0.1) 3.303 ms 3.292 ms 3.145 ms
buscando información de nginx
❯ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx 1/1 Running 0 39m 172.16.29.154 k8s-worker-0 <none> <none>
❯ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE 42h
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 2d5h
nginx LoadBalancer 10.97.153.41 10.16.0.1 80:30422/TCP 40m
❯ kubectl get endpoints
NAME ENDPOINTS AGE 42h
kubernetes 192.168.88.43:6443 2d5h
nginx 172.16.29.154:80 41m
❯ kubectl describe service nginx
Name: nginx
Namespace: default
Labels: <none>
Annotations: metallb.universe.tf/address-pool: public
metallb.universe.tf/ip-allocated-from-pool: public
Selector: app=nginx
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.97.153.41
IPs: 10.97.153.41
LoadBalancer Ingress: 10.16.0.1
Port: <unset> 80/TCP
TargetPort: 80/TCP
NodePort: <unset> 30422/TCP
Endpoints: 172.16.29.154:80
Session Affinity: None
External Traffic Policy: Local
HealthCheck NodePort: 31552
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal IPAllocated 41m metallb-controller Assigned IP ["10.16.0.1"]
configuraciones de calicó
❯ kubectl describe bgppeer
Name: global-peer
Namespace:
Labels: <none>
Annotations: <none>
API Version: projectcalico.org/v3
Kind: BGPPeer
Metadata:
Creation Timestamp: 2023-09-22T06:37:35Z
Resource Version: 430164
UID: 62d549ac-b7ed-47ab-bc2c-b94de8a08939
Spec:
As Number: 65530
Filters:
default
Peer IP: 192.168.88.1
Events: <none>
❯ kubectl describe bgpconfiguration
Name: default
Namespace:
Labels: <none>
Annotations: <none>
API Version: projectcalico.org/v3
Kind: BGPConfiguration
Metadata:
Creation Timestamp: 2023-09-22T06:37:24Z
Resource Version: 839696
UID: fd6aef3e-0a4c-4ecc-afe4-09395b76d107
Spec:
As Number: 65500
Node To Node Mesh Enabled: false
Service Load Balancer I Ps:
Cidr: 10.16.0.0/28
Events: <none>
❯ kubectl describe bgpfilter
Name: default
Namespace:
Labels: <none>
Annotations: <none>
API Version: projectcalico.org/v3
Kind: BGPFilter
Metadata:
Creation Timestamp: 2023-09-22T06:37:34Z
Resource Version: 434512
UID: df8b5aeb-e25e-481c-9403-fcc54727eea8
Spec:
exportV4:
Action: Reject
Cidr: 10.16.0.0/28
Match Operator: NotIn
Events: <none>
❯ kubectl describe installation
Name: default
Namespace:
Labels: <none>
Annotations: <none>
API Version: operator.tigera.io/v1
Kind: Installation
Metadata:
Creation Timestamp: 2023-09-21T05:32:53Z
Finalizers:
tigera.io/operator-cleanup
Generation: 3
Resource Version: 1052996
UID: 2f7928cd-3815-42be-9483-512f2f39fcf9
Spec:
Calico Network:
Bgp: Enabled
Host Ports: Enabled
Ip Pools:
Block Size: 26
Cidr: 172.16.0.0/16
Disable BGP Export: false
Encapsulation: None
Nat Outgoing: Disabled
Node Selector: all()
Linux Dataplane: Iptables
Multi Interface Mode: None
nodeAddressAutodetectionV4:
First Found: true
Cni:
Ipam:
Type: Calico
Type: Calico
Control Plane Replicas: 2
Flex Volume Path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/
Kubelet Volume Plugin Path: /var/lib/kubelet
Logging:
Cni:
Log File Max Age Days: 30
Log File Max Count: 10
Log File Max Size: 100Mi
Log Severity: Info
Node Update Strategy:
Rolling Update:
Max Unavailable: 1
Type: RollingUpdate
Non Privileged: Disabled
Variant: Calico
Status:
Calico Version: v3.26.1
Computed:
Calico Network:
Bgp: Enabled
Host Ports: Enabled
Ip Pools:
Block Size: 26
Cidr: 172.16.0.0/16
Disable BGP Export: false
Encapsulation: None
Nat Outgoing: Disabled
Node Selector: all()
Linux Dataplane: Iptables
Multi Interface Mode: None
nodeAddressAutodetectionV4:
First Found: true
Cni:
Ipam:
Type: Calico
Type: Calico
Control Plane Replicas: 2
Flex Volume Path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/
Kubelet Volume Plugin Path: /var/lib/kubelet
Logging:
Cni:
Log File Max Age Days: 30
Log File Max Count: 10
Log File Max Size: 100Mi
Log Severity: Info
Node Update Strategy:
Rolling Update:
Max Unavailable: 1
Type: RollingUpdate
Non Privileged: Disabled
Variant: Calico
Conditions:
Last Transition Time: 2023-09-23T15:08:30Z
Message:
Observed Generation: 3
Reason: Unknown
Status: False
Type: Degraded
Last Transition Time: 2023-09-23T15:08:30Z
Message:
Observed Generation: 3
Reason: Unknown
Status: False
Type: Ready
Last Transition Time: 2023-09-23T15:08:30Z
Message: DaemonSet "calico-system/calico-node" is not available (awaiting 1 nodes)
Observed Generation: 3
Reason: ResourceNotReady
Status: True
Type: Progressing
Mtu: 1450
Variant: Calico
Events: <none>
calicoctl node status
muestra
configuración de metallb
❯ kubectl -n metallb-system describe ipaddresspool
Name: public
Namespace: metallb-system
Labels: <none>
Annotations: <none>
API Version: metallb.io/v1beta1
Kind: IPAddressPool
Metadata:
Creation Timestamp: 2023-09-22T07:57:01Z
Generation: 3
Resource Version: 434511
UID: aa2d0e08-4fee-4ce1-a822-9bf4f52a3a3c
Spec:
Addresses:
10.16.0.0/28
Auto Assign: true
Avoid Buggy I Ps: true
Events: <none>
Entonces, cómo depurar esto, por qué no puedo acceder al servicio Nginx desde fuera del clúster.
actualizar0
Ejecutando tcpdump -n -i ens18 host 10.16.0.1
tanto en el nodo maestro como en el trabajador y luego intentando ejecutarlo ping 10.16.0.1
en arping -I 10.16.0.1
otra máquina fuera del clúster, me dan esto tanto en el nodo maestro como en el trabajador.
# ping only show on master node
# 192.168.88.111 is the other machine(my laptop)
dropped privs to tcpdump
tcpdump: listening on ens18, link-type EN10MB (Ethernet), snapshot length 262144 bytes
15:31:29.842416 IP (tos 0x0, ttl 63, id 33297, offset 0, flags [none], proto ICMP (1), length 84)
192.168.88.111 > 10.16.0.1: ICMP echo request, id 14677, seq 0, length 64
15:31:29.842465 IP (tos 0x0, ttl 64, id 38235, offset 0, flags [none], proto ICMP (1), length 84)
10.16.0.1 > 192.168.88.111: ICMP echo reply, id 14677, seq 0, length 64
15:31:30.846173 IP (tos 0x0, ttl 63, id 14503, offset 0, flags [none], proto ICMP (1), length 84)
192.168.88.111 > 10.16.0.1: ICMP echo request, id 14677, seq 1, length 64
15:31:30.846206 IP (tos 0x0, ttl 64, id 38836, offset 0, flags [none], proto ICMP (1), length 84)
10.16.0.1 > 192.168.88.111: ICMP echo reply, id 14677, seq 1, length 64
# arping received 0 responses
# arping showing both on master and worker node
15:33:08.269603 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.16.0.1 (Broadcast) tell 192.168.88.63, length 46
15:33:09.269759 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.16.0.1 (Broadcast) tell 192.168.88.63, length 46
15:33:10.269768 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.16.0.1 (Broadcast) tell 192.168.88.63, length 46
15:33:11.269725 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.16.0.1 (Broadcast) tell 192.168.88.63, length 46
15:33:12.269575 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.16.0.1 (Broadcast) tell 192.168.88.63, length 46
15:33:13.269695 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.16.0.1 (Broadcast) tell 192.168.88.63, length 46
15:33:14.269685 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.16.0.1 (Broadcast) tell 192.168.88.63, length 46
15:33:15.269715 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.16.0.1 (Broadcast) tell 192.168.88.63, length 46
15:33:16.269754 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.16.0.1 (Broadcast) tell 192.168.88.63, length 46
Respuesta1
Sugeriría preguntar esto enUsuarios de Calico flojos.
En tus configuraciones de BGP vi que has deshabilitado la malla completa, ¿es eso intencional? ¿Estás usando reflectores de ruta?
Node To Node Mesh Enabled: false
Yo comenzaría verificando la ruta BGP en Mikrotik. Algo como "ajustar si está ejecutando una versión anterior del sistema operativo del enrutador".
/ip/route/print detail from=[find gateway~"^192.168.88.63[0x00-0 xff]*"]
Si eso parece correcto, comprobaría si se puede acceder a LB a través de Miki.
/tool/fetch http://10.43.1.1
Luego verifique el estado del protocolo Bird, pruebe birdcl desde los pods de calico-node
kubectl exec -n calico-system ds/calico-node -c calico-node -- birdcl show protocols
kubectl exec -n calico-system ds/calico-node -c calico-node -- birdcl show route