Usando kubeadm
y flannel
más de 4 nodos ejecutándoseRHEL 7
Hice lo siguiente:
- Abrir el puerto 10250 en todos los nodos
- AplicadoNo se pudo obtener la dirección de Kubernetes: no se encontró ninguna fuente de Kubernetespara abordar
no source found
el problema - Corrió
kubectl create -f deploy/1.8+/
- Corrió
kubectl get pods -n=kube-system
y luego subí CrashLoopBackOff
al servidor de métricas
NAME READY STATUS RESTARTS AGE
coredns-78fcdf6894-4q7ct 1/1 Running 10 7d
coredns-78fcdf6894-7tj52 1/1 Running 10 7d
etcd-thalia0.ahc.umn.edu 1/1 Running 0 7d
kube-apiserver-thalia0.ahc.umn.edu 1/1 Running 0 7d
kube-controller-manager-thalia0.ahc.umn.edu 1/1 Running 0 7d
kube-flannel-ds-amd64-78hbk 1/1 Running 0 7d
kube-flannel-ds-amd64-gdttr 1/1 Running 0 7d
kube-flannel-ds-amd64-rzhm2 1/1 Running 0 7d
kube-flannel-ds-amd64-xc2n7 1/1 Running 0 7d
kube-proxy-b86kn 1/1 Running 0 7d
kube-proxy-g27sk 1/1 Running 0 7d
kube-proxy-rtgtp 1/1 Running 0 7d
kube-proxy-x2pp7 1/1 Running 0 7d
kube-scheduler-thalia0.ahc.umn.edu 1/1 Running 0 7d
kubernetes-dashboard-7b7cb74c5c-wgt8f 1/1 Running 0 6d
metrics-server-85ff8f7b84-2x5th 0/1 CrashLoopBackOff 8 23m
- Corrió
kubectl -n kube-system logs $(kubectl get pods --namespace=kube-system -l k8s-app=metrics-server -o name)
y obtuve salida:
I0828 19:26:41.686932 1 heapster.go:71] /metrics-server --source=kubernetes:https://kubernetes.default
I0828 19:26:41.687023 1 heapster.go:72] Metrics Server version v0.2.1
I0828 19:26:41.687360 1 configs.go:61] Using Kubernetes client with master "https://kubernetes.default" and version
I0828 19:26:41.687388 1 configs.go:62] Using kubelet port 10255
E0828 19:27:01.692571 1 kubelet.go:331] Failed to load nodes: Get https://kubernetes.default/api/v1/nodes: dial tcp: lookup kubernetes.default on 10.96.0.10:53: read udp 10.244.2.4:34644->10.96.0.10:53: read: no route to host
I0828 19:27:01.692700 1 heapster.go:128] Starting with Metric Sink
I0828 19:27:02.500852 1 serving.go:308] Generated self-signed cert (apiserver.local.config/certificates/apiserver.crt, apiserver.local.config/certificates/apiserver.key)
W0828 19:27:04.381151 1 authentication.go:222] Unable to get configmap/extension-apiserver-authentication in kube-system. Usually fixed by 'kubectl create rolebinding -n kube-system ROLE_NAME --role=extension-apiserver-authentication-reader --serviceaccount=YOUR_NS:YOUR_SA'
F0828 19:27:04.381187 1 heapster.go:97] Could not create the API server: Get https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication: dial tcp 10.96.0.1:443: getsockopt: no route to host
También miré los distintos registros y noté que en los flannel
pods recibía una gran cantidad de estos errores:
E0829 19:41:32.636680 1 reflector.go:201] github.com/coreos/flannel/subnet/kube/kube.go:295: Failed to list *v1.Node: Get https://10.96.0.1:443/api/v1/nodes?resourceVersion=0: net/http: TLS handshake timeout
Además, aparece este error en el pod del programador:
E0829 19:41:32.637368 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:129: Failed to list *core.Service: Get https://134.84.53.162:6443/api/v1/services?limit=500&resourceVersion=0: net/http: TLS handshake timeout
EDITAR 1
Reconstruí el clúster después de desmontarlo y agregar una regla en el firewall local para permitir el puerto 443 (para tratar kubectl proxy
).
La salida de kubectl get services --namespace=kube-system
es
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 15h
kubernetes-dashboard ClusterIP 10.98.72.170 <none> 443/TCP 20m
metrics-server ClusterIP 10.111.155.9 <none> 443/TCP 1m
Además, es de destacar que, después del desmontaje y la reinicialización del clúster, ni los módulos de franela ni los del programador arrojan el error. Solo recibo el error en el pod del servidor de métricas, junto con este nuevo error en el pod del apiserver:
: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0830 20:43:38.101286 1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
I0830 20:45:38.101548 1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.metrics.k8s.io
E0830 20:45:38.101757 1 controller.go:111] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0830 20:45:38.101779 1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
E0830 20:45:44.532250 1 available_controller.go:311] v1beta1.metrics.k8s.io failed with: Get https://10.111.155.9:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
I0830 20:45:48.894505 1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.metrics.k8s.io
E0830 20:45:48.894693 1 controller.go:111] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
Además, profundizar en el error W0828 19:27:04.381151 1 authentication.go:222] Unable to get configmap/extension-apiserver-authentication in kube-system. Usually fixed by 'kubectl create rolebinding -n kube-system ROLE_NAME --role=extension-apiserver-authentication-reader --serviceaccount=YOUR_NS:YOUR_SA'
Corrí kubectl get roles -n kube-system extension-apiserver-authentication-reader -o yaml
y obtuve el siguiente resultado:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
creationTimestamp: 2018-08-30T00:58:35Z
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: extension-apiserver-authentication-reader
namespace: kube-system
resourceVersion: "132"
selfLink: /apis/rbac.authorization.k8s.io/v1/namespaces/kube-system/roles/extension-apiserver-authentication-reader
uid: d2f1c80c-abef-11e8-95cc-005056891f42
rules:
- apiGroups:
- ""
resourceNames:
- extension-apiserver-authentication
resources:
- configmaps
verbs:
- get
Por último, la salida kubectl get apiservice v1beta1.metrics.k8s.io -o yaml
de es
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
creationTimestamp: 2018-08-30T22:41:26Z
name: v1beta1.metrics.k8s.io
resourceVersion: "119754"
selfLink: /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.metrics.k8s.io
uid: d403e18f-aca5-11e8-95cc-005056891f42
spec:
group: metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: metrics-server
namespace: kube-system
version: v1beta1
versionPriority: 100
status:
conditions:
- lastTransitionTime: 2018-08-30T22:41:26Z
message: endpoints for service/metrics-server in "kube-system" have no addresses
reason: MissingEndpoints
status: "False"
type: Available
Esto parece un problema de red obvio (¿firewall?), pero no estoy seguro de cómo proceder. ¿Es esto flannel
un coredns
problema de configuración?
Respuesta1
Cambié el CNI de flannel
a calico
y eso parece haber resuelto los problemas que tenía (tampoco pude hacer que el argo workflow
controlador se iniciara en mi clúster).