使用kubeadm
超過flannel
4 個節點運行RHEL 7
我做了以下事情:
- 在所有節點上開放10250端口
- 應用取得 kubernetes 位址失敗:未找到 kubernetes 來源解決
no source found
問題 - 然
kubectl create -f deploy/1.8+/
- 然
kubectl get pods -n=kube-system
然後進入CrashLoopBackOff
指標伺服器
NAME READY STATUS RESTARTS AGE
coredns-78fcdf6894-4q7ct 1/1 Running 10 7d
coredns-78fcdf6894-7tj52 1/1 Running 10 7d
etcd-thalia0.ahc.umn.edu 1/1 Running 0 7d
kube-apiserver-thalia0.ahc.umn.edu 1/1 Running 0 7d
kube-controller-manager-thalia0.ahc.umn.edu 1/1 Running 0 7d
kube-flannel-ds-amd64-78hbk 1/1 Running 0 7d
kube-flannel-ds-amd64-gdttr 1/1 Running 0 7d
kube-flannel-ds-amd64-rzhm2 1/1 Running 0 7d
kube-flannel-ds-amd64-xc2n7 1/1 Running 0 7d
kube-proxy-b86kn 1/1 Running 0 7d
kube-proxy-g27sk 1/1 Running 0 7d
kube-proxy-rtgtp 1/1 Running 0 7d
kube-proxy-x2pp7 1/1 Running 0 7d
kube-scheduler-thalia0.ahc.umn.edu 1/1 Running 0 7d
kubernetes-dashboard-7b7cb74c5c-wgt8f 1/1 Running 0 6d
metrics-server-85ff8f7b84-2x5th 0/1 CrashLoopBackOff 8 23m
- 然
kubectl -n kube-system logs $(kubectl get pods --namespace=kube-system -l k8s-app=metrics-server -o name)
並得到輸出:
I0828 19:26:41.686932 1 heapster.go:71] /metrics-server --source=kubernetes:https://kubernetes.default
I0828 19:26:41.687023 1 heapster.go:72] Metrics Server version v0.2.1
I0828 19:26:41.687360 1 configs.go:61] Using Kubernetes client with master "https://kubernetes.default" and version
I0828 19:26:41.687388 1 configs.go:62] Using kubelet port 10255
E0828 19:27:01.692571 1 kubelet.go:331] Failed to load nodes: Get https://kubernetes.default/api/v1/nodes: dial tcp: lookup kubernetes.default on 10.96.0.10:53: read udp 10.244.2.4:34644->10.96.0.10:53: read: no route to host
I0828 19:27:01.692700 1 heapster.go:128] Starting with Metric Sink
I0828 19:27:02.500852 1 serving.go:308] Generated self-signed cert (apiserver.local.config/certificates/apiserver.crt, apiserver.local.config/certificates/apiserver.key)
W0828 19:27:04.381151 1 authentication.go:222] Unable to get configmap/extension-apiserver-authentication in kube-system. Usually fixed by 'kubectl create rolebinding -n kube-system ROLE_NAME --role=extension-apiserver-authentication-reader --serviceaccount=YOUR_NS:YOUR_SA'
F0828 19:27:04.381187 1 heapster.go:97] Could not create the API server: Get https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication: dial tcp 10.96.0.1:443: getsockopt: no route to host
我還查看了各種日誌,並注意到對於flannel
Pod,我收到了大量這些錯誤:
E0829 19:41:32.636680 1 reflector.go:201] github.com/coreos/flannel/subnet/kube/kube.go:295: Failed to list *v1.Node: Get https://10.96.0.1:443/api/v1/nodes?resourceVersion=0: net/http: TLS handshake timeout
另外,在調度程序 pod 上出現此錯誤:
E0829 19:41:32.637368 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:129: Failed to list *core.Service: Get https://134.84.53.162:6443/api/v1/services?limit=500&resourceVersion=0: net/http: TLS handshake timeout
編輯1
我在拆除並在本地防火牆上添加規則以允許連接埠 443(用於處理kubectl proxy
)後重建了叢集。
的輸出kubectl get services --namespace=kube-system
是
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 15h
kubernetes-dashboard ClusterIP 10.98.72.170 <none> 443/TCP 20m
metrics-server ClusterIP 10.111.155.9 <none> 443/TCP 1m
另外,值得注意的是,在叢集拆卸和重新初始化之後,flannel 和調度程序 pod 都不會拋出錯誤。我僅在metrics-server pod 上收到錯誤,並在 apiserver pod 上收到此新錯誤:
: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0830 20:43:38.101286 1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
I0830 20:45:38.101548 1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.metrics.k8s.io
E0830 20:45:38.101757 1 controller.go:111] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0830 20:45:38.101779 1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
E0830 20:45:44.532250 1 available_controller.go:311] v1beta1.metrics.k8s.io failed with: Get https://10.111.155.9:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
I0830 20:45:48.894505 1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.metrics.k8s.io
E0830 20:45:48.894693 1 controller.go:111] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
此外,深入研究錯誤 W0828 19:27:04.381151 1 authentication.go:222] Unable to get configmap/extension-apiserver-authentication in kube-system. Usually fixed by 'kubectl create rolebinding -n kube-system ROLE_NAME --role=extension-apiserver-authentication-reader --serviceaccount=YOUR_NS:YOUR_SA'
我運行kubectl get roles -n kube-system extension-apiserver-authentication-reader -o yaml
並得到以下輸出:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
creationTimestamp: 2018-08-30T00:58:35Z
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: extension-apiserver-authentication-reader
namespace: kube-system
resourceVersion: "132"
selfLink: /apis/rbac.authorization.k8s.io/v1/namespaces/kube-system/roles/extension-apiserver-authentication-reader
uid: d2f1c80c-abef-11e8-95cc-005056891f42
rules:
- apiGroups:
- ""
resourceNames:
- extension-apiserver-authentication
resources:
- configmaps
verbs:
- get
最後,輸出kubectl get apiservice v1beta1.metrics.k8s.io -o yaml
是
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
creationTimestamp: 2018-08-30T22:41:26Z
name: v1beta1.metrics.k8s.io
resourceVersion: "119754"
selfLink: /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.metrics.k8s.io
uid: d403e18f-aca5-11e8-95cc-005056891f42
spec:
group: metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: metrics-server
namespace: kube-system
version: v1beta1
versionPriority: 100
status:
conditions:
- lastTransitionTime: 2018-08-30T22:41:26Z
message: endpoints for service/metrics-server in "kube-system" have no addresses
reason: MissingEndpoints
status: "False"
type: Available
這似乎是一個明顯的網路問題(防火牆?),但我不確定如何繼續處理。這是一個問題flannel
還是一個coredns
配置問題?
答案1
我將 CNI 從 切換flannel
到calico
,這似乎解決了我遇到的問題(我也無法讓控制器argo workflow
在我的叢集中啟動)。