Estoy intentando instalar Kubernetes en un laboratorio con Vagrant y Ansible. Estoy usando la siguiente documentación en Ubuntu 16.04:
https://kubernetes.io/blog/2019/03/15/kubernetes-setup-using-ansible-and-vagrant/
Varios problemas que encuentro:
- El archivo Vagrant no se inicializa
- El nodo no proporciona correctamente la dirección IP del nodo con argumentos adicionales
- Al especificar la versión documentada de Calico en Ansible Playbook, falla.
Para el número 1, aquí está el archivo vagabundo:
IMAGE_NAME = "bento/ubuntu-16.04"
N = 2
Vagrant.configure("2") do |config|
config.ssh.insert_key = false
config.vm.provider "virtualbox" do |v|
v.memory = 1024
v.cpus = 2
end
config.vm.define "k8s-master" do |master|
master.vm.box = IMAGE_NAME
master.vm.network "private_network", ip: "192.168.50.10"
master.vm.hostname = "k8s-master"
master.vm.provision "ansible" do |ansible|
ansible.playbook = "kubernetes-setup/master-playbook.yml"
ansible.extra_vars = {
node_ip: "192.168.50.10",
}
end
end
(1..N).each do |i|
config.vm.define "node-#{i}" do |node|
node.vm.box = IMAGE_NAME
node.vm.network "private_network", ip: "192.168.50.#{i + 10}"
node.vm.hostname = "node-#{i}"
node.vm.provision "ansible" do |ansible|
ansible.playbook = "kubernetes-setup/node-playbook.yml"
ansible.extra_vars = {
node_ip: "192.168.50.#{i + 10}",
}
end
end
end
¿Me estoy perdiendo algo dentro del Vagrantfile
?
Para el número 2, veo el siguiente problema en master-playbook.yaml, si falla en este paso:
- name: Configure node ip
lineinfile:
path: /etc/default/kubelet
line: KUBELET_EXTRA_ARGS=--node-ip={{ node_ip }}
Para el número 3 encontré un problema con la tarea Calico dentro de Ansible:
- name: Install calico pod network
become: false
command: kubectl create -f https://docs.projectcalico.org/v3.4/getting-started/kubernetes/installation/hosted/calico.yaml
Recibo el siguiente mensaje sobre esa tarea:
TASK [Install calico pod network] **********************************************
fatal: [k8s-master]: FAILED! => {"changed": true, "cmd": ["kubectl", "create", "-f", "https://docs.projectcalico.org/v3.4/getting-started/kubernetes/installation/hosted/calico.yaml"], "delta": "0:00:01.460979", "end": "2020-08-21 01:57:36.395550", "failed": true, "rc": 1, "start": "2020-08-21 01:57:34.934571", "stderr": "unable to recognize \"https://docs.projectcalico.org/v3.4/getting-started/kubernetes/installation/hosted/calico.yaml\": no matches for kind \"DaemonSet\" in version \"extensions/v1beta1\"\nunable to recognize \"https://docs.projectcalico.org/v3.4/getting-started/kubernetes/installation/hosted/calico.yaml\": no matches for kind \"Deployment\" in version \"extensions/v1beta1\"", "stderr_lines": ["unable to recognize \"https://docs.projectcalico.org/v3.4/getting-started/kubernetes/installation/hosted/calico.yaml\": no matches for kind \"DaemonSet\" in version \"extensions/v1beta1\"", "unable to recognize \"https://docs.projectcalico.org/v3.4/getting-started/kubernetes/installation/hosted/calico.yaml\": no matches for kind \"Deployment\" in version \"extensions/v1beta1\""], "stdout": "configmap/calico-config created\nsecret/calico-etcd-secrets created\nserviceaccount/calico-node created\nserviceaccount/calico-kube-controllers created\nclusterrole.rbac.authorization.k8s.io/calico-kube-controllers created\nclusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created\nclusterrole.rbac.authorization.k8s.io/calico-node created\nclusterrolebinding.rbac.authorization.k8s.io/calico-node created", "stdout_lines": ["configmap/calico-config created", "secret/calico-etcd-secrets created", "serviceaccount/calico-node created", "serviceaccount/calico-kube-controllers created", "clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created", "clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created", "clusterrole.rbac.authorization.k8s.io/calico-node created", "clusterrolebinding.rbac.authorization.k8s.io/calico-node created"]}
RUNNING HANDLER [docker status] ************************************************
to retry, use: --limit @/home/sto/Vagrant/Kubernetes/kubernetes-setup/master-playbook.retry
PLAY RECAP *********************************************************************
k8s-master : ok=15 changed=14 unreachable=0 failed=1
Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.
Además, lo interesante es que, aunque falle en esta tarea, si elijo la versión Calico 3.14.
- name: Install calico pod network
become: false
command: kubectl create -f https://docs.projectcalico.org/v3.14/getting-started/kubernetes/installation/hosted/calico.yaml
Además, al reemplazar la URL de 3.4
a 3.14
, los recursos para la base CNI aún se crearán y seguirán fallando dentro de las máquinas Vagrant:
vagrant@k8s-master:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-6bb5db574-p2w9c 0/1 CrashLoopBackOff 10 31m
kube-system calico-node-6wsm7 0/1 CrashLoopBackOff 7 13m
kube-system calico-node-flf89 0/1 Running 12 31m
kube-system calico-node-fwk84 0/1 CrashLoopBackOff 6 12m
kube-system coredns-66bff467f8-cdrb8 0/1 ContainerCreating 0 31m
kube-system coredns-66bff467f8-lgcf8 0/1 ContainerCreating 0 31m
kube-system etcd-k8s-master 1/1 Running 0 31m
kube-system kube-apiserver-k8s-master 1/1 Running 0 31m
kube-system kube-controller-manager-k8s-master 1/1 Running 1 31m
kube-system kube-proxy-79pw8 1/1 Running 0 31m
kube-system kube-proxy-g8gnm 1/1 Running 0 12m
kube-system kube-proxy-tvwlq 1/1 Running 0 13m
kube-system kube-scheduler-k8s-master 1/1 Running 3 31m
Aquí están los mensajes:
vagrant@k8s-master:~$ kubectl logs -f calico-node-4q2xz -n kube-system
2020-08-22 21:35:41.194 [INFO][8] startup/startup.go 299: Early log level set to info
2020-08-22 21:35:41.194 [INFO][8] startup/startup.go 319: Using HOSTNAME environment (lowercase) for node name
2020-08-22 21:35:41.194 [INFO][8] startup/startup.go 327: Determined node name: node-1
2020-08-22 21:35:41.195 [INFO][8] startup/startup.go 106: Skipping datastore connection test
vagrant@k8s-master:~$ kubectl logs -f calico-kube-controllers-6bb5db574-m6b5j -n kube-system
2020-08-22 21:32:06.445 [INFO][1] main.go 88: Loaded configuration from environment config=&config.Config{LogLevel:"info", WorkloadEndpointWorkers:1, ProfileWorkers:1, PolicyWorkers:1, NodeWorkers:1, Kubeconfig:"", DatastoreType:"etcdv3"}
I0822 21:32:06.513751 1 client.go:357] parsed scheme: "endpoint"
I0822 21:32:06.514122 1 endpoint.go:68] ccResolverWrapper: sending new addresses to cc: [{http://<ETCD_IP>:<ETCD_PORT> 0 <nil>}]
W0822 21:32:06.515723 1 client_config.go:541] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
W0822 21:32:06.538019 1 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {http://<ETCD_IP>:<ETCD_PORT> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp: address http://<ETCD_IP>:<ETCD_PORT>: too many colons in address". Reconnecting...
2020-08-22 21:32:06.568 [INFO][1] main.go 109: Ensuring Calico datastore is initialized
W0822 21:32:07.541048 1 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {http://<ETCD_IP>:<ETCD_PORT> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp: address http://<ETCD_IP>:<ETCD_PORT>: too many colons in address". Reconnecting...
W0822 21:32:09.309376 1 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {http://<ETCD_IP>:<ETCD_PORT> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp: address http://<ETCD_IP>:<ETCD_PORT>: too many colons in address". Reconnecting...
W0822 21:32:12.223003 1 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {http://<ETCD_IP>:<ETCD_PORT> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp: address http://<ETCD_IP>:<ETCD_PORT>: too many colons in address". Reconnecting...
W0822 21:32:15.736676 1 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {http://<ETCD_IP>:<ETCD_PORT> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp: address http://<ETCD_IP>:<ETCD_PORT>: too many colons in address". Reconnecting...
{"level":"warn","ts":"2020-08-22T21:32:16.668Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-286ade6d-d04a-4dac-848b-231ef01101d2/http://<ETCD_IP>:<ETCD_PORT>","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: address http://<ETCD_IP>:<ETCD_PORT>: too many colons in address\""}
2020-08-22 21:32:16.877 [ERROR][1] client.go 261: Error getting cluster information config ClusterInformation="default" error=context deadline exceeded
2020-08-22 21:32:16.907 [FATAL][1] main.go 114: Failed to initialize Calico datastore error=context deadline exceeded
¿Esto probablemente esté relacionado con cómo se aprovisiona la API en la tarea Master-Playbook dentro de master-playbook.yml
?
- name: Initialize the Kubernetes cluster using kubeadm
command: kubeadm init --apiserver-advertise-address="192.168.20.10" --apiserver-cert-extra-sans="192.168.20.10" --node-name k8s-master --pod-network-cidr=192.168.0.0/16
Aquí hay un enlace donde puede clonar en un entorno Ubuntu 16.04 y es más que probable que tenga los mismos problemas. Dado que esta publicación fue marcada para eliminación, decidí reformularla y hacerla más impactante, ya que sigo pensando que vale la pena tenerla en Stackoverflow. Personas con más experiencia que yo tuvieron muchos problemas con la Guía de Kubernetes e incluso me preguntaron.
Aquí está el enlace a eso. Enlace de laboratorio único con Vagrant y Ansible
Asegúrese de tener la siguiente estructura de árbol cuando pruebe esto. Mi principal problema es básicamente cómo hacer que estos Calico Pods funcionen ahora:
sto@suplab02:~/Vagrant/Kubernetes$ tree
.
├── connect.sh
├── init.sh
├── join.sh
├── kubernetes-setup
│ ├── master-playbook.yml
│ └── node-playbook.yml
├── rename_roles.sh
└── Vagrantfile