Bei der Installation des RabbitMQ Helm-Diagramms in einem Kubernetes-Cluster schlägt die Verteilung des Erlang-Cookies an einen Knoten fehl

Bei der Installation des RabbitMQ Helm-Diagramms in einem Kubernetes-Cluster schlägt die Verteilung des Erlang-Cookies an einen Knoten fehl

Ich versuche, einen RabbitMQ-Cluster über das Bitnami Helm-Diagramm zu installieren (https://github.com/bitnami/charts/tree/master/bitnami/rabbitmq) in einem EKS-Cluster und wenn ich die Helm-Installation ausführe, erhalte ich im ersten erstellten Pod den folgenden Fehler:

rabbitmq 13:41:15.99
rabbitmq 13:41:15.99 Welcome to the Bitnami rabbitmq container
rabbitmq 13:41:15.99 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-rabbitmq
rabbitmq 13:41:15.99 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-rabbitmq/issues
rabbitmq 13:41:15.99
rabbitmq 13:41:15.99 INFO  ==> ** Starting RabbitMQ setup **
rabbitmq 13:41:16.01 INFO  ==> Validating settings in RABBITMQ_* env vars..
rabbitmq 13:41:16.03 INFO  ==> Initializing RabbitMQ...
rabbitmq 13:41:16.03 DEBUG ==> Creating environment file...
rabbitmq 13:41:16.03 DEBUG ==> Creating enabled_plugins file...
rabbitmq 13:41:16.04 DEBUG ==> Creating Erlang cookie...
rabbitmq 13:41:16.04 DEBUG ==> Ensuring expected directories/files exist...
rabbitmq 13:41:16.05 INFO  ==> Starting RabbitMQ in background...
Waiting for erlang distribution on node '[email protected]' while OS process '51' is running
2022-04-19 13:41:19.198340+00:00 [info] <0.222.0> Feature flags: list of feature flags found:
2022-04-19 13:41:19.212884+00:00 [info] <0.222.0> Feature flags:   [ ] implicit_default_bindings
2022-04-19 13:41:19.212941+00:00 [info] <0.222.0> Feature flags:   [ ] maintenance_mode_status
2022-04-19 13:41:19.212965+00:00 [info] <0.222.0> Feature flags:   [ ] quorum_queue
2022-04-19 13:41:19.212985+00:00 [info] <0.222.0> Feature flags:   [ ] stream_queue
2022-04-19 13:41:19.213077+00:00 [info] <0.222.0> Feature flags:   [ ] user_limits
2022-04-19 13:41:19.213104+00:00 [info] <0.222.0> Feature flags:   [ ] virtual_host_metadata
2022-04-19 13:41:19.213124+00:00 [info] <0.222.0> Feature flags: feature flag states written to disk: yes
2022-04-19 13:41:19.637051+00:00 [noti] <0.44.0> Application syslog exited with reason: stopped
2022-04-19 13:41:19.637148+00:00 [noti] <0.222.0> Logging: switching to configured handler(s); following messages may not be visible in this log output
2022-04-19 13:41:19.656264+00:00 [noti] <0.222.0> Logging: configured log handlers are now ACTIVE
2022-04-19 13:41:19.904087+00:00 [info] <0.222.0> ra: starting system quorum_queues
2022-04-19 13:41:19.904200+00:00 [info] <0.222.0> starting Ra system: quorum_queues in directory: /bitnami/rabbitmq/mnesia/rabbit@rabbitmq-0/quorum/rabbit@rabbitmq-0
2022-04-19 13:41:19.995094+00:00 [info] <0.263.0> ra: meta data store initialised for system quorum_queues. 0 record(s) recovered
2022-04-19 13:41:20.013384+00:00 [noti] <0.268.0> WAL: ra_log_wal init, open tbls: ra_log_open_mem_tables, closed tbls: ra_log_closed_mem_tables
2022-04-19 13:41:20.022921+00:00 [info] <0.222.0> ra: starting system coordination
2022-04-19 13:41:20.022987+00:00 [info] <0.222.0> starting Ra system: coordination in directory: /bitnami/rabbitmq/mnesia/rabbit@rabbitmq-0/coordination/rabbit@rabbitmq-0
2022-04-19 13:41:20.026371+00:00 [info] <0.276.0> ra: meta data store initialised for system coordination. 0 record(s) recovered
2022-04-19 13:41:20.026628+00:00 [noti] <0.281.0> WAL: ra_coordination_log_wal init, open tbls: ra_coordination_log_open_mem_tables, closed tbls: ra_coordination_log_closed_mem_tables
2022-04-19 13:41:20.032159+00:00 [info] <0.222.0>
2022-04-19 13:41:20.032159+00:00 [info] <0.222.0>  Starting RabbitMQ 3.9.8 on Erlang 24.1.2 [jit]
2022-04-19 13:41:20.032159+00:00 [info] <0.222.0>  Copyright (c) 2007-2021 VMware, Inc. or its affiliates.
2022-04-19 13:41:20.032159+00:00 [info] <0.222.0>  Licensed under the MPL 2.0. Website: https://rabbitmq.com

  ##  ##      RabbitMQ 3.9.8
  ##  ##
  ##########  Copyright (c) 2007-2021 VMware, Inc. or its affiliates.
  ######  ##
  ##########  Licensed under the MPL 2.0. Website: https://rabbitmq.com

  Erlang:      24.1.2 [jit]
  TLS Library: OpenSSL - OpenSSL 1.1.1d  10 Sep 2019

  Doc guides:  https://rabbitmq.com/documentation.html
  Support:     https://rabbitmq.com/contact.html
  Tutorials:   https://rabbitmq.com/getstarted.html
  Monitoring:  https://rabbitmq.com/monitoring.html

  Logs: /opt/bitnami/rabbitmq/var/log/rabbitmq/rabbit@rabbitmq-0_upgrade.log
        <stdout>

  Config file(s): /opt/bitnami/rabbitmq/etc/rabbitmq/rabbitmq.conf

  Starting broker...2022-04-19 13:41:20.033907+00:00 [info] <0.222.0>
2022-04-19 13:41:20.033907+00:00 [info] <0.222.0>  node           : rabbit@rabbitmq-0
2022-04-19 13:41:20.033907+00:00 [info] <0.222.0>  home dir       : /opt/bitnami/rabbitmq/.rabbitmq
2022-04-19 13:41:20.033907+00:00 [info] <0.222.0>  config file(s) : /opt/bitnami/rabbitmq/etc/rabbitmq/rabbitmq.conf
2022-04-19 13:41:20.033907+00:00 [info] <0.222.0>  cookie hash    : d3Nfp8t690Ln1h811Tuxzw==
2022-04-19 13:41:20.033907+00:00 [info] <0.222.0>  log(s)         : /opt/bitnami/rabbitmq/var/log/rabbitmq/rabbit@rabbitmq-0_upgrade.log
2022-04-19 13:41:20.033907+00:00 [info] <0.222.0>                 : <stdout>
2022-04-19 13:41:20.033907+00:00 [info] <0.222.0>  database dir   : /bitnami/rabbitmq/mnesia/rabbit@rabbitmq-0
2022-04-19 13:41:20.307590+00:00 [info] <0.222.0> Feature flags: list of feature flags found:
2022-04-19 13:41:20.307654+00:00 [info] <0.222.0> Feature flags:   [ ] drop_unroutable_metric
2022-04-19 13:41:20.307681+00:00 [info] <0.222.0> Feature flags:   [ ] empty_basic_get_metric
2022-04-19 13:41:20.307705+00:00 [info] <0.222.0> Feature flags:   [ ] implicit_default_bindings
2022-04-19 13:41:20.307792+00:00 [info] <0.222.0> Feature flags:   [ ] maintenance_mode_status
2022-04-19 13:41:20.307818+00:00 [info] <0.222.0> Feature flags:   [ ] quorum_queue
2022-04-19 13:41:20.307838+00:00 [info] <0.222.0> Feature flags:   [ ] stream_queue
2022-04-19 13:41:20.307908+00:00 [info] <0.222.0> Feature flags:   [ ] user_limits
2022-04-19 13:41:20.307947+00:00 [info] <0.222.0> Feature flags:   [ ] virtual_host_metadata
2022-04-19 13:41:20.307968+00:00 [info] <0.222.0> Feature flags: feature flag states written to disk: yes
Error: operation wait on node [email protected] timed out. Timeout value used: 5000
2022-04-19 13:41:23.299211+00:00 [info] <0.222.0> Running boot step pre_boot defined by app rabbit
2022-04-19 13:41:23.299295+00:00 [info] <0.222.0> Running boot step rabbit_global_counters defined by app rabbit
2022-04-19 13:41:23.299545+00:00 [info] <0.222.0> Running boot step rabbit_osiris_metrics defined by app rabbit
2022-04-19 13:41:23.299746+00:00 [info] <0.222.0> Running boot step rabbit_core_metrics defined by app rabbit
2022-04-19 13:41:23.300299+00:00 [info] <0.222.0> Running boot step rabbit_alarm defined by app rabbit
2022-04-19 13:41:23.304497+00:00 [info] <0.297.0> Memory high watermark set to 12695 MiB (13312088473 bytes) of 31738 MiB (33280221184 bytes) total
2022-04-19 13:41:23.308954+00:00 [info] <0.299.0> Enabling free disk space monitoring
2022-04-19 13:41:23.309007+00:00 [info] <0.299.0> Disk free limit set to 50MB
2022-04-19 13:41:23.312489+00:00 [info] <0.222.0> Running boot step code_server_cache defined by app rabbit
2022-04-19 13:41:23.312650+00:00 [info] <0.222.0> Running boot step file_handle_cache defined by app rabbit
2022-04-19 13:41:23.312958+00:00 [info] <0.302.0> Limiting to approx 65439 file handles (58893 sockets)
2022-04-19 13:41:23.313163+00:00 [info] <0.303.0> FHC read buffering: OFF
2022-04-19 13:41:23.313217+00:00 [info] <0.303.0> FHC write buffering: ON
2022-04-19 13:41:23.313829+00:00 [info] <0.222.0> Running boot step worker_pool defined by app rabbit
2022-04-19 13:41:23.313932+00:00 [info] <0.283.0> Will use 4 processes for default worker pool
2022-04-19 13:41:23.313982+00:00 [info] <0.283.0> Starting worker pool 'worker_pool' with 4 processes in it
2022-04-19 13:41:23.314583+00:00 [info] <0.222.0> Running boot step database defined by app rabbit
2022-04-19 13:41:23.314894+00:00 [info] <0.222.0> Node database directory at /bitnami/rabbitmq/mnesia/rabbit@rabbitmq-0 is empty. Assuming we need to join an existing cluster or initialise from scratch...
2022-04-19 13:41:23.314963+00:00 [info] <0.222.0> Configured peer discovery backend: rabbit_peer_discovery_k8s
2022-04-19 13:41:23.315110+00:00 [info] <0.222.0> Will try to lock with peer discovery backend rabbit_peer_discovery_k8s
2022-04-19 13:41:23.316998+00:00 [noti] <0.44.0> Application mnesia exited with reason: stopped

BOOT FAILED
===========
Exception during startup:

2022-04-19 13:41:23.317269+00:00 [erro] <0.222.0>
2022-04-19 13:41:23.317269+00:00 [erro] <0.222.0> BOOT FAILED
2022-04-19 13:41:23.317269+00:00 [erro] <0.222.0> ===========
2022-04-19 13:41:23.317269+00:00 [erro] <0.222.0> Exception during startup:
2022-04-19 13:41:23.317269+00:00 [erro] <0.222.0>
2022-04-19 13:41:23.317269+00:00 [erro] <0.222.0> error:{badmatch,{error,enoent}}
2022-04-19 13:41:23.317269+00:00 [erro] <0.222.0>
2022-04-19 13:41:23.317269+00:00 [erro] <0.222.0>     rabbit_peer_discovery_k8s:make_request/0, line 121
2022-04-19 13:41:23.317269+00:00 [erro] <0.222.0>     rabbit_peer_discovery_k8s:list_nodes/0, line 41
2022-04-19 13:41:23.317269+00:00 [erro] <0.222.0>     rabbit_peer_discovery_k8s:lock/1, line 76
2022-04-19 13:41:23.317269+00:00 [erro] <0.222.0>     rabbit_peer_discovery:lock/0, line 190
2022-04-19 13:41:23.317269+00:00 [erro] <0.222.0>     rabbit_mnesia:init_with_lock/3, line 104
2022-04-19 13:41:23.317269+00:00 [erro] <0.222.0>     rabbit_mnesia:init/0, line 76
2022-04-19 13:41:23.317269+00:00 [erro] <0.222.0>     rabbit_boot_steps:-run_step/2-lc$^0/1-0-/2, line 41
2022-04-19 13:41:23.317269+00:00 [erro] <0.222.0>     rabbit_boot_steps:run_step/2, line 46
2022-04-19 13:41:23.317269+00:00 [erro] <0.222.0>
error:{badmatch,{error,enoent}}

    rabbit_peer_discovery_k8s:make_request/0, line 121
    rabbit_peer_discovery_k8s:list_nodes/0, line 41
    rabbit_peer_discovery_k8s:lock/1, line 76
    rabbit_peer_discovery:lock/0, line 190
    rabbit_mnesia:init_with_lock/3, line 104
    rabbit_mnesia:init/0, line 76
    rabbit_boot_steps:-run_step/2-lc$^0/1-0-/2, line 41
    rabbit_boot_steps:run_step/2, line 46

2022-04-19 13:41:24.318598+00:00 [erro] <0.221.0>   crasher:
2022-04-19 13:41:24.318598+00:00 [erro] <0.221.0>     initial call: application_master:init/4
2022-04-19 13:41:24.318598+00:00 [erro] <0.221.0>     pid: <0.221.0>
2022-04-19 13:41:24.318598+00:00 [erro] <0.221.0>     registered_name: []
2022-04-19 13:41:24.318598+00:00 [erro] <0.221.0>     exception exit: {{badmatch,{error,enoent}},{rabbit,start,[normal,[]]}}
2022-04-19 13:41:24.318598+00:00 [erro] <0.221.0>       in function  application_master:init/4 (application_master.erl, line 142)
2022-04-19 13:41:24.318598+00:00 [erro] <0.221.0>     ancestors: [<0.220.0>]
2022-04-19 13:41:24.318598+00:00 [erro] <0.221.0>     message_queue_len: 1
2022-04-19 13:41:24.318598+00:00 [erro] <0.221.0>     messages: [{'EXIT',<0.222.0>,normal}]
2022-04-19 13:41:24.318598+00:00 [erro] <0.221.0>     links: [<0.220.0>,<0.44.0>]
2022-04-19 13:41:24.318598+00:00 [erro] <0.221.0>     dictionary: []
2022-04-19 13:41:24.318598+00:00 [erro] <0.221.0>     trap_exit: true
2022-04-19 13:41:24.318598+00:00 [erro] <0.221.0>     status: running
2022-04-19 13:41:24.318598+00:00 [erro] <0.221.0>     heap_size: 2586
2022-04-19 13:41:24.318598+00:00 [erro] <0.221.0>     stack_size: 29
2022-04-19 13:41:24.318598+00:00 [erro] <0.221.0>     reductions: 186
2022-04-19 13:41:24.318598+00:00 [erro] <0.221.0>   neighbours:
2022-04-19 13:41:24.318598+00:00 [erro] <0.221.0>
2022-04-19 13:41:24.319087+00:00 [noti] <0.44.0> Application rabbit exited with reason: {{badmatch,{error,enoent}},{rabbit,start,[normal,[]]}}
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{{badmatch,{error,enoent}},{rabbit,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,rabbit,{{badmatch,{error,enoent}},{rabbit,start,[normal,[]]}}})

Crash dump is being written to: /opt/bitnami/rabbitmq/var/log/rabbitmq/erl_crash.dump...done
Waiting for erlang distribution on node '[email protected]' while OS process '51' is running
Error:
process_not_running
Waiting for erlang distribution on node '[email protected]' while OS process '51' is running
Error:
process_not_running

Es scheint, als ob das Erlang-Cookie nicht richtig verteilt wird, aber nach der Überprüfung einiger Beiträge bin ich zu keinem Schluss gekommen.

Wenn Sie über hilfreiche Informationen verfügen, wäre ich Ihnen dankbar, wenn Sie sie mit mir teilen würden.

BEARBEITEN 1: Ich bin in den ersten und einzigen Pod der drei Replikate gegangen, die erstellt werden müssen, habe nachgesehen, rabbitmq-diagnostics erlang_cookie_sourceswo die Erland-Cookie-Datei gespeichert ist (/opt/bitnami/rabbitmq/.rabbitmq/.erlang.cookie) und habe überprüft, ob es dieselbe ist, die ich in values.yaml des Diagramms angegeben habe, und es ist genau dieselbe, also denke ich, dass es letztendlich kein Problem mit der Verteilung des Schlüssels gibt, aber ich habe immer noch dasselbe Problem. Wenn ich die Protokolle noch einmal überprüfe, sehe ich, dass ein Prozess nicht ausgeführt wird. Ich weiß nicht, ob das Problem dort liegen sollte.

Antwort1

Das Problem war das Service-Account-Token, das nicht an die Pods verteilt wurde. Ich habe die values.yaml des Helm-Charts geändert:

serviceAccount:
  ## @param serviceAccount.create Enable creation of ServiceAccount for RabbitMQ pods
  ##
  create: true
  ## @param serviceAccount.name Name of the created serviceAccount
  ## If not set and create is true, a name is generated using the rabbitmq.fullname template
  ##
  #name: ""
  ## @param serviceAccount.automountServiceAccountToken Auto-mount the service account token in the pod
  ##
  automountServiceAccountToken: true

verwandte Informationen