![Ansible Сбор фактов не удается выполнить команду findmnt для некоторых хостов](https://rvso.com/image/164702/Ansible%20%D0%A1%D0%B1%D0%BE%D1%80%20%D1%84%D0%B0%D0%BA%D1%82%D0%BE%D0%B2%20%D0%BD%D0%B5%20%D1%83%D0%B4%D0%B0%D0%B5%D1%82%D1%81%D1%8F%20%D0%B2%D1%8B%D0%BF%D0%BE%D0%BB%D0%BD%D0%B8%D1%82%D1%8C%20%D0%BA%D0%BE%D0%BC%D0%B0%D0%BD%D0%B4%D1%83%20findmnt%20%D0%B4%D0%BB%D1%8F%20%D0%BD%D0%B5%D0%BA%D0%BE%D1%82%D0%BE%D1%80%D1%8B%D1%85%20%D1%85%D0%BE%D1%81%D1%82%D0%BE%D0%B2.png)
ВЕРСИЯ ANSIBLE
ansible 2.4.6.0
config file = /home/xxxxxx/ansible.cfg
configured module search path = [u'/home/xxxxxx/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python2.7/site-packages/ansible
executable location = /usr/bin/ansible
python version =
2.7.5 (по умолчанию, 7 августа 2019 г., 00:51:29) [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]
КОНФИГУРАЦИЯ
кот ~/.ansible.cfg
[defaults]
host_key_checking = False
forks = 5
log_path = /home/userid/ansible.log
[ssh_connection]
pipelining = true
grep ^[^#] /etc/ansible/ansible.cfg
[defaults]
roles_path = /etc/ansible/roles:/usr/share/ansible/roles
host_key_checking = False
ОС / ОКРУЖЕНИЕ Клиент: CentOS Linux версии 7.5.1804 (Core)
ДЕЙСТВИЯ ПО ВОСПРОИЗВЕДЕНИЮ Ansible all Playbook работает отлично, за исключением любой ссылки на Gather facts. Модуль Gather facts и любая ссылка на Gather facts зависают.
Пример - Команда ansible все -i ansible/inventory/inventory -m настройка -u идентификатор пользователя -k -K -vvv
ФАКТИЧЕСКИЕ РЕЗУЛЬТАТЫ
ansible all -i ansible/inventory/inventory-file -m setup -u userid -k -K --limit="130.100.136.118,130.100.136.114" -vvv
ansible 2.4.6.0
config file = /home/userid/ansible.cfg
configured module search path = [u'/home/userid/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python2.7/site-packages/ansible
executable location = /usr/bin/ansible
python version = 2.7.5 (default, Aug 7 2019, 00:51:29) [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]
Using /home/userid/ansible.cfg as config file
SSH password:
SUDO password[defaults to SSH password]:
Parsed /home/userid/ansible/inventory/dop-poc-ibm inventory source with ini plugin
META: ran handlers
Using module file /usr/lib/python2.7/site-packages/ansible/modules/system/setup.py
<130.100.136.114> ESTABLISH SSH CONNECTION FOR USER: userid
Using module file /usr/lib/python2.7/site-packages/ansible/modules/system/setup.py
<130.100.136.118> ESTABLISH SSH CONNECTION FOR USER: userid
<130.100.136.114> SSH: EXEC sshpass -d14 ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o User=userid -o ConnectTimeout=10 -o ControlPath=/home/userid/.ansible/cp/1f9f8629ab 130.100.136.114 '/bin/sh -c '"'"'/usr/bin/python && sleep 0'"'"''
<130.100.136.118> SSH: EXEC sshpass -d15 ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o User=userid -o ConnectTimeout=10 -o ControlPath=/home/userid/.ansible/cp/e3a887b653 130.100.136.118 '/bin/sh -c '"'"'/usr/bin/python && sleep 0'"'"''
<130.100.136.114> (1, '\n{"exception": "Traceback (most recent call last):\n File "/tmp/ansible_5w_PfH/ansible_modlib.zip/ansible/module_utils/basic.py", line 2786, in run_command\n cmd = subprocess.Popen(args, **kwargs)\n File "/usr/lib64/python2.7/subprocess.py", line 711, in init\n errread, errwrite)\n File "/usr/lib64/python2.7/subprocess.py", line 1308, in _execute_child\n data = _eintr_retry_call(os.read, errpipe_read, 1048576)\n File "/usr/lib64/python2.7/subprocess.py", line 478, in _eintr_retry_call\n return func(args)\n File "/tmp/ansible_5w_PfH/ansible_modlib.zip/ansible/module_utils/facts/timeout.py", line 37, in _handle_timeout\n raise TimeoutError(msg)\nTimeoutError: Timer expired after 10 seconds\n", "cmd": "/usr/bin/findmnt --list --noheadings --notruncate", "failed": true, "rc": 257, "invocation": {"module_args": {"filter": "", "gather_subset": ["all"], "fact_path": "/etc/ansible/facts.d", "gather_timeout": 10}}, "msg": "Timer expired after 10 seconds"}\n', '')
The full traceback is:
Traceback (most recent call last):
File "/tmp/ansible_5w_PfH/ansible_modlib.zip/ansible/module_utils/basic.py", line 2786, in run_command
cmd = subprocess.Popen(args, **kwargs)
File "/usr/lib64/python2.7/subprocess.py", line 711, in init
errread, errwrite)
File "/usr/lib64/python2.7/subprocess.py", line 1308, in _execute_child
data = _eintr_retry_call(os.read, errpipe_read, 1048576)
File "/usr/lib64/python2.7/subprocess.py", line 478, in _eintr_retry_call
return func(*args)
File "/tmp/ansible_5w_PfH/ansible_modlib.zip/ansible/module_utils/facts/timeout.py", line 37, in _handle_timeout
raise TimeoutError(msg)
TimeoutError: Timer expired after 10 seconds
130.100.136.114 | FAILED! => {
"changed": false,
"cmd": "/usr/bin/findmnt --list --noheadings --notruncate",
"failed": true,
"invocation": {
"module_args": {
"fact_path": "/etc/ansible/facts.d",
"filter": "*",
"gather_subset": [
"all"
],
"gather_timeout": 10
}
},
"msg": "Timer expired after 10 seconds",
"rc": 257
}
<130.100.136.118> (1, '\n{"exception": "Traceback (most recent call last):\n File "/tmp/ansible_Alx9Sv/ansible_modlib.zip/ansible/module_utils/basic.py", line 2786, in run_command\n cmd = subprocess.Popen(args, **kwargs)\n File "/usr/lib64/python2.7/subprocess.py", line 711, in init\n errread, errwrite)\n File "/usr/lib64/python2.7/subprocess.py", line 1308, in _execute_child\n data = _eintr_retry_call(os.read, errpipe_read, 1048576)\n File "/usr/lib64/python2.7/subprocess.py", line 478, in _eintr_retry_call\n return func(args)\n File "/tmp/ansible_Alx9Sv/ansible_modlib.zip/ansible/module_utils/facts/timeout.py", line 37, in _handle_timeout\n raise TimeoutError(msg)\nTimeoutError: Timer expired after 10 seconds\n", "cmd": "/usr/bin/findmnt --list --noheadings --notruncate", "failed": true, "rc": 257, "invocation": {"module_args": {"filter": "", "gather_subset": ["all"], "fact_path": "/etc/ansible/facts.d", "gather_timeout": 10}}, "msg": "Timer expired after 10 seconds"}\n', '')
The full traceback is:
Traceback (most recent call last):
File "/tmp/ansible_Alx9Sv/ansible_modlib.zip/ansible/module_utils/basic.py", line 2786, in run_command
cmd = subprocess.Popen(args, **kwargs)
File "/usr/lib64/python2.7/subprocess.py", line 711, in init
errread, errwrite)
File "/usr/lib64/python2.7/subprocess.py", line 1308, in _execute_child
data = _eintr_retry_call(os.read, errpipe_read, 1048576)
File "/usr/lib64/python2.7/subprocess.py", line 478, in _eintr_retry_call
return func(*args)
File "/tmp/ansible_Alx9Sv/ansible_modlib.zip/ansible/module_utils/facts/timeout.py", line 37, in _handle_timeout
raise TimeoutError(msg)
TimeoutError: Timer expired after 10 seconds
130.100.136.118 | FAILED! => {
"changed": false,
"cmd": "/usr/bin/findmnt --list --noheadings --notruncate",
"failed": true,
"invocation": {
"module_args": {
"fact_path": "/etc/ansible/facts.d",
"filter": "*",
"gather_subset": [
"all"
],
"gather_timeout": 10
}
},
"msg": "Timer expired after 10 seconds",
"rc": 257
}
Предпринятые шаги
Increased gather_timeout = 20 or 30 in home folder ansible.cfg, Didnt helped.
Tried gather_subset = !all, Didnt helped.
Manual execution of
ansible -i ansible/inventory/inventory -u userid@domain --become -m shell -a '/usr/bin/findmnt --list --noheadings --notruncate' linux -k -K Worked. Noticed, it takes a few seconds to publish results.
Обходной путь на данный момент
Commented section in "/usr/lib/python2.7/site-packages/ansible/module_utils/facts/hardware/linux.py"
#def _run_findmnt(self, findmnt_path):
# args = ['--list', '--noheadings', '--notruncate']
# cmd = [findmnt_path] + args
# rc, out, err = self.module.run_command(cmd, errors='surrogate_then_replace')
# return rc, out, err
#def _find_bind_mounts(self):
# bind_mounts = set()
# findmnt_path = self.module.get_bin_path("findmnt")
# if not findmnt_path:
# return bind_mounts
# rc, out, err = self._run_findmnt(findmnt_path)
# if rc != 0:
# return bind_mounts
# find bind mounts, in case /etc/mtab is a symlink to /proc/mounts
# for line in out.splitlines():
# fields = line.split()
# fields[0] is the TARGET, fields[1] is the SOURCE
# if len(fields) < 2:
# continue
# bind mounts will have a [/directory_name] in the SOURCE column
# if self.BIND_MOUNT_RE.match(fields[1]):
# bind_mounts.add(fields[0])
# return bind_mounts
решение1
Я не уверен, что проблема в этом, но у меня были проблемы из-за устаревших монтирований NFS. Если бы вы могли подключиться по ssh к одному из неисправных серверов и посмотреть, будет ли работать команда df без зависаний, чтобы исключить это.
решение2
После обновления ansible до версии 2.8 этого больше не было.