
Tengo un script bash que devuelve resultados correctos cuando lo ejecuto como usuario nagios manualmente. Falla como desconocido cuando se configura en las configuraciones de Nagios.
Entradas en configuraciones de nagios
define command {
command_name check_s3descrepency
command_line /usr/lib64/nagios/plugins/check_s3descrepency $ARG1$
}
define service{
use generic-service
service_description Check android_event S3 descrepency
host_name nagios-server
check_command check_s3descrepency!android_event
}
[root@nagios nagios]# sudo -u nagios /usr/lib64/nagios/plugins/check_s3descrepency android_event
OK - 2014-05-07-22-00 Data size match with pre week 2014-04-30-22-00 by 102%
[root@nagios nagios]#
Se habilitó la depuración y el siguiente es el registro.
[1399529584.757820] [016.0] [pid=1089] Checking service 'Check android_event S3 descrepency' on host 'nagios-server'...
[1399529584.757839] [2320.2] [pid=1089] Raw Command Input: /usr/lib64/nagios/plugins/check_s3descrepency $ARG1$
[1399529584.757891] [2320.2] [pid=1089] Expanded Command Output: /usr/lib64/nagios/plugins/check_s3descrepency $ARG1$
[1399529584.757916] [2048.1] [pid=1089] Processing: '/usr/lib64/nagios/plugins/check_s3descrepency $ARG1$'
[1399529584.757923] [2048.2] [pid=1089] Processing part: '/usr/lib64/nagios/plugins/check_s3descrepency '
[1399529584.757930] [2048.2] [pid=1089] Not currently in macro. Running output (46): '/usr/lib64/nagios/plugins/check_s3descrepency '
[1399529584.757957] [2048.2] [pid=1089] Uncleaned macro. Running output (66): '/usr/lib64/nagios/plugins/check_s3descrepency android_event'
[1399529584.757962] [2048.2] [pid=1089] Just finished macro. Running output (66): '/usr/lib64/nagios/plugins/check_s3descrepency android_event'
[1399529584.757972] [2048.2] [pid=1089] Not currently in macro. Running output (66): '/usr/lib64/nagios/plugins/check_s3descrepency android_event'
[1399529584.757977] [2048.1] [pid=1089] Done. Final output: '/usr/lib64/nagios/plugins/check_s3descrepency android_event'
[1399529586.011991] [016.1] [pid=1089] Handling check result for service 'Check android_event S3 descrepency' on host 'nagios-server'...
[1399529586.012012] [016.0] [pid=1089] ** Handling check result for service 'Check android_event S3 descrepency' on host 'nagios-server'...
[1399529586.012022] [016.1] [pid=1089] HOST: nagios-server, SERVICE: Check android_event S3 descrepency, CHECK TYPE: Active, OPTIONS: 1, SCHEDULED: Yes, RESCHEDULE: Yes, EXITED OK: Yes, RETURN CODE: 3, OUTPUT: \nUNKNOWN - 2014-05-07-22-00 Data size match with pre week 2014-04-30-22-00 by %\n
[1399529586.012170] [016.1] [pid=1089] Checking service 'Check android_event S3 descrepency' on host 'nagios-server' for flapping...
[1399529586.012267] [032.0] [pid=1089] ** Service Notification Attempt ** Host: 'nagios-server', Service: 'Check android_event S3 descrepency', Type: 0, Options: 0, Current State: 3, Last Notification: Wed Dec 31 16:00:00 1969
[1399529586.012385] [016.0] [pid=1089] Scheduling a non-forced, active check of service 'Check android_event S3 descrepency' on host 'nagios-server' @ Wed May 7 23:23:04 2014
[1399529631.103930] [008.0] [pid=1089] ** Service Check Event ==> Host: 'nagios-server', Service: 'Check android_event S3 descrepency', Options: 0, Latency: 0.103000 sec
[1399529631.103946] [016.0] [pid=1089] Attempting to run scheduled check of service 'Check android_event S3 descrepency' on host 'nagios-server': check options=0, latency=0.103000
Respuesta1
Gracias Zoredache. Era la variable ambiental.
s3cmd no estaba en la ruta para el usuario de nagios. Se agregó la ruta completa en el script y funcionó a las mil maravillas.