
我是 Sensu 修復的新手。我嘗試使用自訂腳本重新啟動進程。
Senario:如果我的 http 連結斷開,那麼我想手動運行腳本並啟動它。
我嘗試了對 sensu 進行修復,如果您的監控出現問題,可以使用用於監控檢查的自訂腳本自動執行腳本。然而我面臨的問題是所有檢查和連接都很好,但是當我的連結斷開時,sensu remediator 不會觸發客戶端。我已經發布了日誌和配置請告訴我哪裡出錯了..
這是 Sensu 伺服器日誌
{"timestamp":"2016-05-16T09:44:52.768622+0000","level":"info","message":"processing event","event":{"id":"9a9f66c2-e70e-45fb-87fb-c9e9085c8e05","client":{"name":"zubron","address":"10.0.0.110","subscriptions":["zubron"],"version":"0.20.3","timestamp":1463391880},"check":{"command":"/etc/sensu/plugins/check_http -H 10.0.0.110 -p 7077","interval":60,"occurrences":2,"handlers":["remediator"],"subscribers":["zubron"],"standalone":false,"remediation":{"remediate-zubron":{"occurrences":[1,3],"severities":[2]},"trigger_on":["zubron"]},"name":"check-zubron-port","issued":1463391892,"executed":1463391892,"duration":0.002,"output":"connect to address 10.0.0.110 and port 7077: Connection refused\nHTTP CRITICAL - Unable to open TCP socket\n","status":2,"history":["0","0","0","2","2","2","2","2","2","2","2","2","2","2","2","2","2","2","2","2","2"],"total_state_change":4},"occurrences":18,"action":"create","timestamp":1463391892}}
{"timestamp":"2016-05-16T09:44:52.864908+0000","level":"info","message":"handler output","handler":{"command":"/etc/sensu/handlers/sensu.rb","type":"pipe","severities":["critical"],"name":"remediator"},"output":["/etc/sensu/handlers/sensu.rb:108:in `[]': can't convert String into Integer (TypeError)\n","\tfrom /etc/sensu/handlers/sensu.rb:108:in `block in parse_remediations'\n","\tfrom /etc/sensu/handlers/sensu.rb:106:in `each'\n","\tfrom /etc/sensu/handlers/sensu.rb:106:in `parse_remediations'\n","\tfrom /etc/sensu/handlers/sensu.rb:90:in `handle'\n","\tfrom /var/lib/gems/1.9.1/gems/sensu-plugin-1.2.0/lib/sensu-handler.rb:55:in `block in <class:Handler>'\n","REMEDIATION: Evaluating remediation: zubron {\"remediate-zubron\"=>{\"occurrences\"=>[1, 3], \"severities\"=>[2]}, \"trigger_on\"=>[\"zubron\"]} #=18 sev=2\n"]}
這是我在 Sensu-server 上的檢查文件...
{
"checks": {
"check-zubron-port": {
"command": "/etc/sensu/plugins/check_http -H 10.0.0.110 -p 7077",
"interval": 60,
"occurrences": 2,
"handlers": [
"remediator"
],
"subscribers": [
"zubron"
],
"standalone": false,
"remediation": {
"remediate-zubron": {
"occurrences": [
1,
3
],
"severities": [
2
]
},
"trigger_on": [
"zubron"
]
}
}
}
}
這是我的修復文件...
{
"remediate-zubron": {
"command": "sudo /bin/bash ~/zubron/home/moofwd-zubron-server/bin/start-moofwd.sh",
"handlers": [],
"subscribers": [
"zubron"
],
"standalone": false,
"publish": false
}
}
其餘的 sensu.rb 我用過這個關聯
我有什麼遺漏的嗎?
如果出現任何問題,是否還有其他監控系統可以執行腳本或命令?
我已經嘗試過 nagios nectar 和 monit。
答案1
我在您的修復文件中看到一個錯誤。您缺少“檢查”鍵。
應該是這樣的,
{
"checks": {
"remediate-zubron": {
"command": "sudo /bin/bash ~/zubron/home/moofwd-zubron-server/bin/start-moofwd.sh",
"handlers": [],
"subscribers": [
"zubron"
],
"standalone": false,
"publish": false
}
}
}
其他可能的問題是您的客戶端配置應該訂閱您的名字,即名稱
university
應該university
在訂閱者中訂閱。{ "client": { "name": "university", "address": "IP ADDRESS", "subscriptions": [ "linux", "web-server", "system", "university" ] } }
另一個問題可能是 remeddiator.rb 或 sensu.rb 中的 api_request(:POST) ,以下方法應與下面給出的完全相同。在一些舊程式碼中,它
'/checks/request'
代替'/request'
並導致損壞。def trigger_remediation(check, subscribers) api_request(:POST, '/request') do |req| req.body = JSON.dump('check' => check, 'subscribers' => subscribers) end end
在某些情況下,您需要同時修復案例 2 和案例 3。
以下是參考連結。為了讓您更清楚地了解這個問題。
https://github.com/sensu/sensu-community-plugins/issues/1162