Время от времени, по непонятной мне причине (я очень неопытный админ), nginx автоматически перезапускается, не может этого сделать, что приводит к прерыванию обслуживания. Это произошло сегодня утром:
$ journalctl -u nginx
-- Logs begin at Mon 2018-09-03 11:46:24 CEST, end at Tue 2018-09-04 09:30:22 CEST. --
Sep 04 07:55:21 vpsXXXXXX.ovh.net systemd[1]: Stopping A high performance web server and a reverse proxy server...
Sep 04 07:55:21 vpsXXXXXX.ovh.net systemd[1]: Stopped A high performance web server and a reverse proxy server.
Sep 04 07:55:27 vpsXXXXXX.ovh.net systemd[1]: Starting A high performance web server and a reverse proxy server...
Sep 04 07:55:27 vpsXXXXXX.ovh.net nginx[29333]: nginx: [warn] could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_siz
Sep 04 07:55:27 vpsXXXXXX.ovh.net nginx[29335]: nginx: [warn] could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_siz
Sep 04 07:55:28 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use)
Sep 04 07:55:28 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Sep 04 07:55:28 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:5281 failed (98: Address already in use)
Sep 04 07:55:28 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use)
Sep 04 07:55:28 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Sep 04 07:55:28 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:5281 failed (98: Address already in use)
Sep 04 07:55:29 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use)
Sep 04 07:55:29 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Sep 04 07:55:29 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:5281 failed (98: Address already in use)
Sep 04 07:55:29 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use)
Sep 04 07:55:29 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Sep 04 07:55:29 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:5281 failed (98: Address already in use)
Sep 04 07:55:30 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use)
Sep 04 07:55:30 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Sep 04 07:55:30 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:5281 failed (98: Address already in use)
Sep 04 07:55:30 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] still could not bind()
Sep 04 07:55:30 vpsXXXXXX.ovh.net systemd[1]: nginx.service: Control process exited, code=exited status=1
Sep 04 07:55:30 vpsXXXXXX.ovh.net systemd[1]: Failed to start A high performance web server and a reverse proxy server.
Sep 04 07:55:30 vpsXXXXXX.ovh.net systemd[1]: nginx.service: Unit entered failed state.
Sep 04 07:55:30 vpsXXXXXX.ovh.net systemd[1]: nginx.service: Failed with result 'exit-code'.
Спустя 1 час 30 минут я понимаю, что мой веб-сервер не работает, поэтому я просто включаю его, systemctl restart nginx
и все работает нормально.
Sep 04 09:23:48 vpsXXXXXX.ovh.net systemd[1]: Starting A high performance web server and a reverse proxy server...
Sep 04 09:23:48 vpsXXXXXX.ovh.net nginx[30003]: nginx: [warn] could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_siz
Sep 04 09:23:48 vpsXXXXXX.ovh.net nginx[30004]: nginx: [warn] could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_siz
Sep 04 09:23:48 vpsXXXXXX.ovh.net systemd[1]: nginx.service: Failed to read PID from file /run/nginx.pid: Invalid argument
Sep 04 09:23:48 vpsXXXXXX.ovh.net systemd[1]: Started A high performance web server and a reverse proxy server.
К сожалению, я не записал последний раз, когда это произошло, и журналы, по-видимому, слишком старые:
$ zgrep "bind()" /var/log/nginx/*
(just this morning's episode)
... но я уверен, что у меня была похожая проблема по крайней мере 2 или 3 раза в этом году (что приемлемо, но раздражает).
Похоже, это не связано с перезапуском сервера:
$ uptime
09:43:37 up 12 days, 7:00, 1 user, load average: 0.00, 0.00, 0.00
Это мой файл crontab пользователя root. Я не думаю, что это связано, так как расписание не совпадает, но поскольку я не могу придумать, где его искать, вот он:
# let's encrypt renewal
0 5 1 * * certbot renew --authenticator standalone --installer nginx --pre-hook "systemctl stop nginx" --post-hook "systemctl start nginx" -n
# automatically update debian
0 5 * * 1 apt -qq update && apt dist-upgrade -qq -y
# couldn't find another way to make sure that datetime.today() actually returns today's date in my flask apps
0 0 * * * systemctl reload uwsgi.service
Чем это вызвано?
РЕДАКТИРОВАТЬ: В /var/log/nginx/error.log
, я могу найти немного больше информации:
2018/09/04 07:55:19 [warn] 29269#29269: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/04 07:55:19 [info] 29269#29269: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/04 07:55:20 [warn] 29271#29271: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/04 07:55:20 [info] 29271#29271: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/04 07:55:20 [warn] 29273#29273: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/04 07:55:20 [info] 29273#29273: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/04 07:55:26 [warn] 29305#29305: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/04 07:55:26 [notice] 29305#29305: signal process started
2018/09/04 07:55:26 [error] 29305#29305: open() "/run/nginx.pid" failed (2: No such file or directory)
2018/09/04 07:55:26 [warn] 29306#29306: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/04 07:55:27 [warn] 29333#29333: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
(then the same lines journalctl displays)
Почему это open() "/run/nginx.pid" failed (2: No such file or directory)
произошло вдруг?
ПРАВКА2: Это случилось снова сегодня утром:
$ cat /var/log/nginx/error.log
2018/09/13 10:12:19 [warn] 7230#7230: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:19 [info] 7230#7230: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/13 10:12:31 [warn] 7243#7243: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:31 [notice] 7243#7243: signal process started
2018/09/13 10:12:31 [error] 7243#7243: open() "/run/nginx.pid" failed (2: No such file or directory)
2018/09/13 10:12:31 [warn] 7244#7244: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:32 [warn] 7247#7247: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:32 [info] 7247#7247: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/13 10:12:32 [warn] 7251#7251: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:32 [info] 7251#7251: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/13 10:12:33 [warn] 7255#7255: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:33 [info] 7255#7255: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/13 10:12:33 [warn] 7256#7256: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:443 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:80 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:5281 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:443 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:80 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:5281 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:443 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:80 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:5281 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:443 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:80 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:5281 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:443 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:80 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:5281 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: still could not bind()
2018/09/13 10:12:36 [alert] 7245#7245: unlink() "/run/nginx.pid" failed (2: No such file or directory)
-------- this is when I realized my web services are down and manually restart nginx
2018/09/13 11:25:11 [warn] 7578#7578: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 11:25:11 [info] 7578#7578: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/13 11:25:12 [warn] 7579#7579: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
Думаю, я просто добавлю cronjob, который будет каждые 5 минут проверять, nginx
запущено ли приложение, и запускать его, если нет, но это настолько грязно, что меня тянет блевать.