Nginx + php-fpm “504 網關逾時”錯誤幾乎為零負載(在測試伺服器上)

Nginx + php-fpm “504 網關逾時”錯誤幾乎為零負載(在測試伺服器上)

調試了 6 個小時後 - 我放棄了:|

我們在 LAN 中有一個 nginx+php-fpm+mysql,有近 100 個 wordpress(由不同的設計人員/開發人員創建和使用,所有這些都致力於測試 wordpres 設定)

我們使用 nginx 很長一段時間以來沒有任何問題。

今天,突然間 - nginx 開始突然回到「504 Gateway Time-out」...

我檢查了虛擬主機的 nginx 錯誤日誌...

2010/09/06 21:24:24 [error] 12909#0: *349 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 21:25:11 [error] 12909#0: *349 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 21:25:11 [error] 12909#0: *443 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /info.php HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 21:25:12 [error] 12909#0: *443 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 22:08:32 [error] 12909#0: *1025 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 22:09:33 [error] 12909#0: *1025 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 22:09:40 [error] 12909#0: *1064 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /info.php HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 22:09:40 [error] 12909#0: *1064 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 22:24:44 [error] 12909#0: *1313 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 22:24:53 [error] 12909#0: *1313 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"

當我透過 TCP 模式在連接埠 9000 上運行 php-fpm 時,我運行“netstat | grep 9000”並注意到一些不尋常的事情... (此處貼上部分輸出以便於閱讀)

tcp        9      0 localhost:9000          localhost:36094         CLOSE_WAIT  14269/php5-fpm  
tcp        0      0 localhost:46664         localhost:9000          FIN_WAIT2   -               
tcp     1257      0 localhost:9000          localhost:36135         CLOSE_WAIT  -               
tcp     1257      0 localhost:9000          localhost:36125         CLOSE_WAIT  -               
tcp        9      0 localhost:9000          localhost:36102         CLOSE_WAIT  14268/php5-fpm  
tcp        0      0 localhost:46662         localhost:9000          FIN_WAIT2   -               
tcp      745      0 localhost:9000          localhost:46644         CLOSE_WAIT  -               
tcp        0      0 localhost:46658         localhost:9000          FIN_WAIT2   -               
tcp     1265      0 localhost:9000          localhost:46607         CLOSE_WAIT  -               
tcp        0      0 localhost:46672         localhost:9000          ESTABLISHED 12909/nginx: worker
tcp     1257      0 localhost:9000          localhost:36119         CLOSE_WAIT  -               
tcp     1265      0 localhost:9000          localhost:46613         CLOSE_WAIT  -               
tcp        0      0 localhost:46646         localhost:9000          FIN_WAIT2   -               
tcp     1257      0 localhost:9000          localhost:36137         CLOSE_WAIT  -               
tcp        0      0 localhost:46670         localhost:9000          ESTABLISHED 12909/nginx: worker
tcp     1265      0 localhost:9000          localhost:46619         CLOSE_WAIT  -               
tcp     1336      0 localhost:9000          localhost:46668         ESTABLISHED -               
tcp        0      0 localhost:46648         localhost:9000          FIN_WAIT2   -               
tcp     1336      0 localhost:9000          localhost:46670         ESTABLISHED -               
tcp        9      0 localhost:9000          localhost:36108         CLOSE_WAIT  14274/php5-fpm  
tcp     1336      0 localhost:9000          localhost:46684         ESTABLISHED -               
tcp        0      0 localhost:46674         localhost:9000          ESTABLISHED 12909/nginx: worker
tcp     1336      0 localhost:9000          localhost:46666         ESTABLISHED -               
tcp     1257      0 localhost:9000          localhost:46648         CLOSE_WAIT  -               
tcp     1336      0 localhost:9000          localhost:46678         ESTABLISHED -               
tcp        0      0 localhost:46668         localhost:9000          ESTABLISHED 12909/nginx: wo             

有很多“CLOSE_WAIT”和“FIN_WAIT2”對,如下突出顯示(在上面的輸出中):

tcp     1337      0 localhost:9000          localhost:46680         CLOSE_WAIT  -               
tcp        0      0 localhost:46680         localhost:9000          FIN_WAIT2   -

請注意上面的連接埠 46680。

我啟用了mysql慢查詢錯誤日誌,但沒有用。

截至目前,透過 cronjob 每分鐘重新啟動 php5-fpm (請參閱下面的命令),保持一切「順利」運行,但我討厭拼湊,想解決這個問題...

1 * * * * service php5-fpm restart > /dev/null

我在谷歌上進行了廣泛的搜索 - 沒有得到任何幫助。如前所述,這是LAN 中的測試伺服器,CPU 負載從未超過0.10,記憶體使用率也低於25%(系統有2GB RAM 並安裝了ubuntu 伺服器)因此,如果您發現它的時間混亂,請幫助我至少給一個提示。

預先感謝您的幫忙。

-拉胡爾

(註 - 這是轉發 -http://forum.nginx.org/read.php?11,127694

更新:我找到了答案,發佈在下面。

答案1

我在 nginx 論壇上的帖子中找到了答案 -http://forum.nginx.org/read.php?2,127854

就我而言,答案是設定:

request_terminate_timeout=30s

在 php-fpm 配置中(通常/etc/php5/fpm/php-fpm.conf

請注意,您也可以使用 30 秒以外的值。

我用它來匹配主文件中的值php.ini

max_execution_time = 30

謝謝大家。 :-)

答案2

這是如何解決我的問題的:

對 http { 部分中的 /etc/nginx/nginx.conf 進行以下更改

proxy_connect_timeout  600s;
proxy_send_timeout  600s;
proxy_read_timeout  600s;
fastcgi_send_timeout 600s;
fastcgi_read_timeout 600s;

然後重啟nginx

/etc/init.d/nginx 重新啟動

答案3

如果您使用 php 5.3,請增加積壓。

如果您使用 php 5.2,請反向移植補丁以將積壓工作量從 128 增加。

另外,請使用 unix 套接字而不是 TCP 套接字。 unix:/tmp/php5-cgi.sock (或相關路徑)

答案4

就我而言(相同的 nginx 錯誤訊息),一些有問題的 php 腳本沒有結束執行並等待某些內容,導致 nginx 不再選擇 php5-fpm 子層級。

使固定:

  1. 添加執行時間限制其他人在這篇文章中提到過。 request_terminate_timeout=30s
  2. 增加孩子數量。一切都很順利。 pm.max_spare_servers=16 pm.min_spare_servers=2

現在一切都很順利。

相關內容