我的 Ubuntu 20.04 伺服器遇到一些問題,三天後失去連線。伺服器使用 Dracut 啟動。在啟動並從 DHCP 伺服器取得 IP 位址後,它的作用是停止 NetworkManager。我這樣做是因為有些人指出 NetworkManager 可能導致此問題。
我希望伺服器保留啟動後獲得的 IP 位址。
root@host:/var/log# systemctl list-units --all '*etwork*'
UNIT LOAD ACTIVE SUB DESCRIPTION
networkd-dispatcher.service loaded active running Dispatcher daemon for systemd-networkd
NetworkManager-wait-online.service loaded inactive dead Network Manager Wait Online
NetworkManager.service loaded inactive dead Network Manager
systemd-networkd.service loaded inactive dead Network Service
network-online.target loaded active active Network is Online
network-pre.target loaded inactive dead Network (Pre)
network.target loaded active active Network
LOAD = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB = The low-level unit activation state, values depend on unit type.
7 loaded units listed.
To show all installed unit files use 'systemctl list-unit-files'.
在日誌中可以看到,該機器突然無法與另一台機器上的 influxdb 聯繫,之後我也無法 ssh 了。 Journalctl 日誌:
Aug 14 11:19:01 myhost pulseaudio[67379]: GetManagedObjects() failed: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
...
Aug 14 11:58:05 myhost telegraf[3018]: 2020-08-14T09:58:05Z E! [outputs.influxdb] When writing to [http://someip]: Post http://someip/write?consistency=any&db=telegraf: net/http: request canceled (Client.Timeout exceeded while awai>
Aug 14 11:58:05 myhost telegraf[3018]: 2020-08-14T09:58:05Z E! [agent] Error writing to outputs.influxdb: could not write any address
我在日誌中看到的唯一其他相關內容是:
Aug 14 10:34:08 myhost systemd-timesyncd[2889]: Timed out waiting for reply from 91.189.91.157:123 (ntp.ubuntu.com).
但這是一個經常發生的事情,我不確定它是否與這個問題有關,或者只是網路的某些元件阻止了與該 IP 位址的連接。