PROXMOX가 무작위로 재부팅됩니다.

PROXMOX가 무작위로 재부팅됩니다.

두 개의 물리적 서버인 pve와 pve2가 있는 Proxmox 클러스터가 있습니다. 96GB 메모리와 1TB(RAID-10)를 갖춘 동일한 Dell R710입니다. 아직 확인하지 못한 어떤 이유로 인해 pve2는 전원을 껐다 켭니다. iDRAC를 통해 HW 로그를 확인했는데 경보나 오류가 없습니다.

나는 Proxmox를 처음 접했기 때문에 다음과 같은 일반적인 Linux 장소 외부에서 오류 로그를 찾을 수 있는 곳을 모르겠습니다.시스템 로그그리고dmesg.

재부팅이 발생했을 때 내 syslog의 일부는 다음과 같습니다. (@ Dec 30 16:54:01)

Dec 30 16:50:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Dec 30 16:50:01 pve2 systemd[1]: pvesr.service: Succeeded.
Dec 30 16:50:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Dec 30 16:51:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Dec 30 16:51:01 pve2 systemd[1]: pvesr.service: Succeeded.
Dec 30 16:51:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Dec 30 16:52:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Dec 30 16:52:01 pve2 systemd[1]: pvesr.service: Succeeded.
Dec 30 16:52:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Dec 30 16:53:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Dec 30 16:53:01 pve2 systemd[1]: pvesr.service: Succeeded.
Dec 30 16:53:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Dec 30 16:54:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Dec 30 16:54:01 pve2 systemd[1]: pvesr.service: Succeeded.
Dec 30 16:54:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Dec 30 16:57:42 pve2 dmeventd[492]: dmeventd ready for processing.
Dec 30 16:57:42 pve2 kernel: [    0.000000] Linux version 5.4.73-1-pve (build@pve) (gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP PVE 5.4.73-1 (Mon, 16 Nov 2020 10:52:16 +0100) ()
Dec 30 16:57:42 pve2 kernel: [    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.4.73-1-pve root=/dev/mapper/pve-root ro quiet
Dec 30 16:57:42 pve2 kernel: [    0.000000] KERNEL supported cpus:
Dec 30 16:57:42 pve2 systemd-modules-load[483]: Inserted module 'iscsi_tcp'
Dec 30 16:57:42 pve2 kernel: [    0.000000]   Intel GenuineIntel
Dec 30 16:57:42 pve2 kernel: [    0.000000]   AMD AuthenticAMD
Dec 30 16:57:42 pve2 kernel: [    0.000000]   Hygon HygonGenuine
Dec 30 16:57:42 pve2 kernel: [    0.000000]   Centaur CentaurHauls
Dec 30 16:57:42 pve2 kernel: [    0.000000]   zhaoxin   Shanghai
Dec 30 16:57:42 pve2 systemd[1]: Starting Flush Journal to Persistent Storage...
Dec 30 16:57:42 pve2 kernel: [    0.000000] x86/fpu: x87 FPU will use FXSAVE
Dec 30 16:57:42 pve2 kernel: [    0.000000] BIOS-provided physical RAM map:
Dec 30 16:57:42 pve2 kernel: [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable
Dec 30 16:57:42 pve2 kernel: [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000bf378fff] usable
Dec 30 16:57:42 pve2 kernel: [    0.000000] BIOS-e820: [mem 0x00000000bf379000-0x00000000bf38efff] reserved
Dec 30 16:57:42 pve2 systemd[1]: Started udev Coldplug all Devices.
Dec 30 16:57:42 pve2 kernel: [    0.000000] BIOS-e820: [mem 0x00000000bf38f000-0x00000000bf3cdfff] ACPI data
Dec 30 16:57:42 pve2 kernel: [    0.000000] BIOS-e820: [mem 0x00000000bf3ce000-0x00000000bfffffff] reserved
Dec 30 16:57:42 pve2 kernel: [    0.000000] BIOS-e820: [mem 0x00000000e0000000-0x00000000efffffff] reserved
Dec 30 16:57:42 pve2 systemd[1]: Starting Helper to synchronize boot up for ifupdown...
Dec 30 16:57:42 pve2 kernel: [    0.000000] BIOS-e820: [mem 0x00000000fe000000-0x00000000ffffffff] reserved
Dec 30 16:57:42 pve2 kernel: [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000183fffffff] usable
Dec 30 16:57:42 pve2 kernel: [    0.000000] NX (Execute Disable) protection: active
Dec 30 16:57:42 pve2 kernel: [    0.000000] SMBIOS 2.6 present.
Dec 30 16:57:42 pve2 kernel: [    0.000000] DMI: Dell Inc. PowerEdge R710/0Y7JM4, BIOS 6.3.0 07/24/2012

그것이 무엇인지, 어디서 확인할 수 있는지에 대한 제안이 있으십니까?

답변1

하드웨어 오류일 수 있으며, 가장 일반적인 오류는 RAM 오류입니다.

알아보려면 서버에서 memtest를 실행하세요.

관련 정보