El sistema Proxmox se congela después de 30 a 40 minutos de ejecución, sin contenedores/VM en ejecución

El sistema Proxmox se congela después de 30 a 40 minutos de ejecución, sin contenedores/VM en ejecución

Tengo algunos problemas con un servidor Xeon E3-1225 usado y nuevo para mí que estoy intentando convertir en un servidor doméstico.

Actualmente, lo tengo configurado con Proxmox instalado, dos discos duros de 1 TB (diferentes marcas/modelos) en un grupo RAID1 ZFS para el sistema operativo base e idealmente algunas máquinas virtuales. Tengo dos discos duros de 4 TB que iba a usar para almacenamiento masivo, también en un ZFS RAID1, pero actualmente solo guardo algunos datos de respaldo.

Por el momento, solo tengo instalado Proxmox, no hay máquinas virtuales ni contenedores ejecutándose más allá de la configuración predeterminada. Estaba moviendo datos del disco duro de 4 TB al grupo ZFS recién creado en los discos de 1 TB (directorio raíz de inicio), pero al hacerlo, Proxmox dejó de responder repetidamente. Lo intenté 3 veces y cada vez el sistema interrumpe la conexión de red y se cuelga y ya no estoy seguro de lo que está pasando.


Feb  2 20:40:21 theHive kernel: [  356.464000] EXT4-fs (sda1): recovery complete
Feb  2 20:40:21 theHive kernel: [  356.482202] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
Feb  2 20:44:33 theHive kernel: [  608.563501] r8169 0000:02:00.0 enp2s0: Link is Down
Feb  2 20:44:33 theHive kernel: [  608.563522] vmbr0: port 1(enp2s0) entered disabled state
Feb  2 20:44:34 theHive kernel: [  610.230917] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full - flow control rx/tx
Feb  2 20:44:34 theHive kernel: [  610.230931] vmbr0: port 1(enp2s0) entered blocking state
Feb  2 20:44:34 theHive kernel: [  610.230933] vmbr0: port 1(enp2s0) entered forwarding state
Feb  2 21:56:11 theHive kernel: [ 4906.840254] WARNING: CPU: 4 PID: 0 at kernel/sched/core.c:4014 schedule_idle+0x34/0x40
Feb  2 21:56:11 theHive kernel: [ 4906.840270] Modules linked in: ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_hda_codec_hdmi irqbypass i915 snd_hda_codec_realtek crct10dif_pclmul crc32_pclmul snd_hda_codec_generic ghash_clmulni_intel ledtrig_audio aesni_intel drm_kms_helper aes_x86_64 drm snd_hda_intel crypto_simd snd_hda_codec snd_hda_core cryptd i2c_algo_bit snd_hwdep snd_pcm snd_timer glue_helper snd fb_sys_fops syscopyarea intel_cstate soundcore mei_hdcp sysfillrect mei_me intel_rapl_perf mei sysimgblt ie31200_edac pcspkr mac_hid vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core sunrpc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zlua(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs xor zstd_compress raid6_pq libcrc32c ahci i2c_i801 libahci lpc_ich r8169
Feb  2 21:56:11 theHive kernel: [ 4906.840292]  realtek video
Feb  2 21:56:11 theHive kernel: [ 4906.840304] CPU: 4 PID: 0 Comm: swapper/4 Tainted: P           O      5.3.10-1-pve #1
Feb  2 21:56:11 theHive kernel: [ 4906.840306] Hardware name: Gigabyte Technology Co., Ltd. Z87M-D3H/Z87M-D3H, BIOS F4 04/16/2013
Feb  2 21:56:11 theHive kernel: [ 4906.840308] RIP: 0010:schedule_idle+0x34/0x40
Feb  2 21:56:11 theHive kernel: [ 4906.840310] Code: 25 c0 6b 01 00 48 8b 40 10 48 89 e5 48 85 c0 75 19 31 ff e8 9e f6 ff ff 65 48 8b 04 25 c0 6b 01 00 48 8b 00 a8 08 75 e9 5d c3 <0f> 0b eb e3 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 e8
Feb  2 21:56:11 theHive kernel: [ 4906.840312] RSP: 0018:ffffb435800abec0 EFLAGS: 00010206
Feb  2 21:56:11 theHive kernel: [ 4906.840314] RAX: 0000000000001000 RBX: ffff989e2cf1dc00 RCX: 0000000000000000
Feb  2 21:56:11 theHive kernel: [ 4906.840316] RDX: 00000476777ed6de RSI: ffff989e2f11e040 RDI: 0000000000000004
Feb  2 21:56:11 theHive kernel: [ 4906.840317] RBP: ffffb435800abec0 R08: 0000000000000002 R09: 0000000000000006
Feb  2 21:56:11 theHive kernel: [ 4906.840318] R10: 00000e4b10e536cb R11: ffff989e2f1294c4 R12: ffffffff86e4a840
Feb  2 21:56:11 theHive kernel: [ 4906.840320] R13: ffff989e2f135320 R14: ffffffff86d588e0 R15: 0000000000000005
Feb  2 21:56:11 theHive kernel: [ 4906.840321] FS:  0000000000000000(0000) GS:ffff989e2f100000(0000) knlGS:0000000000000000
Feb  2 21:56:11 theHive kernel: [ 4906.840323] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb  2 21:56:11 theHive kernel: [ 4906.840324] CR2: 000055ca19aa1fb0 CR3: 00000001e7a0a005 CR4: 00000000001606e0
Feb  2 21:56:11 theHive kernel: [ 4906.840326] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Feb  2 21:56:11 theHive kernel: [ 4906.840327] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Feb  2 21:56:11 theHive kernel: [ 4906.840328] Call Trace:
Feb  2 21:56:11 theHive kernel: [ 4906.840333]  do_idle+0x16b/0x270
Feb  2 21:56:11 theHive kernel: [ 4906.840335]  cpu_startup_entry+0x1d/0x20
Feb  2 21:56:11 theHive kernel: [ 4906.840338]  start_secondary+0x167/0x1c0
Feb  2 21:56:11 theHive kernel: [ 4906.840341]  secondary_startup_64+0xa4/0xb0
Feb  2 21:56:11 theHive kernel: [ 4906.840343] ---[ end trace 51b959059837262e ]---
Feb  2 21:56:11 theHive kernel: [ 4906.840345] bad: scheduling from the idle thread!
Feb  2 21:56:11 theHive kernel: [ 4906.840346] CPU: 4 PID: 0 Comm: swapper/4 Tainted: P        W  O      5.3.10-1-pve #1
Feb  2 21:56:11 theHive kernel: [ 4906.840348] Hardware name: Gigabyte Technology Co., Ltd. Z87M-D3H/Z87M-D3H, BIOS F4 04/16/2013
Feb  2 21:56:11 theHive kernel: [ 4906.840349] Call Trace:
Feb  2 21:56:11 theHive kernel: [ 4906.840351]  dump_stack+0x63/0x8a
Feb  2 21:56:11 theHive kernel: [ 4906.840353]  dequeue_task_idle+0x2c/0x40
Feb  2 21:56:11 theHive kernel: [ 4906.840356]  dequeue_task+0xd7/0x2d0
Feb  2 21:56:11 theHive kernel: [ 4906.840358]  ? invalid_op+0x1e/0x30
Feb  2 21:56:11 theHive kernel: [ 4906.840359]  deactivate_task+0x3a/0x40
Feb  2 21:56:11 theHive kernel: [ 4906.840361]  __schedule+0x118/0x660
Feb  2 21:56:11 theHive kernel: [ 4906.840363]  schedule_idle+0x22/0x40
Feb  2 21:56:11 theHive kernel: [ 4906.840364]  do_idle+0x16b/0x270
Feb  2 21:56:11 theHive kernel: [ 4906.840366]  cpu_startup_entry+0x1d/0x20
Feb  2 21:56:11 theHive kernel: [ 4906.840368]  start_secondary+0x167/0x1c0
Feb  2 21:56:11 theHive kernel: [ 4906.840369]  secondary_startup_64+0xa4/0xb0
Feb  2 21:56:11 theHive kernel: [ 4906.840822] bad: scheduling from the idle thread!
Feb  2 21:56:11 theHive kernel: [ 4906.840835] CPU: 4 PID: 0 Comm: swapper/4 Tainted: P        W  O      5.3.10-1-pve #1
Feb  2 21:56:11 theHive kernel: [ 4906.840838] Hardware name: Gigabyte Technology Co., Ltd. Z87M-D3H/Z87M-D3H, BIOS F4 04/16/2013
Feb  2 21:56:11 theHive kernel: [ 4906.840840] Call Trace:
Feb  2 21:56:11 theHive kernel: [ 4906.840851]  dump_stack+0x63/0x8a
Feb  2 21:56:11 theHive kernel: [ 4906.840853]  dequeue_task_idle+0x2c/0x40
Feb  2 21:56:11 theHive kernel: [ 4906.840855]  dequeue_task+0xd7/0x2d0
Feb  2 21:56:11 theHive kernel: [ 4906.840858]  ? sched_clock+0x9/0x10
Feb  2 21:56:11 theHive kernel: [ 4906.840861]  deactivate_task+0x3a/0x40
Feb  2 21:56:11 theHive kernel: [ 4906.840862]  __schedule+0x118/0x660
Feb  2 21:56:11 theHive kernel: [ 4906.840864]  schedule_idle+0x22/0x40
Feb  2 21:56:11 theHive kernel: [ 4906.840866]  do_idle+0x16b/0x270
Feb  2 21:56:11 theHive kernel: [ 4906.840868]  cpu_startup_entry+0x1d/0x20
Feb  2 21:56:11 theHive kernel: [ 4906.840869]  start_secondary+0x167/0x1c0
Feb  2 21:56:11 theHive kernel: [ 4906.840871]  secondary_startup_64+0xa4/0xb0
Feb  2 21:56:11 theHive kernel: [ 4906.840938] bad: scheduling from the idle thread!
Feb  2 21:56:11 theHive kernel: [ 4906.840940] CPU: 4 PID: 0 Comm: swapper/4 Tainted: P        W  O      5.3.10-1-pve #1
Feb  2 21:56:11 theHive kernel: [ 4906.840943] Hardware name: Gigabyte Technology Co., Ltd. Z87M-D3H/Z87M-D3H, BIOS F4 04/16/2013
Feb  2 21:56:11 theHive kernel: [ 4906.840944] Call Trace:
Feb  2 21:56:11 theHive kernel: [ 4906.840946]  dump_stack+0x63/0x8a
Feb  2 21:56:11 theHive kernel: [ 4906.840948]  dequeue_task_idle+0x2c/0x40
Feb  2 21:56:11 theHive kernel: [ 4906.840950]  dequeue_task+0xd7/0x2d0
Feb  2 21:56:11 theHive kernel: [ 4906.840952]  ? sched_clock+0x9/0x10
Feb  2 21:56:11 theHive kernel: [ 4906.840953]  deactivate_task+0x3a/0x40
Feb  2 21:56:11 theHive kernel: [ 4906.840955]  __schedule+0x118/0x660
Feb  2 21:56:11 theHive kernel: [ 4906.840957]  schedule_idle+0x22/0x40
Feb  2 21:56:11 theHive kernel: [ 4906.840959]  do_idle+0x16b/0x270
Feb  2 21:56:11 theHive kernel: [ 4906.840960]  cpu_startup_entry+0x1d/0x20
Feb  2 21:56:11 theHive kernel: [ 4906.840962]  start_secondary+0x167/0x1c0
Feb  2 21:56:11 theHive kernel: [ 4906.840964]  secondary_startup_64+0xa4/0xb0
Feb  2 21:56:11 theHive kernel: [ 4906.843707] bad: scheduling from the idle thread!
Feb  2 21:56:11 theHive kernel: [ 4906.843731] CPU: 4 PID: 0 Comm: swapper/4 Tainted: P        W  O      5.3.10-1-pve #1

Más recientemente, intenté hacer la copia de seguridad de los datos en una unidad externa usando un Live CD de Linux Mint y no he tenido fallas en el sistema, por lo que no creo que esté relacionado con el hardware, pero probablemente sea una mala configuración/práctica de mi parte. .

A continuación se muestra el registro del kernel después del arranque, donde comenzaron a ocurrir errores y no estoy del todo seguro de lo que está pasando.

Trunqué el registro del kernel, pero el registro se llena con la parte repetida indefinidamente.

Respuesta1

Supuse que se trataba de un problema de compatibilidad de la placa base con Proxmox/virtualización y el firmware.

Hice una actualización de firmware en la placa base y parece haberlo solucionado.

Buena suerte a quienes encuentren esto en el futuro.

información relacionada