Este es un sistema de caja ocupada/openwrt.
Los OOM-killer están matando procesos de forma intermitente. A veces recopilado, a veces snmpd y, como en el ejemplo siguiente, uhttpd.
Podría liberar aproximadamente un 37% si no ejecutara Collectd (graficando en *rtg por defecto), sin embargo, no estoy seguro de que eso haría algo más que llevar el problema un poco más allá en el tiempo.
Mi sysctl "vm.min_free_kbytes" dice 16384 y, según el resultado a continuación, hasta donde interpreto el resultado, tenía 32760 kB libres, entonces, ¿por qué se reinició?
Cualquier ayuda para interpretar el motivo del reinicio o asesoramiento al respecto será muy apreciada.
¿Podría, y tal vez debería, establecer vm.min_free_kbytes en un nivel más bajo? Sin embargo, no es obvio para mí en el resultado siguiente que no tengo memoria libre. Dice "DMA libre: 32760 kB mínimo: 16384 kB". La versión del kernel es 4.9.17
kern.warn kernel: [72064.465528] observer invoked oom-killer: gfp_mask=0x24080c0(GFP_KERNEL|__GFP_ZERO), nodemask=0, order=1, oom_score_adj=0
kern.debug kernel: [72064.465537] CPU: 0 PID: 25658 Comm: observer Not tainted 4.9.17 #0
kern.debug kernel: [72064.465539] Call Trace:
kern.debug kernel: [72064.465547] [c54dfd40] [c03d1798] dump_stack+0x84/0xb0 (unreliable)
kern.debug kernel: [72064.465563] [c54dfd50] [c03cfae0] dump_header.isra.4+0x54/0x180
kern.debug kernel: [72064.465570] [c54dfd90] [c0097aec] oom_kill_process+0x88/0x3f0
kern.debug kernel: [72064.465575] [c54dfdd0] [c0098350] out_of_memory+0x37c/0x3b0
kern.debug kernel: [72064.465583] [c54dfe00] [c009bbf0] __alloc_pages_nodemask+0x904/0x9cc
kern.debug kernel: [72064.465588] [c54dfeb0] [c009bcd4] __get_free_pages+0x1c/0x44
kern.debug kernel: [72064.465598] [c54dfec0] [c001c660] mm_init+0xcc/0x144
kern.debug kernel: [72064.465604] [c54dfee0] [c00de2b4] do_execveat_common+0x254/0x574
kern.debug kernel: [72064.465609] [c54dff30] [c00de600] do_execve+0x2c/0x3c
kern.debug kernel: [72064.465618] [c54dff40] [c000cff8] ret_from_syscall+0x0/0x3c
kern.debug kernel: [72064.465624] --- interrupt: c01 at 0xfd41cd4
kern.debug kernel: [72064.465624] LR = 0x10024200
kern.debug kernel: [72064.465626] Mem-Info:
kern.debug kernel: [72064.465638] active_anon:1298 inactive_anon:26 isolated_anon:0
kern.debug kernel: [72064.465638] active_file:11 inactive_file:18 isolated_file:0
kern.debug kernel: [72064.465638] unevictable:15509 dirty:0 writeback:0 unstable:0
kern.debug kernel: [72064.465638] slab_reclaimable:691 slab_unreclaimable:2447
kern.debug kernel: [72064.465638] mapped:1622 shmem:163 pagetables:166 bounce:0
kern.debug kernel: [72064.465638] free:8190 free_pcp:72 free_cma:0
kern.debug kernel: [72064.465648] Node 0 active_anon:5192kB inactive_anon:104kB active_file:44kB inactive_file:72kB unevictable:62036kB isolated(anon):0kB isolated(file):0kB mapped:6488kB dirty:0kB writeback:0kB shmem:652kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no
kern.debug kernel: [72064.465661] DMA free:32760kB min:16384kB low:20480kB high:24576kB active_anon:5192kB inactive_anon:104kB active_file:44kB inactive_file:72kB unevictable:62036kB writepending:0kB present:262144kB managed:200856kB mlocked:0kB slab_reclaimable:2764kB slab_unreclaimable:9788kB kernel_stack:920kB pagetables:664kB bounce:0kB free_pcp:288kB local_pcp:136kB free_cma:0kB
kern.emerg kernel: lowmem_reserve[]: 0 0 0 0
kern.debug kernel: [72064.465669] DMA: 306*4kB (UMEH) 258*8kB (UMH) 177*16kB (UMEH) 35*32kB (UMEH) 65*64kB (MEH) 47*128kB (MH) 2*256kB (H) 1*512kB (H) 2*1024kB (H) 2*2048kB (H) 2*4096kB (H) 0*8192kB 0*16384kB = 32776kB
kern.emerg kernel: 15702 total pagecache pages
kern.debug kernel: [72064.465707] 0 pages in swap cache
kern.debug kernel: [72064.465710] Swap cache stats: add 0, delete 0, find 0/0
kern.debug kernel: [72064.465711] Free swap = 0kB
kern.debug kernel: [72064.465713] Total swap = 0kB
kern.debug kernel: [72064.465715] 65536 pages RAM
kern.debug kernel: [72064.465716] 0 pages HighMem/MovableOnly
kern.debug kernel: [72064.465718] 15322 pages reserved
kern.info kernel: [72064.465720] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
kern.info kernel: [72064.465730] [ 834] 0 834 700 462 7 0 0 0 ubusd
kern.info kernel: [72064.465735] [ 835] 0 835 641 105 7 0 0 0 askfirst
kern.info kernel: [72064.465740] [ 1548] 0 1548 594 378 6 0 0 0 allocpts
kern.info kernel: [72064.465746] [ 2858] 0 2858 774 438 7 0 0 0 logd
kern.info kernel: [72064.465751] [ 2873] 0 2873 1042 492 7 0 0 0 observer
kern.info kernel: [72064.465756] [ 2883] 0 2883 921 497 7 0 0 0 rpcd
kern.info kernel: [72064.465761] [ 2936] 0 2936 819 522 7 0 0 0 netifd
kern.info kernel: [72064.465766] [ 2998] 0 2998 753 394 7 0 0 0 odhcpd
kern.info kernel: [72064.465771] [ 3106] 0 3106 699 408 7 0 0 0 dropbear
kern.info kernel: [72064.465776] [ 3112] 0 3112 917 31 6 0 0 0 netserver
kern.info kernel: [72064.465782] [ 4134] 453 4134 639 430 7 0 0 0 dnsmasq
kern.info kernel: [72064.465787] [ 4226] 0 4226 1269 382 7 0 0 0 hostapd
kern.info kernel: [72064.465792] [ 4232] 0 4232 1269 367 8 0 0 0 hostapd
kern.info kernel: [72064.465797] [ 4456] 0 4456 1328 785 8 0 0 0 uhttpd
kern.info kernel: [72064.465802] [ 4628] 0 4628 789 465 7 0 0 0 lldpd
kern.info kernel: [72064.465807] [ 4632] 121 4632 705 357 7 0 0 0 lldpd
kern.info kernel: [72064.465812] [ 4700] 0 4700 1165 536 7 0 0 0 ntpd
kern.info kernel: [72064.465817] [ 4786] 0 4786 18795 703 19 0 0 0 collectd
kern.info kernel: [72064.465823] [25656] 0 25656 1042 215 7 0 0 0 observer
kern.info kernel: [72064.465828] [25658] 0 25658 1042 42 6 0 0 0 observer
kern.info kernel: [72064.465833] [25659] 0 25659 1043 103 7 0 0 0 wc
kern.err kernel: [72064.465836] Out of memory: Kill process 4456 (uhttpd) score 15 or sacrifice child
kern.err kernel: [72064.473474] Killed process 4456 (uhttpd) total-vm:5312kB, anon-rss:284kB, file-rss:2856kB, shmem-rss:0kB