
Tengo las siguientes excepciones después de que mi servidor se congela y reinicia 2 veces
No puedo decir lo relevante con Docker, pero sucede cada vez que inicio algunos contenedores y no puedo encontrar nada útil en syslog:
Nov 24 15:21:30 shisoft-idc kernel: [25671.700452] Oops: 0000 [#2] SMP
Nov 24 15:21:30 shisoft-idc kernel: [25671.713472] Modules linked in: xt_nat xt_tcpudp veth xt_addrtype xt_conntrack ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack bridge stp llc pf_ring(OX) aufs iptable_filter ip_tables x_tables nls_iso8859_1 gpio_ich mxm_wmi joydev mac_hid x86_pkg_temp_thermal intel_powerclamp coretemp mei_me mei sb_edac ioatdma lpc_ich edac_core dca wmi kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd ipmi_si lp parport hid_generic isci e1000e ahci libsas usbhid ptp hid libahci pps_core scsi_transport_sas megaraid_sas
Nov 24 15:21:30 shisoft-idc kernel: [25671.810245] CPU: 34 PID: 6 Comm: kworker/u80:0 Tainted: G D W OX 3.13.0-40-generic #69-Ubuntu
Nov 24 15:21:30 shisoft-idc kernel: [25671.838447] Hardware name: Supermicro X9DRL-3F/iF/X9DRL-3F/iF, BIOS 3.0a 08/08/2013
Nov 24 15:21:30 shisoft-idc kernel: [25671.853158] task: ffff880851354800 ti: ffff88085135e000 task.ti: ffff88085135e000
Nov 24 15:21:30 shisoft-idc kernel: [25671.867861] RIP: 0010:[<ffffffff8108bc00>] [<ffffffff8108bc00>] kthread_data+0x10/0x20
Nov 24 15:21:30 shisoft-idc kernel: [25671.883418] RSP: 0018:ffff88085135f960 EFLAGS: 00010002
Nov 24 15:21:30 shisoft-idc kernel: [25671.899320] RAX: 0000000000000000 RBX: 0000000000000022 RCX: 0000000000000000
Nov 24 15:21:30 shisoft-idc kernel: [25671.914928] RDX: 0000000000000001 RSI: 0000000000000022 RDI: ffff880851354800
Nov 24 15:21:30 shisoft-idc kernel: [25671.930186] RBP: ffff88085135f960 R08: 0000000000000000 R09: 0000000000000001
Nov 24 15:21:30 shisoft-idc kernel: [25671.945595] R10: ffffffff8106516c R11: ffffea002144d200 R12: ffff88183f394480
Nov 24 15:21:30 shisoft-idc kernel: [25671.960870] R13: 0000000000000022 R14: ffff8808513547f0 R15: ffff880851354800
Nov 24 15:21:30 shisoft-idc kernel: [25671.976402] FS: 0000000000000000(0000) GS:ffff88183f380000(0000) knlGS:0000000000000000
Nov 24 15:21:30 shisoft-idc kernel: [25671.992073] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 24 15:21:30 shisoft-idc kernel: [25672.007445] CR2: 0000000000000028 CR3: 0000000001c0e000 CR4: 00000000001407e0
Nov 24 15:21:30 shisoft-idc kernel: [25672.023175] Stack:
Nov 24 15:21:30 shisoft-idc kernel: [25672.038455] ffff88085135f978 ffffffff81084f51 ffff880851354800 ffff88085135f9d8
Nov 24 15:21:30 shisoft-idc kernel: [25672.054584] ffffffff817233d9 ffff880851354800 ffff88085135ffd8 0000000000014480
Nov 24 15:21:30 shisoft-idc kernel: [25672.070539] 0000000000014480 ffff880851354800 ffff880851354e50 ffff8808513547f0
Nov 24 15:21:30 shisoft-idc kernel: [25672.086803] Call Trace:
Nov 24 15:21:30 shisoft-idc kernel: [25672.103266] [<ffffffff81084f51>] wq_worker_sleeping+0x11/0x90
Nov 24 15:21:30 shisoft-idc kernel: [25672.119191] [<ffffffff817233d9>] __schedule+0x589/0x7d0
Nov 24 15:21:30 shisoft-idc kernel: [25672.135594] [<ffffffff81723649>] schedule+0x29/0x70
Nov 24 15:21:30 shisoft-idc kernel: [25672.150643] [<ffffffff8106a15f>] do_exit+0x6df/0xa50
Nov 24 15:21:30 shisoft-idc kernel: [25672.165683] [<ffffffff817287f9>] oops_end+0xa9/0x150
Nov 24 15:21:30 shisoft-idc kernel: [25672.180692] [<ffffffff810172ab>] die+0x4b/0x70
Nov 24 15:21:30 shisoft-idc kernel: [25672.195466] [<ffffffff8172818e>] do_general_protection+0x11e/0x1b0
Nov 24 15:21:30 shisoft-idc kernel: [25672.210093] [<ffffffff81727aa8>] general_protection+0x28/0x30
Nov 24 15:21:30 shisoft-idc kernel: [25672.224652] [<ffffffff816f6ff2>] ? in6_dev_finish_destroy+0x62/0xf0
Nov 24 15:21:30 shisoft-idc kernel: [25672.238830] [<ffffffff8122a099>] ? remove_proc_entry+0x89/0x1b0
Nov 24 15:21:30 shisoft-idc kernel: [25672.253390] [<ffffffffa0344889>] remove_device_from_ring_list+0x69/0x120 [pf_ring]
Nov 24 15:21:30 shisoft-idc kernel: [25672.268168] [<ffffffffa0344d07>] ring_notifier+0x127/0x425 [pf_ring]
Nov 24 15:21:30 shisoft-idc kernel: [25672.282725] [<ffffffff816f02f8>] ? ip6mr_device_event+0xa8/0xc0
Nov 24 15:21:30 shisoft-idc kernel: [25672.296697] [<ffffffff8172b83c>] notifier_call_chain+0x4c/0x70
Nov 24 15:21:30 shisoft-idc kernel: [25672.310129] [<ffffffff8108fd56>] raw_notifier_call_chain+0x16/0x20
Nov 24 15:21:30 shisoft-idc kernel: [25672.323626] [<ffffffff8161f055>] call_netdevice_notifiers_info+0x35/0x60
Nov 24 15:21:30 shisoft-idc kernel: [25672.336576] [<ffffffff81620469>] rollback_registered_many+0x189/0x2a0
Nov 24 15:21:30 shisoft-idc kernel: [25672.349075] [<ffffffff816205db>] unregister_netdevice_many+0x1b/0xb0
Nov 24 15:21:30 shisoft-idc kernel: [25672.362098] [<ffffffff8162114d>] default_device_exit_batch+0x13d/0x160
Nov 24 15:21:30 shisoft-idc kernel: [25672.374600] [<ffffffff810ab0a0>] ? prepare_to_wait_event+0x100/0x100
Nov 24 15:21:30 shisoft-idc kernel: [25672.386514] [<ffffffff8161b8a3>] ops_exit_list.isra.1+0x53/0x60
Nov 24 15:21:30 shisoft-idc kernel: [25672.398854] [<ffffffff8161c110>] cleanup_net+0x110/0x250
Nov 24 15:21:30 shisoft-idc kernel: [25672.411501] [<ffffffff81083a52>] process_one_work+0x182/0x450
Nov 24 15:21:30 shisoft-idc kernel: [25672.425847] [<ffffffff81084841>] worker_thread+0x121/0x410
Nov 24 15:21:30 shisoft-idc kernel: [25672.439753] [<ffffffff81084720>] ? rescuer_thread+0x430/0x430
Nov 24 15:21:30 shisoft-idc kernel: [25672.454172] [<ffffffff8108b562>] kthread+0xd2/0xf0
Nov 24 15:21:30 shisoft-idc kernel: [25672.467798] [<ffffffff8108b490>] ? kthread_create_on_node+0x1c0/0x1c0
Nov 24 15:21:30 shisoft-idc kernel: [25672.481698] [<ffffffff8172fc7c>] ret_from_fork+0x7c/0xb0
Nov 24 15:21:30 shisoft-idc kernel: [25672.496083] [<ffffffff8108b490>] ? kthread_create_on_node+0x1c0/0x1c0
Nov 24 15:21:30 shisoft-idc kernel: [25672.509968] Code: 00 48 89 e5 5d 48 8b 40 c8 48 c1 e8 02 83 e0 01 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 c0 03 00 00 55 48 89 e5 <48> 8b 40 d8 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
Nov 24 15:21:30 shisoft-idc kernel: [25672.546519] RIP [<ffffffff8108bc00>] kthread_data+0x10/0x20
Nov 24 15:21:30 shisoft-idc kernel: [25672.557942] RSP <ffff88085135f960>
Nov 24 15:21:30 shisoft-idc kernel: [25672.569003] CR2: ffffffffffffffd8
Nov 24 15:21:30 shisoft-idc kernel: [25672.580223] ---[ end trace f801ff82c5094880 ]---
Nov 24 15:21:30 shisoft-idc kernel: [25674.624052] Fixing recursive fault but reboot is needed!
Nov 24 15:21:30 shisoft-idc kernel: [25682.813069] docker0: port 14(veth_app-mine) entered forwarding state
Nov 24 15:21:49 shisoft-idc kernel: [25700.486840] BUG: soft lockup - CPU#20 stuck for 22s! [irqbalance:1544]
Nov 24 15:21:49 shisoft-idc kernel: [25700.498429] Modules linked in: xt_nat xt_tcpudp veth xt_addrtype xt_conntrack ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack bridge stp llc pf_ring(OX) aufs iptable_filter ip_tables x_tables nls_iso8859_1 gpio_ich mxm_wmi joydev mac_hid x86_pkg_temp_thermal intel_powerclamp coretemp mei_me mei sb_edac ioatdma lpc_ich edac_core dca wmi kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd ipmi_si lp parport hid_generic isci e1000e ahci libsas usbhid ptp hid libahci pps_core scsi_transport_sas megaraid_sas
Nov 24 15:21:49 shisoft-idc kernel: [25700.590082] CPU: 20 PID: 1544 Comm: irqbalance Tainted: G D W OX 3.13.0-40-generic #69-Ubuntu
Nov 24 15:21:49 shisoft-idc kernel: [25700.618480] Hardware name: Supermicro X9DRL-3F/iF/X9DRL-3F/iF, BIOS 3.0a 08/08/2013
Nov 24 15:21:49 shisoft-idc kernel: [25700.633291] task: ffff88084e6f8000 ti: ffff88084d980000 task.ti: ffff88084d980000
Nov 24 15:21:49 shisoft-idc kernel: [25700.648956] RIP: 0010:[<ffffffff8172722a>] [<ffffffff8172722a>] _raw_spin_lock+0x3a/0x50
Nov 24 15:21:49 shisoft-idc kernel: [25700.664427] RSP: 0018:ffff88084d981c50 EFLAGS: 00000206
Nov 24 15:21:49 shisoft-idc kernel: [25700.679149] RAX: 0000000000007bfa RBX: 0000000100000001 RCX: 00000000000020de
Nov 24 15:21:49 shisoft-idc kernel: [25700.694642] RDX: 00000000000020e0 RSI: 00000000000020e0 RDI: ffffffff81fb2a40
Nov 24 15:21:49 shisoft-idc kernel: [25700.709489] RBP: ffff88084d981c50 R08: 0000000000017a50 R09: 0000000000000001
Nov 24 15:21:49 shisoft-idc kernel: [25700.724954] R10: ffff880850b76026 R11: ffff880825f08b40 R12: 0000001400000013
Nov 24 15:21:49 shisoft-idc kernel: [25700.739837] R13: 0000000100000001 R14: 0000000000002df8 R15: 0000000000000000
Nov 24 15:21:49 shisoft-idc kernel: [25700.755476] FS: 00007fdf26b71780(0000) GS:ffff88085fa80000(0000) knlGS:0000000000000000
Nov 24 15:21:49 shisoft-idc kernel: [25700.770582] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 24 15:21:49 shisoft-idc kernel: [25700.785925] CR2: 00007f860014a108 CR3: 000000084b8d6000 CR4: 00000000001407e0
Nov 24 15:21:49 shisoft-idc kernel: [25700.801644] Stack:
Nov 24 15:21:49 shisoft-idc kernel: [25700.818004] ffff88084d981c80 ffffffff81229cf5 ffff88085f018040 ffff880825f08b40
Nov 24 15:21:49 shisoft-idc kernel: [25700.834178] 0000000000000101 ffff88085f008240 ffff88084d981c90 ffffffff81229dcb
Nov 24 15:21:49 shisoft-idc kernel: [25700.850069] ffff88084d981cb8 ffffffff8122491c ffff880825f08b40 0000000000008000
Nov 24 15:21:49 shisoft-idc kernel: [25700.866236] Call Trace:
Nov 24 15:21:49 shisoft-idc kernel: [25700.881916] [<ffffffff81229cf5>] proc_lookup_de+0x25/0xe0
Nov 24 15:21:49 shisoft-idc kernel: [25700.898082] [<ffffffff81229dcb>] proc_lookup+0x1b/0x20
Nov 24 15:21:49 shisoft-idc kernel: [25700.914531] [<ffffffff8122491c>] proc_root_lookup+0x1c/0x40
Nov 24 15:21:49 shisoft-idc kernel: [25700.932091] [<ffffffff811c75dd>] lookup_real+0x1d/0x50
Nov 24 15:21:49 shisoft-idc kernel: [25700.951143] [<ffffffff811cc8e3>] do_last+0x983/0x1230
Nov 24 15:21:49 shisoft-idc kernel: [25700.969605] [<ffffffff811ca561>] ? link_path_walk+0x71/0x870
Nov 24 15:21:49 shisoft-idc kernel: [25700.988492] [<ffffffff813137ab>] ? apparmor_file_alloc_security+0x5b/0x180
Nov 24 15:21:49 shisoft-idc kernel: [25701.007602] [<ffffffff812d5df6>] ? security_file_alloc+0x16/0x20
Nov 24 15:21:49 shisoft-idc kernel: [25701.025095] [<ffffffff811cd24b>] path_openat+0xbb/0x650
Nov 24 15:21:49 shisoft-idc kernel: [25701.039760] [<ffffffff81012609>] ? __switch_to+0x169/0x4c0
Nov 24 15:21:49 shisoft-idc kernel: [25701.054466] [<ffffffff811cd87f>] ? getname_flags+0x4f/0x190
Nov 24 15:21:49 shisoft-idc kernel: [25701.068670] [<ffffffff811ce64a>] do_filp_open+0x3a/0x90
Nov 24 15:21:49 shisoft-idc kernel: [25701.082653] [<ffffffff811db4d7>] ? __alloc_fd+0xa7/0x130
Nov 24 15:21:49 shisoft-idc kernel: [25701.096477] [<ffffffff811bccc9>] do_sys_open+0x129/0x280
Nov 24 15:21:49 shisoft-idc kernel: [25701.109503] [<ffffffff811bce3e>] SyS_open+0x1e/0x20
Nov 24 15:21:49 shisoft-idc kernel: [25701.122350] [<ffffffff8172fd2d>] system_call_fastpath+0x1a/0x1f
Nov 24 15:21:49 shisoft-idc kernel: [25701.134992] Code: 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 02 5d c3 83 e2 fe 0f b7 f2 b8 00 80 00 00 eb 0c 0f 1f 44 00 00 f3 90 83 e8 01 74 0a 0f b7 0f <66> 39 ca 75 f1 5d c3 0f 1f 80 00 00 00 00 eb da 66 0f 1f 44 00
La uname me saca la siguiente informacion
Linux shisoft-idc 3.13.0-40-generic #69-Ubuntu SMP Thu Nov 13 17:53:56 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
Las versiones acoplables
Client version: 1.3.1
Client API version: 1.15
Go version (client): go1.3.3
Git commit (client): 4e9bbfa
OS/Arch (client): linux/amd64
Server version: 1.3.1
Server API version: 1.15
Go version (server): go1.3.3
Git commit (server): 4e9bbfa
e información de la ventana acoplable
Containers: 19
Images: 343
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Dirs: 382
Execution Driver: native-0.2
Kernel Version: 3.13.0-40-generic
Operating System: Ubuntu 14.04.1 LTS
Debug mode (server): false
Debug mode (client): true
Fds: 10
Goroutines: 10
EventsListeners: 0
Init Path: /usr/bin/docker
Respuesta1
Basado en esta línea:
24 de noviembre 15:21:49 kernel shisoft-idc: [25700.486840] ERROR: bloqueo suave - CPU#20 bloqueada durante 22 segundos. [equilibrio de irq: 1544]
lo que más o menos significa que lograste bloquear tu CPU durante al menos 22 segundos. . .
Creo que el problema es su CPU Intel, si la CPU no está muerta o agonizante, podría simplemente tener un error, eche un vistazo a la actualización del microcódigo (el firmware de la CPU), lea:
https://lists.debian.org/debian-user/2013/09/msg00126.html
y
http://wiki.gentoo.org/wiki/Intel_microcode
Como mínimo, todas las CPU Intel xeon, i3, i5 e i7 necesitan una corrección de seguridad crítica de su microcódigo.
su distribución de Linux probablemente tenga un servicio de actualización de microcódigo
Vuelva y díganos si la actualización del microcódigo podría solucionar el problema (pero si no... me temo que tendrá que comprar una CPU nueva).
Tenga cuidado, simplemente actualizar el microcódigo una vez al arrancar no siempre es suficiente; a menudo se necesita un servicio en ejecución para reinyectar la actualización del nicrocódigo cada vez que se reinicia la CPU.
Respuesta2
Esas líneas en el primer rastreo parecen un error del kernel:
Nov 24 15:21:30 shisoft-idc kernel: [25672.210093] [<ffffffff81727aa8>] general_protection+0x28/0x30
Nov 24 15:21:30 shisoft-idc kernel: [25672.224652] [<ffffffff816f6ff2>] ? in6_dev_finish_destroy+0x62/0xf0
Nov 24 15:21:30 shisoft-idc kernel: [25672.238830] [<ffffffff8122a099>] ? remove_proc_entry+0x89/0x1b0
Nov 24 15:21:30 shisoft-idc kernel: [25672.253390] [<ffffffffa0344889>] remove_device_from_ring_list+0x69/0x120 [pf_ring]
Nov 24 15:21:30 shisoft-idc kernel: [25672.268168] [<ffffffffa0344d07>] ring_notifier+0x127/0x425 [pf_ring]
... skip ...
Nov 24 15:21:30 shisoft-idc kernel: [25674.624052] Fixing recursive fault but reboot is needed!
Nov 24 15:21:30 shisoft-idc kernel: [25682.813069] docker0: port 14(veth_app-mine) entered forwarding state
Probablemente pueda solucionar ese problema cambiando la configuración de red de su ventana acoplable (por ejemplo, deshabilitando IPv6).
O si tiene algo de tiempo libre, puede intentar resolver el problema ffffffff816f6ff2
con un LOC e intentar descubrir qué pudo haber causado el GPF allí.
PD. Además, probablemente no hayas publicado tu primer Ups aquí, ya que ya lo hiciste X
y W
en tuTainted: G D W OX
Respuesta3
Resolví este problema actualizando el kernel de mi sistema de 13.13 a 13.17.4
En otra máquina, no tuve el mismo problema con el kernel 13.13 con Docker. eso fue extraño