mdadm 배열 조립으로 인해 커널 중단 발생

mdadm 배열 조립으로 인해 커널 중단 발생

저는 2개의 대형 mdadm 어레이를 가지고 있으며 이를 하나의 볼륨 그룹으로 결합합니다. 최근 두 번째 어레이에 새 드라이브를 추가하고 있었는데 정전이 발생했습니다. 일반적으로 중단 후 어레이를 다시 시작하거나 재구성을 재개하는 데 큰 문제가 없지만 이번에는 많은 문제가 있습니다.

시스템 세부 정보: CentOS 6.8 x64

Linux myserver 2.6.32-642.4.2.el6.x86_64 #1 SMP Tue Aug 23 19:58:13 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

처음 부팅할 때 어레이가 조립되지만 성능이 저하되어 시작되지 않습니다. 다음으로는 시작할 수 없습니다.

mdadm -R /dev/md126

그런 다음 중지합니다.

mdadm -S /dev/md126

정상적으로 조립하면 시작되지 않습니다.

mdadm --assemble --scan
mdadm: device 26 in /dev/md/128 has wrong state in superblock, but /dev/sdn seems ok
mdadm: /dev/md/128 assembled from 12 drives and 2 spares - not enough to start the array while not clean - consider --force.
mdadm: No arrays found in config file or automatically

그래서 나는 다음과 같이 조립합니다:

mdadm --assemble --scan --force
mdadm: clearing FAULTY flag for device 1 in /dev/md/128 for /dev/sdn
mdadm: Marking array /dev/md/128 as 'clean'
mdadm: /dev/md/128 has been started with 12 drives (out of 13) and 2 spares.

이 시점에서 내 세션이 중단되었습니다. 하지만 SSH를 통해 장치에 다시 연결하면 다른 세션이 계속 중단되는 동안에도 명령을 실행할 수 있습니다.

고양이 /proc/mdstat

md128 : active raid6 sdf[0] sdl[9](S) sdn[14](S) sdo[6] sdm[7] sdc[10] sdb[11] sdd[12] sde[13] sdk[5] sdj[4] sdi[3] sdh[2] sdg[1]
      23441323008 blocks super 1.2 level 6, 512k chunk, algorithm 2 [13/12] [UUUUUUUUUUUU_]
      [>....................]  reshape =  0.0% (128496/3906887168) finish=133658963.6min speed=0K/sec
      bitmap: 1/30 pages [4KB], 65536KB chunk

md127 : active (auto-read-only) raid6 sdaa[2] sdab[13] sdac[0] sdad[15] sdy[4] sdx[5] sdz[14] sdw[6] sdr[9] sds[8] sdq[10] sdp[16] sdu[11] sdv[7] sdt[12]
      50789533184 blocks super 1.2 level 6, 512k chunk, algorithm 2 [15/15] [UUUUUUUUUUUUUUU]

unused devices: <none>

모양 변경은 결코 그 지점을 지나치지 않습니다. 항상 거기에 붙어 있어요.

dmesg에서 다음 오류가 발생합니다.

created bitmap (30 pages) for device md128
md128: bitmap initialized from disk: read 2 pages, set 1 of 59615 bits
md128: detected capacity change from 0 to 24003914760192
md: reshape of RAID array md128
md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape.
 md128:
md: using 128k window, over a total of 3906887168k.
 unknown partition table
INFO: task mdadm:2357 blocked for more than 120 seconds.
      Not tainted 2.6.32-642.4.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdadm         D 0000000000000002     0  2357   2179 0x00000004
 ffff880821773688 0000000000000082 0000000000000000 0000000000000086
 ffff880821773618 ffffffff810633a3 0000013e9de36948 ffff88081f96d440
 ffff88081dc65ed0 000000010010516a ffff88081d713068 ffff880821773fd8
Call Trace:
 [<ffffffff810633a3>] ? __wake_up+0x53/0x70
 [<ffffffffa0240805>] get_active_stripe+0x2d5/0x880 [raid456]
 [<ffffffff81130d00>] ? mempool_alloc+0x20/0x140
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff810a6bce>] ? prepare_to_wait+0x4e/0x80
 [<ffffffffa02455ef>] make_request+0x19f/0xcb0 [raid456]
 [<ffffffff812ab7de>] ? __sg_alloc_table+0x7e/0x130
 [<ffffffff81056cc5>] ? gup_pte_range+0xe5/0x130
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8143066f>] md_make_request+0xdf/0x240
 [<ffffffff8127ddb0>] generic_make_request+0x240/0x5a0
 [<ffffffff811d941c>] ? do_direct_IO+0x57c/0xfa0
 [<ffffffff811d60cb>] ? bio_alloc_bioset+0x5b/0xf0
 [<ffffffff8127e180>] submit_bio+0x70/0x120
 [<ffffffff811daabd>] __blockdev_direct_IO_newtrunc+0xc7d/0x1270
 [<ffffffff811d6320>] ? blkdev_get_block+0x0/0x20
 [<ffffffff811db127>] __blockdev_direct_IO+0x77/0xe0
 [<ffffffff811d6320>] ? blkdev_get_block+0x0/0x20
 [<ffffffff811d73a7>] blkdev_direct_IO+0x57/0x60
 [<ffffffff811d6320>] ? blkdev_get_block+0x0/0x20
 [<ffffffff8113028b>] generic_file_aio_read+0x6bb/0x700
 [<ffffffff8143877f>] ? md_ioctl+0x31f/0x1ac0
 [<ffffffff811d7f00>] ? blkdev_open+0x0/0xc0
 [<ffffffff81196d47>] ? __dentry_open+0x257/0x380
 [<ffffffff811d6891>] blkdev_aio_read+0x51/0x80
 [<ffffffff81199a6a>] do_sync_read+0xfa/0x140
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff811d66bc>] ? block_ioctl+0x3c/0x40
 [<ffffffff811af522>] ? vfs_ioctl+0x22/0xa0
 [<ffffffff811af6c4>] ? do_vfs_ioctl+0x84/0x580
 [<ffffffff8123aa06>] ? security_file_permission+0x16/0x20
 [<ffffffff8119a365>] vfs_read+0xb5/0x1a0
 [<ffffffff8119b116>] ? fget_light_pos+0x16/0x50
 [<ffffffff8119a6b1>] sys_read+0x51/0xb0
 [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
INFO: task md128_reshape:2428 blocked for more than 120 seconds.
      Not tainted 2.6.32-642.4.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
md128_reshape D 0000000000000001     0  2428      2 0x00000000
 ffff88081fdcfa60 0000000000000046 0000000000000000 0000000000000086
 ffff88081fdcf9f0 ffffffff810633a3 0000013d983220e0 ffff88081f96d440
 ffff88081dc65ed0 0000000100103f9f ffff88081ea1e5f8 ffff88081fdcffd8
Call Trace:
 [<ffffffff810633a3>] ? __wake_up+0x53/0x70
 [<ffffffffa0240805>] get_active_stripe+0x2d5/0x880 [raid456]
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa02447c0>] reshape_request+0x440/0xa20 [raid456]
 [<ffffffff810633a3>] ? __wake_up+0x53/0x70
 [<ffffffffa02450b2>] sync_request+0x312/0x3a0 [raid456]
 [<ffffffff81430ff7>] md_do_sync+0x6c7/0xd60
 [<ffffffff81431ae5>] md_thread+0x115/0x150
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff814319d0>] ? md_thread+0x0/0x150
 [<ffffffff810a640e>] kthread+0x9e/0xc0
 [<ffffffff8100c28a>] child_rip+0xa/0x20
 [<ffffffff810a6370>] ? kthread+0x0/0xc0
 [<ffffffff8100c280>] ? child_rip+0x0/0x20
INFO: task blkid:2430 blocked for more than 120 seconds.
      Not tainted 2.6.32-642.4.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
blkid         D 0000000000000002     0  2430      1 0x00000000
 ffff8808211d36f8 0000000000000082 0000000000000000 0000000000000082
 ffff8808211d3688 ffffffff810633a3 0000013d9831af63 ffff88081f96d440
 ffff88081dc65ed0 0000000100104045 ffff88081e615ad8 ffff8808211d3fd8
Call Trace:
 [<ffffffff810633a3>] ? __wake_up+0x53/0x70
 [<ffffffffa0240805>] get_active_stripe+0x2d5/0x880 [raid456]
 [<ffffffff8142af00>] ? try_module_get+0x30/0xb0
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff810a6bce>] ? prepare_to_wait+0x4e/0x80
 [<ffffffffa02455ef>] make_request+0x19f/0xcb0 [raid456]
 [<ffffffff81340b4b>] ? mix_pool_bytes_extract+0x16b/0x180
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81151319>] ? zone_statistics+0x99/0xc0
 [<ffffffff8143066f>] md_make_request+0xdf/0x240
 [<ffffffff8127ddb0>] generic_make_request+0x240/0x5a0
 [<ffffffff81130ba5>] ? mempool_alloc_slab+0x15/0x20
 [<ffffffff81130d43>] ? mempool_alloc+0x63/0x140
 [<ffffffff8127e180>] submit_bio+0x70/0x120
 [<ffffffff811cfc8d>] submit_bh+0x11d/0x1f0
 [<ffffffff811d2a9c>] block_read_full_page+0x27c/0x3a0
 [<ffffffff811d6320>] ? blkdev_get_block+0x0/0x20
 [<ffffffff811513ee>] ? __inc_zone_page_state+0x2e/0x30
 [<ffffffff81145480>] ? __lru_cache_add+0x40/0x90
 [<ffffffff811d7498>] blkdev_readpage+0x18/0x20
 [<ffffffff811441aa>] __do_page_cache_readahead+0x20a/0x210
 [<ffffffff81144251>] force_page_cache_readahead+0x71/0xa0
 [<ffffffff81144773>] page_cache_sync_readahead+0x43/0x50
 [<ffffffff81130128>] generic_file_aio_read+0x558/0x700
 [<ffffffff811d6891>] blkdev_aio_read+0x51/0x80
 [<ffffffff81199a6a>] do_sync_read+0xfa/0x140
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8115f3d0>] ? unmap_region+0x110/0x130
 [<ffffffff8123aa06>] ? security_file_permission+0x16/0x20
 [<ffffffff8119a365>] vfs_read+0xb5/0x1a0
 [<ffffffff8119b116>] ? fget_light_pos+0x16/0x50
 [<ffffffff8119a6b1>] sys_read+0x51/0xb0
 [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
INFO: task mdadm:2357 blocked for more than 120 seconds.
      Not tainted 2.6.32-642.4.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdadm         D 0000000000000002     0  2357   2179 0x00000004
 ffff880821773688 0000000000000082 0000000000000000 0000000000000086
 ffff880821773618 ffffffff810633a3 0000013e9de36948 ffff88081f96d440
 ffff88081dc65ed0 000000010010516a ffff88081d713068 ffff880821773fd8
Call Trace:
 [<ffffffff810633a3>] ? __wake_up+0x53/0x70
 [<ffffffffa0240805>] get_active_stripe+0x2d5/0x880 [raid456]
 [<ffffffff81130d00>] ? mempool_alloc+0x20/0x140
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff810a6bce>] ? prepare_to_wait+0x4e/0x80
 [<ffffffffa02455ef>] make_request+0x19f/0xcb0 [raid456]
 [<ffffffff812ab7de>] ? __sg_alloc_table+0x7e/0x130
 [<ffffffff81056cc5>] ? gup_pte_range+0xe5/0x130
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8143066f>] md_make_request+0xdf/0x240
 [<ffffffff8127ddb0>] generic_make_request+0x240/0x5a0
 [<ffffffff811d941c>] ? do_direct_IO+0x57c/0xfa0
 [<ffffffff811d60cb>] ? bio_alloc_bioset+0x5b/0xf0
 [<ffffffff8127e180>] submit_bio+0x70/0x120
 [<ffffffff811daabd>] __blockdev_direct_IO_newtrunc+0xc7d/0x1270
 [<ffffffff811d6320>] ? blkdev_get_block+0x0/0x20
 [<ffffffff811db127>] __blockdev_direct_IO+0x77/0xe0
 [<ffffffff811d6320>] ? blkdev_get_block+0x0/0x20
 [<ffffffff811d73a7>] blkdev_direct_IO+0x57/0x60
 [<ffffffff811d6320>] ? blkdev_get_block+0x0/0x20
 [<ffffffff8113028b>] generic_file_aio_read+0x6bb/0x700
 [<ffffffff8143877f>] ? md_ioctl+0x31f/0x1ac0
 [<ffffffff811d7f00>] ? blkdev_open+0x0/0xc0
 [<ffffffff81196d47>] ? __dentry_open+0x257/0x380
 [<ffffffff811d6891>] blkdev_aio_read+0x51/0x80
 [<ffffffff81199a6a>] do_sync_read+0xfa/0x140
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff811d66bc>] ? block_ioctl+0x3c/0x40
 [<ffffffff811af522>] ? vfs_ioctl+0x22/0xa0
 [<ffffffff811af6c4>] ? do_vfs_ioctl+0x84/0x580
 [<ffffffff8123aa06>] ? security_file_permission+0x16/0x20
 [<ffffffff8119a365>] vfs_read+0xb5/0x1a0
 [<ffffffff8119b116>] ? fget_light_pos+0x16/0x50
 [<ffffffff8119a6b1>] sys_read+0x51/0xb0
 [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
INFO: task md128_reshape:2428 blocked for more than 120 seconds.
      Not tainted 2.6.32-642.4.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
md128_reshape D 0000000000000001     0  2428      2 0x00000000
 ffff88081fdcfa60 0000000000000046 0000000000000000 0000000000000086
 ffff88081fdcf9f0 ffffffff810633a3 0000013d983220e0 ffff88081f96d440
 ffff88081dc65ed0 0000000100103f9f ffff88081ea1e5f8 ffff88081fdcffd8
Call Trace:
 [<ffffffff810633a3>] ? __wake_up+0x53/0x70
 [<ffffffffa0240805>] get_active_stripe+0x2d5/0x880 [raid456]
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa02447c0>] reshape_request+0x440/0xa20 [raid456]
 [<ffffffff810633a3>] ? __wake_up+0x53/0x70
 [<ffffffffa02450b2>] sync_request+0x312/0x3a0 [raid456]
 [<ffffffff81430ff7>] md_do_sync+0x6c7/0xd60
 [<ffffffff81431ae5>] md_thread+0x115/0x150
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff814319d0>] ? md_thread+0x0/0x150
 [<ffffffff810a640e>] kthread+0x9e/0xc0
 [<ffffffff8100c28a>] child_rip+0xa/0x20
 [<ffffffff810a6370>] ? kthread+0x0/0xc0
 [<ffffffff8100c280>] ? child_rip+0x0/0x20
INFO: task blkid:2430 blocked for more than 120 seconds.
      Not tainted 2.6.32-642.4.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
blkid         D 0000000000000002     0  2430      1 0x00000000
 ffff8808211d36f8 0000000000000082 0000000000000000 0000000000000082
 ffff8808211d3688 ffffffff810633a3 0000013d9831af63 ffff88081f96d440
 ffff88081dc65ed0 0000000100104045 ffff88081e615ad8 ffff8808211d3fd8
Call Trace:
 [<ffffffff810633a3>] ? __wake_up+0x53/0x70
 [<ffffffffa0240805>] get_active_stripe+0x2d5/0x880 [raid456]
 [<ffffffff8142af00>] ? try_module_get+0x30/0xb0
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff810a6bce>] ? prepare_to_wait+0x4e/0x80
 [<ffffffffa02455ef>] make_request+0x19f/0xcb0 [raid456]
 [<ffffffff81340b4b>] ? mix_pool_bytes_extract+0x16b/0x180
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81151319>] ? zone_statistics+0x99/0xc0
 [<ffffffff8143066f>] md_make_request+0xdf/0x240
 [<ffffffff8127ddb0>] generic_make_request+0x240/0x5a0
 [<ffffffff81130ba5>] ? mempool_alloc_slab+0x15/0x20
 [<ffffffff81130d43>] ? mempool_alloc+0x63/0x140
 [<ffffffff8127e180>] submit_bio+0x70/0x120
 [<ffffffff811cfc8d>] submit_bh+0x11d/0x1f0
 [<ffffffff811d2a9c>] block_read_full_page+0x27c/0x3a0
 [<ffffffff811d6320>] ? blkdev_get_block+0x0/0x20
 [<ffffffff811513ee>] ? __inc_zone_page_state+0x2e/0x30
 [<ffffffff81145480>] ? __lru_cache_add+0x40/0x90
 [<ffffffff811d7498>] blkdev_readpage+0x18/0x20
 [<ffffffff811441aa>] __do_page_cache_readahead+0x20a/0x210
 [<ffffffff81144251>] force_page_cache_readahead+0x71/0xa0
 [<ffffffff81144773>] page_cache_sync_readahead+0x43/0x50
 [<ffffffff81130128>] generic_file_aio_read+0x558/0x700
 [<ffffffff811d6891>] blkdev_aio_read+0x51/0x80
 [<ffffffff81199a6a>] do_sync_read+0xfa/0x140
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8115f3d0>] ? unmap_region+0x110/0x130
 [<ffffffff8123aa06>] ? security_file_permission+0x16/0x20
 [<ffffffff8119a365>] vfs_read+0xb5/0x1a0
 [<ffffffff8119b116>] ? fget_light_pos+0x16/0x50
 [<ffffffff8119a6b1>] sys_read+0x51/0xb0
 [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
INFO: task mdadm:2481 blocked for more than 120 seconds.
      Not tainted 2.6.32-642.4.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdadm         D 0000000000000003     0  2481   2459 0x00000004
 ffff8808226ef688 0000000000000086 0000000000000000 0000000000000086
 ffff8808226ef618 ffffffff810633a3 0000014fb75098d5 ffff88081f96d440
 ffff88081dc65ed0 0000000100117098 ffff88081e7f7068 ffff8808226effd8
Call Trace:
 [<ffffffff810633a3>] ? __wake_up+0x53/0x70
 [<ffffffffa0240805>] get_active_stripe+0x2d5/0x880 [raid456]
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff810a6bce>] ? prepare_to_wait+0x4e/0x80
 [<ffffffffa02455ef>] make_request+0x19f/0xcb0 [raid456]
 [<ffffffff8117ec7c>] ? transfer_objects+0x5c/0x80
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8117f66e>] ? cache_alloc_refill+0x9e/0x240
 [<ffffffff8143066f>] md_make_request+0xdf/0x240
 [<ffffffff8127ddb0>] generic_make_request+0x240/0x5a0
 [<ffffffff811d941c>] ? do_direct_IO+0x57c/0xfa0
 [<ffffffff81151319>] ? zone_statistics+0x99/0xc0
 [<ffffffff811d60cb>] ? bio_alloc_bioset+0x5b/0xf0
 [<ffffffff8127e180>] submit_bio+0x70/0x120
 [<ffffffff811daabd>] __blockdev_direct_IO_newtrunc+0xc7d/0x1270
 [<ffffffff811d6320>] ? blkdev_get_block+0x0/0x20
 [<ffffffff811db127>] __blockdev_direct_IO+0x77/0xe0
 [<ffffffff811d6320>] ? blkdev_get_block+0x0/0x20
 [<ffffffff811d73a7>] blkdev_direct_IO+0x57/0x60
 [<ffffffff811d6320>] ? blkdev_get_block+0x0/0x20
 [<ffffffff8113028b>] generic_file_aio_read+0x6bb/0x700
 [<ffffffff8143877f>] ? md_ioctl+0x31f/0x1ac0
 [<ffffffff811d7f00>] ? blkdev_open+0x0/0xc0
 [<ffffffff81196d47>] ? __dentry_open+0x257/0x380
 [<ffffffff811d6891>] blkdev_aio_read+0x51/0x80
 [<ffffffff81199a6a>] do_sync_read+0xfa/0x140
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff811d66bc>] ? block_ioctl+0x3c/0x40
 [<ffffffff811af522>] ? vfs_ioctl+0x22/0xa0
 [<ffffffff811af6c4>] ? do_vfs_ioctl+0x84/0x580
 [<ffffffff8123aa06>] ? security_file_permission+0x16/0x20
 [<ffffffff8119a365>] vfs_read+0xb5/0x1a0
 [<ffffffff8119b116>] ? fget_light_pos+0x16/0x50
 [<ffffffff8119a6b1>] sys_read+0x51/0xb0
 [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
INFO: task pvscan:2518 blocked for more than 120 seconds.
      Not tainted 2.6.32-642.4.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
pvscan        D 0000000000000003     0  2518   2505 0x00000000
 ffff8808205e3688 0000000000000086 0000000000000000 0000000000000086
 ffff8808205e3618 ffffffff810633a3 000001550b5d4884 ffff88081f96d440
 ffff88081dc65ed0 000000010011ca13 ffff88081ddf5ad8 ffff8808205e3fd8
Call Trace:
 [<ffffffff810633a3>] ? __wake_up+0x53/0x70
 [<ffffffffa0240805>] get_active_stripe+0x2d5/0x880 [raid456]
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff810a6bce>] ? prepare_to_wait+0x4e/0x80
 [<ffffffffa02455ef>] make_request+0x19f/0xcb0 [raid456]
 [<ffffffff812ab7de>] ? __sg_alloc_table+0x7e/0x130
 [<ffffffff81056cc5>] ? gup_pte_range+0xe5/0x130
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8143066f>] md_make_request+0xdf/0x240
 [<ffffffff8127ddb0>] generic_make_request+0x240/0x5a0
 [<ffffffff811d941c>] ? do_direct_IO+0x57c/0xfa0
 [<ffffffff811d60cb>] ? bio_alloc_bioset+0x5b/0xf0
 [<ffffffff8127e180>] submit_bio+0x70/0x120
 [<ffffffff811daabd>] __blockdev_direct_IO_newtrunc+0xc7d/0x1270
 [<ffffffff811d6320>] ? blkdev_get_block+0x0/0x20
 [<ffffffff811db127>] __blockdev_direct_IO+0x77/0xe0
 [<ffffffff811d6320>] ? blkdev_get_block+0x0/0x20
 [<ffffffff811d73a7>] blkdev_direct_IO+0x57/0x60
 [<ffffffff811d6320>] ? blkdev_get_block+0x0/0x20
 [<ffffffff8113028b>] generic_file_aio_read+0x6bb/0x700
 [<ffffffff811d7ef0>] ? blkdev_get+0x10/0x20
 [<ffffffff811d7f00>] ? blkdev_open+0x0/0xc0
 [<ffffffff81196d47>] ? __dentry_open+0x257/0x380
 [<ffffffff811d6891>] blkdev_aio_read+0x51/0x80
 [<ffffffff81199a6a>] do_sync_read+0xfa/0x140
 [<ffffffff81397a3f>] ? scsi_device_put+0x2f/0x40
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff811d66bc>] ? block_ioctl+0x3c/0x40
 [<ffffffff811af522>] ? vfs_ioctl+0x22/0xa0
 [<ffffffff811af6c4>] ? do_vfs_ioctl+0x84/0x580
 [<ffffffff8123aa06>] ? security_file_permission+0x16/0x20
 [<ffffffff8119a365>] vfs_read+0xb5/0x1a0
 [<ffffffff8119b116>] ? fget_light_pos+0x16/0x50
 [<ffffffff8119a6b1>] sys_read+0x51/0xb0
 [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
INFO: task mdadm:2357 blocked for more than 120 seconds.
      Not tainted 2.6.32-642.4.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdadm         D 0000000000000002     0  2357   2179 0x00000004
 ffff880821773688 0000000000000082 0000000000000000 0000000000000086
 ffff880821773618 ffffffff810633a3 0000013e9de36948 ffff88081f96d440
 ffff88081dc65ed0 000000010010516a ffff88081d713068 ffff880821773fd8
Call Trace:
 [<ffffffff810633a3>] ? __wake_up+0x53/0x70
 [<ffffffffa0240805>] get_active_stripe+0x2d5/0x880 [raid456]
 [<ffffffff81130d00>] ? mempool_alloc+0x20/0x140
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff810a6bce>] ? prepare_to_wait+0x4e/0x80
 [<ffffffffa02455ef>] make_request+0x19f/0xcb0 [raid456]
 [<ffffffff812ab7de>] ? __sg_alloc_table+0x7e/0x130
 [<ffffffff81056cc5>] ? gup_pte_range+0xe5/0x130
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8143066f>] md_make_request+0xdf/0x240
 [<ffffffff8127ddb0>] generic_make_request+0x240/0x5a0
 [<ffffffff811d941c>] ? do_direct_IO+0x57c/0xfa0
 [<ffffffff811d60cb>] ? bio_alloc_bioset+0x5b/0xf0
 [<ffffffff8127e180>] submit_bio+0x70/0x120
 [<ffffffff811daabd>] __blockdev_direct_IO_newtrunc+0xc7d/0x1270
 [<ffffffff811d6320>] ? blkdev_get_block+0x0/0x20
 [<ffffffff811db127>] __blockdev_direct_IO+0x77/0xe0
 [<ffffffff811d6320>] ? blkdev_get_block+0x0/0x20
 [<ffffffff811d73a7>] blkdev_direct_IO+0x57/0x60
 [<ffffffff811d6320>] ? blkdev_get_block+0x0/0x20
 [<ffffffff8113028b>] generic_file_aio_read+0x6bb/0x700
 [<ffffffff8143877f>] ? md_ioctl+0x31f/0x1ac0
 [<ffffffff811d7f00>] ? blkdev_open+0x0/0xc0
 [<ffffffff81196d47>] ? __dentry_open+0x257/0x380
 [<ffffffff811d6891>] blkdev_aio_read+0x51/0x80
 [<ffffffff81199a6a>] do_sync_read+0xfa/0x140
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff811d66bc>] ? block_ioctl+0x3c/0x40
 [<ffffffff811af522>] ? vfs_ioctl+0x22/0xa0
 [<ffffffff811af6c4>] ? do_vfs_ioctl+0x84/0x580
 [<ffffffff8123aa06>] ? security_file_permission+0x16/0x20
 [<ffffffff8119a365>] vfs_read+0xb5/0x1a0
 [<ffffffff8119b116>] ? fget_light_pos+0x16/0x50
 [<ffffffff8119a6b1>] sys_read+0x51/0xb0
 [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
INFO: task md128_reshape:2428 blocked for more than 120 seconds.
      Not tainted 2.6.32-642.4.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
md128_reshape D 0000000000000001     0  2428      2 0x00000000
 ffff88081fdcfa60 0000000000000046 0000000000000000 0000000000000086
 ffff88081fdcf9f0 ffffffff810633a3 0000013d983220e0 ffff88081f96d440
 ffff88081dc65ed0 0000000100103f9f ffff88081ea1e5f8 ffff88081fdcffd8
Call Trace:
 [<ffffffff810633a3>] ? __wake_up+0x53/0x70
 [<ffffffffa0240805>] get_active_stripe+0x2d5/0x880 [raid456]
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa02447c0>] reshape_request+0x440/0xa20 [raid456]
 [<ffffffff810633a3>] ? __wake_up+0x53/0x70
 [<ffffffffa02450b2>] sync_request+0x312/0x3a0 [raid456]
 [<ffffffff81430ff7>] md_do_sync+0x6c7/0xd60
 [<ffffffff81431ae5>] md_thread+0x115/0x150
 [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff814319d0>] ? md_thread+0x0/0x150
 [<ffffffff810a640e>] kthread+0x9e/0xc0
 [<ffffffff8100c28a>] child_rip+0xa/0x20
 [<ffffffff810a6370>] ? kthread+0x0/0xc0
 [<ffffffff8100c280>] ? child_rip+0x0/0x20

이제 md 배열을 제어하는 ​​pvscan 및 기타 명령과 같은 명령이 중단됩니다. 본질적으로 제가 할 수 있는 유일한 일은 재부팅하여 어레이를 제어할 수 있는 상태로 돌아가는 것입니다. 재부팅도 끝까지 가는데, 이 시점에서는 하드 리셋을 해야 합니다.

이 문제를 해결하는 방법에 대한 아이디어가 있는 사람이 있나요? 우분투 라이브 CD로 부팅을 시도했지만 어레이를 조립하려고 했을 때 전체가 멈췄습니다.

답변1

그래서 결국 다음 명령을 수행하여 이 어레이를 복구할 수 있었습니다.

mdadm --create /dev/md126 --level=6 --raid-devices=14 --name=gigantor:128 /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdo /dev/sdm missing missing /dev/sdc /dev/sdb /dev/sdd /dev/sde --assume-clean

그런 다음 두 장치를 다시 추가합니다.

mdadm --add /dev/md126 /dev/sdn
mdadm --add /dev/md126 /dev/sdl

그 후 내 LVM 정보도 손상되었으므로 해당 정보를 복구해야 했습니다.

pvcreate --uuid "kvEA4X-vobg-2Ipz-ITF1-ZhtW-Ewej-6liKVx" --restorefile /etc/lvm/backup/vg_data /dev/md126
vgcfgrestore vg_data
lvchange -ay /dev/vg_data/lvm0

그런 다음 LVM을 xfs_check하려고 시도했는데 복구를 실행하고 로그를 삭제해야 한다고 말했습니다.

xfs_repair -L /dev/mapper/vg_data-lvm0

이 문제를 복구한 후 이제 LVM을 마운트할 수 있었고 데이터는 그대로 유지되었습니다.

내 LVM이 수리 중입니다.

Personalities : [raid6] [raid5] [raid4]
md126 : active raid6 sdm[14] sdl[7] sdk[15] sdn[6] sdj[5] sdi[4] sdg[2] sde[0] sdf[1] sdh[3] sdc[12] sda[11] sdd[13] sdb[10]
      46882646016 blocks super 1.2 level 6, 512k chunk, algorithm 2 [14/12] [UUUUUUUU__UUUU]
      [==========>..........]  recovery = 53.6% (2097061864/3906887168) finish=2448.0min speed=12321K/sec
      bitmap: 0/30 pages [0KB], 65536KB chunk

md127 : active raid6 sdac[15] sdaa[13] sdz[2] sdab[0] sdq[9] sdy[14] sdx[4] sdr[8] sdp[10] sdo[16] sdv[6] sdu[7] sdw[5] sds[12] sdt[11]
      50789533184 blocks super 1.2 level 6, 512k chunk, algorithm 2 [15/15] [UUUUUUUUUUUUUUU]

unused devices: <none>

정말 고통스럽네요....

관련 정보