具有 4 個磁碟的 RAID 5 無法在 1 個磁碟發生故障的情況下運作?

具有 4 個磁碟的 RAID 5 無法在 1 個磁碟發生故障的情況下運作?

我發現了一個關於mdadm 備用磁碟這幾乎回答了我的問題,但我不清楚發生了什麼。

我們設定了 4 個磁碟的 RAID5 - 所有磁碟在正常操作下都標記為active/sync

    Update Time : Sun Sep 29 03:44:01 2013
          State : clean 
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

Number   Major   Minor   RaidDevice State
   0     202       32        0      active sync   /dev/sdc
   1     202       48        1      active sync   /dev/sdd
   2     202       64        2      active sync   /dev/sde 
   4     202       80        3      active sync   /dev/sdf

但當其中一個磁碟發生故障時,RAID 就停止運作:

    Update Time : Sun Sep 29 01:00:01 2013
          State : clean, FAILED 
 Active Devices : 2
Working Devices : 3
 Failed Devices : 1
  Spare Devices : 1

Number   Major   Minor   RaidDevice State
   0     202       32        0      active sync   /dev/sdc
   1     202       48        1      active sync   /dev/sdd
   2       0        0        2      removed
   3       0        0        3      removed

   2     202       64        -      faulty spare   /dev/sde
   4     202       80        -      spare   /dev/sdf

這裡到底發生了什麼事?

修復方法是重新安裝 RAID - 幸運的是我可以做到這一點。下次它可能會有一些重要的數據。我需要理解這一點,這樣我才能擁有一個不會因單一磁碟故障而失敗的 RAID。

我意識到我沒有列出我的預期和發生的情況。

我預計具有 3 個正常磁碟和 1 個壞磁碟的 RAID5 將在降級模式下運行 - 3 個活動/同步和 1 個故障。

發生的事情是憑空創建了一個備件並宣布故障 - 然後又製造了一個新的備件憑空創造並宣告聲音 - 之後RAID被宣告無效。

這是以下的輸出blkid

$ blkid
/dev/xvda1: LABEL="/" UUID="4797c72d-85bd-421a-9c01-52243aa28f6c" TYPE="ext4" 
/dev/xvdc: UUID="feb2c515-6003-478b-beb0-089fed71b33f" TYPE="ext3" 
/dev/xvdd: UUID="feb2c515-6003-478b-beb0-089fed71b33f" SEC_TYPE="ext2" TYPE="ext3" 
/dev/xvde: UUID="feb2c515-6003-478b-beb0-089fed71b33f" SEC_TYPE="ext2" TYPE="ext3" 
/dev/xvdf: UUID="feb2c515-6003-478b-beb0-089fed71b33f" SEC_TYPE="ext2" TYPE="ext3" 

TYPE 和 SEC_TYPE 很有趣,因為 RAID 有 XFS,而不是 ext3....

在此磁碟上嘗試安裝的日誌(與其他所有安裝一樣,導致了前面列出的最終結果)具有以下日誌條目:

Oct  2 15:08:51 it kernel: [1686185.573233] md/raid:md0: device xvdc operational as raid disk 0
Oct  2 15:08:51 it kernel: [1686185.580020] md/raid:md0: device xvde operational as raid disk 2
Oct  2 15:08:51 it kernel: [1686185.588307] md/raid:md0: device xvdd operational as raid disk 1
Oct  2 15:08:51 it kernel: [1686185.595745] md/raid:md0: allocated 4312kB
Oct  2 15:08:51 it kernel: [1686185.600729] md/raid:md0: raid level 5 active with 3 out of 4 devices, algorithm 2
Oct  2 15:08:51 it kernel: [1686185.608928] md0: detected capacity change from 0 to 2705221484544
Oct  2 15:08:51 it kernel: [1686185.615772] md: recovery of RAID array md0
Oct  2 15:08:51 it kernel: [1686185.621150] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
Oct  2 15:08:51 it kernel: [1686185.627626] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
Oct  2 15:08:51 it kernel: [1686185.634024]  md0: unknown partition table
Oct  2 15:08:51 it kernel: [1686185.645882] md: using 128k window, over a total of 880605952k.
Oct  2 15:22:25 it kernel: [1686999.697076] XFS (md0): Mounting Filesystem
Oct  2 15:22:26 it kernel: [1686999.889961] XFS (md0): Ending clean mount
Oct  2 15:24:19 it kernel: [1687112.817845] end_request: I/O error, dev xvde, sector 881423360
Oct  2 15:24:19 it kernel: [1687112.820517] raid5_end_read_request: 1 callbacks suppressed
Oct  2 15:24:19 it kernel: [1687112.821837] md/raid:md0: read error not correctable (sector 881423360 on xvde).
Oct  2 15:24:19 it kernel: [1687112.821837] md/raid:md0: Disk failure on xvde, disabling device.
Oct  2 15:24:19 it kernel: [1687112.821837] md/raid:md0: Operation continuing on 2 devices.
Oct  2 15:24:19 it kernel: [1687112.821837] md/raid:md0: read error not correctable (sector 881423368 on xvde).
Oct  2 15:24:19 it kernel: [1687112.821837] md/raid:md0: read error not correctable (sector 881423376 on xvde).
Oct  2 15:24:19 it kernel: [1687112.821837] md/raid:md0: read error not correctable (sector 881423384 on xvde).
Oct  2 15:24:19 it kernel: [1687112.821837] md/raid:md0: read error not correctable (sector 881423392 on xvde).
Oct  2 15:24:19 it kernel: [1687112.821837] md/raid:md0: read error not correctable (sector 881423400 on xvde).
Oct  2 15:24:19 it kernel: [1687112.821837] md/raid:md0: read error not correctable (sector 881423408 on xvde).
Oct  2 15:24:19 it kernel: [1687112.821837] md/raid:md0: read error not correctable (sector 881423416 on xvde).
Oct  2 15:24:19 it kernel: [1687112.821837] md/raid:md0: read error not correctable (sector 881423424 on xvde).
Oct  2 15:24:19 it kernel: [1687112.821837] md/raid:md0: read error not correctable (sector 881423432 on xvde).
Oct  2 15:24:19 it kernel: [1687113.432129] md: md0: recovery done.
Oct  2 15:24:19 it kernel: [1687113.685151] Buffer I/O error on device md0, logical block 96
Oct  2 15:24:19 it kernel: [1687113.691386] Buffer I/O error on device md0, logical block 96
Oct  2 15:24:19 it kernel: [1687113.697529] Buffer I/O error on device md0, logical block 64
Oct  2 15:24:20 it kernel: [1687113.703589] Buffer I/O error on device md0, logical block 64
Oct  2 15:25:51 it kernel: [1687205.682022] Buffer I/O error on device md0, logical block 96
Oct  2 15:25:51 it kernel: [1687205.688477] Buffer I/O error on device md0, logical block 96
Oct  2 15:25:51 it kernel: [1687205.694591] Buffer I/O error on device md0, logical block 64
Oct  2 15:25:52 it kernel: [1687205.700728] Buffer I/O error on device md0, logical block 64
Oct  2 15:25:52 it kernel: [1687205.748751] XFS (md0): last sector read failed

我沒有看到 xvdf 列出在那裡...

答案1

這是 RAID5 的一個根本問題——重建時的壞塊是一個殺手。

Oct  2 15:08:51 it kernel: [1686185.573233] md/raid:md0: device xvdc operational as raid disk 0
Oct  2 15:08:51 it kernel: [1686185.580020] md/raid:md0: device xvde operational as raid disk 2
Oct  2 15:08:51 it kernel: [1686185.588307] md/raid:md0: device xvdd operational as raid disk 1
Oct  2 15:08:51 it kernel: [1686185.595745] md/raid:md0: allocated 4312kB
Oct  2 15:08:51 it kernel: [1686185.600729] md/raid:md0: raid level 5 active with 3 out of 4 devices, algorithm 2
Oct  2 15:08:51 it kernel: [1686185.608928] md0: detected capacity change from 0 to 2705221484544

陣列已經組裝、降級。它已與 xvdc、xvde 和 xvdd 組裝在一起。顯然,有一個熱備件:

Oct  2 15:08:51 it kernel: [1686185.615772] md: recovery of RAID array md0
Oct  2 15:08:51 it kernel: [1686185.621150] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
Oct  2 15:08:51 it kernel: [1686185.627626] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
Oct  2 15:08:51 it kernel: [1686185.634024]  md0: unknown partition table
Oct  2 15:08:51 it kernel: [1686185.645882] md: using 128k window, over a total of 880605952k.

“分區表”訊息無關。其他訊息告訴您 md 正在嘗試進行恢復,可能是在熱備用設備上(如果您嘗試刪除/重新添加它,則可能是先前發生故障的設備)。


Oct  2 15:24:19 it kernel: [1687112.817845] end_request: I/O error, dev xvde, sector 881423360
Oct  2 15:24:19 it kernel: [1687112.820517] raid5_end_read_request: 1 callbacks suppressed
Oct  2 15:24:19 it kernel: [1687112.821837] md/raid:md0: read error not correctable (sector 881423360 on xvde).
Oct  2 15:24:19 it kernel: [1687112.821837] md/raid:md0: Disk failure on xvde, disabling device.
Oct  2 15:24:19 it kernel: [1687112.821837] md/raid:md0: Operation continuing on 2 devices.

這裡 md 嘗試從 xvde(其餘三個裝置之一)讀取磁區。那會失敗[可能是壞扇區],而 md (因為陣列已降級)無法恢復。因此,它將磁碟從陣列中踢出,並且如果出現雙磁碟故障,您的 RAID5 就會失效。

我不確定為什麼它被標記為備用 - 這很奇怪(不過,我想我通常會查看/proc/mdstat,所以也許 mdadm 就是這樣標記它的)。另外,我認為較新的核心對於剔除壞區塊要猶豫得多,但也許您正在運行較舊的核心?

對此你能做什麼?

良好的備份。這始終是任何保持數據活力的策略的重要組成部分。

確保定期清理陣列中的壞區塊。您的作業系統可能已經包含一個用於此目的的 cron 作業。您可以透過回顯 或 來完成repaircheck操作/sys/block/md0/md/sync_action。 「修復」也會修復任何發現的奇偶校驗錯誤(例如,奇偶校驗位與磁碟上的資料不符)。

# echo repair > /sys/block/md0/md/sync_action
#

cat /proc/mdstat可以使用、 或 sysfs 目錄中的各種檔案來查看進度。 (您可以在以下位置找到一些最新的文檔Linux Raid Wiki mdstat 文章

注意:在較舊的核心上(不確定確切的版本),檢查可能無法修復壞區塊。

最後一個選擇是切換到 RAID6。這將需要另一個磁碟(您運行四個甚至三個磁碟的 RAID6,您可能不想)。有了足夠新的內核,壞塊就會盡可能地被即時修復。 RAID6 可以承受兩次磁碟故障,因此當一個磁碟發生故障時,它仍然可以承受壞區塊,因此它會映射出壞區塊並繼續重建。

答案2

我想像您正在像這樣建立 RAID5 陣列:

$ mdadm --create /dev/md0 --level=5 --raid-devices=4 \
       /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1

這不完全是你想要的。相反,您需要像這樣添加磁碟:

$ mdadm --create /dev/md0 --level=5 --raid-devices=4 \
       /dev/sda1 /dev/sdb1 /dev/sdc1
$ mdadm --add /dev/md0 /dev/sdd1

或者您可以使用mdadm的選項來新增備件,如下所示:

$ mdadm --create /dev/md0 --level=5 --raid-devices=3 --spare-devices=1 \
       /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1

清單中的最後一個磁碟機將是備用磁碟機。

摘自mdadm 手冊頁

-n, --raid-devices=
      Specify the number of active devices in the array.  This, plus the 
      number of spare devices (see below) must  equal the  number  of  
      component-devices (including "missing" devices) that are listed on 
      the command line for --create. Setting a value of 1 is probably a 
      mistake and so requires that --force be specified first.  A  value 
      of  1  will then be allowed for linear, multipath, RAID0 and RAID1.  
      It is never allowed for RAID4, RAID5 or RAID6. This  number  can only 
      be changed using --grow for RAID1, RAID4, RAID5 and RAID6 arrays, and
      only on kernels which provide the necessary support.

-x, --spare-devices=
      Specify the number of spare (eXtra) devices in the initial array.  
      Spares can also be  added  and  removed  later. The  number  of component
      devices listed on the command line must equal the number of RAID devices 
      plus the number of spare devices.

相關內容