更換磁碟後 Raid 5 損壞

更換磁碟後 Raid 5 損壞

我的伺服器向我發送郵件說我的一個磁碟無法讀取區塊。所以我決定在它完全失效之前更換它。我新增了一張新磁碟並替換了故障的磁碟。

sudo mdadm --manage /dev/md0 --add /dev/sdg1
sudo mdadm --manage /dev/md0 --replace /dev/sdb1 --with /dev/dbg1

同步後,我想刪除失敗的 /dev/sdb1 並將其從陣列中刪除:

sudo mdadm --manage /dev/md0 --remove /dev/sdb1

但是當我想從盒子中取出磁碟時,我首先取出另外兩個磁碟,然後立即將它們放回去。在此之後,我證明我的突襲是否仍然有效,但事實並非如此。我嘗試重新啟動,希望它能自行痊癒。過去這從來都不是問題,但我也從未更換過磁碟。

在這不起作用之後,我看看該怎麼辦並嘗試重新添加光碟,但這沒有幫助,而且組裝也不起作用:

sudo mdadm --assamble --scan

只偵測到 2 個磁碟,所以我嘗試告訴它磁碟的名稱

sudo mdadm -v -A /dev/md0 /dev/sda1 /dev/sdf1 /dev/sdc1 /dev/sdd1

但告訴我所有磁碟都忙:

sudo mdadm -v -A /dev/md0 /dev/sda1 /dev/sdf1 /dev/sdc1 /dev/sdd1 
mdadm: looking for devices for /dev/md0
mdadm: /dev/sda1 is busy - skipping
mdadm: /dev/sdf1 is busy - skipping
mdadm: /dev/sdc1 is busy - skipping
mdadm: /dev/sdd1 is busy - skipping

重新啟動後,sdg1 取得 sdf1。

mdstat 似乎偵測到磁碟正確(我再次插入 sdb1 希望它能有所幫助,並嘗試了有或沒有):

cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : inactive sdd1[3](S) sdb1[1](S) sdc1[2](S) sda1[0](S) sdf1[4](S)
      14650670080 blocks super 1.2
       
unused devices: <none>

如果我僅查詢磁碟/dev/sda1/dev/sdf1顯示相同的陣列狀態AA..

sudo mdadm --query --examine /dev/sda1 
/dev/sda1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 7c3e9d4e:6bad2afa:85cd55b4:43e43f56
           Name : lianli:0  (local to host lianli)
  Creation Time : Sat Oct 29 18:52:27 2016
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
     Array Size : 8790402048 (8383.18 GiB 9001.37 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : 3e912563:b10b74d0:a49faf2d:e14db558

Internal Bitmap : 8 sectors from superblock
    Update Time : Sat Jan  9 10:06:33 2021
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : c7d96490 - correct
         Events : 303045

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AA.. ('A' == active, '.' == missing, 'R' == replacing)
sudo mdadm --query --examine /dev/sdd1 
/dev/sdd1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 7c3e9d4e:6bad2afa:85cd55b4:43e43f56
           Name : lianli:0  (local to host lianli)
  Creation Time : Sat Oct 29 18:52:27 2016
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
     Array Size : 8790402048 (8383.18 GiB 9001.37 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : bf303286:5889dc0c:a6a1824a:4fe1ae03

Internal Bitmap : 8 sectors from superblock
    Update Time : Sat Jan  9 10:05:58 2021
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : ef1f16fd - correct
         Events : 303036

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AA.A ('A' == active, '.' == missing, 'R' == replacing)
sudo mdadm --query --examine /dev/sdc1 
/dev/sdc1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 7c3e9d4e:6bad2afa:85cd55b4:43e43f56
           Name : lianli:0  (local to host lianli)
  Creation Time : Sat Oct 29 18:52:27 2016
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
     Array Size : 8790402048 (8383.18 GiB 9001.37 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : b29aba8f:f92c2b65:d155a3a8:40f41859

Internal Bitmap : 8 sectors from superblock
    Update Time : Sat Jan  9 10:04:33 2021
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 47feb45 - correct
         Events : 303013

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)

我會繼續嘗試,但目前我已經沒有想法了,這也是我第一次在raid中更換磁碟。希望有人能幫助我。

至少我還有一個備份,但我不想重置硬碟以發現備份也不起作用...

更新: 添加所有磁碟進行組裝後,我得到:

sudo mdadm -v -A /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sdf1
mdadm: looking for devices for /dev/md0
mdadm: /dev/sda1 is identified as a member of /dev/md0, slot 0.
mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdc1 is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 1.
mdadm: added /dev/sdf1 to /dev/md0 as 1
mdadm: added /dev/sdc1 to /dev/md0 as 2 (possibly out of date)
mdadm: added /dev/sdd1 to /dev/md0 as 3 (possibly out of date)
mdadm: added /dev/sda1 to /dev/md0 as 0
mdadm: /dev/md0 assembled from 2 drives - not enough to start the array.

答案1

我找到了一個解決方案:

經過更多研究和我在詳細模式 ( ) 下獲得的“可能已過時”信息sudo mdadm -v -A /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sdf1,我找到了此頁面:https://raid.wiki.kernel.org/index.php/RAID_Recovery

在「嘗試使用--force 進行組裝」一節中,他們描述瞭如果事件計數差異低於50,則使用強制。其中一個磁碟仍然如此日期,但我希望它能夠與其他人的資訊同步。所以可能是我丟失了一些數據,但我了解到如果我從陣列中刪除了錯誤的磁碟以等待陣列進入 snyc...

我用來再次讓我的攻擊工作的命令:

sudo mdadm --stop /dev/md0
sudo mdadm -v -A --force /dev/md0 /dev/sda1 /dev/sdc1 /dev/sdd1 /dev/sdf1

更新: 可能沒有添加一個驅動器,因此部隊只添加了一個驅動器以使陣列恢復到可用狀態。事件差異最大的設備必須稍後添加--re-add

sudo mdadm --manage /dev/md0 --re-add /dev/sdc1

現在我的陣列已恢復同步,我可以嘗試再次刪除故障的硬碟。

相關內容