expandir a partição RAID1 de software para incluir 2 unidades espelhadas ou converter para RAID10

expandir a partição RAID1 de software para incluir 2 unidades espelhadas ou converter para RAID10

Eu tinha um e-sata de 4 drives conectado a um servidor Fedora 31, com três drives de 1,5 TB e um de 2 TB. Eu criei um RAID1seguindo este excelente tutorial do tecmint. Eu usei --raid-devices=4. Bem, isso não cria automaticamente uma partição de 2 unidades espelhada. Ele mostra que apenas 1,4 TB está disponível. De df -h:

/dev/md0                          1.4T  425G  880G  33% /esata

então:

lsblk
NAME                     MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                        8:0    0   1.4T  0 disk
└─sda1                     8:1    0   1.4T  0 part
  └─md0                    9:0    0   1.4T  0 raid1
sdb                        8:16   0   1.4T  0 disk
└─sdb1                     8:17   0   1.4T  0 part
  └─md0                    9:0    0   1.4T  0 raid1
sdd                        8:48   0   1.4T  0 disk
└─sdd1                     8:49   0   1.4T  0 part
  └─md0                    9:0    0   1.4T  0 raid1
sde                        8:64   0   4.9T  0 disk
├─sde1                     8:65   0     2M  0 part
├─sde2                     8:66   0   476M  0 part  /boot
└─sde3                     8:67   0   3.3T  0 part
sdf                        8:80   0  59.8G  0 disk
└─sdf1                     8:81   0  59.8G  0 part
sdg                        8:96   0   1.8T  0 disk
└─sdg1                     8:97   0   1.8T  0 part
  └─md0                    9:0    0   1.4T  0 raid1
sr0                       11:0    1  1024M  0 rom

E:

cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[0] sdg1[3] sdd1[2] sdb1[1]
      1465005464 blocks super 1.2 [4/4] [UUUU]
      bitmap: 0/11 pages [0KB], 65536KB chunk

unused devices: <none>

e:

mdadm -E /dev/sd[a-b]1 /dev/sdg1 /dev/sdd1
/dev/sda1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 88b9fcb6:52d0f235:849bd9d6:c079cfc8
           Name : ourserver:0  (local to host ourserver)
  Creation Time : Fri Mar 13 16:46:35 2020
     Raid Level : raid1
   Raid Devices : 4

 Avail Dev Size : 2930010928 (1397.14 GiB 1500.17 GB)
     Array Size : 1465005440 (1397.14 GiB 1500.17 GB)
  Used Dev Size : 2930010880 (1397.14 GiB 1500.17 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=48 sectors
          State : clean
    Device UUID : 7df3d233:060aaac3:04eb9f3a:65a9119e

Internal Bitmap : 8 sectors from superblock
    Update Time : Sat Mar 14 08:32:32 2020
  Bad Block Log : 512 entries available at offset 16 sectors
       Checksum : bbb40149 - correct
         Events : 20558


   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 88b9fcb6:52d0f235:849bd9d6:c079cfc8
           Name : ourserver:0  (local to host ourserver)
  Creation Time : Fri Mar 13 16:46:35 2020
     Raid Level : raid1
   Raid Devices : 4

 Avail Dev Size : 2930010928 (1397.14 GiB 1500.17 GB)
     Array Size : 1465005440 (1397.14 GiB 1500.17 GB)
  Used Dev Size : 2930010880 (1397.14 GiB 1500.17 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=48 sectors
          State : clean
    Device UUID : 434684bb:d297cd17:f5391b7b:0d73e9d7

Internal Bitmap : 8 sectors from superblock
    Update Time : Sat Mar 14 08:32:32 2020
  Bad Block Log : 512 entries available at offset 16 sectors
       Checksum : 11dbfa76 - correct
         Events : 20558


   Device Role : Active device 1
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdg1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 88b9fcb6:52d0f235:849bd9d6:c079cfc8
           Name : ourserver:0  (local to host ourserver)
  Creation Time : Fri Mar 13 16:46:35 2020
     Raid Level : raid1
   Raid Devices : 4

 Avail Dev Size : 3906762928 (1862.89 GiB 2000.26 GB)
     Array Size : 1465005440 (1397.14 GiB 1500.17 GB)
  Used Dev Size : 2930010880 (1397.14 GiB 1500.17 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=976752048 sectors
          State : clean
    Device UUID : 45a47922:251b01e7:a920b5ef:aec34c43

Internal Bitmap : 8 sectors from superblock
    Update Time : Sat Mar 14 08:32:32 2020
  Bad Block Log : 512 entries available at offset 16 sectors
       Checksum : 623a20a2 - correct
         Events : 20558


   Device Role : Active device 3
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdd1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 88b9fcb6:52d0f235:849bd9d6:c079cfc8
           Name : ourserver:0  (local to host ourserver)
  Creation Time : Fri Mar 13 16:46:35 2020
     Raid Level : raid1
   Raid Devices : 4

 Avail Dev Size : 2930012909 (1397.14 GiB 1500.17 GB)
     Array Size : 1465005440 (1397.14 GiB 1500.17 GB)
  Used Dev Size : 2930010880 (1397.14 GiB 1500.17 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=2029 sectors
          State : clean
    Device UUID : 9f705e06:0b9a6d1a:fe4a0368:8a279a1a

Internal Bitmap : 8 sectors from superblock
    Update Time : Sat Mar 14 08:32:32 2020
  Bad Block Log : 512 entries available at offset 16 sectors
       Checksum : 8eeef44d - correct
         Events : 20558


   Device Role : Active device 2
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)

Então eu vi umusuário em serverfault, eoutro no SErecomendo usar o mdadm --assemble --update=devicesize /dev/md0, que executei e depois mdadm -G /dev/md0 -z maxque continua igual:

mdadm --assemble --update=devicesize /dev/md0 /dev/sd[a-b]1 /dev/sdg1 /dev/sdd1
mdadm: /dev/md0 has been started with 4 drives.
mdadm: component size of /dev/md0 unchanged at 1465005464K

Como eu iriaaltere esta postagem SFem aumentar um RAID 1 para RAID 10 ou simplesmente obter uma partição espelhada que consiste em 2 unidades?

Responder1

Resolvi isso graças ao excelente artigo do engenheiro de softwareJean-Christophe Berthon. Teria economizado ainda mais tempo se eu simplesmente excluísse o enorme backup que fiz com os diretórios e arquivos.

Mesmo que o RAID10 pareça saudável uma vez por dia, vejo os logs abaixo, que presumo significarem substituir o SDD1:

Mar 15 06:12:57 ourserver kernel: ata18.00: failed command: READ DMA EXT
Mar 15 06:12:57 ourserver kernel: ata18.00: cmd 25/00:80:22:ba:c4/00:00:ab:00:00/e0 tag 31 dma 65536 in#012         res 51/40:00:6d:ba:c4
/00:00:ab:00:00/00 Emask 0x9 (media error)
Mar 15 06:12:57 ourserver kernel: ata18.00: status: { DRDY ERR }
Mar 15 06:12:57 ourserver kernel: ata18.00: error: { UNC }
Mar 15 06:12:57 ourserver kernel: ata18.00: configured for UDMA/133
Mar 15 06:12:57 ourserver kernel: sd 17:0:0:0: [sdd] tag#31 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=2s
Mar 15 06:12:57 ourserver kernel: sd 17:0:0:0: [sdd] tag#31 Sense Key : Medium Error [current]
Mar 15 06:12:57 ourserver kernel: sd 17:0:0:0: [sdd] tag#31 Add. Sense: Unrecovered read error - auto reallocate failed
Mar 15 06:12:57 ourserver kernel: sd 17:0:0:0: [sdd] tag#31 CDB: Read(10) 28 00 ab c4 ba 22 00 00 80 00
Mar 15 06:12:57 ourserver kernel: blk_update_request: I/O error, dev sdd, sector 2881796717 op 0x0:(READ) flags 0x0 phys_seg 2 prio class
 0
Mar 15 06:12:57 ourserver kernel: ata18: EH complete
Mar 15 06:13:00 ourserver kernel: ata18.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Mar 15 06:13:00 ourserver kernel: ata18.00: irq_stat 0x40000001
Mar 15 06:13:00 ourserver kernel: ata18.00: failed command: READ DMA EXT
Mar 15 06:13:00 ourserver kernel: ata18.00: cmd 25/00:00:a2:ba:c4/00:09:ab:00:00/e0 tag 0 dma 1179648 in#012         res 51/40:00:41:bd:c
4/00:00:ab:00:00/00 Emask 0x9 (media error)
Mar 15 06:13:00 ourserver kernel: ata18.00: status: { DRDY ERR }
Mar 15 06:13:00 ourserver kernel: ata18.00: error: { UNC }
Mar 15 06:13:01 ourserver kernel: ata18.00: configured for UDMA/133
Mar 15 06:13:01 ourserver kernel: sd 17:0:0:0: [sdd] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=3s
Mar 15 06:13:01 ourserver kernel: sd 17:0:0:0: [sdd] tag#0 Sense Key : Medium Error [current]
Mar 15 06:13:01 ourserver kernel: sd 17:0:0:0: [sdd] tag#0 Add. Sense: Unrecovered read error - auto reallocate failed
Mar 15 06:13:01 ourserver kernel: sd 17:0:0:0: [sdd] tag#0 CDB: Read(10) 28 00 ab c4 ba a2 00 09 00 00
Mar 15 06:13:01 ourserver kernel: blk_update_request: I/O error, dev sdd, sector 2881797441 op 0x0:(READ) flags 0x0 phys_seg 86 prio class 0
Mar 15 06:13:01 ourserver kernel: ata18: EH complete
Mar 15 06:13:04 ourserver kernel: ata18.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Mar 15 06:13:04 ourserver kernel: ata18.00: irq_stat 0x40000001
Mar 15 06:13:04 ourserver kernel: ata18.00: failed command: READ DMA EXT

E smartctlmostra isso:

Error 45 occurred at disk power-on lifetime: 3736 hours (155 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 80 ff ff ff ef 00   1d+07:38:55.963  READ DMA EXT
  27 00 00 00 00 00 e0 00   1d+07:38:55.906  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
  ec 00 00 00 00 00 a0 00   1d+07:38:55.905  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00   1d+07:38:55.892  SET FEATURES [Set transfer mode]
  27 00 00 00 00 00 e0 00   1d+07:38:55.830  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]

Error 44 occurred at disk power-on lifetime: 3736 hours (155 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 00 ff ff ff ef 00   1d+07:38:52.146  READ DMA EXT
  35 00 80 ff ff ff ef 00   1d+07:38:52.143  WRITE DMA EXT
  35 00 80 ff ff ff ef 00   1d+07:38:52.142  WRITE DMA EXT
  35 00 80 ff ff ff ef 00   1d+07:38:52.140  WRITE DMA EXT
  27 00 00 00 00 00 e0 00   1d+07:38:52.112  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]

Error 43 occurred at disk power-on lifetime: 3736 hours (155 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 80 ff ff ff ef 00   1d+07:38:49.220  READ DMA EXT
  ea 00 00 00 00 00 a0 00   1d+07:38:49.163  FLUSH CACHE EXT
  ca 00 01 2a 00 00 e0 00   1d+07:38:49.163  WRITE DMA
  ea 00 00 00 00 00 a0 00   1d+07:38:49.162  FLUSH CACHE EXT
  ea 00 00 00 00 00 a0 00   1d+07:38:49.136  FLUSH CACHE EXT

Error 42 occurred at disk power-on lifetime: 3736 hours (155 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 00 ff ff ff ef 00   1d+07:38:46.103  READ DMA EXT
  27 00 00 00 00 00 e0 00   1d+07:38:46.075  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
  ec 00 00 00 00 00 a0 00   1d+07:38:46.074  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00   1d+07:38:46.060  SET FEATURES [Set transfer mode]
  27 00 00 00 00 00 e0 00   1d+07:38:46.033  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]

Error 41 occurred at disk power-on lifetime: 3736 hours (155 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 80 ff ff ff ef 00   1d+07:38:42.036  READ DMA EXT
  35 00 00 ff ff ff ef 00   1d+07:38:42.032  WRITE DMA EXT
  35 00 80 ff ff ff ef 00   1d+07:38:42.025  WRITE DMA EXT
  27 00 00 00 00 00 e0 00   1d+07:38:41.997  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
  ec 00 00 00 00 00 a0 00   1d+07:38:41.996  IDENTIFY DEVICE

Também vendo isso:

smartctl -A /dev/sdd1
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.5.8-200.fc31.x86_64] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   101   099   006    Pre-fail  Always       -       203989872
  3 Spin_Up_Time            0x0003   099   097   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       16
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       23
  7 Seek_Error_Rate         0x000f   070   060   030    Pre-fail  Always       -       12419382
  9 Power_On_Hours          0x0032   096   096   000    Old_age   Always       -       3774
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       10
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   001   001   000    Old_age   Always       -       101
188 Command_Timeout         0x0032   100   099   000    Old_age   Always       -       65537
189 High_Fly_Writes         0x003a   070   070   000    Old_age   Always       -       30
190 Airflow_Temperature_Cel 0x0022   072   060   045    Old_age   Always       -       28 (Min/Max 28/31)
194 Temperature_Celsius     0x0022   028   040   000    Old_age   Always       -       28 (0 19 0 0 0)
195 Hardware_ECC_Recovered  0x001a   044   006   000    Old_age   Always       -       203989872
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       105
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       105
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       3774 (62 166 0)
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       201703664
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       1542427917

informação relacionada