Tenía un e-sata de 4 unidades conectado a un servidor Fedora 31, con tres unidades de 1,5 TB y una de 2 TB. Creé un RAID1siguiendo este excelente tutorial de tecmint. Solía --raid-devices=4
. Bueno, eso no crea automáticamente una partición de 2 unidades reflejada. Muestra que solo hay 1,4 TB disponibles. De df -h
:
/dev/md0 1.4T 425G 880G 33% /esata
entonces:
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.4T 0 disk
└─sda1 8:1 0 1.4T 0 part
└─md0 9:0 0 1.4T 0 raid1
sdb 8:16 0 1.4T 0 disk
└─sdb1 8:17 0 1.4T 0 part
└─md0 9:0 0 1.4T 0 raid1
sdd 8:48 0 1.4T 0 disk
└─sdd1 8:49 0 1.4T 0 part
└─md0 9:0 0 1.4T 0 raid1
sde 8:64 0 4.9T 0 disk
├─sde1 8:65 0 2M 0 part
├─sde2 8:66 0 476M 0 part /boot
└─sde3 8:67 0 3.3T 0 part
sdf 8:80 0 59.8G 0 disk
└─sdf1 8:81 0 59.8G 0 part
sdg 8:96 0 1.8T 0 disk
└─sdg1 8:97 0 1.8T 0 part
└─md0 9:0 0 1.4T 0 raid1
sr0 11:0 1 1024M 0 rom
Y:
cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[0] sdg1[3] sdd1[2] sdb1[1]
1465005464 blocks super 1.2 [4/4] [UUUU]
bitmap: 0/11 pages [0KB], 65536KB chunk
unused devices: <none>
y:
mdadm -E /dev/sd[a-b]1 /dev/sdg1 /dev/sdd1
/dev/sda1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 88b9fcb6:52d0f235:849bd9d6:c079cfc8
Name : ourserver:0 (local to host ourserver)
Creation Time : Fri Mar 13 16:46:35 2020
Raid Level : raid1
Raid Devices : 4
Avail Dev Size : 2930010928 (1397.14 GiB 1500.17 GB)
Array Size : 1465005440 (1397.14 GiB 1500.17 GB)
Used Dev Size : 2930010880 (1397.14 GiB 1500.17 GB)
Data Offset : 264192 sectors
Super Offset : 8 sectors
Unused Space : before=264112 sectors, after=48 sectors
State : clean
Device UUID : 7df3d233:060aaac3:04eb9f3a:65a9119e
Internal Bitmap : 8 sectors from superblock
Update Time : Sat Mar 14 08:32:32 2020
Bad Block Log : 512 entries available at offset 16 sectors
Checksum : bbb40149 - correct
Events : 20558
Device Role : Active device 0
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 88b9fcb6:52d0f235:849bd9d6:c079cfc8
Name : ourserver:0 (local to host ourserver)
Creation Time : Fri Mar 13 16:46:35 2020
Raid Level : raid1
Raid Devices : 4
Avail Dev Size : 2930010928 (1397.14 GiB 1500.17 GB)
Array Size : 1465005440 (1397.14 GiB 1500.17 GB)
Used Dev Size : 2930010880 (1397.14 GiB 1500.17 GB)
Data Offset : 264192 sectors
Super Offset : 8 sectors
Unused Space : before=264112 sectors, after=48 sectors
State : clean
Device UUID : 434684bb:d297cd17:f5391b7b:0d73e9d7
Internal Bitmap : 8 sectors from superblock
Update Time : Sat Mar 14 08:32:32 2020
Bad Block Log : 512 entries available at offset 16 sectors
Checksum : 11dbfa76 - correct
Events : 20558
Device Role : Active device 1
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdg1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 88b9fcb6:52d0f235:849bd9d6:c079cfc8
Name : ourserver:0 (local to host ourserver)
Creation Time : Fri Mar 13 16:46:35 2020
Raid Level : raid1
Raid Devices : 4
Avail Dev Size : 3906762928 (1862.89 GiB 2000.26 GB)
Array Size : 1465005440 (1397.14 GiB 1500.17 GB)
Used Dev Size : 2930010880 (1397.14 GiB 1500.17 GB)
Data Offset : 264192 sectors
Super Offset : 8 sectors
Unused Space : before=264112 sectors, after=976752048 sectors
State : clean
Device UUID : 45a47922:251b01e7:a920b5ef:aec34c43
Internal Bitmap : 8 sectors from superblock
Update Time : Sat Mar 14 08:32:32 2020
Bad Block Log : 512 entries available at offset 16 sectors
Checksum : 623a20a2 - correct
Events : 20558
Device Role : Active device 3
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdd1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 88b9fcb6:52d0f235:849bd9d6:c079cfc8
Name : ourserver:0 (local to host ourserver)
Creation Time : Fri Mar 13 16:46:35 2020
Raid Level : raid1
Raid Devices : 4
Avail Dev Size : 2930012909 (1397.14 GiB 1500.17 GB)
Array Size : 1465005440 (1397.14 GiB 1500.17 GB)
Used Dev Size : 2930010880 (1397.14 GiB 1500.17 GB)
Data Offset : 264192 sectors
Super Offset : 8 sectors
Unused Space : before=264112 sectors, after=2029 sectors
State : clean
Device UUID : 9f705e06:0b9a6d1a:fe4a0368:8a279a1a
Internal Bitmap : 8 sectors from superblock
Update Time : Sat Mar 14 08:32:32 2020
Bad Block Log : 512 entries available at offset 16 sectors
Checksum : 8eeef44d - correct
Events : 20558
Device Role : Active device 2
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
Entonces he visto unusuario en fallo del servidor, yotro en SERecomiendo usar mdadm --assemble --update=devicesize /dev/md0
, que ejecuté y luego mdadm -G /dev/md0 -z max
todavía tiene lo mismo:
mdadm --assemble --update=devicesize /dev/md0 /dev/sd[a-b]1 /dev/sdg1 /dev/sdd1
mdadm: /dev/md0 has been started with 4 drives.
mdadm: component size of /dev/md0 unchanged at 1465005464K
¿Cómo podríamodificar esta publicación de SF¿Cómo hacer crecer un RAID 1 a RAID 10, o simplemente obtener una partición reflejada que consta de 2 unidades?
Respuesta1
Resolví esto gracias al excelente artículo del ingeniero de software.Jean-Christophe Berthon. Habría ahorrado aún más tiempo si simplemente eliminara la enorme copia de seguridad que hice con los directorios y archivos.
Aunque el RAID10 se muestra saludable aproximadamente una vez al día, veo los siguientes registros que supongo que significan reemplazar SDD1:
Mar 15 06:12:57 ourserver kernel: ata18.00: failed command: READ DMA EXT
Mar 15 06:12:57 ourserver kernel: ata18.00: cmd 25/00:80:22:ba:c4/00:00:ab:00:00/e0 tag 31 dma 65536 in#012 res 51/40:00:6d:ba:c4
/00:00:ab:00:00/00 Emask 0x9 (media error)
Mar 15 06:12:57 ourserver kernel: ata18.00: status: { DRDY ERR }
Mar 15 06:12:57 ourserver kernel: ata18.00: error: { UNC }
Mar 15 06:12:57 ourserver kernel: ata18.00: configured for UDMA/133
Mar 15 06:12:57 ourserver kernel: sd 17:0:0:0: [sdd] tag#31 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=2s
Mar 15 06:12:57 ourserver kernel: sd 17:0:0:0: [sdd] tag#31 Sense Key : Medium Error [current]
Mar 15 06:12:57 ourserver kernel: sd 17:0:0:0: [sdd] tag#31 Add. Sense: Unrecovered read error - auto reallocate failed
Mar 15 06:12:57 ourserver kernel: sd 17:0:0:0: [sdd] tag#31 CDB: Read(10) 28 00 ab c4 ba 22 00 00 80 00
Mar 15 06:12:57 ourserver kernel: blk_update_request: I/O error, dev sdd, sector 2881796717 op 0x0:(READ) flags 0x0 phys_seg 2 prio class
0
Mar 15 06:12:57 ourserver kernel: ata18: EH complete
Mar 15 06:13:00 ourserver kernel: ata18.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Mar 15 06:13:00 ourserver kernel: ata18.00: irq_stat 0x40000001
Mar 15 06:13:00 ourserver kernel: ata18.00: failed command: READ DMA EXT
Mar 15 06:13:00 ourserver kernel: ata18.00: cmd 25/00:00:a2:ba:c4/00:09:ab:00:00/e0 tag 0 dma 1179648 in#012 res 51/40:00:41:bd:c
4/00:00:ab:00:00/00 Emask 0x9 (media error)
Mar 15 06:13:00 ourserver kernel: ata18.00: status: { DRDY ERR }
Mar 15 06:13:00 ourserver kernel: ata18.00: error: { UNC }
Mar 15 06:13:01 ourserver kernel: ata18.00: configured for UDMA/133
Mar 15 06:13:01 ourserver kernel: sd 17:0:0:0: [sdd] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=3s
Mar 15 06:13:01 ourserver kernel: sd 17:0:0:0: [sdd] tag#0 Sense Key : Medium Error [current]
Mar 15 06:13:01 ourserver kernel: sd 17:0:0:0: [sdd] tag#0 Add. Sense: Unrecovered read error - auto reallocate failed
Mar 15 06:13:01 ourserver kernel: sd 17:0:0:0: [sdd] tag#0 CDB: Read(10) 28 00 ab c4 ba a2 00 09 00 00
Mar 15 06:13:01 ourserver kernel: blk_update_request: I/O error, dev sdd, sector 2881797441 op 0x0:(READ) flags 0x0 phys_seg 86 prio class 0
Mar 15 06:13:01 ourserver kernel: ata18: EH complete
Mar 15 06:13:04 ourserver kernel: ata18.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Mar 15 06:13:04 ourserver kernel: ata18.00: irq_stat 0x40000001
Mar 15 06:13:04 ourserver kernel: ata18.00: failed command: READ DMA EXT
Y smartctl
muestra esto:
Error 45 occurred at disk power-on lifetime: 3736 hours (155 days + 16 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 80 ff ff ff ef 00 1d+07:38:55.963 READ DMA EXT
27 00 00 00 00 00 e0 00 1d+07:38:55.906 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 00 1d+07:38:55.905 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 00 1d+07:38:55.892 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 00 1d+07:38:55.830 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 44 occurred at disk power-on lifetime: 3736 hours (155 days + 16 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 00 ff ff ff ef 00 1d+07:38:52.146 READ DMA EXT
35 00 80 ff ff ff ef 00 1d+07:38:52.143 WRITE DMA EXT
35 00 80 ff ff ff ef 00 1d+07:38:52.142 WRITE DMA EXT
35 00 80 ff ff ff ef 00 1d+07:38:52.140 WRITE DMA EXT
27 00 00 00 00 00 e0 00 1d+07:38:52.112 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 43 occurred at disk power-on lifetime: 3736 hours (155 days + 16 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 80 ff ff ff ef 00 1d+07:38:49.220 READ DMA EXT
ea 00 00 00 00 00 a0 00 1d+07:38:49.163 FLUSH CACHE EXT
ca 00 01 2a 00 00 e0 00 1d+07:38:49.163 WRITE DMA
ea 00 00 00 00 00 a0 00 1d+07:38:49.162 FLUSH CACHE EXT
ea 00 00 00 00 00 a0 00 1d+07:38:49.136 FLUSH CACHE EXT
Error 42 occurred at disk power-on lifetime: 3736 hours (155 days + 16 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 00 ff ff ff ef 00 1d+07:38:46.103 READ DMA EXT
27 00 00 00 00 00 e0 00 1d+07:38:46.075 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 00 1d+07:38:46.074 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 00 1d+07:38:46.060 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 00 1d+07:38:46.033 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 41 occurred at disk power-on lifetime: 3736 hours (155 days + 16 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 80 ff ff ff ef 00 1d+07:38:42.036 READ DMA EXT
35 00 00 ff ff ff ef 00 1d+07:38:42.032 WRITE DMA EXT
35 00 80 ff ff ff ef 00 1d+07:38:42.025 WRITE DMA EXT
27 00 00 00 00 00 e0 00 1d+07:38:41.997 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 00 1d+07:38:41.996 IDENTIFY DEVICE
También viendo esto:
smartctl -A /dev/sdd1
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.5.8-200.fc31.x86_64] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 101 099 006 Pre-fail Always - 203989872
3 Spin_Up_Time 0x0003 099 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 16
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 23
7 Seek_Error_Rate 0x000f 070 060 030 Pre-fail Always - 12419382
9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 3774
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 10
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 101
188 Command_Timeout 0x0032 100 099 000 Old_age Always - 65537
189 High_Fly_Writes 0x003a 070 070 000 Old_age Always - 30
190 Airflow_Temperature_Cel 0x0022 072 060 045 Old_age Always - 28 (Min/Max 28/31)
194 Temperature_Celsius 0x0022 028 040 000 Old_age Always - 28 (0 19 0 0 0)
195 Hardware_ECC_Recovered 0x001a 044 006 000 Old_age Always - 203989872
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 105
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 105
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 3774 (62 166 0)
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 201703664
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 1542427917