ドライブは故障しましたか、それともまだ使用できますか?

ドライブは故障しましたか、それともまだ使用できますか?

私は次の WD ドライブ (3TB) を持っていますが、問題が発生しました (どのファイルにもアクセスできず、lsコマンドを実行しても永遠に待機状態になりました)。

ディスクに関する詳細は以下のとおりです。

Disk /dev/sda: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Disk model: EZRX-00D8PB0
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt

Device     Start        End    Sectors  Size Type
/dev/sda1   2048 5860532223 5860530176  2.7T Linux filesystem

この問題が発生した後、どのような問題が影響しているかを調べるためにいくつかのテストを実行しました。最初のステップとして、短いテストを実行したところsudo smartctl -t short /dev/sda、次のエラーが表示されました。

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       90%     17480         8467144

それから、他の投稿で説明されているように、いくつかの属性を取得しようとしましたsmartctl -a の出力を理解するを使用しますsudo smartctl -a /dev/sda。ここでは、属性テーブルと最新の 5 つのエラー ログを見つけることができます。

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       71
  3 Spin_Up_Time            0x0027   174   161   021    Pre-fail  Always       -       6266
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       695
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   077   077   000    Old_age   Always       -       17481
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       457
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       63
193 Load_Cycle_Count        0x0032   179   179   000    Old_age   Always       -       64193
194 Temperature_Celsius     0x0022   122   101   000    Old_age   Always       -       28
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   197   000    Old_age   Always       -       356
198 Offline_Uncorrectable   0x0030   197   197   000    Old_age   Offline      -       1691
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   196   196   000    Old_age   Offline      -       1691

SMART Error Log Version: 1
ATA Error Count: 47 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 47 occurred at disk power-on lifetime: 232 hours (9 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  04 61 0a 00 00 00 00

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  e0 00 0a 00 00 00 00 00      04:00:17.522  STANDBY IMMEDIATE
  ef 03 46 00 00 00 a0 00      04:00:16.815  SET FEATURES [Set transfer mode]
  ec 00 00 00 00 00 a0 00      04:00:16.815  IDENTIFY DEVICE

Error 46 occurred at disk power-on lifetime: 232 hours (9 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  04 61 46 00 00 00 a0  Device Fault; Error: ABRT

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ef 03 46 00 00 00 a0 00      04:00:16.815  SET FEATURES [Set transfer mode]
  ec 00 00 00 00 00 a0 00      04:00:16.815  IDENTIFY DEVICE
  e1 00 0f 00 00 00 00 00      04:00:15.095  IDLE IMMEDIATE
  ef 03 46 00 00 00 a0 00      04:00:14.575  SET FEATURES [Set transfer mode]
  ec 00 00 00 00 00 a0 00      04:00:14.575  IDENTIFY DEVICE

Error 45 occurred at disk power-on lifetime: 232 hours (9 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  04 61 0f 00 00 00 00

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  e1 00 0f 00 00 00 00 00      04:00:15.095  IDLE IMMEDIATE
  ef 03 46 00 00 00 a0 00      04:00:14.575  SET FEATURES [Set transfer mode]
  ec 00 00 00 00 00 a0 00      04:00:14.575  IDENTIFY DEVICE

Error 44 occurred at disk power-on lifetime: 232 hours (9 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  04 61 46 00 00 00 a0  Device Fault; Error: ABRT

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ef 03 46 00 00 00 a0 00      04:00:14.575  SET FEATURES [Set transfer mode]
  ec 00 00 00 00 00 a0 00      04:00:14.575  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      04:00:12.170  SET FEATURES [Set transfer mode]

Error 43 occurred at disk power-on lifetime: 232 hours (9 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  04 61 46 00 00 00 a0  Device Fault; Error: ABRT

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ef 03 46 00 00 00 a0 00      04:00:12.170  SET FEATURES [Set transfer mode]
  ec 00 00 00 00 00 a0 00      04:00:12.170  IDENTIFY DEVICE
  e1 00 0f 00 00 00 00 00      04:00:10.445  IDLE IMMEDIATE
  ef 03 46 00 00 00 a0 00      04:00:09.925  SET FEATURES [Set transfer mode]
  ec 00 00 00 00 00 a0 00      04:00:09.925  IDENTIFY DEVICE

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       90%     17480         8467144

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

それから私はLBA_of_first_error(8467144)を検査しようとしました、そして、このガイドを実行すると、sudo sg_verify --lba=8467144 /dev/sdaハードウェア障害が発生したことを確認する次の出力が得られます。

verify(10):
Fixed format, current; Sense key: Medium Error
Additional sense: Id CRC or ECC error
VERIFY(10) medium or hardware error near lba=0x8132c8

最後のステップとして、ブロックの再割り当てを試みましたが、成功しませんでしたsudo sg_reassign --address=8467144 /dev/sda

REASSIGN BLOCKS: Illegal request, Invalid opcode
sg_reassign failed: Illegal request, Invalid opcode

まとめると、このディスクの調査で何か手順を見逃したのでしょうか? ドライブは故障しているのでしょうか、それともまだ使用できるのでしょうか? SMART 属性リストから何らかのエラーがあるかどうかはわかりません。ドライブにさらにエラーがあるかどうかを知る手助けをしていただけますか?

関連情報