檔案系統達到100%儲存容量後設定為唯讀,如何重設為讀寫模式?

檔案系統達到100%儲存容量後設定為唯讀,如何重設為讀寫模式?

昨天我們的伺服器(Ubuntu 18.04)儲存容量達到100% 在此輸入影像描述 並將我們的檔案系統之一設定為唯讀模式,請參閱:/dev/md3 / ext4 ro,relatime,errors=remount-ro,data=ordered 0 0。我已經嘗試了伺服器故障其他答案中的幾種解決方案,但似乎都不適合我的情況。

例如,我嘗試執行以下命令:sudo mount -o remount,rw /dev/md3 /,但這會產生以下訊息:mount: /: cannot remount /dev/md3 read-write, is write-protected.

如何解決這個問題以使檔案系統再次讀寫?

謝謝!

使用調試資訊更新:

mdadm --detail /dev/md3
/dev/md3:
           Version : 0.90
     Creation Time : Fri Nov 10 10:07:34 2017
        Raid Level : raid1
        Array Size : 20478912 (19.53 GiB 20.97 GB)
     Used Dev Size : 20478912 (19.53 GiB 20.97 GB)
      Raid Devices : 2
     Total Devices : 2
   Preferred Minor : 3
       Persistence : Superblock is persistent

       Update Time : Sat Sep 18 09:15:35 2021
             State : clean
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 0

Consistency Policy : unknown

              UUID : 4b632ac4:ae1a7c2b:a4d2adc2:26fd5302
            Events : 0.861

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       8       19        1      active sync   /dev/sdb3

並使用 dmesg:

dmesg | grep "md3"
[67448453.830094] EXT4-fs error (device md3): ext4_remount:4840: Abort forced by user

執行tune2fs

tune2fs -l /dev/md3
tune2fs 1.44.1 (24-Mar-2018)
Filesystem volume name:   /
Last mounted on:          /
Filesystem UUID:          d1a985c4-8c5e-4034-93e0-629b8e65f161
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags:         signed_directory_hash
Default mount options:    user_xattr acl
Filesystem state:         clean with errors
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              1281120
Block count:              5119728
Reserved block count:     255986
Free blocks:              445848
Free inodes:              1001361
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      1022
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8160
Inode blocks per group:   510
Flex block group size:    16
Filesystem created:       Fri Nov 10 10:07:39 2017
Last mount time:          Tue Jul 30 17:51:41 2019
Last write time:          Thu Sep 16 20:06:05 2021
Mount count:              7
Maximum mount count:      -1
Last checked:             Fri Nov 10 10:07:39 2017
Check interval:           0 (<none>)
Lifetime writes:          4013 GB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:           256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
First orphan inode:       663035
Default directory hash:   half_md4
Directory Hash Seed:      ae316af1-086d-470f-af27-0c10ca25f3c8
Journal backup:           inode blocks
FS Error count:           8
First error time:         Thu Sep 16 20:06:04 2021
First error function:     ext4_lookup
First error line #:       1607
First error inode #:      930317
First error block #:      0
Last error time:          Sat Sep 18 09:15:35 2021
Last error function:      ext4_remount
Last error line #:        4840
Last error inode #:       685456
Last error block #:       0

調試資訊使用e2fsck -n /dev/md3

e2fsck -n /dev/md3
e2fsck 1.44.1 (24-Mar-2018)
Warning: skipping journal recovery because doing a read-only filesystem check.
/ contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Inodes that were part of a corrupted orphan linked list found.  Fix? no

Inode 101 was part of the orphaned inode list.  IGNORED.
Inode 117 was part of the orphaned inode list.  IGNORED.
Inode 292 was part of the orphaned inode list.  IGNORED.
Inode 460 was part of the orphaned inode list.  IGNORED.
Inode 465 was part of the orphaned inode list.  IGNORED.
Inode 471 was part of the orphaned inode list.  IGNORED.
Inode 487 was part of the orphaned inode list.  IGNORED.
Inode 529 was part of the orphaned inode list.  IGNORED.
Inode 562 was part of the orphaned inode list.  IGNORED.
Inode 564 was part of the orphaned inode list.  IGNORED.
Inode 707 was part of the orphaned inode list.  IGNORED.
Inode 723 was part of the orphaned inode list.  IGNORED.
Inode 918 was part of the orphaned inode list.  IGNORED.
...
Deleted inode 402614 has zero dtime.  Fix? no
...
Inode 783370, end of extent exceeds allowed value
    (logical block 1024, physical block 3068928, len 76)
Clear? no

Inode 783370, i_blocks is 8784, should be 8200.  Fix? no

Inode 783470, end of extent exceeds allowed value
    (logical block 2708, physical block 1322783, len 193)
Clear? no

Inode 783470, i_blocks is 23200, should be 21672.  Fix? no

Inode 1047956 was part of the orphaned inode list.  IGNORED.
Pass 2: Checking directory structure
Entry 'tmp' in /tmp/systemd-private-bb09aae54cab4e12844e5844d11ca5eb-certbot.service-VSBnVY (685456) has deleted/unused inode 685457.  Clear? no

Entry '1159_key-certbot.pem' in /etc/letsencrypt/keys (930317) has deleted/unused inode 920168.  Clear? no

Entry '1159_key-certbot.pem' in /etc/letsencrypt/keys (930317) has an incorrect filetype (was 1, should be 0).
Fix? no

Entry '1110_csr-certbot.pem' in /etc/letsencrypt/csr (930318) has deleted/unused inode 920176.  Clear? no

Entry '1110_csr-certbot.pem' in /etc/letsencrypt/csr (930318) has an incorrect filetype (was 1, should be 0).
Fix? no

Entry '1106_key-certbot.pem' in /etc/letsencrypt/keys (930317) has deleted/unused inode 920166.  Clear? no

Entry '1106_key-certbot.pem' in /etc/letsencrypt/keys (930317) has an incorrect filetype (was 1, should be 0).
Fix? no

Entry '1109_key-certbot.pem' in /etc/letsencrypt/keys (930317) has deleted/unused inode 920173.  Clear? no

Entry '1109_key-certbot.pem' in /etc/letsencrypt/keys (930317) has an incorrect filetype (was 1, should be 0).
Fix? no

Entry '1146_csr-certbot.pem' in /etc/letsencrypt/csr (930318) has deleted/unused inode 920172.  Clear? no

Entry '1146_csr-certbot.pem' in /etc/letsencrypt/csr (930318) has an incorrect filetype (was 1, should be 0).
Fix? no
...
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Inode 685456 ref count is 3, should be 2.  Fix? no

Pass 5: Checking group summary information
Block bitmap differences:  -34565 -(53721--53734) -(59721--59761) -(59981--59983) -(61106--61184) -(61540--61544) -(70964--71007) -(71274--71313) -(84938--84989) -(85084--85107) -(85592--85599) -(116400--116408) -(116423--116436) -(128700--128703) -(128708--128721) -(138904--138914) -(165045--165150) -(169691--169713) -(169717--169742) -(464896--471464) -(471552--471989) -(472928--472947) -(499200--499612) -(501408--501434) -(503808--504070) -(513024--513301) -(513408--513491) -(589477--589480) -(711431--711441) -(747968--748030) -(838733--838740) -(838755--838758) -(838772--838783) -(838791--838800) -(838805--838816) -(838824--838835) -(848384--848972) -(875840--875880) -(1032187--1033031) -(1083840--1083878) -(1120110--1120132) -(1322783--1322975) -(1631196--1631251) -(1635150--1635169) -(1635360--1635391) -(1635571--1635575) -(1635848--1635855) -(1635996--1636001) -1648860 -1648880 -(1715533--1715536) -(1740800--1741311) -(1746432--1746573) -(1750528--1750729) -(1867776--1867880) -(1870717--1871294) -(1880576--1880791) -(1888256--1888258) -1888260 -(1888272--1888273) -(1888275--1888767) -(2226402--2226405) -(2235495--2235719) -(2266304--2266332) -(2301560--2301629) -(2528723--2528753) -(2589088--2589117) -(2597312--2597374) -(2597696--2597757) -(2614784--2615295) -(2619392--2619458) -(2619904--2620297) -2636181 -(2671360--2671491) -(2687328--2687350) -(3068928--3069003) -(3196998--3197002) -(3228728--3228738) -(3236697--3236703) -(3252961--3252970) -(3264276--3264277) -(3264287--3264298) -(3285164--3285170) -(3299518--3299524) -(3399680--3400062) -(3441024--3441129) -(3574080--3574142) -(3601664--3601795) -(3659648--3659724) -(3660672--3660755) -(3704233--3704234) -(3704237--3704242) -3707626 -3708898 -3709310 -3709356 -3709398 -3709984 -(3751694--3751696) -(3751707--3751711) -(3751767--3751768) -(3751774--3751775) -(3751800--3751814) -(3771264--3771343) -(3830025--3830040) -(3860480--3867203) -(3867616--3867644) -(3868160--3868618) -(3869696--3870139) -(4045457--4045483) -(4087936--4088023) -(4088032--4088055) -(4088320--4088780) -(4088960--4089064) -(4089088--4089126) -(4091136--4091324) -(4091392--4092119) -(4092928--4094514) -(4094976--4095854) -(4097088--4097120) -(4097536--4097816) -(4109312--4110157) -(4250368--4250378) -(4278497--4278513) -(4296960--4297014) -(4325486--4325616) -(4325632--4325707) -(4326688--4327074) -(4328826--4328961) -(4329202--4329314) -(4329600--4329666) -(4329764--4329804) -(4332027--4332178) -(4332406--4332476) -(4333568--4333942) -(4334372--4334454) -(4334564--4335227) -(4621153--4621176) -(4669781--4670170) -(4696470--4696548) -(4697074--4697429) -(4697662--4697711) -(4726778--4727894) -(5055921--5056185) -(5056648--5056667) -(5106412--5106620) -(5106668--5107034)
Fix? no

Free blocks count wrong for group #76 (3374, counted=3375).
Fix? no

Free blocks count wrong (445848, counted=445849).
Fix? no

Inode bitmap differences:  -101 -117 -292 -460 -465 -471 -487 -529 -562 -564 -707 -723 -918 -(1837--1838) -2041 -2714 -3593 -3654 -3659 -3894 -3976 -4336 -4425 -5193 -5244 -5252 -5930 -5951 -5967 -(7066--7069) -7431 -8492 -8651 -9298 -9583 -9592 -14261 -14270 -18093 -19214 -21301 -(27843--27844) -27847 -27849 -(27853--27856) -(27868--27869) -(27872--27873) -27875 -27879 -27883 -27885 -(27889--27890) -27892 -162842 -391708 -391741 -391759 -391763 -(391800--391802) -(391804--391805) -(391812--391814) -(391831--391833) -391870 -391873 -391878 -391900 -391902 -(391910--391911) -391915 -391919 -391927 -391956 -392493 -392719 -393759 -393795 -395132 -395134 -395161 -395165 -395221 -395234 -395267 -395289 -(395312--395313) -395315 -395325 -395336 -395387 -395630 -396550 -396589 -(396699--396700) -402594 -(402596--402598) -402601 -(402604--402606) -402608 -(402611--402614) -407918 -413872 -413874 -413881 -413885 -413897 -413900 -413908 -421042 -421202 -421226 -426391 -652905 -(652931--652935) -663035 -685457 -920162 -(920164--920176) -1047956
Fix? no

Directories count wrong for group #84 (17, counted=16).
Fix? no

Free inodes count wrong for group #96 (80, counted=82).
Fix? no

Free inodes count wrong for group #112 (486, counted=487).
Fix? no

Free inodes count wrong (1001361, counted=1001364).
Fix? no


/: ********** WARNING: Filesystem still has errors **********

/: 279759/1281120 files (0.7% non-contiguous), 4673880/5119728 blocks

答案1

正是檔案系統損壞導致此切換為唯讀模式,而不是其溢出,完全遵循掛載選項errors=remount-ro

備份重要資料和配置並將其下載到某個地方。如果重要的啟動項被破壞,請為該案例準備一個復原計畫。如果可能的話,將重要的服務轉移到另一台機器上。會有一些停機時間。

我注意到這個系統並不經常重新啟動(自 2017 年以來僅安裝了 7 次,最後一次重新啟動是在 2019 年)。所以我建議設定最大安裝數為 1,因此每次啟動時都會檢查:

tune2fs -c 1 /dev/md3

然後重新啟動。初始化腳本應在引導期間檢查並修復檔案系統。但是,損壞可能非常嚴重,因此可能需要手動交互,因此請確保有人在伺服器附近並準備好為您提供幫助。而且,如果這種腐敗觸及了一些重要的東西,你可能會遇到奇怪的問題。

在最壞的情況下,你將不得不重新安裝系統。但不要忘記再次將最大安裝計數設為 1。

為什麼檔案系統損壞了?它就這樣發生了。區塊儲存在記憶體中,由於宇宙射線等原因,可能會在那裡發生損壞。非常罕見的情況,有時會發生。那麼,磁碟也不理想,無法偵測到所有錯誤;存在非零誤碼率(在設備資料表中查找實際值),因此資料讀取損壞的可能性非常低,但仍然有可能。如果這種情況發生在元資料區塊上,則問題可能會累積(由錯誤訊息引導的檔案系統驅動程式可能會做出一些錯誤的假設並進一步破壞檔案系統),這就是為什麼不時檢查它很重要的原因。

相關內容