LVM 在嘗試建立我的根設備節點時掛起

LVM 在嘗試建立我的根設備節點時掛起

後續行動如何檢查 LVM 實體磁碟區上的壞塊?

標題主要概括了這一點。基本上,我有一個用常規分區分區的盒子/boot,然後用 LVM 物理卷填充驅動器的其餘部分。在 LVM 中,我有一個磁碟區組,其中包含一個根分割區、一個/home分割區和一個交換分割區。

當 LVM 在 中建立設備節點時/dev/mapper,它會很好地建立交換分區和主分區。但是,在嘗試建立根設備節點時它通常會掛起。這種情況發生在 Live CD(pvscan; vgscan; vgchange -ay我使用的,IIRC)和初始 ramdisk 上,導致盒子無法啟動。我還嘗試過 initrd 恢復 shell(lvm pvscan; lvm vgscan; lvm vgchange -ay我使用的是 IIRC),它也以同樣的方式失敗。

有時,vgchange -ay實際上創建了根設備節點(經過長時間的延遲)但從未退出,讓我手動殺死它。發生這種情況時,我嘗試安裝該設備,但它總是無限期地掛起。請注意,當這兩個命令都在運行時,控制台會輸出一堆有關失敗的命令“READ DMA”或其他內容的消息。

我跑過smartctl -a /dev/sda幾次了每次它都會給出相當多的關於壞塊(IIRC)的錯誤,但最終表明驅動器狀況良好。

我已經放了一個貼上箱dmesg受影響的機器上的。日誌來自啟動 Arch Linux live CD,然後運行pvscan; vgscan; vgchange -ay.vgchange -ay這次永遠掛著,我最終殺了它。這是 的結尾dmesg,為了後代(所以我[不使用 Pastebin2):

[   46.332920] end_request: I/O error, dev fd0, sector 0
[   58.503496] end_request: I/O error, dev fd0, sector 0
[167992.304649] EXT4-fs (sda1): recovery complete
[167992.304660] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
[168092.874016] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
[168163.318923] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
[168459.839738] end_request: I/O error, dev fd0, sector 0
[168472.010337] end_request: I/O error, dev fd0, sector 0
[168614.642035] bio: create slab <bio-2> at 2
[168630.045526] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[168630.045649] ata1.00: BMDMA stat 0x65
[168630.045710] ata1.00: failed command: READ DMA
[168630.045787] ata1.00: cmd c8/00:08:00:10:10/00:00:00:00:00/e6 tag 0 dma 4096 in
         res 51/40:08:00:10:10/00:00:00:00:00/e6 Emask 0x9 (media error)
[168630.046006] ata1.00: status: { DRDY ERR }
[168630.046071] ata1.00: error: { UNC }
[168630.066286] ata1.00: configured for UDMA/100
[168630.079493] ata1.01: configured for UDMA/66
[168630.079514] sd 0:0:0:0: [sda] Unhandled sense code
[168630.079517] sd 0:0:0:0: [sda]  
[168630.079520] Result: hostbyte=0x00 driverbyte=0x08
[168630.079523] sd 0:0:0:0: [sda]  
[168630.079525] Sense Key : 0x3 [current] [descriptor]
[168630.079530] Descriptor sense data with sense descriptors (in hex):
[168630.079532]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 
[168630.079544]         06 10 10 00 
[168630.079549] sd 0:0:0:0: [sda]  
[168630.079551] ASC=0x11 ASCQ=0x4
[168630.079554] sd 0:0:0:0: [sda] CDB: 
[168630.079556] cdb[0]=0x28: 28 00 06 10 10 00 00 00 08 00
[168630.079567] end_request: I/O error, dev sda, sector 101715968
[168630.079665] Buffer I/O error on device dm-3, logical block 0
[168630.079775] ata1: EH complete
[168634.564062] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[168634.564165] ata1.00: BMDMA stat 0x64
[168634.564225] ata1.00: failed command: READ DMA
[168634.564301] ata1.00: cmd c8/00:08:80:0f:10/00:00:00:00:00/e6 tag 0 dma 4096 in
         res 51/10:00:83:0f:10/00:00:00:00:00/e6 Emask 0x81 (invalid argument)
[168634.564527] ata1.00: status: { DRDY ERR }
[168634.564592] ata1.00: error: { IDNF }
[168634.584336] ata1.00: configured for UDMA/100
[168634.597559] ata1.01: configured for UDMA/66
[168634.597578] ata1: EH complete
[168639.087353] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[168639.087462] ata1.00: BMDMA stat 0x64
[168639.087521] ata1.00: failed command: READ DMA
[168639.087596] ata1.00: cmd c8/00:08:80:0f:10/00:00:00:00:00/e6 tag 0 dma 4096 in
         res 51/10:00:83:0f:10/00:00:00:00:00/e6 Emask 0x81 (invalid argument)
[168639.087822] ata1.00: status: { DRDY ERR }
[168639.087886] ata1.00: error: { IDNF }
[168639.105791] ata1.00: configured for UDMA/100
[168639.118999] ata1.01: configured for UDMA/66
[168639.119017] ata1: EH complete
[168645.896986] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[168645.897095] ata1.00: BMDMA stat 0x64
[168645.897155] ata1.00: failed command: READ DMA
[168645.900373] ata1.00: cmd c8/00:08:80:0f:10/00:00:00:00:00/e6 tag 0 dma 4096 in
         res 51/40:00:83:0f:10/00:00:00:00:00/e6 Emask 0x9 (media error)
[168645.906936] ata1.00: status: { DRDY ERR }
[168645.910263] ata1.00: error: { UNC }
[168645.931315] ata1.00: configured for UDMA/100
[168645.944504] ata1.01: configured for UDMA/66
[168645.944525] sd 0:0:0:0: [sda] Unhandled sense code
[168645.944529] sd 0:0:0:0: [sda]  
[168645.944531] Result: hostbyte=0x00 driverbyte=0x08
[168645.944534] sd 0:0:0:0: [sda]  
[168645.944537] Sense Key : 0x3 [current] [descriptor]
[168645.944541] Descriptor sense data with sense descriptors (in hex):
[168645.944543]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 
[168645.944554]         06 10 0f 83 
[168645.944559] sd 0:0:0:0: [sda]  
[168645.944561] ASC=0x11 ASCQ=0x4
[168645.944564] sd 0:0:0:0: [sda] CDB: 
[168645.944566] cdb[0]=0x28: 28 00 06 10 0f 80 00 00 08 00
[168645.944578] end_request: I/O error, dev sda, sector 101715843
[168645.947946] Buffer I/O error on device dm-2, logical block 10485744
[168645.951439] ata1: EH complete
[168650.445911] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[168650.449275] ata1.00: BMDMA stat 0x65
[168650.452579] ata1.00: failed command: READ DMA
[168650.455873] ata1.00: cmd c8/00:08:00:10:10/00:00:00:00:00/e6 tag 0 dma 4096 in
         res 51/40:08:00:10:10/00:00:00:00:00/e6 Emask 0x9 (media error)
[168650.462537] ata1.00: status: { DRDY ERR }
[168650.465714] ata1.00: error: { UNC }
[168650.486063] ata1.00: configured for UDMA/100
[168650.499326] ata1.01: configured for UDMA/66
[168650.499344] sd 0:0:0:0: [sda] Unhandled sense code
[168650.499348] sd 0:0:0:0: [sda]  
[168650.499350] Result: hostbyte=0x00 driverbyte=0x08
[168650.499353] sd 0:0:0:0: [sda]  
[168650.499355] Sense Key : 0x3 [current] [descriptor]
[168650.499360] Descriptor sense data with sense descriptors (in hex):
[168650.499362]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 
[168650.499373]         06 10 10 00 
[168650.499378] sd 0:0:0:0: [sda]  
[168650.499380] ASC=0x11 ASCQ=0x4
[168650.499383] sd 0:0:0:0: [sda] CDB: 
[168650.499385] cdb[0]=0x28: 28 00 06 10 10 00 00 00 08 00
[168650.499396] end_request: I/O error, dev sda, sector 101715968
[168650.502757] Buffer I/O error on device dm-3, logical block 0
[168650.506189] ata1: EH complete
[168798.816025] usb 9-2: new high-speed USB device number 2 using ehci-pci

這只是日誌的結尾,錯誤開始的地方,因為我達到了帖子限制。要了解整個內容,請查看pastebin。

很抱歉沒有提供具體信息,但我現在不在受影響的盒子前。

答案1

從您提供的額外資訊來看,聽起來您的驅動器損壞(壞塊)。如果您願意,可以嘗試解決這些問題,但我會認真考慮更換驅動器。

如果您想解決該問題,基本上您必須找到位於壞區塊頂部的 LVM 實體磁碟區,並將這些實體磁碟區新增至不得使用的邏輯磁碟區。

實際上,linux-lvm 郵件列表上有一個關於這個主題的最新電子郵件鏈(我閱讀了整個鏈,它包含很多資訊):

https://www.redhat.com/archives/linux-lvm/2012-November/msg00033.html


在這個特定的訊息中,看起來有人創建了一個 python 腳本來幫助完成任務:

https://www.redhat.com/archives/linux-lvm/2012-November/msg00038.html

在幫助處於這種情況(互聯網至少可以正常工作)的人之後,我使用了附加的腳本來幫助尋找受影響的 LV 和文件。

#!/usr/bin/python
# Identify partition, LV, file containing a sector 

# Copyright (C) 2010,2012 Stuart D. Gathman
# Shared under GNU Public License v2 or later
#   This program is free software; you can redistribute it and/or modify
#   it under the terms of the GNU General Public License as published by
#   the Free Software Foundation; either version 2 of the License, or
#   (at your option) any later version.

#   This program is distributed in the hope that it will be useful,
#   but WITHOUT ANY WARRANTY; without even the implied warranty of
#   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#   GNU General Public License for more details.

#   You should have received a copy of the GNU General Public License along
#   with this program; if not, write to the Free Software Foundation, Inc.,
#   51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

import sys
from subprocess import Popen,PIPE

ID_LVM = 0x8e
ID_LINUX = 0x83
ID_EXT = 0x05
ID_RAID = 0xfd

def idtoname(id):
  if id == ID_LVM: return "Linux LVM"
  if id == ID_LINUX: return "Linux Filesystem"
  if id == ID_EXT: return "Extended Partition"
  if id == ID_RAID: return "Software RAID"
  return hex(id)

class Segment(object):
  __slots__ = ('pe1st','pelst','lvpath','le1st','lelst')
  def __init__(self,pe1st,pelst):
    self.pe1st = pe1st;
    self.pelst = pelst;
  def __str__(self):
    return "Seg:%d-%d:%s:%d-%d" % (
      self.pe1st,self.pelst,self.lvpath,self.le1st,self.lelst)

def cmdoutput(cmd):
  p = Popen(cmd, shell=True, stdout=PIPE)
  try:
    for ln in p.stdout:
      yield ln
  finally:
    p.stdout.close()
    p.wait()

def icheck(fs,blk):
  "Return inum from block number, or 0 if free space."
  for ln in cmdoutput("debugfs -R 'icheck %d' '%s' 2>/dev/null"%(blk,fs)):
    b,i = ln.strip().split(None,1)
    if not b[0].isdigit(): continue
    if int(b) == blk:
      if i.startswith('<'):
    return 0
      return int(i)
  raise ValueError('%s: invalid block: %d'%(fs,blk))

def ncheck(fs,inum):
  "Return filename from inode number, or None if not linked."
  for ln in cmdoutput("debugfs -R 'ncheck %d' '%s' 2>/dev/null"%(inum,fs)):
    i,n = ln.strip().split(None,1)
    if not i[0].isdigit(): continue
    if int(i) == inum:
      return n
  return None

def blkid(fs):
  "Return dictionary of block device attributes"
  d = {}
  for ln in cmdoutput("blkid -o export '%s'"%fs):
    k,v = ln.strip().split('=',1)
    d[k] = v
  return d

def getpvmap(pv):
  pe_start = 192 * 2
  pe_size = None
  seg = None
  segs = []
  for ln in cmdoutput("pvdisplay --units k -m %s"%pv):
    a = ln.strip().split()
    if not a: continue
    if a[0] == 'Physical' and a[4].endswith(':'):
      pe1st = int(a[2])
      pelst = int(a[4][:-1])
      seg = Segment(pe1st,pelst)
    elif seg and a[0] == 'Logical':
      if a[1] == 'volume':
    seg.lvpath = a[2]
      elif a[1] == 'extents':
    seg.le1st = int(a[2])
    seg.lelst = int(a[4])
    segs.append(seg)
    elif a[0] == 'PE' and a[1] == 'Size':
      if a[2] == "(KByte)":
    pe_size = int(a[3]) * 2
      elif a[3] == 'KiB':
    pe_size = int(float(a[2])) * 2
  if segs:
    for ln in cmdoutput("pvs --units k -o+pe_start %s"%pv):
      a = ln.split()
      if a[0] == pv:
        lst = a[-1]
    if lst.lower().endswith('k'):
      pe_start = int(float(lst[:-1]))*2
      return pe_start,pe_size,segs
  return None

def findlv(pv,sect):
  res = getpvmap(pv)
  if not res: return None
  pe_start,pe_size,m = res
  if sect < pe_start:
    raise Exception("Bad sector in PV metadata area")
  pe = int((sect - pe_start)/pe_size)
  pebeg = pe * pe_size + pe_start
  peoff = sect - pebeg
  for s in m:
    if s.pe1st <= pe <= s.pelst:
      le = s.le1st + pe - s.pe1st
      return s.lvpath,le * pe_size + peoff

def getmdmap():
  with open('/proc/mdstat','rt') as fp:
    m = []
    for ln in fp:
      if ln.startswith('md'):
    a = ln.split(':')
    raid = a[0].strip()
    devs = []
    a = a[1].split()
    for d in a[2:]:
      devs.append(d.split('[')[0])
    m.append((raid,devs))
    return m

def parse_sfdisk(s):
  for ln in s:
    try:
      part,desc = ln.split(':')
      if part.startswith('/dev/'):
        d = {}
        for p in desc.split(','):
      name,val = p.split('=')
      name = name.strip()
      if name.lower() == 'id':
        d[name] = int(val,16)
      else:
        d[name] = int(val)
    yield part.strip(),d
    except ValueError:
      continue

def findpart(wd,lba):
  s = cmdoutput("sfdisk -d %s"%wd)
  parts = [ (part,d['start'],d['size'],d['Id']) for part,d in parse_sfdisk(s) ]
  for part,start,sz,Id in parts:
    if Id == ID_EXT: continue
    if start <= lba < start + sz:
      return part,lba - start,Id
  return None

if __name__ == '__main__':
  wd = sys.argv[1]
  lba = int(sys.argv[2])
  print wd,lba,"Whole Disk"
  res = findpart(wd,lba)
  if not res:
    print "LBA is outside any partition"
    sys.exit(1)
  part,sect,Id = res
  print part,sect,idtoname(Id)
  if Id == ID_LVM:
    bd,sect = findlv(part,sect)
    # FIXME: problems if LV is snapshot
  elif Id == ID_LINUX:
    bd = part
  else:
    if Id == ID_RAID:
      for md,devs in getmdmap():
    for dev in devs:
      if part == "/dev/"+dev:
        part = "/dev/"+md
        break
        else: continue
    break
    res = findlv(part,sect)
    if res:
      print "PV =",part
      bd,sect = res
    else:
      bd = part
  blksiz = 4096
  blk = int(sect * 512 / blksiz)
  p = blkid(bd)
  try:
    t = p['TYPE']
  except:
    print bd,p
    raise
  print "fs=%s block=%d %s"%(bd,blk,t)
  if t.startswith('ext'):
    inum = icheck(bd,blk)
    if inum:
      fn = ncheck(bd,inum)
      print "file=%s inum=%d"%(fn,inum)
    else:
      print "<free space>"

相關內容