Last Wednesday I came to work to find my workstation had died overnight — and upon reboot, it failed to mount or fsck the root partition. Unfortunately it seems my disk has seen enough service and was having failed reads:
Jun 19 08:51:39 atlas kernel: [ 1324.948428] sd 3:0:1:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
Jun 19 08:51:39 atlas kernel: [ 1324.948434] sd 3:0:1:0: [sdb] Sense Key : Medium Error [current] [descriptor]
Jun 19 08:51:39 atlas kernel: [ 1324.948442] Descriptor sense data with sense descriptors (in hex):
Jun 19 08:51:39 atlas kernel: [ 1324.948446] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Jun 19 08:51:39 atlas kernel: [ 1324.948463] 00 03 fa 3f
Jun 19 08:51:39 atlas kernel: [ 1324.948470] sd 3:0:1:0: [sdb] Add. Sense: Unrecovered read error – auto reallocate failed
Jun 19 08:51:39 atlas kernel: [ 1324.948519] ata4: EH complete
Jun 19 08:51:39 atlas kernel: [ 1324.950250] sd 3:0:1:0: [sdb] 156250000 512-byte hardware sectors: (80.0 GB/74.5 GiB)
Jun 19 08:51:39 atlas kernel: [ 1324.950919] sd 3:0:1:0: [sdb] Write Protect is off
Jun 19 08:51:39 atlas kernel: [ 1324.954928] sd 3:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn’t support DPO or FUA
After getting a new disk and the machine back operable, I’ve now plugged it in and trying to recover it. Mounting fails, as it seems there are read errors in the journal itself:
Jun 19 08:35:39 atlas kernel: [ 364.686504] EXT3-fs: INFO: recovery required on readonly filesystem.
Jun 19 08:35:39 atlas kernel: [ 364.686508] EXT3-fs: write access will be enabled during recovery.
<errors like above>
Jun 19 08:36:08 atlas kernel: [ 393.492868] JBD: recovery failed
It seems somewhat logical that it might be common to have physical failures in the area of the disk where the journal lives. In my case, it seems the unreadable part(s) of the disk are all within the journal. I ran debugfs on the volume to find that I could read all sorts of things on the disk — so I just needed to tell it to skip reading the journal and mount the disk anyway.
- Before you remove the journal, you need to remove the needs_recovery flag from the volume. You’d think this is possible with tune2fs, but it doesn’t seem so. So you do it with debugfs:
debugfs -w -R “feature ^needs_recovery” /dev/sdb1
- Then remove the journal, forcibly:
tune2fs -f -O ^has_journal /dev/sdb1
- Now, go ahead and mount your volume as ext2:
mount -t ext2 -o ro /dev/sdb /mnt/disk
Voila! You’ve now nuked your journal, and marked your volume ready for mounting. It’s possible that it has inconsistencies and needs a fsck, but I mounted anyway and was able to recover everything without failure.
I learned this trick from an Ubuntu forums post at http://ubuntuforums.org/archive/index.php/t-953279.html