How to recover an ext3 volume with an unreadable journal

Last Wednesday I came to work to find my workstation had died overnight — and upon reboot, it failed to mount or fsck the root partition.  Unfortunately it seems my disk has seen enough service and was having failed reads:

Jun 19 08:51:39 atlas kernel: [ 1324.948428] sd 3:0:1:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
Jun 19 08:51:39 atlas kernel: [ 1324.948434] sd 3:0:1:0: [sdb] Sense Key : Medium Error [current] [descriptor]
Jun 19 08:51:39 atlas kernel: [ 1324.948442] Descriptor sense data with sense descriptors (in hex):
Jun 19 08:51:39 atlas kernel: [ 1324.948446]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Jun 19 08:51:39 atlas kernel: [ 1324.948463]         00 03 fa 3f
Jun 19 08:51:39 atlas kernel: [ 1324.948470] sd 3:0:1:0: [sdb] Add. Sense: Unrecovered read error – auto reallocate failed
Jun 19 08:51:39 atlas kernel: [ 1324.948519] ata4: EH complete
Jun 19 08:51:39 atlas kernel: [ 1324.950250] sd 3:0:1:0: [sdb] 156250000 512-byte hardware sectors: (80.0 GB/74.5 GiB)
Jun 19 08:51:39 atlas kernel: [ 1324.950919] sd 3:0:1:0: [sdb] Write Protect is off
Jun 19 08:51:39 atlas kernel: [ 1324.954928] sd 3:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn’t support DPO or FUA

After getting a new disk and the machine back operable, I’ve now plugged it in and trying to recover it.  Mounting fails, as it seems there are read errors in the journal itself:

Jun 19 08:35:39 atlas kernel: [  364.686504] EXT3-fs: INFO: recovery required on readonly filesystem.
Jun 19 08:35:39 atlas kernel: [  364.686508] EXT3-fs: write access will be enabled during recovery.
<errors like above>
Jun 19 08:36:08 atlas kernel: [  393.492868] JBD: recovery failed

It seems somewhat logical that it might be common to have physical failures in the area of the disk where the journal lives.  In my case, it seems the unreadable part(s) of the disk are all within the journal.  I ran debugfs on the volume to find that I could read all sorts of things on the disk — so I just needed to tell it to skip reading the journal and mount the disk anyway.

  1. Before you remove the journal, you need to remove the needs_recovery flag from the volume.  You’d think this is possible with tune2fs, but it doesn’t seem so.  So you do it with debugfs:

    debugfs -w -R “feature ^needs_recovery” /dev/sdb1

  2. Then remove the journal, forcibly:

    tune2fs -f -O ^has_journal /dev/sdb1

  3. Now, go ahead and mount your volume as ext2:

    mount -t ext2 -o ro /dev/sdb /mnt/disk

Voila!  You’ve now nuked your journal, and marked your volume ready for mounting.  It’s possible that it has inconsistencies and needs a fsck, but I mounted anyway and was able to recover everything without failure.

I learned this trick from an Ubuntu forums post at http://ubuntuforums.org/archive/index.php/t-953279.html

This entry was posted in Uncategorized and tagged . Bookmark the permalink.