Save the World, Save Time Machine
[UPADTE - 2009-05-14] I can't find where I saw that, but a recent Aiport and Time Capsule update from Apple seems to have fixed the corruption issue and introduced an integrity check on sparsebundle images. Good. The bad side is that now, when you interrupt a network backup, the next time Time Machine starts, it will do a "deep node traversal" which means the next backup takes quite some time.
So I have a 500GB Time Capsule stuck somewhere in the house, to which all the Macs are backed up by Time Machine. Well, actually, my work MacBook Pro gets backed up on the TC’s internal hard drive, while all the other Macs (that’s one G5 iMac, a G4 iBook and a core2duo MacBook) are backed-up to a 500GB USB HD that’s hooked-up to the TC.
The great thing with TC/TM is that, since a couple versions of Leopard, the backups can be done over the LAN via Ethernet OR Airport. While TM does backups to external HD as simple subfolders, when you backup to a network drive, TM creates a sparsebundle disk image using your computer’s name and mac address (well, the MAC address of the network interface you used when doing the first ever TM backup of said machine).On Monday, while TM was doing its hourly thing, there was a power outage in the house. When it went back live, I found out TM couldn’t backup my MacBook Pro anymore, it threw an error saying that it could not find the backup disk. I checked and could see the TC alright on the LAN, and the sparsebundle for my laptop was clearly there… So I rebooted the TC, but to no result. Then I tried to mount manually the sparsebundle: the Finder grinded away for a loooong while before giving up and throwing an error stating that the image had “no filesystem”. Yikes!
I went and had a look in my system logs and here is what I found:
May 28 08:29:14 rangiroa /System/Library/CoreServices/backupd[1607]: Backing up to: /Volumes/Backup of rangiroa/Backups.backupdb
May 28 08:31:17 rangiroa /System/Library/CoreServices/backupd[1607]: Stopping backupd to allow ejection of backup destination disk!
May 28 08:31:17 rangiroa /System/Library/CoreServices/backupd[1607]: Error: (6) getxattr for key:com.apple.backupd.SnapshotState path:/Volumes/Backup of rangiroa/Backups.backupdb/rangiroa/2008-04-29-102105
May 28 08:31:17 rangiroa /System/Library/CoreServices/backupd[1607]: Error: (6) getxattr for key:com.apple.backupd.SnapshotContainer path:/Volumes/Backup of rangiroa/Backups.backupdb/rangiroa/2008-04-29-102105
May 28 08:31:17 rangiroa /System/Library/CoreServices/backupd[1607]: Error: (-36) Creating directory 2008-05-28-083117.inProgress
May 28 08:31:17 rangiroa /System/Library/CoreServices/backupd[1607]: Failed to make snapshot container.
May 28 08:31:17 rangiroa /System/Library/CoreServices/backupd[1607]: Error: (-36) Creating directory rangiroa 2[...] 998 of them [...]
May 28 08:31:18 rangiroa /System/Library/CoreServices/backupd[1607]: Backup failed with error: 2
May 28 08:31:30 rangiroa /System/Library/CoreServices/backupd[1635]: Mounting disk image /Volumes/Time Capsule/rangiroa_0017f2cb63cb.sparsebundle
May 28 08:31:30 rangiroa /System/Library/CoreServices/backupd[1635]: Failed to attach to disk image, returned: 35
May 28 08:31:30 rangiroa /System/Library/CoreServices/backupd[1635]: Failed to mount disk image /Volumes/Time Capsule/rangiroa_0017f2cb63cb.sparsebundle
May 28 08:31:30 rangiroa /System/Library/CoreServices/backupd[1635]: Ejected Time Machine network volume.
So the Finder could not attach the disk image, nor mount it...
And you know, there is no way to directly access the HD in the Time Capsule, meaning you can’t simply do a disk utility check on it. Wait you’ll say, you can use Disk Utility to check and repair a disk image, but to do that, said Disk utility first tries to mount the image, which, of course, failed in my case… So no luck with Disk Utility.
So I copied the 160GB sparsebundle to a USB HD, and went on trying to fix it… I did a lot of googling for the problem, but all I found was loads of complaints, but no solutions, until I stumbled on this post over at macosxhints.com. It says the guy managed to repair his sparsebundle using some terminal wizardry involving the use of the hdiutl command:
sudo hdiutil mount -nomount -readwrite /path/to/sparseimage
So I tried it: I launched Terminal, typed the correct command and path to the image, at which point I got a huge CPU spike while nothing happended in the Terminal, indicating that some process was running but giving no feedback. Looking at the Activity Monitor showed a process called fsck_hfs using about 60% of one CPU. fsck_hfs does, according to OSX’s man page (“man fsck_hfs”) , an HFS file system consistency check.
Funny thing is that I left it running for quite a while (we are dealing with a 160GB image, with hundreds of thousands of files), and after a couple hours, it failed! Argh!
