Email bug wastes 4 days

Jeez, I hate software sometimes. At the start of Wednesday morning last week, Outlook 2000 ate my email, and it took me until the end of Saturday afternoon (4 frustrating days) to recover most of it.

The bug:

  • Specifically, there is a data corruption bug: when you reach around 2 gigabytes of data, Outlook will corrupt its PST mail store file, so that it can no longer read it (which is what happened last Wednesday morning). There is a tool though for repairing corruption in that file, which is shipped with Outlook. However, it dies with a cryptic 8-digit error code when you try to repair one of these 2-gigabyte files. So, not only does Outlook reproducibly corrupt its data (thus making all email, notes, calendar items, to-do items, etc in that file inaccessible), there is simply no way to recover the data using the tools that the product ships with.
  • Microsoft are aware of the problem, and they have a “solution” of sorts: chop off the end of the file using a tool (thus losing any data contained in the removed portion), run the recovery tool again, and try to recover your data. They claim that you can just cut off 25 to 50 megabytes, but this is incorrect – and the reason it is incorrect is that the repair process significant increases the file size, which can easily cause the recovered file to exceed 2 gigabytes, thus causing the recovery tool to fail again. By a process of trial and error (and each attempt took around 3 hours of non-stop hard-disk thrashing to succeed or fail), I was able to find that trimming 355 megabytes of mail (thus deleting around 17% of my data) would make the tool run successfully, whilst just avoiding the 2-gigabyte limit.

So, what do we learn about software in general from this? For me, these were the most important points:

  • Test your software. First and foremost, this bug represents a failure to test (because I doubt anyone actually intended for data corruption to happen). The minimal test case for this would have been: Create a new email, attach a 100 MB file, save the draft email, and close the email item: Repeat 25 times; Close program, open program, verify that the program opens without errors. This test (verifying that you can read your own data) would have demonstrated data corruption, and it does not require sending any email, or talking to the network at all – so would have been a comparatively simple test to preform.
  • Fix severe problems quickly. This problem existed in Outlook 97 through to Outlook 2002, which (from the list of office versions) meant it was in the latest-and-greatest editions of office from Dec 1996 through to Nov 2003 (i.e. 7 years). Seven years is way too long have a severe data corruption bug like this.
  • Test your fixes. There is an update that is claimed to “prevent Outlook from allowing the .PST file to exceed the 2 GB maximum size”. Since I has this update installed at the time, all I can say is that the fix seems broken to me.
  • Fix severe problems in multiple ways. Bugs happen: I’ve certainly made plenty of mistakes, including ones that lost data. For the most severe ones though, I try to make it a point to fix them in multiple ways, at every point where I have made a bad assumption, or where I could be checking the passed data obeys certain constraints. My personal record is fixing a logic bug in 4 different ways, although 3-way fixes are slightly more common, and 2-way fixes are fairly standard – and when you fix bugs thoroughly like this, you never see them again. It’s the same in aviation, where they use the Swiss Cheese model – which basically says that for any accident, there are usually many cumulative failures, and you need to fix all of them to stop the same mistake happening again. Now, this particular email bug was not one bug: rather, it was two (at the very least). The first bug is that corruption occurs. The second bug is that when corruption occurs, you cannot recover from it. Microsoft only attempted to fix the first bug (and failed). If they had attempted to fix the second bug (making data recovery work), and succeeded, it would have been much less of a problem. I even tried to recover the data in a virtual machine, running the free 60-day trial edition of Outlook 2007 (the latest version), which you can download from the Microsoft site. It didn’t work, and the recovery tool still failed to recover data. As a result, this bug (making data recovery work for oversized PST files) is still unfixed in the latest edition of office (and has been present for 10.5 years now, and counting).
  • Automate the backup of your data. My last backup of email data was from May 2005. Backing up my data was on my TODO list, however that TODO list was stored in the very file that got corrupted (yay, irony!). I’ve come to the realisation that if it’s not automated, I probably won’t back up my data – and I suspect most people are the same. I’m currently planning to buy one of those small network-drive devices that runs Linux, install samba, and script it to once a day delete the oldest backups until there is 10 gig of free space, then make a copy of yesterday’s data, and then reach out across the network and rsync yesterday’s data remotely against the current data, and then share out all my data as read-only and password protected. This way, I should be able get back to a previous state from any day in the last 30 days; and even if I get a severe virus or accidentally try to delete my backup, it can’t be deleted or altered because it’s read-only.
  • Proprietary data formats suck. If my email had been stored in mbox format, I would have able to open it another app, even when Outlook could not. If my notes were in text files, I would have been able to open them in a text editor. If my calendar items were in iCalendar format, I could have been able to import them into other calendaring software. As it was, the data was in a proprietary format, so none of these things were possible. However, because of the point below, it’s still not clear what to do about that:
  • There still appears to be no email plus Personal Information manager open-source killer-app. Despite everything, there still seems to be no free email app that’s doing what Firefox did for web browsers, or what OpenOffice is doing now for word processing and spreadsheets: Provide a fully-featured, cross-platform, kick-arse implementation, that’s easy to switch to. There are many email clients, but none that seem suitable replacements. Mozilla Thunderbird for example is clear that “It is not a personal information manager” – which is fine, but not what I’m looking for (although having builds available for Windows, Mac and Linux is a definite plus, because I want the option to change platforms at any point, and bring all my data and favourite apps with me). However, I’m looking for email + calendaring + TODO task lists + notes, well integrated in one app, rather than 4 separate apps. The closest candidate seems to be Novell Evolution, which feature-wise seems closest, but which is limited in two regards: 1) It’s part of the GNOME desktop, thus philosophically tied somewhat to a single platform, and not officially available for Windows (there are people working on a Windows port, but builds happen on an ad-hoc basis, rather than being a first-class citizens with automated nightly builds like Firefox), and 2) there seems to be no mechanism for importing data from Outlook (originally requested, twice 5 years ago), which is a missed opportunity because Outlook is a pretty popular email client, and I’m sure a lot of people and corporations would switch to an open-source app if there was a compelling pathway for them to do so.
  • If you run Outlook, check now that your PST file is significantly smaller than 2 gigabytes. If it’s getting close, take action now.