I'm the guy who upgrades (plus I've got regression).

After talking to quite a few people about my "stuponfucious" essay, and mulling over the problem a bit more, I think that I've come up with a few guiding principles on maintaining a codebase which can effectively be upgraded without ever having to resort to drastic "stop the world" database destruction or shutdown measures. I've been wrestling to state these in a concise manner for the last few days, and I think I've finally taken enough words out to post them here.

I'd love for this to be generally applicable, but if it doesn't make sense to you, keep in mind I'm talking specifically about the Divmod product, Quotient.

Upgrade Mechanism With Per Object - Class Granularity

This means you need a mechanism which can upgrade each instance at the level of each of its classes independently. It's important so that you can do upgrades at the same level of the code that manages the state, to keep that knowledge close together and consistent.

Twisted already handles this, so I won't belabor the point. However, in my previous posting here, I was not sure that providing this level of flexibility is a good thing. After considering a wide variety of use-cases, it actually is, because it removes limitations from the kind of changes you can adapt to. You still need to be careful about making difficult persistence decisions, but if you have to, you won't be stuck. Making a particular kind of feature impossible will not also make it unnecessary.

I was finally convinced that this was a really useful facility when I realized that it's not important to keep total continuity of data. It's fine if, on a developer's local machine, they bring up a database, debug for a while, completely destroy that database, then bring up a new one and test again. In fact, the unit tests for persistence should probably work this way, so that we can get some assurances that the database actually works.

I have made a list of everything I believe we need to encourage and implement the stablest policies regarding data upgrades, as well as remaining flexible and nimble the majority of the time.

Examples and Tests

If you are really interested in supporting past versions of persistence, like past versions of anything else that you want to support, you must have regression tests. That means a consistent dump of every kind of object that you're going to be outputting from a particular supported version, ideally also in every state they can valid-ly be in in storage.

Reason To Keep Going

All the development rules and testing discipline in the world isn't going to stick if you don't have some tangible reason the data remain upgradable, and concrete points between which upgrades must work. This can only be provided by a real, running service.

Without a running service, everyone will just feel as though the upgraders being written are extra overhead, and in reality, if you're not running the code somewhere that needs to be upgraded periodically, they are.

Staging Area

Although we developers should be concerned with breaking persistence and therefore the running service, in practice this should be a very hard thing to do accidentally. Before a persistence change is rolled out to the running server, it should be run on a "staging" server with an agressive upgrader. (Since the "Versioned" class above is lazy, it is important to actually touch every object in the system.)

It is bad to break CVS so that the staging upgrade breaks, but like any other test, hopefully it will catch things that we didn't think of.
Discrete Versions

Some intermediary checkins may not be supported for upgrades, as the upgraders between releases were broken. However, many intermediary versions will probably be supported, since upgrading on a full-release granularity is often too large to test effectively. Each of these versions should have their own tag or other means to easily and quickly compare what has changed between the different persistence schemas. This is useful for maintenance in 2 ways -
Distinction Between Structure and Content

In any system that has a database layer which can be queried, sorted, etc, the database layer will sometimes need upgrades. Indexes added and removed, etc. Since these changes can be long-running and do not necessarily affect application logic, they should be segregated and queued so as to run as quickly as possible without waiting for all your old objects to catch up.

In atop this means that Pools need to be annotated with type information so that it's possible to update their indexes and remove/add items that may have come from somewhere else. there can be no general queueing mechanism, because the very place that the queue comes from differs from case to case. In general it will be a query over another pool, which one can save the state of between iterations and continue querying on.

Knowledge of When to Give Up

Some things aren't new versions of old objects. Superficially they may be similar, but if you're implementing a whole new strategy and interface for manipulating your items, it's probably best to have your upgraders destroy the old objects and create new ones. In situations where the changes really are major, this is both less likely to produce cruft (it's easier to properly initialize a new object than to filter state by hand into a correct new shape on an old one) and easier to monitor a long-running upgrade (It's hard to tell how many Version 3 Foos you have in the system, but it's easy to tell how many objects are in the 'Foo' index and how many are in the NewFoo index.)

Database Inspection Tools

I've recently written a tool to provide developers with a simple visualization of what's going on within a running Twisted server. Tools like that, enhanced to support the persistence format, should be used to make sure that one has a complete understanding of the objects involved. (In our particular case, I recommend both that object inspector and 'pickletools.dis' for starters. We will need more powerful tools as time goes on.) As JP suggested, these tools should really include a way to manually tweak and destroy parts of a running database. This is important both as a development tool and as a last resort: sometimes, if an upgrader goes subtly wrong (non-subtle wrongness should REALLY be caught in the testing phase) some manual surgery will be required on a few persistent objects or indexes. An interactive prompt should always be the basis for such tools when possible.

Knowledge of when to abuse the infrastructure

The version upgrader is just running a function on your object. If it seems like the framework doesn't support a particular kind of upgrade, it probably does - you can invoke any code you want, schedule it to run later, kick off an upgrade queue, or whatever seems appropriate to your situation. Don't be afraid of creating scratch objects, temporary work-spaces, and other workarounds if your upgrade is complex.

There are several problems I'm not quite sure how to address, such as the proliferation of upgrade code and compensating for buggy past upgraders - however, on a case-by-case basis I don't think that these issues will be significant.

The most challenging thing to provide here will be the test-case data. Even disregarding the same problems we always run across when looking for a decent corpus of email test data - email tends to be private - we're going to have to provide a tool to dump a live database into a test-friendly format, and then a way to verify that an upgrade "worked". I believe this will be challenging because the only way to really test if the upgrade worked is to simulate a great number of interactions and try to poke as many of the moving parts of each upgraded object as possible. Due to the fact that different versions will be different, these tests are likely to continue changing as each release goes on, and will have to be kept in sync with other unit tests for similar types of object. Parameterizeable unit tests would be a big help with this, although I don't see how to make trial do that easily.

In short, these will be regression tests that have to change every time the code is updated to still properly test the regression. We're actually regressing into the future.

In a future update, I plan to provide more complete examples of how one would do particular kinds of "refactoring" upgrades that are likely to be common - for example, converting a small Python list of objects into a Pool (and vice-versa).