Deciphering
Glyph
( )
Functional Functions and the Python Singleton Unpattern

Sat 07 July 2007

Have you ever written a module that looked like this?
subscribers = []

def addSubscriber(subscriber):
    subscribers.append(subscriber)

def publish(message):
    for subscriber in subscribers:
        subscriber.notify(message)
And then used it like this?
from publisher import publish

class worker:
    def work(self):
        publish(self)
I've done this many times myself.

I used to think that this was the "right" way to implement Singletons in Python.  Other languages had static members and synchronized static accessors and factory methods; all kinds of rigamarole to achieve this effect, but Python simply had modules.

Now, however, I realize that there is no "right" way to implement Singleton in Python, because singletons are simply a bad thing to have.  As Wikipedia points out, "It is also considered an anti-pattern since it is often used as a euphemism for global variable."

The module above is brittle, and as a result, unpleasant to test and extend.

It's difficult to test because the call to "publish" cannot be indirected without monkeying around with the module's globals - generally recognized to be poor style, and prone to errors which will corrupt later, unrelated tests.

It makes code that interacts with it difficult to test, because while you can temporary mangle global variables in the most egregious of whitebox tests, tests for code that is further away shouldn't need to know about the implementation detail of "publish".  Furthermore, code which adds subscribers to the global list will destructively change the behavior of later tests (or later code, if you try to invoke your tests in a running environment, since we all know running environments are where the interesting bugs occur).

It's difficult to extend because there is no explicit integration point with 'publish', and all instances share the same look-up.  If you want to override the behavior of "work" and send it to a different publisher, you can't call to the superclass's implementation.

Unfortunately, this probably doesn't seem particularly bad, because bad examples abound.  It's just the status quo.  Twisted's twisted.python.log module is used everywhere like this.  The standard library's sys.path, sys.stdin/out/err, warnings.warn_explicit, and probably a dozen examples I can't think of off the top of my head, all work like this.

And there's a good reason that this keeps happening.  Sometimes, you feel as though your program really does need a "global" registry for some reason; you find yourself wanting access to the same central object in a variety of different places.  It seems convenient to have it available, and it basically works.

Here's a technique for implementing that convenience, while still allowing for a clean point of integration with other code.

First, make your "global" thing be a class.
class Publisher:
    def __init__(self):
        self.subscribers = []

    def addSubscriber(self, subscriber):
        self.subscribers.append(subscriber)

    def publish(self, message):
        for subscriber in self.subscribers:
            subscriber.notify(message)

thePublisher = Publisher()
Second, decide and document how "global" you mean.  Is it global to your process?  Global to a particular group of objects?  Global to a certain kind of class?  Document that, and make sure it is clear who should use the singleton you've created.  At some point in the future, someone will almost certainly come along with a surprising requirement which makes them want a different, or wrapped version of your global thing,  Documentation is always important, but it is particularly important when dealing with globals, because there's really no such thing as completely global, and it is difficult to determine from context just how global you intend for something to be.

Third, and finally, encourage using your singleton by using it as a default, rather than accessing it directly.  For example:
from publisher import thePublisher

class Worker:
    publisher = thePublisher

    def work(self):
        self.publisher.publish(self)
In this example, you now have a clean point of integration for testing and extending this code.  You can make a single Worker instance, and change its "publisher" attribute before calling "work".  Of course, if you're willing to burn a whole extra two lines of code, you can make it an optional argument to the constructor of Worker.  If you decide that in fact, your publisher isn't global at all, but system-specific, this vastly decreases the amount of code you have to change.

Does this mean you should make everything into objects, and never use free functions?  No.  Free functions are fine, but functions in Python are for functional programming.  The hint is right there in the name.  If you are performing computations which return values, and calling other functions which do the same thing, it makes perfect sense to use free functions and not bog yourself down with useless object allocations and 'self' arguments.

Once you've started adding mutable state into the mix, you're into object territory.  If you're appending to a global list, if you're setting a global "state" variable, even if you're writing to a global file, it's time to make a class and give it some methods.