Deciphering
Glyph
( )
man what the heck

Fri 14 May 2004

I should have different blogs for different things. Today, I'm going to be talking about technology. Specifically, the implications of time travel on the quality of life in post-submergence California.

...

Okay, our sponsors don't like that one, and would like me to inform you that I've never been anywhere in the future, let alone to any periods after the 2295 collapse of the Pax Hegemonia world government. Apparently hillbillies have all the fun. Just because I don't have ten billion dollars and a hobo-hunting range they think they can tell me what to do! It is a good thing my crafty markup confuses them!

Instead, I have a few features that I've been thinking about for Twisted that have implications for my favorite application.



Subsystems



Some kinds of processing are better done in a blocking, synchronous, program-at-a-time kind of way. It's always a challenge when you're in a non-blocking environment and you want to do something that just involves a whole crapload of math or data copying that you can't split up easily.

In these cases it would generally better if you had a subprocess which could do all the nasty blocking work and then return a small result, copying the large chunks of data out of band (say, to files, or to a database).

The particular use case I'm thinking of is using Lupy to index and search for messages in Quotient. In this case, there are 3 operations: flush your cache, index this thing into that index, and query that index for some things based on this text. Additionally, as the first operation implies, indexes have quite a bit of stuff that can be cached, so it's good to keep the subprocess alive for long periods of time and not attempt to shut it down too often.

I am probably going to do the inter-process communication with pickle, because I don't need safety between processes in this case, but I imagine that will be pluggable. The general interface I want is to have a global subsystem manager that I can ask for a particular service (by Interface) and then request that a method be called on it and given certain arguments. This will be done in a subprocess, blocking, and the result from the subsystem manager in the parent process will be a Deferred that will fire when the operation is done.

The main thing that I want this to do is to manage the IPC and to start and stop processes as necessary. The real trick, I think, will be spawning a process that has a reactor of a variety similar to that of the superprocess; it should run twistd, with an appropriate reactor argument, and then load a subsystem module into it, but I don't know what that looks like.

Faceted



I decided I'd have a name for the new kind of Componentized. I think that "Faceted" is nice because rather than having an Adapter class which is intended for both a generally useful superclass of all adapters AND the magical getComponent-makes-a-U-turn behavior, you can have a "Facet" which does magical stuff and an "Adapter" elsewhere that is simply utility.

from twisted.python import reflect

class Facet(object):
def __init__(self, original):
self.original = original

def getComponent(self, interface, registry=None, default=None):
return self.original.getComponent(self, interface, registry, default)

class Faceted(dict):
__slots__ = ()
def getComponent(self, interface, registry=None, default=None):
return self.get(interface, default)


This is actually a version that I believe is compatible with the existing (crummy) component system behavior. It's just a prototype; it doesn't do adapter lookup, but most of the time you want to be explicit about putting adapters on something stateful like this anyway. I believe we'll want adapter registry lookup, but this example illustrates how simple it can be.

System Plug Ins



In Atop, I'm unhappy with the way Powerups turned out. They are way too stateful, because they have to procedurally add and remove themselves from sources of events; they can't just declare their areas of interest and let the framework manage the state.

There's a general need, I think, for the powerup interface to fill two roles: one is to provide the functionality it already does, which is access mechanisms from various cred-enabled protocols via credup, and the other is responses to events in the system. This is also just a sketch, but I think it might make a good central event broadcasting mechanism for something like that:

from twisted.python import log

class Broadcaster:
def __init__(self):
self.listeners = {}

def addListener(self, interface, listener):
l = self.listeners
if not l.has_key(interface):
l[interface] = []
l[interface].append(listener)

def removeListener(self, interface, listener):
l = self.listeners
if not l.has_key(interface):
return
l[interface].remove(listener)

def getListeners(self, imeth):
return self.listeners[imeth.im_class]

def callMethod(self, listener, imeth, *a, **kw):
return getattr(listener, imeth.im_func.__name__)(*a,**kw)

def broadcast(self, interfaceMethod, *a, **kw):
m = self.getListeners(interfaceMethod)
for i in m:
self.callMethod(i, interfaceMethod, *a, **kw)

def safeBroadcast(self, interfaceMethod, *a, **kw):
m = self.getListeners(interfaceMethod)
for i in m:
try:
self.callMethod(i, interfaceMethod, *a, **kw)
except:
log.err()

def collect(self, interfaceMethod, *a, **kw):
m = self.getListeners(interfaceMethod)
for i in m:
yield self.callMethod(i, interfaceMethod, *a, **kw)

def safeCollect(self, interfaceMethod, *a, **kw):
m = self.getListeners(interfaceMethod)
for i in m:
try:
yield self.callMethod(i, interfaceMethod, *a, **kw)
except:
log.err()


This uses interfaces - I might have the getListeners method also adapt to the interface for extra flexibility, I'm not sure - but this, plus some kind of hierarchical channel-managing API, would allow the powerup to add itself to the appropriate broadcaster.

This would also be a nice way to encapsulate pool events through a general mechanism rather than having pools do everything themselves.

Execution Context



Every major system in Quotient requires its own execution context:

  • Atop relies on a transaction being in twisted.python.context.

  • Twisted relies on the reactor being instantiated

  • Nevow relies on an explicitly-passed 'context' object to every method it interacts with, which is generated by the page render. This generally also requires that some data, and an IRequest implementor, be in scope at render time.


These three systems are all currently ad-hoc. They should be unified, and the context should use the same method of identifying fragments of context for all three: interface names. For example,

# in all new style examples, something like
from twisted.python.excctx import context

# Twisted
from twisted.internet import reactor
# would become
from twisted.internet.interfaces import IReactor
reactor = IReactor(context)

# Nevow
from iwhatever import IA, IB, IC
from nevow.inevow import IRequest
def doit(context, data):
context.remember(Foo(), IA, IB, IC)
request = context.locate(IRequest)
# would become
def doit(data):
context[IA, IB, IC] = Foo()
request = IRequest(context)

# Atop
c = context.get("CursorFactory").cursor()
# would become
c = ICursorFactory(context).cursor()


I think this would let us sort out a lot of thorny problems, for example, by putting the currently running services into context when they are being invoked; by saving context across Deferreds, one can do things like automatically put both parts of a Nevow page rendering with a Deferred in the middle into two different (potential) transactions. Capturing context would also be useful for a protocol to initialize itself without storage, set its storage in the login process, and then automatically have transactions from then forward with any protocol notification switching into the protocol's saved context.

For the mandatory evil associated with a rambling post like this, I was thinking about how to implement a convenient context.clone() which would push another level onto the context stack without requiring a context.call - so that you could use the new context variables in the rest of your function. This is evil enough for par, I think, and it should be obvious how one would implement such a thing from here:

import gc
import inspect
import weakref

_wrd = {}
_wrc = 0

def whencallerexits(x, stacklevel=2):
global _wrd, _wrc
up = inspect.currentframe()
for ig in range(stacklevel):
up = up.f_back
n = _nothing()
up.f_locals[' nothing '] = n
cap = _wrc
def curry(w):
del _wrd[cap]
x()
_wrd[cap] = weakref.ref(n, curry)
_wrc += 1

def printit():
print 'aaa'

def somestuff():
print 'doing some stuff'
whencallerexits(printit)
print 'done doing some stuff'

def horrible():
print 'get ready for some horribleness'
somestuff()
print 'here we go...'

horrible()
print "wasn't that horrible?"


I think there may still be some remaining issues with managing which bits of context want to be remembered when captured and which don't -- for example, you still want to be able to switch the logging backend for the entire application, not just the protocols which haven't remembered their own logging frosting features -- but I believe it's possible to provide answers to a lot of questions regarding system-wide configuration by providing this context as pseudo-global state which can be addressed as an object rather than as actual global variables. This way multiple applications could still run entirely separated in the same process, just by referring to different context roots; in fact, properly done, this would enable multiple reactors to run in the same process WITHOUT any crazy bootstrapping, by using multi-threading and specifying the context root with a different reactor from the .callInThread from the initial reactor.

I hope that some of that made sense. Have a happy weekend.