A collection of articles, ideas, and rambling from a guy who wrote some software that one time.

Friday, January 20, 2012

The Concurrency Spectrum: from Callbacks to Coroutines to Craziness


Concurrent programming idioms are on a spectrum of complexity.

Obviously, writing code that isn't concurrent in any way is the easiest.  If you never introduce any concurrent tasks, you never have to debug any problems with things running in an unexpected order.  But, in today's connected world, concurrency of some sort is usually a requirement.  Each additional point where concurrency can happen introduces a bit of cognitive overhead, another place you need to think about what might happen, so as a codebase adds more of them it becomes more difficult to understand them all, and it becomes more challenging to understand subtle nuances of parallel execution.

So, at the simplest end of the spectrum, you have callback-based concurrency.  Every time you have to proceed to the next step of a concurrent operation, you have to create a new function and new scope, and pass it to the operation so that the appropriate function will be called when the operation completes.  This is very explicit and reasonably straightforward to debug and test, but it can be tedious and overly verbose, especially in Python where you have to think up a new function name and argument list for every step.  The extra lines for the function definition and return statement can be an impediment to quickly understanding the code's intentions, so what facilitates understanding of the concurrency model can inhibit understanding of the code's actual logical purpose, depending on how much concurrent stuff it has to do.  Twisted's Deferreds make this a bit easier than raw callback-passing without fundamentally changing the execution dynamic, so they're at this same level.

Then you have explicit concurrency, where every possible switch-point has to be labeled somehow.  This is yield-based coroutines, or inlineCallbacks, in Twisted.  This is more compact than using callbacks, but also more limiting.  For example, you can only resume a generator once, whereas you can run a callback multiple times.  However, for a logical flow of sequential concurrent steps, it reads very naturally, and is shorter, as it collapses out the 'def' and 'return' lines, and you have to think of at least two fewer names per step.

However, that very ease can be misleading.  You might gloss over a 'result = yield ...' more easily than a 'def whatever(result): return result; something(whatever)'.  Nevertheless, if you have 'yield's everywhere you might swap your stack, then when you have a concurrency bug, you can look at any given arbitrary chunk of code and know that you don't need any locks in it, as long as you can't see any yield statements.  Where you do see yield statements, you know that you have some code that needs to be inspected.

To continue down that spectrum, a cooperatively multithreading program with implicit context switches makes every line with any function call on it (or any line which might be a function call, like any operator which can be overridden by a special method) a possible, but not likely culprit.  Now when you have a concurrency bug you have to audit absolutely every line of code you've got, although you still have a few clues which will help you narrow it down and rule out certain areas of the code.  For example, you can guess that it would be pathological for 'x = []; ...; x.append(y)' to context switch. (Although, given arbitrary introspection craziness, it is still possible, depending on what "..." is.)  This is way more lines than you have to consider with yield, although with some discipline it can be kept manageable.  However, experience has taught me that "with some discipline" is a code phrase for "almost never, on real-life programming projects".

All the way at the end of the spectrum of course you have preemptive multithreading, where every line of code is a mind-destroying death-trap hiding every possible concurrency peril you could imagine, and anything could happen at any time.  When you encounter a concurrency bug you have to give up and just try to drink your sorrows away.  Or just change random stuff in your 'settings.py' until it starts working, or something.  I never really did get comfortable in that style.  With some discipline, you can manage this problem by never manipulating shared state, and only transferring data via safe queueing mechanisms, but... there's that phrase again.

Some programming languages, like Erlang, support efficient preemptive processes with state isolation and built-in super-cheap super-fast queues to transfer immutable values.  (Some other languages call these "threads" anyway, even though I would agree with Erlang's classification as "processes".)  That's a different programming model entirely though, with its own advantages and challenges, which doesn't land neatly on this spectrum; if I'm talking about left and right here, Erlang and friends are somewhere above or below.  I'm just describing Python and its ilk, where threads give you a big pile of shared, mutable state, and you are constantly tempted to splash said state all over your program.

Personally I like Twisted's style best; the thing that you yield is itself an object whose state can be inspected, and you can write callback-based or yield-based code as each specific context merits.  My opinion on this has shifted over time, but currently I find that it's best to have a core which is written in the super-explicit callback-based approach with no coroutines at all, and then high-level application logic which wraps that core using yield-based coroutines (@inlineCallbacks, for Twisted fans).

I hope that in a future post, I may explain why, but that would take more words than I've got in me tonight.

14 comments:

Michael Hunter said...

It is very useful for people to understand the range of options they have. I find new grads tend to over weight the preemptive thread model. And you and they forget the process model for managing concurrency.

I think you over weight the complexities of preemptive threading. I grew up writing interrupt handlers and spending some of my early life working in the OS world so maybe I don't get why people think this is mind bending. But it really isn't. And it pays to understand how to use techniques to manage complexity when writing threaded code.

I think you miss that the real issue here isn't which one is easier to get code right in over the other. But when to make the trade off between loosely and tightly coupled and between cooperative multitasking and preemptive multitasking (roughly).

Study all the techniques until you can code comfortable in all of them. It is a real bear as long as you think something is hard. But in reality it really isn't.

Adam said...

After learning some Golang I have to say that I think we are doing it wrong in the Python community.

A simple keyword added to the language tied to a very clear and easy to reason about pipe-like model.

It's really kind of a combination of gevent and Queue.queue but without the noise and hassle.

Most importantly it's first class in the language and is put forward as the "right" way to do it (which I think it is for many many problems). Go still has locks etc but you have to import to get to them.

I'm really concerned that we are exploring and expounding multiprocess methods and styles that are going to lose Python that clarity that draws people to it in the first place.

Guido van Rossum said...

"currently I find that it's best to have a core which is written in the super-explicit callback-based approach with no coroutines at all, and then high-level application logic which wraps that core using yield-based coroutines"

Amen. This is exactly how I have developed NDB, a client library for App Engine's datastore that supports both synchronous calls and yield-based async calls in its user-facing API. The lowest level is an event loop based on callbacks.

Allen Short said...

Adam: The problem is that Go still shares mutable state between its coroutines, so you still have the problems reasoning about code that you'd have with greenlet/gevent/stackless etc.

CloudMarc said...

This is a great encapsulation of what I've been trying to tell developers for years, so I thank you!

I would add a little elaboration on the simple end of the spectrum (non-threading, non-concurrent): I have found that a fast, shared, atomic push-pop queue (redis ;) can get you a long way in some use cases.

Matt Joiner said...

Adam: What Allen Short said.
Allen Short: What you said.

Golang doesn't solve the mutability problem. It's definitely not practical to use channels everywhere. That said, explicit coroutines in Python don't allow for CPU parallelism, but then that's the true reason for OS threads, and they should only be used for that purpose, and only when actually needed.

Peter Cai said...

Can anyone give an example of "yield-based coroutines, or inlineCallbacks, in Twisted"?

I used Twisted 2 years ago a lot but I couldn't remember seeing these terms in any tutorials about Twisted. Is this a new trend?

---

BTW, from my experience, pure callback based concurrency can become very messy as some asynchronous calls need to be done sequentially , and when this sequence has to change according to the results of results of those calls.

muharem said...

Matt: re. golang: why is it not practical to use channels everywhere?

I find them to be pretty lightweight and practical so far.

Nicola Larosa said...

Peter Cai: inlineCallbacks are not the native Twisted, sometimes can be frowned upon, and apparently they have not yet found their way in the documentation, apart from the reference, that includes some examples.

Their style has been generalized by Monocle, that runs on top of either Twisted or Tornado.

Nicola Larosa said...

"...not the native Twisted style..."

Timo Savola said...

This has a Coroutines Considered Harmul-ring to it. The fact that there are bad programmers out there shouldn't limit other programmers' choices.

Steve said...

"With some discipline, you can manage this problem by never manipulating shared state, and only transferring data via safe queueing mechanisms."

Could somebody explain to me how this is not *also* true of callback driven code? It's an argument I see continually bandied about, and I've never been able to understand how using callbacks makes mutable shared state any less perilous.

Allen Short said...

It's not true of callback-driven code insofar as you can reason about when context switches happen when you don't have coroutines. If you know when context switches occur, you can know what state is vulnerable to changes by code you can't see.

gelin yan said...

I like the idea:

reactor & callback at the lowest level, yield on app level. I really like inlineCallbacks which I use daily.

use yield & inlineCallbacks are able to keep my code succinct & short...

using callbacks is an nightmare....