Blocking vs. Running

Friday November 04, 2011
I've heard tell of some confusion lately around what the term "non-blocking" means.  This isn't the first time I've tried to explain it, and it certainly won't be the last, but blogging is easier than the job Sisyphus got, so I can't complain.

A thread is blocking when it is performing an input or output operation that may take an unknown amount of time.  Crucially, a blocking thread is doing no useful work.  It is stuck, consuming resources - in particular, its thread stack, and its process table entry.  It is sucking up resources and getting nothing done.  These are resources that one can most definitely run out of, and are in fact artificially limited on most operating systems, because if one has too many of them, the system bogs down and becomes unusable.

A thread may also be "stuck" doing some computationally intensive work; performing a complex computation, and sucking up CPU cycles.  There is a very important distinction here, though.  If that thread is burning up CPU, it is getting work done.  It is computing.  This is why we have computers: to compute things.

It is of course possible for a program to have a bug where a program goes into an infinite loop, or otherwise performs work on the CPU without actually getting anything useful to the user done, but if that's happening then the program is just buggy, or inefficient.  But such a program is not blocking: it might be "thrashing" or "stuck" or "broken", but "blocking" means something more specific: that the program is sitting around, doing nothing, while it is waiting for some other thing to get work done, and not doing any of its own.

A program written in an event-driven style may be busy as long as it needs to be, but that does not mean it is blocking.  Hence, event-driven and non-blocking are synonyms.

Furthermore, non-blocking doesn't necessarily mean single-process.  Twisted is non-blocking, for example, but it has a sophisticated facility for starting, controlling and stopping other processes.  Information about changes to those processes is represented as plain old events, making it reasonably easy to fold the results of computation in another process back into the main one.

If you need to perform a lengthy computation in an event-driven program, that does not mean you need to stop the world in order to do it.  It doesn't mean that you need to give up on the relatively simple execution model of an event loop for a mess of threads, either.  Just ask another process to do the work, and handle the result of that work as just another event.