Deciphering
Glyph
( )
This Word, "Scaling"

Mon 30 June 2008

You keep using that word.  I do not think it means what you think it means.
        — Inigo Montoya
It seems that everyone on the blogosphere, including Divmod, is talking about "scaling" these days.  I'd like to talk a bit about what we mean ­— and by "we" I mean both the Twisted community and Divmod, Inc., — when we talk about "scaling".

First, some background.

Google Versus Rails

Everyone knows that Scaling is a Good Thing.  It's bad that Rails "doesn't scale" — see Twitter.  It's good that the Google App Engine scales — see... well, Google.  These facts are practically received wisdom in the recent web 2.0 interblag.  The common definition of "scaling" which applies to these systems is the "ability to handle growing amounts of work in a graceful manner".

And yet (for all that I'd like to rag on Twitter), Twitter serves hojillions of users umptillions of bytes every month, and (despite significant growing pains) continues to grow.  So in what sense does it "not scale"?  While that's going on, Google App Engine has some pretty draconian restrictions on how much an application can actually do.  So it remains to be seen whether GAE will actually scale, and right now you're not even allowed to scale it.  Why, exactly, do we say that one system "scales" and the other doesn't, when the actual data available now says pretty much the opposite?

A GAE application may not scale today, but when Our Benefactors over at the big "G" see fit to turn on the juice, you won't have to re-write a single line of your code.  It will all magically scale out to their demonstrably amazing pile of computers — assuming you haven't done anything particularly dumb in your own code.  All you have to do is throw money at the problem.  Well, actually, you throw the money at Google and they will take the problem away for you, and you will never see it again.  It accomplishes this by providing you with an API for accessing your data, and forbidding most things that would cause your application to start depending on local state.  These restrictions are surprisingly strict if you are trying to write an application that does things other than display web pages and store data, but that functionality does cover a huge number of applications.

Rails, on the other hand, does not provide facilities for scaling.  For one thing, it doesn't provide you with a concurrency model.  Rails itself is not thread safe, nor does it allow any multiplexing of input and ouptut, so you can't share state between multiple HTTP connections.  Yet, Rails encourages you to use "normal" ruby data structures, not inter-process-communication-friendly data structures, to enforce model constraints and do other interesting things.  It's easily to add logic to your rails application which is not amenable to splitting over multiple processes, and it's hard to make sure you haven't accidentally done so.  When you use the only concurrency model it really supports, i.e. locking and using transactions via the database, Rails strongly encourages you to consider your database connection global, so "sharding" your database requires significant reconsiderations of your application logic.

These technical details are interesting, but they all point to the same thing.  The key difference between Rails and GAE is the small matter of writing code.  If you write an application with Rails, you probably have to write a whole bunch of new code, or at least change around all of your old code, in order to get it to run on multiple computers.  With GAE, the code you start with is the code you scale with.

Economics of Scale

The key feature of "scalability" that most people care about is actually the ability of a system to efficiently convert money to increased capacity.  Nobody expects you to be able to run a networked system for a hundred million users on a desktop PC.  However, a lot of business people — especially investors — will expect you to be able to run a system for a hundred million users on a data-center with ten million cores in it.  Especially if they've just bought one for you.

Coding is an activity that is notoriously inefficient at converting money into other things.  It's difficult to predict.  It's slow.  But most unnervingly to people with money to invest, pouring money on a problematic software project is like pouring water on an oil fire: adding more manpower to a late software project makes it later.  If you have a hard software problem, you want to identify it early and add the manpower as soon as possible, because you won't be able to speed things along later if you start running into trouble.

So, the thing that pundits and entrepreneurs alike are thinking about when they start talking about "scalability" is eliminating this extra risky phase of programming.  Investors (and entrepreneurs) don't mind investing some money in a "scaling solution", but they don't want to do it when they are in the hockey-stick part of the growth curve, making first impressions with their largest number of customers, and having system failures.  So we're all talking about what hot new piece of technology will solve this problem.

At a coarse granularity, this is a useful framing of the issue.  Technology investment and third-party tools really can help with scaling.  Google and Amazon obviously know what they're doing when it comes to world-spanning scale, and if they're building tools for developers, those tools are going to help.

As you start breaking it down into details, though, problems emerge.  Front and center is the problem that scalability is actually a property of a system, not an individual layer of that system, infrastructure or no.  Even with the best, sexiest, most automatic scaling layer, you can easily write code that just doesn't scale.  As a soon-to-be purveyor of "scalability solutions" myself, this is a scary thought: it's easy to imagine a horror story where a tiny, but hard to discover error in code written on top of our infrastructure makes it difficult to scale up.

That error need not be in the application code.  The scaling infrastructure itself could have some small issue which causes problems at higher scales.  After all, you can do extensive testing, code review, profiling and load analysis and still miss something that comes up only under extremely high load.

Does Twisted Scale?

Just about any answer to this question that you can imagine is valid, so I'll go through them all and explain what they might mean.

No.

Applications written using Twisted can very easily share lots of state, require local configuration, and do all kinds of things which make them unfriendly to distribution over multiple nodes.  Since there is no 'canonical' Twisted application (in fact, you might say that the usual Twisted application is simply an application unusual enough to be unsuited to a more traditional LAMP-type infrastructure), there's no particular documented model for writing a Twisted application that scales up smoothly.  None of the included services do anything special to provide automatic scaling.  There are no state-management abstractions inside Twisted.  If you talk to a database in a Twisted application, the normal way to do it is to use a normal DB-API connection.

When I discussed Rails above, I said that the reason it doesn't scale is that it's too easy, by default, to write applications that don't scale.  Therefore we must conclude that Twisted doesn't scale.

Yes.

Twisted is mainly an I/O library, and it uses abstract interfaces to define application code's interface with sockets and timers.  Twisted itself includes several different implementations of different strategies for multiplexing between these timers, including several which are platform-specific (kqueue, iocp), squeezing the maximum scale out of your deployment platform, even if it changes.

I said above that infrastructure is scalable if it lets you increase your scale without changing your code.  It would make sense to say that Twisted scales because it allows you to increase the number of connections that you're handling by changing your reactor without changing your code.

You could also say that Twisted is scalable because it is an I/O library, and communication between different nodes is almost the definition of scale these days.  Not only can you write scalable systems easily using Twisted's facilities, you can use Twisted as a tool to make other systems scale, as part of a bespoke caching daemon or database proxy.  Several Twisted users use it this way.

Maybe.

Being mostly an I/O library, Twisted itself is rarely the component most in need of optimization.  Being mostly an implementation of mechanisms rather than policies, Twisted gives you what you need to achieve scale but doesn't force, or even encourage you, to use it that way.

For the most part, it's not really interesting to talk about whether Twisted scales or not.  The field of possibilities of what you can do with Twisted is too wide open to allow that sort of classification.

What about Divmod? Does Mantissa scale?

Mantissa, lest you have not heard of it already, is the application server that we are developing at Divmod.  Mantissa is based on Twisted, among other components.  However, there's a big difference in what the answer to the "scaling" question means than it means to Twisted.

Twisted is very general and can be used in almost any type of application, from embedded devices to web services to thick clients to system management consoles.  It's almost as general as Python itself — with the notable exception that you can't use Twisted on Google App Engine because they don't allow sockets.  As part of being general, Twisted doesn't dictate much about the structure of your application, except that it use an event loop.  You can manage persistent state however you want, deal with configuration however you want.

Mantissa, on the other hand, is only for one type of application: multi-user, server-side applications, with web interfaces.  You might be able to apply it to something else but you would be fighting it every step of the way.  (Although if you wanted to use Mantissa's components for other types of applications, the more general parts decompose neatly into Nevow and Axiom.)  So the question of "does it scale" is a bit more interesting, since we can talk about a specific type of application rather than a near-infinite expanse of possibilities.  Does Mantissa scale to large numbers of users for these types of "web 2.0" applications?

Unfortunately, the fact that the question is simpler doesn't make the answer that much simpler, so here it is:

Almost...

Mantissa has a few key ingredients that you need to build a system that scales out. The biggest one is a partitioned data-model.  Each user has their own database, where their data is stored.

A very common "web 2.0" scaling plan — perhaps the most common — is to have an increasing number of web servers, all pointed at a single giant database with an increasingly ridiculous configuration — gigabytes of RAM, terabytes of disk, fronted by a bank of caching servers.  This works for a while.  For many sites, it's actually sufficient.  But it has a few problems.

For one thing, it has a single point of failure.  If your database server goes down, your service goes down.  Your database server isn't a lightweight "glue" component, either, so it's not a single point of failure you can quickly recover if it goes down.  Even worse, it means that even in the good scenario, where you can scale to capacity, your downtime is increased.  Each time you upgrade the database, the whole site goes down.  This problem gets compounded because a lot of sites are append-only databases with increasingly large volumes of data to migrate for each upgrade.

Another issue is that it increases load on your administrators, because they are responsible for an increasingly finicky and stressed database server.  This may actually be a good thing — administrators are not programmers, after all, and are therefore a more reliable and easier resource to throw money at.  Unfortunately there are (almost by definition) fewer things that admins can do to improve the system.  Because the admins can't actually solve the root problems that make their lives difficult, it's easier for them to get frustrated and leave for an environment where they won't be so stressed.

The reason websites choose this scaling model is that popular frameworks, or even non-frameworks like "let's just do it in PHP", make it easy to just use a single database, and to write all the application logic to depend on that single database as the point of communication between system components.  So the scaling plan is just working with the code that was written before anybody thought about scaling.

If you write an application with Mantissa today, it's easiest to toss the data into different databases depending on who it is for, so when you get to dealing with the "scaling" part of the problem, you can put those databases onto different computers, and avoid the single point of failure.  Moreover, when you write an application with Mantissa, you get "global" services like signup and login as part of the application server, so your application code can avoid the usual schema pitfalls (the "users" table, for example) which require a site to have a single large database.

There's only one problem with that plan.

... but not quite.

In my humble opinion, Mantissa offers some interesting ideas, but there are a few reasons you won't get scaling "for free" with Mantissa if you use it right now, today.

You may be noticing about now that I didn't mention any way to communicate between those partitioned chunks of data.  This is what I've been spending most of my last few weeks on.  I have been working on an implementation of an "eventually consistent" message-passing API for transferring messages between user databases in a Mantissa server.  You can see the progress of this work on the Divmod tracker, where the ticket is nearing the end of its year-long journey, and already in its (hopefully) final review.

I'm particularly excited about this feature, because it completes the Mantissa programming model to the point where you can really use it.  It's the part of the system that most directly impacts your own code, and thereby allows you to more completely dodge the bullet of modifying a bunch of your application's logic when you want to scale.  There might be some dark corners — for example, a scalable API for interacting with the authentication system — but those should only affect a small portion of a small number of applications.  Unfortunately communication between databases is not the only issue we have remaining.

There's more to the scaling problem than getting the application code to be the right shape.  The infrastructure itself needs to present a container that does the heavy lifting of scalability for the code that it contains.  For example, Mantissa needs a name server and a load balancer that will direct requests to the appropriate server for the given chunk of data.  It also needs a sign-up and account management interface that will make an informed decision about where to locate a new user's data, and be able to transparently migrate users between servers if load patterns change.  Finally there are enhanced features, like replicating read-only data to multiple hosts, for applications (for example, a blogging system) which have heavy concentrations of readers on small portions of data.

Finally there are problems of optimization.  We haven't had much time to optimize Mantissa or Athena, and already on small-scale systems we have seen performance issues, especially given the large number of requests that an Athena page can generate.  We need to make some time to implement the optimizations we know we need, and when we start scaling up our first really big system, I'm sure that we'll discover other areas that need tweaking.

Why Now?

I'm fond of saying that programming is like frisbee, and predictions more specific than "hey, watch this!" are dangerous.  So you might wonder why I'm talking about such a long-running future plan in such detail.  You might be wondering why I would think that you'd be interested in something that isn't finished yet.  Perhaps you think it's odd that I've described the challenges in such detail rather than being more positive about how awesome it is.

While I certainly don't want to publicly commit to a time-frame for any of this work to be finished, I do feel pretty comfortable saying that it's going to happen.  The design for scalability I've discussed here has been a core driving concern for Mantissa since its beginning, and it's something that's increasingly important to our business and our applications.

I'm being especially detailed about Mantissa's incompleteness because I want to make sure that potential users' expectations are set appropriately.  I don't want anyone coming to the Divmod stack after having heard me say vague things about "scalability", believing that they'll get an application that scales to the moon.

I do think that this is an exciting time for other developers to get involved though.  Mantissa is at a point where there are lots of bits of polish that need to be added to make it truly useful.  Starting to investigate it for your application now will give you the opportunity to provide feedback while it's still being formed, before a bunch of final decisions have been made and a lot of application code has been written to rely on them.

More Later...

I've got more to say about scaling, Twisted, and Mantissa, of course.  In particular I'd like to explain why I think Mantissa is an interesting scaling strategy and how it compares to the other ones.  At this rate, though, I'll only write one blog post this year!  I'm sure you hope as much as I do that the next one won't be so long...