( )
A Tired Hobgoblin

Sun 21 October 2012

Alternate (Boring) Title: Why the Twisted coding standard is better than PEP8 (although you still shouldn't care)

People often ask me why Twisted's coding standard – camel case instead of underscores for method names, epytext instead of ReST for docstrings, underscores for prefixes – is "weird" and doesn't, for example, follow PEP 8.

First off, I should say that the Twisted standard actually follows quite a bit of PEP 8.  PEP 8 is a long document with many rules, and the Twisted standard is compatible in large part.  For example, pretty much all of the recommendations in the section on pointless whitespace.

Also, the primary reason that Twisted differs at all from the standard practice in the Python community is that the "standard practice" was almost all developed after Twisted had put its practices in place.  PEP 8 was created on July 5, 2001; at that point, Twisted had already existed for some time, and had officially checked in its first coding standard just a smidge over one month earlier, on May 2, 2001.

That's where my usual explanation ends.  If you're making a new Python project today, unless it is intended specifically as an extension for Twisted, you should ignore the relative merits of these coding standards and go with PEP 8, because the benefits of consistency generally outweigh any particular benefits of one coding standard or another.  Within Twisted, as PEP 8 itself says, "consistency within a project is even more important", so we're not going to change everything around just for broader consistency, but if we were starting again we might.


There seems to be a sticking point around the camelCase method names.

After ten years of fielding complaints about how weird and gross and ugly it is – rather than just how inconsistent it is – to put method names in camel case, I feel that it is time to speak out in defense of the elegance of this particular feature of our coding standard.  I believe that this reaction is based on Python programmers' ancestral memory of Java programs, and it is as irrational as people disliking Python's blocks-by-indentation because COBOL made a much more horrible use of significant whitespace.

For starters, camelCase harkens back to a long and venerable tradition.  Did you camelCase haters-because-of-Java ever ask yourselves why Java uses that convention?  It's because it's copied from the very first object-oriented language.  If you like consistency, then Twisted is consistent with 34 years of object-oriented programming history.

Next, camelCase is easier to type.  For each word-separator, you have only to press "shift" and the next letter, rather than shift, minus, release shift, next letter.  Especially given the inconvenient placement of minus on US keyboards, this has probably saved me enough time that it's added up to at least six minutes in the last ten years.  (Or, a little under one-tenth the time it took to write this article.)

Method names in mixedCase are also more consistent with CapitalizedWord class names.  If you have to scan 'xX' as a word boundary in one case, why learn two ways to do it?

Also, we can visually distinguish acronyms in more contexts in method names.  Consider the following method names:
  • frog_blast_the_vent_core
  • frogBLASTTheVentCore
I believe that the identification of the acronym improves readability. frog_blast_the_vent_core is just nonsense, but frogBLASTTheVentCore makes it clear that you are doing sequence alignment on frog DNA to try to identify variations in core mammalian respiration functions.

Finally, and this is the one that I think is actually bordering on being important enough to think about, Twisted's coding standard sports one additional feature that actually makes it more expressive than underscore_separated method names.  You see, just because the convention is to separate words in method names with capitalization, that doesn't mean we broke the underscore key on our keyboards.  The underscore is used for something else: dispatch prefixes.

Ironically, since the first letter of a method must be lower case according to our coding standard, this conflicts a little bit with the previous point I made, but it's still a very useful feature.

The portion of a method name before an underscore indicates what type of method it is.  So, for example:
  • irc_JOIN - the "irc_" prefix on an IRC client or server object indicates that it handles the "JOINED" message in the IRC protocol
  • render_GET - the "render_" prefix on an HTTP resource indicates that this method is processing the GET HTTP method.
  • remote_loginAnonymous - the "remote_" prefix on a Perspective Broker Referenceable object indicates that this is the implementation of the PB method 'loginAnonymous'
  • test_addDSAIdentityNoComment - the "test_" prefix on a trial TestCase indicates that this is a test method that should be run automatically. (Although for historical reasons and PyUnit compatibility the code only actually looks at the "test" part.)
The final method name there is a good indication of the additional expressiveness of this naming convention.  The underscores-only version – test_add_dsa_identity_no_comment – depends on context.  Is this an application function that is testing whether we can add a ... dissah? ... identity with no comment?  Or a unit test?  Whereas the Twisted version is unambiguous: it's a test case for adding a D.S.A. identity with no comment.  It would be very odd, if not a violation of the coding standard, to name a method that way outside of a test suite.

Hopefully this will be the last I'll say on the subject.  Again, if you're starting a new Python project, you should really just go ahead and use PEP 8, this battle was lost a very long time ago and I didn't even really mind losing it back then.  Just please, stop telling me how ugly and bad this style is.  It works very nicely for me.