The Joel Un-test

Joel Spolsky seems to like controversy, although I can see why. Being a contrarian ideologue is pretty sweet.

Some people have been suggesting that the Joel Test should mention "100% unit test coverage" as well. Personally, I think that's a great idea. The industry is evolving, and automated testing is getting into the suite of tools that every competent programmer should be familiar with.

Joel disagrees, since 100% coverage is "a bit too doctrinaire about something you may not need".

For what it's worth, I don't completely disagree with Joel. Some of the software that I work on doesn't have 100% test coverage, and that's okay. I wrote it before I learned about unit testing. I'm not freaking out and spending all of my time just writing tests for old code which apparently works.

However, we do have policies in place to add test coverage whenever we change anything. Those policies stipulate that 100% coverage is a requirement for any new or changed code, so I consider myself a fan of 100% coverage and I generally think it's a good idea. I do think it belongs on the Joel Test, or at least something like it.

I feel like my opinions are representative of a pretty substantial number of "agile" practitioners out there, so I'd just like to respond to a few points:

Joel mentions "SOLID principles", as if they're somehow equivalent to unit testing. As if the sentiment that leads one to consider 100% test coverage a great idea leads one into a slavish architectural death-spiral, where any amount of "principles" are dogma if they have a sticker that says "agile" stuck to them.

Let me be clear. I think that SOLID, at least as Joel's defined it, is pointlessly restrictive. (I've never heard about it before.) As a guy who spends a lot of time implementing complex state machines to parse protocols, I find "a class should have only one reason to change" a gallingly naive fantasy. Most of the things that Joel says about SOLID are true, especially if you're using a programming language that forces you to declare types all over the place for everything. (In Python, you get the "open" part of "OCP", and the "clients aren't forced" part of "ISP" for free.) It does sound, in many ways, like the opposite of "agile".

So, since SOLID and unit testing are completely unrelated, I think we can abandon that part of the argument. I can't think of anyone I know who likes unit testing would demand slavish adherence to those principles. I agree that it sounds like it came from "somebody that has not written a lot of code, frankly".

On the other hand, Joel's opinion about unit tests sounds like it comes from someone who has not written a lot of tests, frankly.

He goes on and on about how the real measure of quality is whether your code is providing value to customers, and sure you can use unit tests if that's working for you, but hey, your code probably works anyway.

It's a pretty weaselly argument, and I think he knew it, because he kept saying how he was going to get flamed. Well, Mr. Spolsky, here at least that prediction has come true ;-).

It's weaselly because any rule on the Joel Test could be subjected to this sort of false equivalence. For example, let's apply one of his arguments against "100%" unit testing to apply to something that is already on the joel test, version control:

But the real problem with version control as I've discovered is that the type of changes that you tend to make as code evolves tend to sometimes cause conflicts. Sometimes you will make a change to your code that, causes a conflict with someone else's changes. Intentionally. Because you've changed the design of something... you've moved a menu, and now any other developer's changes that relied on that menu being there... the menu is now elsewhere. And so all those files now conflict. And you have to be able to go in and resolve all those conflicts to reflect the new reality of the code.

This sounds really silly to anyone who has really used version control for any length of time. Sure, sometimes you can get conflicts. The whole point of a version control system is that you have tools to resolve those conflicts, to record your changes, and so on.

The same applies to unit tests. You get failures, but you have tools to deal with the failures. Sure, sometimes you get test failures that you knew about in advance. Great! Now, instead of having a vague intuition about what code you've broken intentionally, you actually have some empirical evidence that you've only broken a certain portion of your test suite. And sure, now you have to delete some old tests and write some new tests. But, uh... aren't you deleting your old code, and writing some new code? If you're so concerned about throwing away tests, why aren't you concerned about throwing away the code that the tests are testing?

The reason you don't want to shoot for 90% test coverage is the same reason you don't want to shoot for putting 90% of your code into version control or automating 90% of your build process into one step or putting 90% or (etc) is that you don't know where the bugs are going to crop up in your code. After all, if we knew where the bugs were, why would we write any tests at all? We'd just go to where the bugs are and get rid of them!

If you test 90% of your code, inevitably, the bugs will be in the 10% that you didn't test. If you automate 90% of your build, inevitably the remaining non-automated 10% will cause the most problems. Let's say getting the optimization options right on one particular C file is really hard. Wouldn't it be easier to just copy the .o file over from bob's machine every time you need to link the whole system, rather than encoding those options in some kind of big fancy build process, that you'd just have to maintain, and maybe change later?

Joel goes on to make the argument that, if he were writing some software that "really needed" to be bulletproof, he'd write lots of integration tests that exercised the entire system at once to prove that it produced valid output. That is a valid testing strategy, but it sort of misses the point of "unit" tests.

The point of unit tests — although I'll have to write more on this later, since it's a large and subtle topic — is to verify that your components work as expected before you integrate them. This is because it's easier to spot bugs the sooner you find them: the same argument Joel makes for writing specs. And in fact if you read Mr. Spolsky's argument for writing specs, it can very easily be converted into an argument for unit testing:

Why won't people write unit tests? People like Joel Spolksy claim that it's because they're saving time by skipping the test-writing phase. They act as if test-writing was a luxury reserved for NASA space shuttle engineers, or people who work for giant, established insurance companies. Balderdash. ... They write bad code and produce shoddy software, and they threaten their projects by taking giant risks which are completely uncalled for.

You think your simple little function that just splits a URL into four parts is super simple and doesn't need tests because it's never going to have bugs that mysteriously interact with other parts of the system, causing you a week of debugging headaches? WRONG. Do you think it was a coincidence that I could find a link to the exact code that Joel mentions? No, it's not, because any component common enough to make someone think that it's so simple that it couldn't possibly have bugs in it, is also common enough that there are a zillion implementations of it with a zillion bugs to match.

Unlike specs, which just let you find bugs earlier, tests also help you make finding (and fixing) a bug later be cheaper.

Watching a test-driven developer work can be pretty boring. We write a test. We watch it fail. We make it pass. We check it in. Then we write another test. After a while of watching this, a manager will get itchy and say, Jeez! Why can't you just go faster! Stop writing all these darn tests already! Just write the code! We have a deadline!

The thing that the manager hasn't noticed here is that every ten cycles or so, something different happens. We write a test. It succeeds. Wait, what? Oops! Looks like the system didn't behave like we expected! Or, the test is failing at a weird way, before it gets to the point where we expect it to fail. At this point, we have just taken five minutes to write a test which has saved us four hours of debugging time. If you accept my estimate, that's 10 tests × 5 minutes, which is almost an hour, to save 4 hours. Of course it's not always four hours; sometimes it's a minute, sometimes it's a week.

If you're not paying attention, this was just a little blip. The test failed twice, rather than once. So what? It's not like you wouldn't have caught that error eventually anyway!

Of course, nobody's perfect, so sometimes we make a mistake anyway and it slips through to production, and we need to diagnose and fix it later. The big difference is that, if we have 100% test coverage, we already have a very good idea of where the bug isn't. And, when we start to track it down, we have a huge library of test utilities that we can use to produce different system configurations. A test harness gives us a way to iterate extremely rapidly to create a test that fails, rather than spinning up the whole giant system and entering a bunch of user input for every attempt at a fix.

This is the reason you don't just write giant integration tests first. If you've got a test that just tells you "COMPILE FAILED", you don't know anything useful yet. You don't know which component is broken, and you don't know why. Individual unit tests with individual failures mean that you know what has gone wrong. Individual tests also mean that you know that each component works individually before inserting it into your giant complex integrated compiler, so that if it dies you have a consistent object that you know at least performs some operations correctly, which you can inspect and almost always see in a sane internal state, even if it's not what the rest of the system expects.

Giant integration test suites can be hugely helpful on some projects, but they are the things which are sometimes unnecessary gold plating unless you have a clear specification for the entire system. Unit tests are the bedrock of any automated testing strategy; you need to start there.

Unit tests seem like they take time, because you look at the time spent on a project and you see the time you spent writing the tests, and you think, "why don't I just take that part out?". Then your schedule magically gets shorter on paper and everything looks rosy.

You can do that to anything. Take your build automation out of your schedule! Take your version-control server out of your budget! Don't write a spec, just start coding! The fact is, we pay for these tools in money and time because they all pay off very quickly.

For the most part, if you don't apply them consistently and completely, their benefits can quickly evaporate while leaving their costs in place. Again, you can try this incomplete application with anything. Automate the build, but only the compile, not the installer. Use version control, but make uncommitted hand-crafted changes to your releases after exporting them. Ignore your spec, and don't update it.

So put "100% test coverage" on your personal copy of the Joel Test. You'll be glad you did.

One postscript I feel obliged to add here: like any tool, unit tests can be used well and used poorly. Just like you can write bad, hard-to-maintain code, you can write bad, hard-to-maintain tests. Doing it well and getting the maximum benefit for the minimum cost is a subtle art. Of course, getting the most out of your version control system or written spec is also a balancing act, but unit tests are a bit trickier than most of these areas, and it requires skill to get good at them. It's definitely worth acquiring that skill, but the learning is not free. The one place that unit tests can take up more time than they save is when you need to learn some new subtlety of how to write them properly. If your developers are even halfway decent, though, this learning period will be shorter than you think. Training and pair-programming with advanced test driven developers can help accelerate the process, too. So, I stand by what I said above, but there is no silver bullet.