While Mr. Resig isn't adamantly against "language abstractions" - he notes many of their benefits - his counterpoint is summed up in this paragraph:
In the case of these language abstractions you are gaining none of the benefit of learning the JavaScript language. When a leak in the abstraction occurs (and it will occur - just as it's bound to occur in any abstraction) what resources do you have, as a developer, to correct the problem? If you've learned nothing about JavaScript then you stand no chance in trying to repair, or work around, the issue.
This is becoming a popular fallacy in programming language circles; treating Joel Spolsky's "Law of Leaky Abstractions" as if it were an actual law.
Let's examine the metaphor of the "leak". In plumbing, a leak is a hole in a pipe where water gets out. Joel has noticed that every pipe has a hole in it, and therefore all pipes are leaky.
But that's not quite accurate. There's another hole in pipes where water gets out: it's called the "faucet", and without that part, the rest of the pipe is pretty useless. To say that a pipe whose faucet is turned on is "leaky" is somewhat misleading, just as it's misleading to say that an abstraction that propagates errors in its lower levels is misleading. Joel's entire original essay is based on a subtle (and, I suspect, intentional) misunderstanding of TCP: the error conditions that result from failures in the lower level, unreliable packet delivery mechanism are not leaks in the abstraction, they are very carefully specified and thoroughly documented. They are part of the abstraction. The abstraction of TCP does not try to pretend that connections are never broken, it just provides a unified idea of a "broken connection" that is clearly specified so you don't need to understand the five million ways that packet delivery can go wrong.
Put more simply: there are abstractions which do not leak. The example that Joel provides is one of them: TCP is a comprehensive abstraction.
Then there are abstractions which really do leak. Every object-relational mapper that provides a facility where you need to directly execute SQL, for example, is leaking the SQL through the abstraction. Every web templating framework where you can directly generate strings is leaky: the browser speaks DOM, and if you're generating strings, then bytes are leaking through the abstraction.
But "language abstractions" — or as those of us who are not hip to the new web lingo call them, "compilers" — are generally accepted to be the kind of thing that work well enough that you can trust them. I don't know the specifics of the current crop of javascript-targeting compilers. Maybe GWT and Pyjamas have issues that would require some knowledge of JavaScript to use them correctly. A well-written compiler, one that really lived up to the promise of treating the browser as a deployment target, wouldn't have those kinds of issues though. Let's turn the wayback machine to 1969 and cast Mr. Resig's argument against the contemporary contender for moving up the abstraction stack:
In the case of UNIX, you are gaining none of the benefit of learning the PDP-11 instruction set. When a bug in the C compiler occurs (and it will occur - just as it's bound to occur in any compiler) what resources do you have, as a developer, to correct the problem? If you've learned nothing about PDP-11 assembler then you stand no chance in trying to repair, or work around, the issue.
So, for those of you who work on UNIX-like operating systems using that fancy "C" machine-code abstraction: how much PDP-11 assembler have you written recently?

8 comments:
Heh, more or less the same analogy had occurred to me. Sadly, a javascript engine and a web browser present a MUCH more complicated interface than "x86 machine code". It's not very often different x86 processors react to the same instruction in very different ways.
Perhaps this is a problem that time and engineering will solve, but if the application development platform of the future has a web browser and js mushed so far into the environment you think of it how you think of the processor's MMU now, I may take up farming instead.
Michael Hudson said...
It's not very often different x86 processors react to the same instruction in very different ways.
But it does happen. Remember the floating point bug a few years ago?
--
About the original post:
I did some PDP-11 assembly language
programming a long time ago in a galaxy far far away. I've also had to deal with some bugs in C compilers. These bugs were not related to the PDP-11 instruction set but rather to the interpretation or misunderstanding of the various C standards. For example, did you know that the &: ternary operator can be on the left hand side of a C = expression? Some compiler writers in the 1980s didn't - it's a fairly subtle implication of C syntax. There are other things compiler writers either missed or chose to ignore.
The fact that the only commonly known example of this kind of thing is so famous helps makes my point, I think.
Hi Glyph,
I disagree -- I have a popular talk (maybe SOME video of it has finally made it to the net? I know of one in italian, but I mostly give those talk in English...!-) about "Zen and the art of Abstraction Maintenance" AKA "Abstraction as Leverage", where I give TCP as an extreme example... a superbly designed abstraction stacks that nevertheless DOES leak! It leaks TRUST -- above, below, and to the sides. BGP fakes, ARP cache poisoning, DNS poisoning, etc, etc...
So, I DO believe that EVERY meaningful abstraction DOES leak (AND sometimes it SHOULD leak, in a controlled, architected way -- but, hey, can't summarize an hour-plus talk into a comment, pls web search for it and you should find at least the slide's PDF if not the video in English yet;-)).
The PDP-11 assembly language is incredibly easy to learn. It's orthogonal, and have only a few instruction set formats. The most complicated format is:
opcode (4 bits)
adressing mode (3 bits)
source register (3 bits)
addressing mode (3 bits)
destination register (3 bits)
To be fair every web browser is its own platform. Its very akin to posix and *nix programming. My co-workers often talk about the programs they wrote on solaris using bi-directional pipes that refused to compile on Linux and had strange bugs. Really its just a part of programming currently, if there was only one platform these problems would largely go away.
I kindly disagree with your interpretation of Spolsky's "Law of Leaky Abstractions."
The point Spolsky makes isn't so much that specific details can't be accounted for in the comprehensive documentation of the underlying mechanism for any sufficiently complex abstraction (thus freeing it from leaks); it may well be the case that any and all foreseeable contingencies that map onto the underlying mechanisms are incorporated into the abstraction, so as to create a formally sound abstraction layer ideally independent from its underlying substrate. For example, designating opaque interfaces is one way to map potential exceptions to potentially exclude unforeseen leaks.
However, that does not suffice to ensure that the abstraction is either conceptually or causally closed (and thus independent of its underlying implementation). Perhaps knowledge of PDP assembler isn't necessary, but knowledge of the Neumann architecture upon which the PDP is based *does* leak, particularly when assessing latencies in processing or other contingencies not fully anticipated in the abstraction spec. A script-kiddie who hacks together a script may not understand why his script runs in quadratic versus log time if he cannot peek under the hood to see that one version uses iteration whereas another uses filtering over a list. It is conceivable that this is addressed in the abstraction spec, but within any generative structure, it is not possible to map out all possible contingencies.
To quote the ever-locquacious Rumsfeld, it's not the known unknowns, it's the unknown unknowns... that force the leaks.
@Alex, gn:
It seems we are in agreement, in a roundabout way.
Interpreted in a certain light, the "law of leaky abstractions" does make sense. You can't abstract away reality: TCP connections break, and that's gotta either be part of the abstraction or unaccounted for. Either way you could potentially class it as a "leak". (Alex's examples about leaking trust are much better examples of what I would consider an actual "leak"; DNS cache poisoning is not accounted for in the specification of TCP, either the explicit specified abstraction or the implicit mental model that a network programmer will develop after working with it for a while.)
The problem is that its interpretation in the sense that I pointed out Mr. Resig makes: that no abstraction can be whole enough for a working programmer to use it and trust it, that you must always have an intimate understanding of every level of abstraction all the way down to microcode, is inaccurate. More importantly - and this is part of my point - things like TCP and the C compiler are abstractions over more than one thing.
TCP packets can be delivered over ethernet, over serial lines, various types of service you get from the phone company, carrier pigeons, optical cables, and so on. There's no way to be an expert in all of these things, and you don't need to be in order to use a TCP socket.
Similarly, becoming an expert at C is more useful than becoming an expert in PDP-11 assembler, because these days it's actually x86 code that gets generated from C compilers and therefore has a whole different set of problems.
Somebody still needs to be an expert on those low-level details, and it's useful to understand a little bit about them to better understand the abstraction in question. However, if the abstraction is any good, you can use those low-level details for an entire career without actually learning enough to maintain them yourself. And that's very important. If we all needed to understand how the implementation of everything worked all the time, software development would be sinking into a peat bog of abstractions rather than building more complex and more powerful tools on top of what we've already created. Granted, a lot of abstractions could definitely be more reliable, more solid, but that doesn't mean they're unusable.
To put it in language more similar to that which you're using: all abstractions "leak", but some are still watertight enough to float while they're doing so. It's important to distinguish those which are seaworthy from those which aren't.
Post a Comment