The always entertaining Jacob Kaplan-Moss recently posted a missive, "Snakes on the Web", which, if you haven't already read it, is a highly edifying trip through a variety of Python web technologies and history. He begins with a simple statement — "Web development sucks." — and goes on to ask a number of interesting questions about that.
What sucks about web development? How will we fix it? How has python fixed it, and how will python fix it in the future? While I can't say I agree with every answer, I found myself nodding quite a bit, and he has something useful to say on just about every point.
I noticed one very important question he leaves out of the mix, though, which seems more fundamental than the others: why does web development suck? In particular, why do so many people who are familiar with multiple styles of development feel like developing for the web is particularly painful by comparison, while so much of software development moves to the web? And, why does web development in Python suck, despite the fact that otherwise, Python mostly rocks?
Programming for the web lacks an important component, one that Fred Brooks identified as crucial for all software as early as 1975: conceptual integrity. Put more simply, it is difficult to make sense of "web" programs. They're difficult to read, difficult to write and difficult to modify, because none of the pieces fits together in a way which can be understood using a simple conceptual model.
Rather than approach this head on, from the perspective of a working web programmer, let's start earlier than that. Let's say someone approached you with a simple programming task: write an accounting system that includes point-of-sale software to run a small business. Now, considering some imagined requirements for such a system, how many languages would you recommend that it be written in?
Most working programmers would usually say "one" without a second thought. A too-clever-by-half language nerd might instead answer "two, a general-purpose programming language for most things and a domain specific language to describe accounting rules and promotions for the business". Why this number? Simply put, there's no reason to use more, and introducing additional languages means mastering additional skills and becoming familiar with additional quirks, all of which add to initial development time and maintenance overhead. Modern programming languages are powerful enough to perform lots of different types of tasks, and are portable across both different computer architectures and different operating systems, so other concerns rarely intrude.
But, in the practical, working programmer's world, what's the web's answer to this question? Six. You have to learn six languages to work on the web:
Of course, Jacob lists a pile of related technologies too, and rightly points out that it's a lot to keep in your head. But he is talking about a problem of needing extensive technical knowledge, something which all programmers working in a particular technology ecosystem learn sooner or later. I'm talking about a different, more fundamental problem: in addition to the surface problem of being complex and often broken, these technologies are fundamentally conceptually incompatible, which leads to a whole host of other problems. Furthermore, the only component which is really complete is the "middle-tier" language, although bespoke web-only languages like PHP and Arc manage to screw that up too.
Here are a few simple example problems that are made depressingly complex by the impedence mismatch between two of these components, but which are incredibly easy using a different paradigm.
How do you place two boxes with text in them side-by-side? Using a GUI toolkit, like my favorite PyGTK, it often goes something like this:
Plus, how do you discover how these layouts work? There are a variety of reference materials, but no canonical guide that says "this is exactly what a <table> tag should do, and how it should look". There are different forms of documentation for both.
If you have a variable number of elements, you quickly run into another problem. Should this be the responsibility of the HTML, the CSS, or some code (in the templating layer) that emits some HTML or some CSS? Should the code in the templating layer be written as an invocation of your middle-tier language, or should the template language itself have some code in it? Reasonable people of good conscience disagee with each other in every possible way over every one of these details.
This is all part of a very complex problem though. For all of these crazy hoops you have to jump through, HTML and CSS do provide a layout model that allows you to do some very pretty and very flexible things with layout, especially if you have large amounts of text. Perhaps not as good as even the most basic pre-press layout engine, but still better than the built-in stuff that most GUI toolkits allow you. So there is an argument that this complexity is a trade-off, where you get functionality in exchange for the confusion. So let's look at a much simpler problem.
Let's say that, in our hypothetical accounting application, you have a list of items in a retail transaction, and you want to process the list and produce a sum. Where is the right place to do that? It turns out you have to write the code to do that three times.
First, you have to write it in JavaScript. After all, the numbers are all already in the client / browser, and you want to update the page instantaneously, not wait for some potentially heavily-loaded server to get back to you each time the user presses a keystroke. And why not? You've got plenty of processing power available on the client.
Then you have to write it in Python. That's where the real brain of the application lives, after all, and if you're going to do something like send a job to a receipt printer or email a customer or sales representative some information in response to a sale, the number has to be located in the middle tier.
Finally you have to do it in SQL. Since this is a traditional web application, your Python code is going to be spread out among multiple servers, and the database is the ultimate arbiter of recorded truth. So you need to have transactions around the appropriate points and execute any interesting aggregate functions (such as SUM()) in the database tier.
So, you've got three times as much work to do in your fancy new web application as you would in a simple record-based application with a GUI. A worthy price to pay to run in the brave new world of tomorrow rather than on some crusty old client/server system, right?
Well, as it turns out, the problem is somewhat deeper than that. It turns out that JavaScript, Python, and SQL actually have slightly different numerical models (in fact Python implements at least 4 itself: fixed-point decimal, floating-point decimal, IEEE 754 floating-point binary, and integer math; you should really only use decimal for money, but this isn't availble in JavaScript and its availability in SQL is spotty). After applying some discounts, your register might read $19.74 but your receipt will read $19.75; and the reports sent to the accounting department will read $19.74898989898989.
Even if you know a lot about math on computers, the limitations of each of these runtimes, and you happen to get all of that just right, you still have another problem to contend with: what happens when somebody else needs to change the logic in question? How do you test that the Python, the JavaScript, and the SQL are all still in sync? It's possible, but you have to go above and beyond the usual discipline of test-driven development, because you need to have integration tests that verify that different, almost unrelated code, in different languages, in different environments is all executing properly in lock-step. Just getting the code from SQL and JavaScript to run in your Python test suite at all is a major challenge; in a language like PHP it's borderline impossible.
This is all even worse when it comes to security, because every part of the application exposes an attack surface, and because you can't use the same language or the same libraries to do any of the work, they all expose a different attack surface.
In his talk, Jacob notes that "frameworks suck at inter-op", but the problem is much deeper than that. As I've shown here, a single page from a single application written using a single framework, which has only one task to do, can't even inter-operate with itself cleanly, at least not at the level that Jacob wants — or that I want. He says, "gateways aren't APIs", and he's right: the correct way to inter-operate is through well-defined APIs. APIs can be discovered through a single, consistent process. Their implementations can be debugged using a single set of development tools.
CSS isn't an API. HTML isn't an API. Strings containing a hodgepodge of SQL and data aren't an API either.
It's not all doom and gloom, but my ideas for a future solution to this problem will have to wait for another post.

What sucks about web development? How will we fix it? How has python fixed it, and how will python fix it in the future? While I can't say I agree with every answer, I found myself nodding quite a bit, and he has something useful to say on just about every point.
I noticed one very important question he leaves out of the mix, though, which seems more fundamental than the others: why does web development suck? In particular, why do so many people who are familiar with multiple styles of development feel like developing for the web is particularly painful by comparison, while so much of software development moves to the web? And, why does web development in Python suck, despite the fact that otherwise, Python mostly rocks?
Programming for the web lacks an important component, one that Fred Brooks identified as crucial for all software as early as 1975: conceptual integrity. Put more simply, it is difficult to make sense of "web" programs. They're difficult to read, difficult to write and difficult to modify, because none of the pieces fits together in a way which can be understood using a simple conceptual model.
Rather than approach this head on, from the perspective of a working web programmer, let's start earlier than that. Let's say someone approached you with a simple programming task: write an accounting system that includes point-of-sale software to run a small business. Now, considering some imagined requirements for such a system, how many languages would you recommend that it be written in?
Most working programmers would usually say "one" without a second thought. A too-clever-by-half language nerd might instead answer "two, a general-purpose programming language for most things and a domain specific language to describe accounting rules and promotions for the business". Why this number? Simply put, there's no reason to use more, and introducing additional languages means mastering additional skills and becoming familiar with additional quirks, all of which add to initial development time and maintenance overhead. Modern programming languages are powerful enough to perform lots of different types of tasks, and are portable across both different computer architectures and different operating systems, so other concerns rarely intrude.
But, in the practical, working programmer's world, what's the web's answer to this question? Six. You have to learn six languages to work on the web:
- HTML. This isn't really a programming language, but in web development you do end up reading and writing quite a lot of it.
- CSS. In order to apply visual styles to your HTML so that it actually looks nice in a browser, you need to understand a different language (with a different conceptual model for how documents are laid out than the HTML itself).
- JavaScript. In today's competitive AJAX-y world, you need to be able to react instantly in the browser, writing a real client application.
- SQL, so that you can store your data in a database.
- Your "middle-tier" language: in my case and Jacob's, that would be Python. This is where people tend to spend the bulk of their programming time, but not all of it.
- A templating language; in Jacob's case, the Django template language.
Of course, Jacob lists a pile of related technologies too, and rightly points out that it's a lot to keep in your head. But he is talking about a problem of needing extensive technical knowledge, something which all programmers working in a particular technology ecosystem learn sooner or later. I'm talking about a different, more fundamental problem: in addition to the surface problem of being complex and often broken, these technologies are fundamentally conceptually incompatible, which leads to a whole host of other problems. Furthermore, the only component which is really complete is the "middle-tier" language, although bespoke web-only languages like PHP and Arc manage to screw that up too.
Here are a few simple example problems that are made depressingly complex by the impedence mismatch between two of these components, but which are incredibly easy using a different paradigm.
How do you place two boxes with text in them side-by-side? Using a GUI toolkit, like my favorite PyGTK, it often goes something like this:
left = Label("some text")The conceptual model here is simple: the HBox() is a container, the "left" and "right" things are widgets, which are in that container. You can add them, remove them, swap them, or handle events on them easily. You can discover how these things are done by reading the API references for the appropriate classes of object. However, there's no right answer to this question on the web. You can use a <table> tag, and then some <tr>s and <td>s to make a single-row table with two cells, but that has a variety of limitations; plus, it's considered somehow gauche by most web designers to use tables for layout these days. Or, you could cook up a collection of CSS classes. So there's the first impedence mismatch: do you do layout in HTML, or CSS? Of course most design gurus would like to tell you that "always and only CSS" is the right answer here, but more practically-minded web developers who actually write code will often prefer HTML, partially because it's simpler but partially because CSS's featureset is incomplete and there are some things you can still only do with HTML, or only do portably with HTML.
right = Label("some other text")
box = HBox()
box.add(left)
box.add(right)
Plus, how do you discover how these layouts work? There are a variety of reference materials, but no canonical guide that says "this is exactly what a <table> tag should do, and how it should look". There are different forms of documentation for both.
If you have a variable number of elements, you quickly run into another problem. Should this be the responsibility of the HTML, the CSS, or some code (in the templating layer) that emits some HTML or some CSS? Should the code in the templating layer be written as an invocation of your middle-tier language, or should the template language itself have some code in it? Reasonable people of good conscience disagee with each other in every possible way over every one of these details.
This is all part of a very complex problem though. For all of these crazy hoops you have to jump through, HTML and CSS do provide a layout model that allows you to do some very pretty and very flexible things with layout, especially if you have large amounts of text. Perhaps not as good as even the most basic pre-press layout engine, but still better than the built-in stuff that most GUI toolkits allow you. So there is an argument that this complexity is a trade-off, where you get functionality in exchange for the confusion. So let's look at a much simpler problem.
Let's say that, in our hypothetical accounting application, you have a list of items in a retail transaction, and you want to process the list and produce a sum. Where is the right place to do that? It turns out you have to write the code to do that three times.
First, you have to write it in JavaScript. After all, the numbers are all already in the client / browser, and you want to update the page instantaneously, not wait for some potentially heavily-loaded server to get back to you each time the user presses a keystroke. And why not? You've got plenty of processing power available on the client.
Then you have to write it in Python. That's where the real brain of the application lives, after all, and if you're going to do something like send a job to a receipt printer or email a customer or sales representative some information in response to a sale, the number has to be located in the middle tier.
Finally you have to do it in SQL. Since this is a traditional web application, your Python code is going to be spread out among multiple servers, and the database is the ultimate arbiter of recorded truth. So you need to have transactions around the appropriate points and execute any interesting aggregate functions (such as SUM()) in the database tier.
So, you've got three times as much work to do in your fancy new web application as you would in a simple record-based application with a GUI. A worthy price to pay to run in the brave new world of tomorrow rather than on some crusty old client/server system, right?
Well, as it turns out, the problem is somewhat deeper than that. It turns out that JavaScript, Python, and SQL actually have slightly different numerical models (in fact Python implements at least 4 itself: fixed-point decimal, floating-point decimal, IEEE 754 floating-point binary, and integer math; you should really only use decimal for money, but this isn't availble in JavaScript and its availability in SQL is spotty). After applying some discounts, your register might read $19.74 but your receipt will read $19.75; and the reports sent to the accounting department will read $19.74898989898989.
Even if you know a lot about math on computers, the limitations of each of these runtimes, and you happen to get all of that just right, you still have another problem to contend with: what happens when somebody else needs to change the logic in question? How do you test that the Python, the JavaScript, and the SQL are all still in sync? It's possible, but you have to go above and beyond the usual discipline of test-driven development, because you need to have integration tests that verify that different, almost unrelated code, in different languages, in different environments is all executing properly in lock-step. Just getting the code from SQL and JavaScript to run in your Python test suite at all is a major challenge; in a language like PHP it's borderline impossible.
This is all even worse when it comes to security, because every part of the application exposes an attack surface, and because you can't use the same language or the same libraries to do any of the work, they all expose a different attack surface.
In his talk, Jacob notes that "frameworks suck at inter-op", but the problem is much deeper than that. As I've shown here, a single page from a single application written using a single framework, which has only one task to do, can't even inter-operate with itself cleanly, at least not at the level that Jacob wants — or that I want. He says, "gateways aren't APIs", and he's right: the correct way to inter-operate is through well-defined APIs. APIs can be discovered through a single, consistent process. Their implementations can be debugged using a single set of development tools.
CSS isn't an API. HTML isn't an API. Strings containing a hodgepodge of SQL and data aren't an API either.
It's not all doom and gloom, but my ideas for a future solution to this problem will have to wait for another post.

23 comments:
There is no solution, because abstractions are leaky :-) http://www.joelonsoftware.com/articles/LeakyAbstractions.html
Glyph, I couldn't agree more. Very well said.
This is a perfect complement to Jacob's essay.
Technologies like GWT and pyjamas seemed promising, having everything written in one language, but have the downsides that Jacob talks about. I have a feeling, though, that the JS only frameworks will be big players very soon, as it easily serves as a lingua franca on the web.
Very succinctly put, Glyph, and a good part of the reason why "Python Web Programming" was almost 700 pages long ...
This is pretty much what I've been saying for years.
Any solution to further abstract away the implementation details of building a web application are just putting lipstick on a pig (ie: GWT, pyjamas, etc).
A rose is still just a rose by any other name.
The real problem is that web applications today are simply an evolution of successive hacks. Everyone has a browser and HTML has form elements to which servers can respond to... everything else we've developed beyond that are just more and more complicated features.
Developing desktop applications by comparison, for me, is a lot more entertaining and fun. The interface code is a lot more simple, there are fewer (if any) serialization steps, and they are easier to test for correctness. I also don't need to have a boatload of operating systems loaded with a truckload of different browsers to test my interface alone. I just write to the API of a cross-platform toolkit of my choice and rest easy that it will display and function the same way across all the platforms that toolkit supports. No fuss, no muss.
Funny thing is that all these yunguns miss today is that network-delivered applications are not a new concept. Back when it was a new concept, protocols such as X11 were written to solve the problem of delivering GUI application interfaces across the network to terminals transparently. The idea was that an end user would open their application as if it were installed on their computer and it would just work. Unbeknownst to them, it was actually being served to them from some mainframe across the network. This was way before the kids today started hacking together web forms and calling it a revolution.
The stumbling block X11 and other such protocols had of course was interoperability. The primary desktop OS vendors were (and still are) walled gardens and refused to adopt a single protocol. They each invented their own and kept their interface APIs proprietary as well. Third parties were eventually able to find ways to work around the walls of the individual gardens, but by then it was largely too late (and not quite as transparent anyway, requiring a separate window in which your remote applications ran on a virtual desktop).
I dream of an Internet of some alternate present where some descendent of X11 had finally succeeded in convincing OS vendors to adopt it. A world where Facebook is just another application in my applications folder and looks like native desktop application. Where I can upload photos and store my data in that application with very little fuss.
As a programmer I dream of this world as well. Where I can write my application interface code, business logic, and storage models in a single programming language. Where the interface code is a toolkit that guarantees support across various OS environments and has a single API. Where comprehensive bug reports are emailed to me in the background instead of just being a URL with a slightly incomprehensible rant following it. Its a glorious world where programming is actually fun and my application code has more lines in it than my test code.
But sigh... stuck with what everyone found to be good enough. Painful, obtuse, and often frustrating to work with... but good enough.
FLEX3 will make the job much better instead of HTML and CSS.
Thank you for the interesting read, and also the interesting comment from j_king.
I disagree with the end of the article, though. It IS all doom and gloom. As a scientist/mathematician who recently got interested in web development, I actually have difficulty believing the conceptual mediocrity that people have to put up with.
Although you concentrated on Python, could you say a few words about ruby on rails? I ask about it because it seems to hang together conceptually and benefit from this fact. But I don't know anything about it, really.
Thank you for the interesting read, and also the interesting comment from j_king.
I disagree with the end of the article, though. It IS all doom and gloom. As a scientist/mathematician who recently got interested in web development, I actually have difficulty believing the conceptual mediocrity that people have to put up with.
Although you concentrated on Python, could you say a few words about ruby on rails? I ask about it because it seems to hang together conceptually and benefit from this fact. But I don't know anything about it, really.
Awesome post Glyph. I think you hit the nail on the head.
Exactly what I have been saying for some time now and +1 to j_king's comment. Having done desktop apps for users that are not going to mess with the client code and then having to do rich web apps where as much functionality should run on the client but none of the output can be trusted I am in a world of pain. The lack of consistency among browsers and the resistance of companies to upgrade to newer versions doesn't help either. But mostly the fact that I regularly have to maintain 2-3 sources of code in different languages to do the exact same thing is the most blatant violation of DRY.
There are web frameworks that use JavaScript as the server language. It’s not that bad a language for doing general programming work so this is not as silly as some people seem to think.
There is also at least one database API—CouchDB—expressed entirely in terms of JSON and HTTP.
On Microsoft Windows, you can do system administration using something called Windows Scripting Host. You write admin scripts in either VBScripot or JScript, Microsoft’s dialect of JavaScript. I don’t know if there is a corresponding JavaScript-based system for Unix-like operating systems, but you never know.
So one solution to the too-many-languages problem might be to switch to using JavaScript as much as possible. Just a thought.
Contradicting myself somewhat, my preferred set-up uses Python for the application logic and most of the database code—via Django’s model API—and also for system admin tasks, since Python programs can do all the things shell scripts and batch files do and are much nicer to write than Windows batch files.
@Damian,
I am considering writing a whole separate article on the JavaScript question :).
However, simply put: as far as I know, server-side JS isn't really the same language as client-side JS yet. There's no general mechanism for loading a library, or declaring a new type, for example. Has anyone ever gotten i.e. Dojo working server-side? If so then maybe there's some hope for this approach.
I think it's always (N+1), where N is some set of general-purpose languages, and the additional one is an application-specific scripting language---the grammar can probably be consistent across most applications, if you want, but after a lot of development I think that a lot of work is saved if you think in terms of building puppets (or 'bots) that do complicated sets of things, and then finding short ways to tell them to do them. (E.g., 'Seize him!!'---very complicated set of motions and feedback, simple command.)
I think the problem is less the multiplicity of programming languages, than our insistence that we should always be separating our languages in different places.
This goes against the basic tenets of cohesion and coupling. We cluster unrelated activities together because they happen to have the same syntactic sugar, while separating tightly-coupled activities because half of them happen on the client and the other on the server. Why the hell should this implementation detail have to be reflected in our architecture?
What I'd like, controversially, is to be able to mix-and-match the languages within the same source file, grouping together the python, javascript, html and sql that actually has to work together in one place. I have no trouble dropping into regular expressions or similar DSLs from inside my main code, why should dropping into a layout or query language be different?
Hi this is a nice blog for getting information about web development. I really enjoy this blog. This is very interesting blog thanks for share your things related with Web site design.
Web development
Web design Company
Web site development company
Yet another reason to learn anything web dev-related...job security.
Very nice article, and a good argument for why HTML 5 is unlikely to be a Flash-killer. Flex/AS3 provide a dev environment and, potentially, a suite of testing tools that are a lot closer to what you'd expect in desktop app development, and a single API to get it done. You still have to talk to a server and database, of course, but Flex goes a long way toward streamlining the whole process. HTML 5 will still be HTML.
thanks for this interesting web development application. I also use a web site design and web development site for my web development application. You can also use this very interesting information about web development on the webdesigningcompany.net site.
Hi! Your blog is simply super. you have create a differentiate. more templates easy to download Thanks for the sharing this website. it is very useful professional knowledge.
one of the reasons i love programming on the web is because of the number of languages i get to experiment with!
you began to start this path is appreciable . . . thanks for posting on web development, web 2.0 development company . .
Post a Comment