“Composition is better than inheritance.”. This is a true statement. “Inheritance is bad.” Also true. I’m a well-known compositional extremist. There’s a great talk you can watch if I haven’t talked your ear off about it already.
Which is why I was extremely surprised in a recent conversation when my interlocutor said that while inheritance might be bad, composition is worse. Once I understood what they meant by “composition”, I was even more surprised to find that I agreed with this assertion.
Although inheritance is bad, it’s very important to understand why. In a high-level language like Python, with first-class runtime datatypes (i.e.: user defined classes that are objects), the computational difference between what we call “composition” and what we call “inheritance” is a matter of where we put a pointer: is it on a type or on an instance? The important distinction has to do with human factors.
First, a brief parable about real-life inheritance.
You find yourself in conversation with an indolent heiress-in-waiting. She complains of her boredom whiling away the time until the dowager countess finally leaves her her fortune.
“Inheritance is bad”, you opine. “It’s better to make your own way in life”.
“By George, you’re right!” she exclaims. You weren’t expecting such an enthusiastic reversal.
“Well,”, you sputter, “glad to see you are turning over a new leaf”.
She crosses the room to open a sturdy mahogany armoire, and draws forth a belt holstering a pistol and a menacing-looking sabre.
“Auntie has only the dwindling remnants of a legacy fortune. The real money has always been with my sister’s manufacturing concern. Why passively wait for Auntie to die, when I can murder my dear sister now, and take what is rightfully mine!”
Cinching the belt around her waist, she strides from the room animated and full of purpose, no longer indolent or in-waiting, but you feel less than satisfied with your advice.
It is, after all, important to understand what the problem with inheritance is.
The primary reason inheritance is bad is confusion between namespaces.
The most important role of code organization (division of code into files, modules, packages, subroutines, data structures, etc) is division of responsibility. In other words, Conway’s Law isn’t just an unfortunate accident of budgeting, but a fundamental property of software design.
For example, if we have a function called
multiply(a, b) - its presence in
our codebase suggests that if someone were to want to multiply two numbers
together, it is
multiply’s responsibility to know how to do so. If there’s
a problem with multiplication, it’s the maintainers of
multiply who need to
go fix it.
And, with this responsibility comes authority over a specific scope within the
code. So if we were to look at an implementation of
1 2 3
def multiply(a, b): product = a * b return product
The maintainers of
multiply get to decide what
product means in the context
of their function. It’s possible, in Python, for some other funciton to
multiply with frame objects and mangle the meaning of
between its assignment and
return, but it’s generally understood that it’s
none of your business what
product is, and if you touch it, all bets are
off about the correctness of
multiply. More importantly, if the maintainers
of multiply wanted to bind other names, or change around existing names, like
so, in a subsequent version:
1 2 3 4 5
def multiply(a, b): factor1 = a factor2 = b result = a * b return result
It is the maintainer of
multiply’s job, not the caller of
multiply, to make
The same programmer may, at different times, be both a caller and a maintainer
multiply. However, they have to know which hat they’re wearing at any
given time, so that they can know which stuff they’re still repsonsible for
when they hand over
multiply to be maintained by a different team.
It’s important to be able to forget about the internals of the local variables in the functions you call. Otherwise, abstractions give us no power: if you have to know the internals of everything you’re using, you can never build much beyond what’s already there, because you’ll be spending all your time trying to understand all the layers below it.
Classes complicate this process of forgetting somewhat. Properties of class
instances “stick out”, and are visible to the callers. This can be powerful —
and can be a great way to represent shared data structures — but this is
exactly why we have the
._ convention in Python: if something starts with an
underscore, and it’s not in a namespace you own, you shouldn’t mess with it.
other._foo is not for you to touch, unless you’re maintaining
self._foo is where you should put your own private state.
So if we have a class like this:
1 2 3
class A(object): def __init__(self): self._note = "a note"
we all know that
A()._note is off-limits.
But then what happens here?
1 2 3 4
class B(A): def __init__(self): super().__init__() self._note = "private state for B()"
B()._note is also off limits for everyone but
B, except... as it turns out,
B doesn’t really own the namespace of
self here, so it’s clashing with what
_note to mean. Even if, right now, we were to change it to
_note2, the maintainer of
A could, in any future release of
A, add a new
_note2 variable which conflicts with something
B is using.
maintainers (rightfully) think they own
B’s maintainers (reasonably)
think that they do. This could continue all the way until we get to
at which point it would explode violently.
So that’s why Inheritance is bad. It’s a bad way for two layers of a system to communicate because it leaves each layer nowhere to put its internal state that the other doesn’t need to know about. So what could be worse?
Let’s say we’ve convinced our junior programmer who wrote
A that inheritance
is a bad interface, and they should instead use the panacea that cures all
inherited ills, composition. Great! Let’s just write a
B that composes
A in a nice clean way, instead of doing any gross inheritance:
1 2 3 4
class Bprime(object): def __init__(self, a): for var in dir(a): setattr(self, var, getattr(a, var))
Uh oh. Looks like composition is worse than inheritance.
Let’s enumerate some of the issues with this “solution” to the problem of inheritance:
- How do we know what attributes
- How do we even know what type
- How is anyone ever going to
grepfor relevant methods in this code and have them come up in the right place?
We briefly reclaimed
Bprime by removing the inheritance from
Bprime does in
__init__ to replace it is much worse. At least
with normal, “vertical” inheritance, IDEs and code inspection tools can have
some idea where your parents are and what methods they declare. We have to
look aside to know what’s there, but at least it’s clear from the code’s
structure where exactly we have to look aside to.
When faced with a class like
Bprime though, what does one do? It’s just
shredding apart some apparently totally unrelated object, there’s nearly no way
for tooling to inspect this code to the point that they know where
self.<something> comes from in a method defined on
The goal of replacing inheritance with composition is to make it clear and
easy to understand what code owns each attribute on
self. Sometimes that
clarity comes at the expense of a few extra keystrokes; an
copies over a few specific attributes, or a method that does nothing but
forward a message, like
def something(self): return self.other.something().
Automatic composition is just lateral inheritance. Magically auto-proxying all methods1, or auto-copying all attributes, saves a few keystrokes at the time some new code is created at the expense of hours of debugging when it is being maintained. If readability counts, we should never privilege the writer over the reader.