Get it?
Quack quack quack quack
Quack quack quack quack
Weird Al Yancovic,
“I Want A New Duck”
Mypy makes most things better
Mypy is a static type checker for Python. If you’re not already familiar, you should check it out; it’s rapidly becoming a standard for Python projects. All the cool kids are doing it. With Mypy, you get all the benefits of high-level dynamic typing for rapid experimentation, and all the benefits of rigorous type checking to complement your tests and improve reliability.1 The best of both worlds!
Mypy can change how you write Python code. In most cases, this is for the
better. For example, I have opined on numerous occasions about how bad None
is. But this can significantly change
with Mypy. Now, when you return None
, you can say -> Optional[str]
and
rest assured that all your callers will be quickly, statically checked for
places where they might encounter an AttributeError
on that None
, which
makes this a more appealing option than risking raising a runtime exception
(which Mypy can’t check).
But sometimes, things can get worse
But in some cases, as you add type annotations, they can make your code more brittle, especially if you annotate with the most initially obvious types. Which is to say, you define some custom classes, and then say that the parameters to and return values from your functions and methods are simply instances of those classes you just defined.
Most Mypy tutorials give you a bunch of examples with str
, int
,
List[int]
, maybe an Optional[float]
or two, and then leave you to your own
devices when it comes to defining your own classes; yet, huge amounts of
real-world applications are custom classes.
So if you’re new to Mypy, particularly if you’re applying it to a large existing codebase, it’s quite natural to write a little code like this:
1 2 3 4 5 6 |
|
and then, later, write some code like this:
1 2 3 4 |
|
In untyped, pre-Mypy python, in addition to being a poignant message about the
futility of escalating violence, duck_war
is a very flexible function,
regardless of where it’s defined. It can take anything with a quack()
method.
But, while the strictness of the Mypy type-check here brings a level of safety
— no None
accidentally masquerading as a Duck
here — it also adds a level
of brittleness. Tests which use carefully-constructed fakes will now fail to
type check, because duck_war
technically insists upon only precisely
instances of Duck
and nothing else.
A few sub-optimal answers to this question
So when you want something else that has slightly different behavior — when you
want a new Duck
2, so to speak — what do you do?
There are a couple of anti-patterns you might arrive at to work around this as you begin your Mypy journey:
- add
# type: ignore
comments to all your tests, or remove their type signatures so they don’t get type checked. This solution throws the baby out with the bath water, as it eliminates any safety that any of these callers experience. - add calls to
cast(Duck, ...)
around any things which you know are “enough like” aDuck
for your purposes. This is much more fine-grained and targeted (and can be a great hack for working with libraries who provide type stubs which are too specific) but this also trades off a bit too much safety, since nothing at the point of thecast
verifies anything unless you build your own ad-hoc system to do so. - subclassing
Duck
. This can also be expedient, and not too bad if you have to; Mypy removes some of the sharpest edges from Python inheritance, providing some guard rails around overriding methods, but it remains a bad idea for all the usual reasons.
The good answer: typing.Protocol
Mypy has a feature,
typing.Protocol
,
that provides a straightforward way to describe any object that has a
quack()
method.
You can do this like so:
1 2 3 4 5 |
|
Now, with only a small modification to its signature — while leaving the
implementation the same — duck_war
can now support anything sufficiently
duck-like:
1 2 |
|
In addition to making it possible for other code — for example, a unit test —
to pass its own implementation of Ducky
into duck_war
without subclassing
or tricking the type checker, this change also improves the safety of
duck_war
’s implementation itself. Previously, when it took a Duck
, it
would have been equally valid for duck_war
to access the .quiet
attribute
of Duck
as it would have been to access .quack
, even though .quiet
is
ostensibly an internal implementation detail.
Now, we could add an underscore prefix to quiet
to make it “private”, but the
type checker will still happily let you access it. So a Protocol
allows you
to clearly reveal your intention about what types you expect your arguments
to be.
Why isn’t everything like this?
Unfortunately, typing.Protocol
began its life as
typing_extensions.Protocol
: a custom extended feature of the type system that
wasn’t present in Mypy initially, and isn’t in the standard library until
Python 3.8. Built-in types like Iterable
and Sequence
are type-checked as
if they’re Protocol
s by being slightly special within Mypy, but it’s not
clear to the casual user how this is happening.
However, other types, like io.TextIO
, don’t quite behave this way, and some
early-adopter projects for Mypy have types that are either too strict or too
permissive because they predated this.
So I really wanted to write this post to highlight the Protocol
style of
describing types and encourage folks to use it.
In conclusion
The concept I’ve described above is not new in the world of type theory.
The way that typing works in Mypy with most types — builtins, custom classes, and abstract base classes — is known as nominal typing. Nominal as in “based on names”; if the object you have directly references the name of the type it’s being compared to, by being an instance of it, then it matches.
In other words: if it’s named “Duck
”, it’s a duck. There are some advantages
to nominal typing3, but this brittleness is not very Pythonic!
In contrast, the type of type-checking accomplished by Protocol
is known as
structural typing.4
Whether the caller matches a Protocol
depends on the structure of your
object — in other words, what methods and attributes it has.
In even other-er words - if it .quack()
s like a duck, it is a duck.
If you’re just starting to use Mypy — particularly if you’re building a library
that exports types that users are expected to implement — consider using
Protocol
to describe those types. With Protocol
, while you get
much-improved safety from type-checking, you don’t lose the wonderful
flexibility and easy testability that duck typing has always given you in
Python.
-
And the early promise of using those type hints to making your code really fast with
mypyc
, although it’s still a bit too limited and poorly documented to start encouraging it too broadly... ↩ -
I said the title of the post, in the post! I love it when that happens. ↩
-
I have another post coming up about using
zope.interface
with Mypy, which combines the abstract typing of Protocol and the avoidance of traditional inheritance with the heightened safety that prevents accidentally matching similar signatures that are named the same but mean something else. ↩ -
The official documentation for
Protocol
in Mypy itself is even titled “Protocols and structural subtyping”. ↩