I Want A New Duck

typing.Protocol and the future of duck typing

Get it?
Quack quack quack quack
Quack quack quack quack

Weird Al Yancovic,
I Want A New Duck

Mypy makes most things better

Mypy is a static type checker for Python. If you’re not already familiar, you should check it out; it’s rapidly becoming a standard for Python projects. All the cool kids are doing it. With Mypy, you get all the benefits of high-level dynamic typing for rapid experimentation, and all the benefits of rigorous type checking to complement your tests and improve reliability.1 The best of both worlds!

Mypy can change how you write Python code. In most cases, this is for the better. For example, I have opined on numerous occasions about how bad None is. But this can significantly change with Mypy. Now, when you return None, you can say -> Optional[str] and rest assured that all your callers will be quickly, statically checked for places where they might encounter an AttributeError on that None, which makes this a more appealing option than risking raising a runtime exception (which Mypy can’t check).

But sometimes, things can get worse

But in some cases, as you add type annotations, they can make your code more brittle, especially if you annotate with the most initially obvious types. Which is to say, you define some custom classes, and then say that the parameters to and return values from your functions and methods are simply instances of those classes you just defined.

Most Mypy tutorials give you a bunch of examples with str, int, List[int], maybe an Optional[float] or two, and then leave you to your own devices when it comes to defining your own classes; yet, huge amounts of real-world applications are custom classes.

So if you’re new to Mypy, particularly if you’re applying it to a large existing codebase, it’s quite natural to write a little code like this:

1
2
3
4
5
6
from dataclasses import dataclass
@dataclass
class Duck:
    quiet: bool = False
    def quack(self) -> None:
        print("Quack." if self.quiet else "QUACK!")

and then, later, write some code like this:

1
2
3
4
def duck_war(aggressor: Duck, defender: Duck) -> None:
    aggressor.quack()
    defender.quack()
    print("The only winning move is not to play.")

In untyped, pre-Mypy python, in addition to being a poignant message about the futility of escalating violence, duck_war is a very flexible function, regardless of where it’s defined. It can take anything with a quack() method.

But, while the strictness of the Mypy type-check here brings a level of safety — no None accidentally masquerading as a Duck here — it also adds a level of brittleness. Tests which use carefully-constructed fakes will now fail to type check, because duck_war technically insists upon only precisely instances of Duck and nothing else.

A few sub-optimal answers to this question

So when you want something else that has slightly different behavior — when you want a new Duck2, so to speak — what do you do?

There are a couple of anti-patterns you might arrive at to work around this as you begin your Mypy journey:

  1. add # type: ignore comments to all your tests, or remove their type signatures so they don’t get type checked. This solution throws the baby out with the bath water, as it eliminates any safety that any of these callers experience.
  2. add calls to cast(Duck, ...) around any things which you know are “enough like” a Duck for your purposes. This is much more fine-grained and targeted (and can be a great hack for working with libraries who provide type stubs which are too specific) but this also trades off a bit too much safety, since nothing at the point of the cast verifies anything unless you build your own ad-hoc system to do so.
  3. subclassing Duck. This can also be expedient, and not too bad if you have to; Mypy removes some of the sharpest edges from Python inheritance, providing some guard rails around overriding methods, but it remains a bad idea for all the usual reasons.

The good answer: typing.Protocol

Mypy has a feature, typing.Protocol , that provides a straightforward way to describe any object that has a quack() method.

You can do this like so:

1
2
3
4
5
from typing import Protocol

class Ducky(Protocol):
    def quack(self) -> None:
        "Quack."

Now, with only a small modification to its signature — while leaving the implementation the same — duck_war can now support anything sufficiently duck-like:

1
2
def duck_war(aggressor: Ducky, defender: Ducky) -> None:
    ...

In addition to making it possible for other code — for example, a unit test — to pass its own implementation of Ducky into duck_war without subclassing or tricking the type checker, this change also improves the safety of duck_war’s implementation itself. Previously, when it took a Duck, it would have been equally valid for duck_war to access the .quiet attribute of Duck as it would have been to access .quack, even though .quiet is ostensibly an internal implementation detail.

Now, we could add an underscore prefix to quiet to make it “private”, but the type checker will still happily let you access it. So a Protocol allows you to clearly reveal your intention about what types you expect your arguments to be.

Why isn’t everything like this?

Unfortunately, typing.Protocol began its life as typing_extensions.Protocol: a custom extended feature of the type system that wasn’t present in Mypy initially, and isn’t in the standard library until Python 3.8. Built-in types like Iterable and Sequence are type-checked as if they’re Protocols by being slightly special within Mypy, but it’s not clear to the casual user how this is happening.

However, other types, like io.TextIO, don’t quite behave this way, and some early-adopter projects for Mypy have types that are either too strict or too permissive because they predated this.

So I really wanted to write this post to highlight the Protocol style of describing types and encourage folks to use it.

In conclusion

The concept I’ve described above is not new in the world of type theory.

The way that typing works in Mypy with most types — builtins, custom classes, and abstract base classes — is known as nominal typing. Nominal as in “based on names”; if the object you have directly references the name of the type it’s being compared to, by being an instance of it, then it matches.

In other words: if it’s named “Duck”, it’s a duck. There are some advantages to nominal typing3, but this brittleness is not very Pythonic!

In contrast, the type of type-checking accomplished by Protocol is known as structural typing.4 Whether the caller matches a Protocol depends on the structure of your object — in other words, what methods and attributes it has.

In even other-er words - if it .quack()s like a duck, it is a duck.

If you’re just starting to use Mypy — particularly if you’re building a library that exports types that users are expected to implement — consider using Protocol to describe those types. With Protocol, while you get much-improved safety from type-checking, you don’t lose the wonderful flexibility and easy testability that duck typing has always given you in Python.


  1. And the early promise of using those type hints to making your code really fast with mypyc, although it’s still a bit too limited and poorly documented to start encouraging it too broadly... 

  2. I said the title of the post, in the post! I love it when that happens. 

  3. I have another post coming up about using zope.interface with Mypy, which combines the abstract typing of Protocol and the avoidance of traditional inheritance with the heightened safety that prevents accidentally matching similar signatures that are named the same but mean something else. 

  4. The official documentation for Protocol in Mypy itself is even titled “Protocols and structural subtyping”. 

Zen Guardian

Let’s rewrite a fun toy Python program - in Python!

There should be one — and preferably only one — obvious way to do it.

Tim Peters, “The Zen of Python”


Moshe wrote a blog post a couple of days ago which neatly constructs a wonderful little coding example from a scene in a movie. And, as we know from the Zen of Python quote, there should only be one obvious way to do something in Python. So my initial reaction to his post was of course to do it differently — to replace an __init__ method with the new @dataclasses.dataclass decorator.

But as I thought about the code example more, I realized there are a number of things beyond just dataclasses that make the difference between “toy”, example-quality Python, and what you’d do in a modern, professional, production codebase today.

So let’s do everything the second, not-obvious way!


There’s more than one way to do it

Larry Wall, “The Other Zen of Python”


Getting started: the __future__ is now

We will want to use type annotations. But, the Guard and his friend are very self-referential, and will have lots of annotations that reference things that come later in the file. So we’ll want to take advantage of a future feature of Python, which is to say, Postponed Evaluation of Annotations. In addition to the benefit of slightly improving our import time, it’ll let us use the nice type annotation syntax without any ugly quoting, even when we need to make forward references.

So, to begin:

1
from __future__ import annotations

Doors: safe sets of constants

Next, let’s tackle the concept of “doors”. We don’t need to gold-plate this with a full blown Door class with instances and methods - doors don’t have any behavior or state in this example, and we don’t need to add it. But, we still wouldn’t want anyone using using this library to mix up a door or accidentally plunge to their doom by accidentally passing "certian death" when they meant certain. So a Door clearly needs a type of its own, which is to say, an Enum:

1
2
3
4
5
from enum import Enum

class Door(Enum):
    certain_death = "certain death"
    castle = "castle"

Questions: describing type interfaces

Next up, what is a “question”? Guards expect a very specific sort of value as their question argument and we if we’re using type annotations, we should specify what it is. We want a Question type that defines arguments for each part of the universe of knowledge that these guards understand. This includes who they are themselves, who the set of both guards are, and what the doors are.

We can specify it like so:

1
2
3
4
5
6
7
from typing import Protocol, Sequence

class Question(Protocol):
    def __call__(
        self, guard: Guard, guards: Sequence[Guard], doors: Sequence[Door]
    ) -> bool:
        ...

The most flexible way to define a type of thing you can call using mypy and typing is to define a Protocol with a __call__ method and nothing else1. We could also describe this type as Question = Callable[[Guard, Sequence[Guard], Door], bool] instead, but as you may be able to infer, that doesn’t let you easily specify names of arguments, or keyword-only or positional-only arguments, or required default values. So Protocol-with-__call__ it is.

At this point, we also get to consider; does the questioner need the ability to change the collection of doors they’re passed? Probably not; they’re just asking questions, not giving commands. So they should receive an immutable version, which means we need to import Sequence from the typing module and not List, and use that for both guards and doors argument types.

Guards and questions: annotating existing logic with types

Next up, what does Guard look like now? Aside from adding some type annotations — and using our shiny new Door and Question types — it looks substantially similar to Moshe’s version:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
from dataclasses import dataclass

@dataclass
class Guard:
    _truth_teller: bool
    _guards: Sequence[Guard]
    _doors: Sequence[Door]

    def ask(self, question: Question) -> bool:
        answer = question(self, self._guards, self._doors)
        if not self._truth_teller:
            answer = not answer
        return answer

Similarly, the question that we want to ask looks quite similar, with the addition of:

  1. type annotations for both the “outer” and the “inner” question, and
  2. using Door.castle for our comparison rather than the string "castle"
  3. replacing List with Sequence, as discussed above, since the guards in this puzzle also have no power to change their environment, only to answer questions.
  4. using the [var] = value syntax for destructuring bind, rather than the more subtle var, = value form
1
2
3
4
5
6
7
8
9
def question(guard: Guard, guards: Sequence[Guard], doors: Sequence[Door]) -> bool:
    [other_guard] = (candidate for candidate in guards if candidate != guard)

    def other_question(
        guard: Guard, guards: Sequence[Guard], doors: Sequence[Door]
    ) -> bool:
        return doors[0] == Door.castle

    return other_guard.ask(other_question)

Eliminating global state: building the guard post

Next up, how shall we initialize this collection of guards? Setting a couple of global variables is never good style, so let’s encapsulate this within a function:

1
2
3
4
5
6
7
from typing import List

def make_guard_post() -> Sequence[Guard]:
    doors = list(Door)
    guards: List[Guard] = []
    guards[:] = [Guard(True, guards, doors), Guard(False, guards, doors)]
    return guards

Defining the main point

And finally, how shall we actually have this execute? First, let’s put this in a function, so that it can be called by things other than running the script directly; for example, if we want to use entry_points to expose this as a script. Then, let's put it in a "__main__" block, and not just execute it at module scope.

Secondly, rather than inspecting the output of each one at a time, let’s use the all function to express that the interesting thing is that all of the guards will answer the question in the affirmative:

1
2
3
4
5
6
def main() -> None:
    print(all(each.ask(question) for each in make_guard_post()))


if __name__ == "__main__":
    main()

Appendix: the full code

To sum up, here’s the full version:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
from __future__ import annotations
from dataclasses import dataclass
from typing import List, Protocol, Sequence
from enum import Enum


class Door(Enum):
    certain_death = "certain death"
    castle = "castle"


class Question(Protocol):
    def __call__(
        self, guard: Guard, guards: Sequence[Guard], doors: Sequence[Door]
    ) -> bool:
        ...


@dataclass
class Guard:
    _truth_teller: bool
    _guards: Sequence[Guard]
    _doors: Sequence[Door]

    def ask(self, question: Question) -> bool:
        answer = question(self, self._guards, self._doors)
        if not self._truth_teller:
            answer = not answer
        return answer


def question(guard: Guard, guards: Sequence[Guard], doors: Sequence[Door]) -> bool:
    [other_guard] = (candidate for candidate in guards if candidate != guard)

    def other_question(
        guard: Guard, guards: Sequence[Guard], doors: Sequence[Door]
    ) -> bool:
        return doors[0] == Door.castle

    return other_guard.ask(other_question)


def make_guard_post() -> Sequence[Guard]:
    doors = list(Door)
    guards: List[Guard] = []
    guards[:] = [Guard(True, guards, doors), Guard(False, guards, doors)]
    return guards


def main() -> None:
    print(all(each.ask(question) for each in make_guard_post()))


if __name__ == "__main__":
    main()

Acknowledgments

I’d like to thank Moshe Zadka for the post that inspired this, as well as Nelson Elhage, Jonathan Lange, Ben Bangert and Alex Gaynor for giving feedback on drafts of this post.


  1. I will hopefully have more to say about typing.Protocol in another post soon; it’s the real hero of the Mypy saga, but more on that later... 

Mac Python Distribution Post Updated for Catalina and Notarization

Notarize your Python apps for macOS Catalina.

I previously wrote a post about shipping a PyGame app to users on macOS. It’s now substantially updated for the new Notarization requirements in Catalina. I hope it’s useful to somebody!

Toward a “Kernel Python”

The life changing magic of a minimal standard library.

Prompted by Amber Brown’s presentation at the Python Language Summit last month, Christian Heimes has followed up on his own earlier work on slimming down the Python standard library, and created a proper Python Enhancement Proposal PEP 594 for removing obviously obsolete and unmaintained detritus from the standard library.

PEP 594 is great news for Python, and in particular for the maintainers of its standard library, who can now address a reduced surface area. A brief trip through the PEP’s rogues gallery of modules to deprecate or remove1 is illuminating. The python standard library contains plenty of useful modules, but it also hides a veritable necropolis of code, a towering monument to obsolescence, threatening to topple over on its maintainers at any point.

However, I believe the PEP may be approaching the problem from the wrong direction. Currently, the standard library is maintained in tandem with, and by the maintainers of, the CPython python runtime. Large portions of it are simply included in the hope that it might be useful to somebody. In the aforementioned PEP, you can see this logic at work in defense of the colorsys module: why not remove it? “The module is useful to convert CSS colors between coordinate systems. [It] does not impose maintenance overhead on core development.”

There was a time when Internet access was scarce, and maybe it was helpful to pre-load Python with lots of stuff so it could be pre-packaged with the Python binaries on the CD-ROM when you first started learning.

Today, however, the modules you need to convert colors between coordinate systems are only a pip install away. The bigger core interpreter is just more to download before you can get started.

Why Didn’t You Review My PR?

So let’s examine that claim: does a tiny module like colorsys “impose maintenance overhead on core development”?

The core maintainers have enough going on just trying to maintain the huge and ancient C codebase that is CPython itself. As Mariatta put it in her North Bay Python keynote, the most common question that core developers get is “Why haven’t you looked at my PR?” And the answer? It’s easier to not look at PRs when you don’t care about them. This from a talk about what it means to be a core developer!

One might ask, whether Twisted has the same problem. Twisted is a big collection of loosely-connected modules too; a sort of standard library for networking. Are clients and servers for SSH, IMAP, HTTP, TLS, et. al. all a bit much to try to cram into one package?

I’m compelled to reply: yes. Twisted is monolithic because it dates back to a similar historical period as CPython, where installing stuff was really complicated. So I am both sympathetic and empathetic towards CPython’s plight.

At some point, each sub-project within Twisted should ideally become a separate project with its own repository, CI, website, and of course its own more focused maintainers. We’ve been slowly splitting out projects already, where we can find a natural boundary. Some things that started in Twisted like constantly and incremental have been split out; deferred and filepath are in the process of getting that treatment as well. Other projects absorbed into the org continue to live separately, like klein and treq. As we figure out how to reduce the overhead of setting up and maintaining the CI and release infrastructure for each of them, we’ll do more of this.


But is our monolithic nature the most pressing problem, or even a serious problem, for the project? Let’s quantify it.

As of this writing, Twisted has 5 outstanding un-reviewed pull requests in our review queue. The median time a ticket spends in review is roughly four and a half days.2 The oldest ticket in our queue dates from April 22, which means it’s been less than 2 months since our oldest un-reviewed PR was submitted.

It’s always a struggle to find enough maintainers and enough time to respond to pull requests. Subjectively, it does sometimes feel like “Why won’t you review my pull request?” is a question we do still get all too often. We aren’t always doing this well, but all in all, we’re managing; the queue hovers between 0 at its lowest and 25 or so during a bad month.

By comparison to those numbers, how is core CPython doing?

Looking at CPython’s keyword-based review queue queue, we can see that there are 429 tickets currently awaiting review. The oldest PR awaiting review hasn’t been touched since February 2, 2018, which is almost 500 days old.

How many are interpreter issues and how many are stdlib issues? Clearly review latency is a problem, but would removing the stdlib even help?

For a quick and highly unscientific estimate, I scanned the first (oldest) page of PRs in the query above. By my subjective assessment, on this page of 25 PRs, 14 were about the standard library, 10 were about the core language or interpreter code; one was a minor documentation issue that didn’t really apply to either. If I can hazard a very rough estimate based on this proportion, somewhere around half of the unreviewed PRs might be in standard library code.


So the first reason the CPython core team needs to stop maintaining the standard library because they literally don’t have the capacity to maintain the standard library. Or to put it differently: they aren’t maintaining it, and what remains is to admit that and start splitting it out.

It’s true that none of the open PRs on CPython are in colorsys3. It does not, in fact, impose maintenance overhead on core development. Core development imposes maintenance overhead on it. If I wanted to update the colorsys module to be more modern - perhaps to have a Color object rather than a collection of free functions, perhaps to support integer color models - I’d likely have to wait 500 days, or more, for a review.

As a result, code in the standard library is harder to change, which means its users are less motivated to contribute to it. CPython’s unusually infrequent releases also slow down the development of library code and decrease the usefulness of feedback from users. It’s no accident that almost all of the modules in the standard library have actively maintained alternatives outside of it: it’s not a failure on the part of the stdlib’s maintainers. The whole process is set up to produce stagnation in all but the most frequently used parts of the stdlib, and that’s exactly what it does.

New Environments, New Requirements

Perhaps even more importantly is that bundling together CPython with the definition of the standard library privileges CPython itself, and the use-cases that it supports, above every other implementation of the language.

Podcast after podcast after podcast after keynote tells us that in order to keep succeeding and expanding, Python needs to grow into new areas: particularly web frontends, but also mobile clients, embedded systems, and console games.

These environments require one or both of:

  • a completely different runtime, such as Brython, or MicroPython
  • a modified, stripped down version of the standard library, which elides most of it.

In all of these cases, determining which modules have been removed from the standard library is a sticking point. They have to be discovered by a process of trial and error; notably, a process completely different from the standard process for determining dependencies within a Python application. There’s no install_requires declaration you can put in your setup.py that indicates that your library uses a stdlib module that your target Python runtime might leave out due to space constraints.

You can even have this problem even if all you ever use is the standard python on your Linux installation. Even server- and desktop-class Linux distributions have the same need for a more minimal core Python package, and so they already chop up the standard library somewhat arbitrarily. This can break the expectations of many python codebases, and result in bugs where even pip install won’t work.

Take It All Out

How about the suggestion that we should do only a little a day? Although it sounds convincing, don’t be fooled. The reason you never seem to finish is precisely because you tidy a little at a time. [...] The ultimate secret of success is this: If you tidy up in one shot, rather than little by little, you can dramatically change your mind-set.

Kondō, Marie.
“The Life-Changing Magic of Tidying Up”
(p. 15-16)

While incremental slimming of the standard library is a step in the right direction, incremental change can only get us so far. As Marie Kondō says, when you really want to tidy up, the first step is to take everything out so that you can really see everything, and put back only what you need.

It’s time to thank those modules which do not spark joy and send them on their way.

We need a “kernel” version of Python that contains only the most absolutely minimal library, so that all implementations can agree on a core baseline that gives you a “python”, and applications, even those that want to run on web browsers or microcontrollers, can simply state their additional requirements in terms of requirements.txt.

Now, there are some business environments where adding things to your requirements.txt is a fraught, bureaucratic process, and in those places, a large standard library might seem appealing. But “standard library” is a purely arbitrary boundary that the procurement processes in such places have drawn, and an equally arbitrary line may be easily drawn around a binary distribution.

So it may indeed be useful for some CPython binary distributions — perhaps even the official ones — to still ship with a broader selection of modules from PyPI. Even for the average user, in order to use it for development, at the very least, you’d need enough stdlib stuff that pip can bootstrap itself, to install the other modules you need!

It’s already the case, today, that pip is distributed with Python, but isn’t maintained in the CPython repository. What the default Python binary installer ships with is already a separate question from what is developed in the CPython repo, or what ships in the individual source tarball for the interpreter.

In order to use Linux, you need bootable media with a huge array of additional programs. That doesn’t mean the Linux kernel itself is in one giant repository, where the hundreds of applications you need for a functioning Linux server are all maintained by one team. The Linux kernel project is immensely valuable, but functioning operating systems which use it are built from the combination of the Linux kernel and a wide variety of separately maintained libraries and programs.

Conclusion

The “batteries included” philosophy was a great fit for the time when it was created: a booster rocket to sneak Python into the imagination of the programming public. As the open source and Python packaging ecosystems have matured, however, this strategy has not aged well, and like any booster, we must let it fall back to earth, lest it drag us back down with it.

New Python runtimes, new deployment targets, and new developer audiences all present tremendous opportunities for the Python community to soar ever higher.

But to do it, we need a newer, leaner, unburdened “kernel” Python. We need to dump the whole standard library out on the floor, adding back only the smallest bits that we need, so that we can tell what is truly necessary and what’s just nice to have.

I hope I’ve convinced at least a few of you that we need a kernel Python.

Now: who wants to write the PEP?

🚀

Acknowledgments

Thanks to Jean-Paul Calderone, Donald Stufft, Alex Gaynor, Amber Brown, Ian Cordasco, Jonathan Lange, Augie Fackler, Hynek Schlawack, Pete Fein, Mark Williams, Tom Most, Jeremy Thurgood, and Aaron Gallagher for feedback and corrections on earlier drafts of this post. Any errors of course remain my own.


  1. sunau, xdrlib, and chunk are my personal favorites. 

  2. Yeah, yeah, you got me, the mean is 102 days. 

  3. Well, as it turns out, one is on colorsys, but it’s a documentation fix that Alex Gaynor filed after reviewing a draft of this post so I don’t think it really counts. 

Tips And Tricks for Shipping a PyGame App on the Mac

A quick and dirty guide to getting that little PyGame hack you did up and running on someone else’s Mac.

I have written a tool you can actually use rather than copying and pasting shell-script snippets, which you can read about in a new post here. I've done my best to update the accuracy of the information below as well, particularly with respect to which Python you want and why, but it is a much older post and I could easily have missed something.

I’ve written and spoken at some length about shipping software in the abstract. Sometimes I’ve even had the occasional concrete tidbit, but that advice wasn’t really complete.

In honor of Eevee’s delightful Games Made Quick???, I’d like to help you package your games even quicker than you made them.

Who is this for?

About ten years ago I made a prototype of a little PyGame thing which I wanted to share with a few friends. Building said prototype was quick and fun, and very different from the usual sort of work I do. But then, the project got just big enough that I started to wonder if it would be possible to share the result, and thus began the long winter of my discontent with packaging tools.

I might be the only one, but... I don’t think so. The history of PyWeek, for example, looks to be a history of games distributed as Github repositories, or, at best, apps which don’t launch. It seems like people who participate in game jams with Unity push a button and publish their games to Steam; people who participate in game jams with Python wander away once the build toolchain defeats them.

So: perhaps you’re also a Python programmer, and you’ve built something with PyGame, and you want to put it on your website so your friends can download it. Perhaps many or most of your friends and family are Mac users. Perhaps you tried to make a thing with py2app once, and got nothing but inscrutable tracebacks or corrupt app bundles for your trouble.

If so, read on and enjoy.

What changed?

If things didn’t work for me when I first tried to do this, what’s different now?

  • the packaging ecosystem in general is far less buggy, and py2app’s dependencies, like setuptools, have become far more reliable as well. Many thanks to Donald Stufft and the whole PyPA for that.
  • Binary wheels exist, and the community has been getting better and better at building self-contained wheels which include any necessary C libraries, relieving the burden on application authors to figure out gnarly C toolchain issues.
  • The PyGame project now ships just such wheels for a variety of Python versions on Mac, Windows, and Linux, which removes a whole huge pile of complexity both in generally understanding the C toolchain and specifically understanding the SDL build process.
  • py2app has been actively maintained and many bugs have been fixed - many thanks to Ronald Oussoren et. al. for that.
  • I finally broke down and gave Apple a hundred dollars so I can produce an app that normal humans might actually be able to run.

There are still weird little corner cases you have to work around — hence this post – but mostly this is the story of how years of effort by the Python packaging community have resulted in tools that are pretty close to working out of the box now.

Step 0: Development Setup

You will also want to use a virtual environment for development.

Finally: pip install all your requirements into your virtualenv, including PyGame itself.

Step 1: Make an icon

All good apps need an icon, right?

When I was young, one would open up ResEdit Resorcerer MPW CodeWarrior Project Builder Icon Composer Xcode and create a new ICON resource cicn resource .tiff file .icns file. Nowadays there’s some weird opaque stuff with xcassets files and Contents.json and “Copy Bundle Resources” in the default Swift and Objective C project templates and honestly I can’t be bothered to keep track of what’s going on with this nonsense any more.

Luckily the OS ships with the macOS-specific “scriptable image processing system”, which can helpfully convert an icon for you. Make yourself a 512x512 PNG file in your favorite image editor (with an alpha channel!) that you want to use as your icon, then run it something like this:

1
$ sips -s format icns Icon.png --out Icon.icns

somewhere in your build process, to produce an icon in the appropriate format.

There’s also one additional wrinkle with PyGame: once you’ve launched the game, PyGame helpfully assigns the cute, but ugly, default PyGame icon to your running process. To avoid this, you’ll need these two lines somewhere in your initialization code, somewhere before pygame.display.init (or, for that matter, pygame.display.<anything>):

1
2
from pygame.sdlmain_osx import InstallNSApplication
InstallNSApplication()

Obviously this is pretty Mac-specific so you probably want this under some kind of platform-detection conditional, perhaps this one.

Step 2: Include All The Dang Files, I Don’t Care About Performance

Unfortunately py2app still tries really hard to jam all your code into a .zip file, which breaks the world in various hilarious ways. Your app will probably have some resources you want to load, as will PyGame itself.

Supposedly, packages=["your_package"] in your setup.py should address this, and it comes with a “pygame” recipe, but neither of these things worked for me. Instead, I convinced py2app to splat out all the files by using the not-quite-public “recipe” plugin API:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
import py2app.recipes
import py2app.build_app

from setuptools import find_packages, setup

pkgs = find_packages(".")

class recipe_plugin(object):
    @staticmethod
    def check(py2app_cmd, modulegraph):
        local_packages = pkgs[:]
        local_packages += ['pygame']
        return {
            "packages": local_packages,
        }

py2app.recipes.my_recipe = recipe_plugin

APP = ['my_main_file.py']
DATA_FILES = []
OPTIONS = {}
OPTIONS.update(
    iconfile="Icon.icns",
    plist=dict(CFBundleIdentifier='com.example.yourdomain.notmine')
)

setup(
    name="Your Game",
    app=APP,
    data_files=DATA_FILES,
    include_package_data=True,
    options={'py2app': OPTIONS},
    setup_requires=['py2app'],
    packages=pkgs,
    package_data={
        "": ["*.gal" , "*.gif" , "*.html" , "*.jar" , "*.js" , "*.mid" ,
             "*.png" , "*.py" , "*.pyc" , "*.sh" , "*.tmx" , "*.ttf" ,
             # "*.xcf"
        ]
    },
)

This is definitely somewhat less efficient than py2app’s default of stuffing the code into a single zip file, but, as a counterpoint to that: it actually works.

Step 3: Build it

Hopefully, at this point you can do python setup.py py2app and get a shiny new app bundle in dist/$NAME.app. We haven’t had to go through the hell of quarantine yet, so it should launch at this point. If it doesn’t, sorry :-(.

You can often debug more obvious fail-to-launch issues by running the executable in the command line, by running ./dist/$NAME.app/Contents/MacOS/$NAME. Although this will run in a slightly different environment than double clicking (it will have all your shell’s env vars, for example, so if your app needs an env var to work it might mysteriously work there) it will also print out any tracebacks to your terminal, where they’ll be slightly easier to find than in Console.app.

Once your app at least runs locally, it’s time to...

Step 4: Code sign it

All the tutorials that I’ve found on how to do this involve doing Xcode project goop where it’s not clear what’s happening underneath. But despite the fact that the introductory docs aren’t quite there, the underlying model for codesigning stuff is totally common across GUI and command-line cases. However, actually getting your cert requires Xcode, an apple ID, and a credit card.

After paying your hundred dollars, go into Xcode, go to Accounts, hit “+”, “Apple ID”, then log in. Then, in your shiny new account, go to “Manage Certificates”, hit the little “+”, and (assuming, like me, you want to put something up on your own website, and not submit to the Mac App Store), and choose Developer ID Application. You probably think you want “mac app distribution” because you are wanting to distribute a mac app! But you don’t.

Next, before you do anything else, make sure you have backups of your certificate and private key. You really don’t want to lose the private key associated with that cert.

Now quit Xcode; you’re done with the GUI.

You will need to know the identifier of your signing key though, which should be output from the command:

1
$ security find-identity -v -p codesigning | grep 'Developer ID' | sed -e 's/.*"\(.*\)"/\1/'

You probably want to put that in your build script, since you want to sign with the same identity every time. Further commands here will assume you’ve copied one of the lines of results from that command and done export IDENTITY="..." with it.

Step 4a: Become Aware Of New Annoying Requirements

Update for macOS Catalina: In Catalina, Apple has added a new code-signing requirement; even for apps distributed outside of the app store, they still have to be submitted to and approved by Apple.

In order to be notarized, you will need to codesign not only your app itself, but to also:

  1. add the hardened-runtime exception entitlements that allow Python to work, and
  2. directly sign every shared library that is part of your app bundle.

So the actual code-signing step is now a little more complicated.

Step 4b: Write An Entitlements Plist That Allows Python To Work

One of the features that notarization is intended to strongly encourage1 is the “hardened runtime”, a feature of macOS which opts in to stricter run-time behavior designed to stop malware. One thing that the hardened runtime does is to disable writable, executable memory, which is used by JITs, FFIs ... and malware.

Unfortunately, both Python’s built-in ctypes module and various popular bits of 3rd-party stuff that uses cffi, including pyOpenSSL, require writable, executable memory to work. Furthermore, py2app actually imports ctypes during its bootstrapping phase, so you can’t even get your own code to start running to perform any workarounds unless this is enabled. So this is just if you want to use Python, not if your project requires ctypes directly.

To make this long, sad story significantly shorter and happier, you can create an entitlements property list that enables the magical property which allows this to work. It looks like this:

1
2
3
4
5
6
7
8
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>com.apple.security.cs.allow-unsigned-executable-memory</key>
    <true/>
</dict>
</plist>

Subsequent steps assume that you’ve put this into a file called entitleme.plist in your project root.

Step 4c: SIGN ALL THE THINGS

Notarization also requires that all the executable files in your bundle, not just the main executable, are properly code-signed before submitting. So you’ll need to first run the codesign command across all your shared libraries, something like this:

1
2
3
4
5
6
7
8
9
$ cd dist
$ find "${NAME}.app" -iname '*.so' -or -iname '*.dylib' |
    while read libfile; do
        codesign --sign "${IDENTITY}" \
                 --entitlements ../entitleme.plist \
                 --deep "${libfile}" \
                 --force \
                 --options runtime;
    done;

Then finally, sign the bundle itself.

1
2
3
4
5
$ codesign --sign "${IDENTITY}" \
         --entitlements ../entitleme.plist \
         --deep "${NAME}.app" \
         --force \
         --options runtime;

Now, your app is code-signed.

Step 5: Archive it

The right way to do this is probably to use dmgbuild or something like it, but what I promised here was quick and dirty, not beautiful and best practices.

You have to make a Zip archive that preserves symbolic links. There are a couple of options for this:

  • open dist/, then in the Finder window that comes up, right click on the app and “compress” it
  • cd dist; zip -yr $NAME.app.zip $NAME.app

Most importantly, if you use the zip command line tool, you must use the -y option. Without it, your downloadable app bundle will be somewhat mysteriously broken even though the one before you zipped it will be fine.

Step 6: Actually The Rest Of Step 4: Request Notarization

Notarization is a 2-step process, which is somewhat resistant to fully automating. You submit to Apple, then they email you the results of doing the notarization, then if that email indicates that your notarization succeded, you can “staple” the successful result to your bundle.

The thing you notarize is an archive, which is why you need to do step 5 first. Then, you need to do this:

1
2
3
4
5
$ xcrun altool --notarize-app \
      --file "${NAME}.app.zip" \
      --type osx \
      --username "${YOUR_DEVELOPER_ID_EMAIL}" \
      --primary-bundle-id="${YOUR_BUNDLE_ID}";

Be sure that YOUR_BUNDLE_ID matches the CFBundleIdentifier you told py2app about before, so that the tool can find your app bundle inside the archive.

You’ll also need to type in the iCloud password for your Developer ID account here.2

Step 6a: Wait A Minute

Anxiously check your email for an hour or so. Hope you don’t get any errors.

Step 6b: Finish Notarizing It, Finally!

Once Apple has a record of the app’s notarization, their tooling will recognize it, so you don’t need any information from the confirmation email or the previous command; just make sure that you are running this on the exact same .app directory you just built and archived and not a version that differs in any way.

1
$ xcrun stapler staple "./${NAME}.app";

Finally, you will want to archive it again:

1
$ zip -qyr "${NAME}.notarized.app.zip" "${NAME}.app";

Step 7: Download it

Ideally, at this point, everything should be working. But to make sure that code-signing and archiving and notarizing and re-archiving went correctly, you should have either a pristine virtual machine with no dev tools and no Python installed, or a non-programmer friend’s machine that can serve the same purpose. They probably need a relatively recent macOS - my own experience has shown that apps made using the above technique will definitely work on High Sierra (and later) and will definitely break on Yosemite (and earlier); they probably start working at some OS version between those.

There’s no tooling that I know of that can clearly tell you whether your mac app depends on some detail of your local machine. Even for your dependencies, there’s no auditwheel for macOS.

Updated 2019-06-27: It turns out there is an auditwheel like thing for macOS: delocate! In fact, it predated and inspired auditwheel!

Thanks to Nathaniel Smith for the update (which he provided in, uh, January of 2018 and I’ve only just now gotten around to updating...).

Nevertheless, it’s always a good idea to check your final app build on a fresh computer before you announce it.

Coda

If you were expecting to get to the end and download my cool game, sorry to disappoint! It really is a half-broken prototype that is in no way ready for public consumption, and given my current load of personal and professional responsibilities, you definitely shouldn’t expect anything from me in this area any time soon, or, you know, ever.

But, from years of experience, I know that it’s nearly impossible to summon any motivation to work on small projects like this without the knowledge that the end result will be usable in some way, so I hope that this helps someone else set up their Python game-dev pipeline.

I’d really like to turn this into a 3-part series, with a part for Linux (perhaps using flatpak? is that a good thing?) and a part for Windows. However, given my aforementioned time constraints, I don’t think I’m going to have the time or energy to do that research, so if you’ve got the appropriate knowledge, I’d love to host a guest post on this blog, or even just a link to yours.

If this post helped you, if you have questions or corrections, or if you’d like to write the Linux or Windows version of this post, let me know.


  1. The hardened runtime was originally required when notarization was introduced. Apparently this broke too much software and now the requirement is relaxed until January 2020. But it’s probably best to treat it as if it is required, since the requirement is almost certainly coming back, and may in fact be back by the time you’re reading this. 

  2. You can pass it via the --password option but there are all kinds of security issues with that so I wouldn’t recommend it.