Careful With That PyPI

PyPI credentials are important. Here are some tips for securing them a little better.

Too Many Secrets

A wise man once said, “you shouldn’t use ENV variables for secret data”. In large part, he was right, for all the reasons he gives (and you should read them). Filesystem locations are usually a better operating system interface to communicate secrets than environment variables; fewer things can intercept an open() than can read your process’s command-line or calling environment.

One might say that files are “more secure” than environment variables. To his credit, Diogo doesn’t, for good reason: one shouldn’t refer to the superiority of such a mechanism as being “more secure” in general, but rather, as better for a specific reason in some specific circumstance.

Supplying your PyPI password to tools you run on your personal machine is a very different case than providing a cryptographic key to a containerized application in a remote datacenter. In this case, based on the constraints of the software presently available, I believe an environment variable provides better security, if you use it correctly.

Popping A Shell By Any Other Name

If you upload packages to the python package index, and people use those packages, your PyPI password is an extremely high-privilege credential: effectively, it grants a time-delayed arbitrary code execution privilege on all of the systems where anyone might pip install your packages.

Unfortunately, the suggested mechanism to manage this crucial, potentially world-destroying credential is to just stick it in an unencrypted file.

The authors of this documentation know this is a problem; the authors of the tooling know too (and, given that these tools are all open source and we all could have fixed them to be better about this, we should all feel bad).

Leaving the secret lying around on the filesystem is a form of ambient authority; a permission you always have, but only sometimes want. One of the worst things about this is that you can easily forget it’s there if you don’t use these credentials very often.

The keyring is a much better place, but even it can be a slightly scary place to put such a thing, because it’s still easy to put it into a state where some random command could upload a PyPI release without prompting you. PyPI is forever, so we want to measure twice and cut once.

Luckily, even more secure places exist: password managers. If you use https://1password.com or https://www.lastpass.com, both offer command-line interfaces that integrate nicely with PyPI. If you use 1password, you’ll really want https://stedolan.github.io/jq/ (apt-get install jq, brew install jq) to slice & dice its command-line.

The way that I manage my PyPI credentials is that I never put them on my filesystem, or even into my keyring; instead, I leave them in my password manager, and very briefly toss them into the tools that need them via an environment variable.

First, I have the following shell function, to prevent any mistakes:

1
2
3
4
function twine () {
    echo "Use dev.twine or prod.twine depending on where you want to upload.";
    return 1;
}

For dev.twine, I configure twine to always only talk to my local DevPI instance:

1
2
3
4
5
6
function dev.twine () {
    env TWINE_USERNAME=root \
        TWINE_PASSWORD= \
        TWINE_REPOSITORY_URL=http://127.0.0.1:3141/root/plus/ \
        twine "$@";
}

This way I can debug Twine, my setup.py, and various test-upload things without ever needing real credentials at all.

But, OK. Eventually, I need to actually get the credentials and do the thing. How does that work?

1Password

1password’s command line is a little tricky to log in to (you have to eval its output, it’s not just a command), so here’s a handy shell function that will do it.

1
2
3
4
5
6
function opme () {
    # Log this shell in to 1password.
    if ! env | grep -q OP_SESSION; then
        eval "$(op signin "$(jq -r '.latest_signin' ~/.op/config)")";
    fi;
}

Then, I have this little helper for slicing out a particular field from the OP JSON structure:

1
2
3
function _op_field () {
    jq -r '.details.fields[] | select(.name == "'"${1}"'") | .value';
}

And finally, I use this to grab the item I want (named, memorably enough, “PyPI”) and invoke Twine:

1
2
3
4
5
6
7
function prod.twine () {
    opme;
    local pypi_item="$(op get item PyPI)";
    env TWINE_USERNAME="$(echo ${pypi_item} | _op_field username)" \
        TWINE_PASSWORD="$(echo "${pypi_item}" | _op_field password)" \
        twine "$@";
}

LastPass

For lastpass, you can just log in (for all shells; it’s a little less secure) via lpass login; if you’ve logged in before you often don’t even have to do that, and it will just prompt you when running command that require you to be logged in; so we don’t need the preamble that 1password’s command line did.

Its version of prod.twine looks quite similar, but its plaintext output obviates the need for jq:

1
2
3
4
5
function prod.twine () {
    env TWINE_USERNAME="$(lpass show PyPI --username)" \
        TWINE_PASSWORD="$(lpass show PyPI --password)" \
        twine "$@";
}

In Conclusion

“Keep secrets out of your environment” is generally a good idea, and you should always do it when you can. But, better a moment in your process environment than an eternity on your filesystem. Environment-based configuration can be a very useful stopgap for limiting the lifetimes of credentials when your tools don’t support more sophisticated approaches to secret storage.1

Post Script

If you are interested in secure secret storage, my micro-project secretly might be of interest. Right now it doesn’t do a whole lot; it’s just a small wrapper around the excellent keyring module and the pinentry / pinentry-mac password prompt tools. secretly presents an interface both for prompting users for their credentials without requiring the command-line or env vars, and for saving them away in keychain, for tools that need to pull in an API key and don’t want to make the user manually edit a config file first.


  1. Really, PyPI should have API keys that last for some short amount of time, that automatically expire so you don’t have to freak out if you gave somebody a 5-year-old laptop and forgot to wipe it first. But again, if I wanted that so bad, I should have implemented it myself... 

Sourceforge Update

Authenticate downloaded binaries from sourceforge a little more.

When I wrote my previous post about Sourceforge, things were looking pretty grim for the site; I (rightly, I think) slammed them for some pretty atrocious security practices.

I invited the SourceForge ops team to get in touch about it, and, to their credit, they did. Even better, they didn't ask for me to take down the article, or post some excuse; they said that they knew there were problems and they were working on a long-term plan to address them.

This week I received an update from said ops, saying:

We have converted many of our mirrors over to HTTPS and are actively working on the rest + gathering new ones. The converted ones happen to be our larger mirrors and are prioritized.

We have added support for HTTPS on the project web. New projects will automatically start using it. Old projects can switch over at their convenience as some of them may need to adjust it to properly work. More info here:

https://sourceforge.net/blog/introducing-https-for-project-websites/

Coincidentally, right after I received this email, I installed a macOS update, which means I needed to go back to Sourceforge to grab an update to my boot manager. This time, I didn't have to do any weird tricks to authenticate my download: the HTTPS project page took me to an HTTPS download page, which redirected me to an HTTPS mirror. Success!

(It sounds like there might still be some non-HTTPS mirrors in rotation right now, but I haven't seen one yet in my testing; for now, keep an eye out for that, just in case.)

If you host a project on Sourceforge, please go push that big "Switch to HTTPS" button. And thanks very much to the ops team at Sourceforge for taking these problems seriously and doing the hard work of upgrading their users' security.

Don’t Trust Sourceforge, Ever

Authenticate downloaded binaries from sourceforge. A little.

Update: please see my more recent post about updates in the interim.

If you use a computer and you use the Internet, chances are you’ll eventually find some software that, for whatever reason, is still hosted on Sourceforge. In case you’re not familiar with it, Sourceforge is a publicly-available malware vector that also sometimes contains useful open source binary downloads, especially for Windows.


In addition to injecting malware into their downloads (a practice they claim, hopefully truthfully, to have stopped), Sourceforge also presents an initial download page over HTTPS, then redirects the user to HTTP for the download itself, snatching defeat from the jaws of victory. This is fantastically irresponsible, especially for a site offering un-sandboxed binaries for download, especially in the era of Let’s Encrypt where getting a TLS certificate takes approximately thirty seconds and exactly zero dollars.

So: if you can possibly find your downloads anywhere else, go there.


But, rarely, you will find yourself at the mercy of whatever responsible stewards1 are still operating Sourceforge if you want to get access to some useful software. As it happens, there is a loophole that will let you authenticate the binaries that you download from them so you won’t be left vulnerable to an evil barista: their “file release system”, the thing you use to upload your projects, will allow you to download other projects as well.

To use it, first, make yourself a sourceforge account. You may need to create a dummy project as well. Sourceforge maintains an HTTPS-accessible list of key fingerprints for all the SSH servers that they operate, so you can verify the public key below.

Then you’ll need to connect to their upload server over SFTP, and go to the path /home/frs/project/<the project’s name>/.../ to get the file.

I have written a little Python script2 that automates the translation of a Sourceforge file-browser download URL, one that you can get if you right-click on a download in the “files” section of a project’s website, and runs the relevant scp command to retrieve the file for you. This isn’t on PyPI or anything, and I’m not putting any effort into polishing it further; the best possible outcome of this blog post is that it immediately stops being necessary.


  1. Are you one of those people? I would prefer to be lauding your legacy of decades of valuable contributions to the open source community instead of ridiculing your dangerous incompetence, but repeated bug reports and support emails have gone unanswered. Please get in touch so we can discuss this. 

  2. Code:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    #!/usr/bin/env python2
    
    import sys
    import os
    
    sfuri = sys.argv[1]
    
    # for example,
    # http://sourceforge.net/projects/refind/files/0.9.2/refind-bin-0.9.2.zip/download
    
    import re
    matched = re.match(
        r"https://sourceforge.net/projects/(.*)/files/(.*)/download",
        sfuri
    )
    
    if not matched:
        sys.stderr.write("Not a SourceForge download link.\n")
        sys.exit(1)
    
    project, path = matched.groups()
    
    sftppath = "/home/frs/project/{project}/{path}".format(project=project, path=path)
    
    def knows_about_web_sf_net():
        with open(
                os.path.expanduser("~/.ssh/known_hosts"), "rb"
        ) as read_known_hosts:
            data = read_known_hosts.read().split("\n")
            for line in data:
                if 'web.sourceforge.net' in line.split()[0]:
                    return True
        return False
    
    sfkey = """
    web.sourceforge.net ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA2uifHZbNexw6cXbyg1JnzDitL5VhYs0E65Hk/tLAPmcmm5GuiGeUoI/B0eUSNFsbqzwgwrttjnzKMKiGLN5CWVmlN1IXGGAfLYsQwK6wAu7kYFzkqP4jcwc5Jr9UPRpJdYIK733tSEmzab4qc5Oq8izKQKIaxXNe7FgmL15HjSpatFt9w/ot/CHS78FUAr3j3RwekHCm/jhPeqhlMAgC+jUgNJbFt3DlhDaRMa0NYamVzmX8D47rtmBbEDU3ld6AezWBPUR5Lh7ODOwlfVI58NAf/aYNlmvl2TZiauBCTa7OPYSyXJnIPbQXg6YQlDknNCr0K769EjeIlAfY87Z4tw==
    """
    
    if not knows_about_web_sf_net():
        with open(
                os.path.expanduser("~/.ssh/known_hosts"), "ab"
        ) as append_known_hosts:
            append_known_hosts.write(sfkey)
    cmd = "scp web.sourceforge.net:{sftppath} .".format(sftppath=sftppath)
    print(cmd)
    os.system(cmd)
    

Your Text Editor Is Malware

Emacs wants you to install unauthenticated code off of a wiki; I can help.

Are you a programmer? Do you use a text editor? Do you install any 3rd-party functionality into that text editor?

If you use Vim, you’ve probably installed a few vimballs from vim.org, a website only available over HTTP. Vimballs are fairly opaque; if you’ve installed one, chances are you didn’t audit the code.

If you use Emacs, you’ve probably installed some packages from ELPA or MELPA using package.el; in Emacs’s default configuration, ELPA is accessed over HTTP, and until recently MELPA’s documentation recommended HTTP as well.

When you install un-signed code into your editor that you downloaded over an unencrypted, unauthenticated transport like HTTP, you might as well be installing malware. This is not a joke or exaggeration: you really might be.1 You have no assurance that you’re not being exploited by someone on your local network, by someone on your ISP’s network, the NSA, the CIA, or whoever else.

The solution for Vim is relatively simple: use vim-plug, which fetches stuff from GitHub exclusively via HTTPS. I haven’t audited it conclusively but its relatively small codebase includes lots of https:// and no http:// or git://2 that I could see.

I’m relatively proud of my track record of being a staunch advocate for improved security in text editor package installation. I’d like to think I contributed a little to the fact that MELPA is now available over HTTPS and instructs you to use HTTPS URLs.

But the situation still isn’t very good in Emacs-land. Even if you manage to get your package sources from an authenticated source over HTTPS, it doesn’t matter, because Emacs won’t verify TLS.

Although package signing is implemented, practically speaking, none of the packages are signed.3 Therefore, you absolutely cannot trust package signing to save you. Plus, even if the packages were signed, why is it the NSA’s business which packages you’re installing, anyway? TLS is shorthand for The Least Security (that is acceptable); whatever other security mechanisms, like package signing, are employed, you should always at least have HTTPS.

With that, here’s my unfortunately surprise-filled step-by-step guide to actually securing Emacs downloads, on Windows, Mac, and Linux.

Step 1: Make Sure Your Package Sources Are HTTPS Only

By default, Emacs ships with its package-archives list as '(("gnu" . "http://elpa.gnu.org/packages/")), which is obviously no good. You will want to both add MELPA (which you surely have done anyway, since it’s where all the actually useful packages are) and change the ELPA URL itself to be HTTPS. Use M-x customize-variable to change package-archives to:

1
2
`(("gnu" . "https://elpa.gnu.org/packages/")
  ("melpa" . "https://melpa.org/packages/"))

Step 2: Turn On TLS Trust Checking

There’s another custom variable in Emacs, tls-checktrust, which checks trust on TLS connections. Go ahead and turn that on, again, via M-x customize-variable tls-checktrust.

Step 3: Set Your Trust Roots

Now that you’ve told Emacs to check that the peer’s certificate is valid, Emacs can’t successfully fetch HTTPS URLs any more, because Emacs does not distribute trust root certificates. Although the set of cabforum certificates are already probably on your computer in various forms, you still have to acquire them in a format usable by Emacs somehow. There are a variety of ways, but in the interests of brevity and cross-platform compatibility, my preferred mechanism is to get the certifi package from PyPI, with python -m pip install --user certifi or similar. (A tutorial on installing Python packages is a little out of scope for this post, but hopefully my little website about this will help you get started.)

At this point, M-x customize-variable fails us, and we need to start just writing elisp code; we need to set tls-program to a string computed from the output of running a program, and if we want this to work on Windows we can’t use Bourne shell escapes. Instead, do something like this in your .emacs or wherever you like to put your start-up elisp:4

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
(let ((trustfile
       (replace-regexp-in-string
        "\\\\" "/"
        (replace-regexp-in-string
         "\n" ""
         (shell-command-to-string "python -m certifi")))))
  (setq tls-program
        (list
         (format "gnutls-cli%s --x509cafile %s -p %%p %%h"
                 (if (eq window-system 'w32) ".exe" "") trustfile))))

This will run gnutls-cli on UNIX, and gnutls-cli.exe on Windows.

You’ll need to install the gnutls-cli command line tool, which of course varies per platform:

  • On OS X, of course, Homebrew is the best way to go about this: brew install gnutls will install it.
  • On Windows, the only way I know of to get GnuTLS itself over TLS is to go directly to this mirror. Download one of these binaries and unzip it next to Emacs in its bin directory.
  • On Debian (or derivatives), apt-get install gnutls-bin
  • On Fedora (or derivatives), yum install gnutls-utils

Great! Now we’ve got all the pieces we need: a tool to make TLS connections, certificates to verify against, and Emacs configuration to make it do those things. We’re done, right?

Wrong!

Step 4: TRUST NO ONE

It turns out there are two ways to tell Emacs to really actually really secure the connection (really), but before I tell you the second one or why you need it, let’s first construct a little test to see if the connection is being properly secured. If we make a bad connection, we want it to fail. Let’s make sure it does.

This little snippet of elisp will use the helpful BadSSL.com site to give you some known-bad and known-good certificates (assuming nobody’s snooping on your connection):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
(let ((bad-hosts
       (loop for bad
             in `("https://wrong.host.badssl.com/"
                  "https://self-signed.badssl.com/")
             if (condition-case e
                    (url-retrieve
                     bad (lambda (retrieved) t))
                  (error nil))
             collect bad)))
  (if bad-hosts
      (error (format "tls misconfigured; retrieved %s ok"
                     bad-hosts))
    (url-retrieve "https://badssl.com"
                  (lambda (retrieved) t))))

If you evaluate it and you get an error, either your trust roots aren’t set up right and you can’t connect to a valid site, or Emacs is still blithely trusting bad certificates. Why might it do that?

Step 5: Configure the Other TLS Verifier

One of Emacs’s compile-time options is whether to link in GnuTLS or not. If GnuTLS is not linked in, it will use whatever TLS program you give it (which might be gnutls-cli or openssl s_client, but since only the most recent version of openssl s_client can even attempt to verify certificates, I’d recommend against it). That is what’s configured via tls-checktrust and tls-program above.

However, if GnuTLS is compiled in, it will totally ignore those custom variables, and honor a different set: gnutls-verify-error and gnutls-trustfiles. To make matters worse, installing the packages which supply the gnutls-cli program also install the packages which might satisfy Emacs’s dynamic linking against the GnuTLS library, which means this code path could get silently turned on because you tried to activate the other one.

To give these variables the correct values as well, we can re-visit the previous trust setup:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
(let ((trustfile
       (replace-regexp-in-string
        "\\\\" "/"
        (replace-regexp-in-string
         "\n" ""
         (shell-command-to-string "python -m certifi")))))
  (setq tls-program
        (list
         (format "gnutls-cli%s --x509cafile %s -p %%p %%h"
                 (if (eq window-system 'w32) ".exe" "") trustfile)))
  (setq gnutls-verify-error t)
  (setq gnutls-trustfiles (list trustfile)))

Now it ought to be set up properly. Try the example again from Step 4 and it ought to work. It probably will. Except, um...

Appendix A: Windows is Weird

As of November 2015, the official Windows builds of Emacs were linked against version 3.3 of GnuTLS rather than the latest 3.4. You might need to download the latest micro-version of 3.3 instead.

As far as I can tell, it’s supposed to work with the command-line tools (and maybe it will for you) but for me, for some reason, Emacs could not parse gnutls-cli.exe’s output no matter what I did. This does not appear to be a universal experience, others have reported success; your mileage may vary.

Update: Thanks to astute reader Richard Copley, I’ve been informed that command-line tools aren’t even really supposed to work on Windows; as described in this bug comment:

TLS connections on MS-Windows are supported via the GnuTLS library. External TLS programs will never work correctly on Windows, since they use signals to communicate with Emacs. So there's little sense in fixing this issue, because the result will not work anyway.

— Eli Zaretskii

Conclusion

We nerds sometimes mock the “normals” for not being as security-savvy as we are. Even if we’re considerate enough not to voice these reactions, when we hear someone got malware on their Windows machine, we think “should have used a UNIX, not Windows”. Or “should have been up to date on your patches”, or something along those lines.

Yet, nerdy tools that download and execute code - Emacs in particular - are shockingly careless about running arbitrary unverified code from the Internet. And we are often equally shockingly careless to use them, when we should know better.

If you’re an Emacs user and you didn’t fully understand this post, or you couldn’t get parts of it to work, stop using package.el until you can get the hang of it. Get a friend to help you get your environment configured properly. Since a disproportionate number of Emacs users are programmers or sysadmins, you are a high-value target, and you are risking not only your own safety but that of your users if you don’t double-check that your editor packages are coming from at least cursorily authenticated sources.

If you use another programmer’s text editor or nerdy development tool that is routinely installing software onto your system, make sure that if it’s at least securing those installations with properly verified TLS.


  1. Technically speaking of course you might always be installing malware; no defense is perfect. And HTTPS is a fairly weak one at that. But is significantly stronger than “no defense at all”. 

  2. Never, ever, clone a repository using git:// URLs. As explained in the documentation: “The native transport (i.e. git:// URL) does no authentication and should be used with caution on unsecured networks.”. You might have heard that git uses a “cryptographic hash function” and thought that had something to do with security: it doesn’t. If you want security you need signed commits, and even then you can never really be sure

  3. Plus, MELPA accepts packages on the (plain-text-only) Wiki, which may be edited by anyone, and from CVS servers, although they’d like to stop that. You should probably be less worried about this, because that’s a link between two datacenters, than about the link between you and MELPA, which is residential or business internet at best, and coffee-shop WiFi at worst. But still maybe be a bit worried about it and go comment on that bug. 

  4. Yes, that let is a hint that this is about to get more interesting... 

According To...?

Browsers, please start showing the issuer to users.

I believe that web browsers must start including the ultimate issuer in an always-visible user interface element.

You are viewing this website at glyph.twistedmatrix.com. Hopefully securely.

We trust that the math in the cryptographic operations protects our data from prying eyes. However, trusting that the math says the content is authentic and secure is useless unless you know who your computer is talking to. The HTTPS/TLS system identifies your interlocutor by their domain name.

In other words, you trust that these words come from me because glyph.twistedmatrix.com is reasonably associated with me. If the lock on your web browser’s title bar was next to the name stuff-glyph-says.stealing-your-credit-card.example.com, presumably you might be more skeptical that the content was legitimate.

But... the cryptographic primitives require a trust root - somebody that you “already trust” - meaning someone that your browser already knows about at the time it makes the request - to tell you that this site is indeed glyph.twistedmatrix.com. So you read these words as if they’re the world according to Glyph, but according to whom is it according to me?

If you click on some obscure buttons (in Safari and Firefox you click on the little lock; in Chrome you click on the lock, then “Connection”) you should see that my identity as glyph.twistedmatrix.com has been verified by “StartCom Class 1 Primary Intermediate Server CA” who was in turn verified by “StartCom Certification Authority”.

But if you do this, it only tells you about this one time. You could click on a link, and the issuer might change. It might be different for just one script on the page, and there’s basically no way to find out. There are more than 50 different organizations which could certify that could tell your browser to trust that this content is from me, several of whom have already been compromised. If you’re concerned about government surveillance, this list includes the governments of Hong Kong, Japan, France, the Netherlands, Turkey, as well as many multinational corporations vulnerable to secret warrants from the USA.

Sometimes it’s perfectly valid to trust these issuers. If I’m visiting a website describing some social services provided to French citizens, it would of course be reasonable for that to be trusted according to the government of France. But if you’re reading an article on my website about secure communications technology, probably it shouldn’t be glyph.twistedmatrix.com brought to you by the China Internet Network Information Center.

Information security is all about the user having some expectation and then a suite of technology ensuring that that expectation is correctly met. If the user’s expectation of the system’s behavior is incorrect, then all the technological marvels in the world making sure that behavior is faithfully executed will not help their actual security at all. Without knowing the issuer though, it’s not clear to me what the user’s expectation is supposed to be about the lock icon.

The security authority system suffers from being a market for silver bullets. Secure websites are effectively resellers of the security offered to them by their certificate issuers; however, the customers are practically unable to even see the trade mark - the issuer name - of the certificate authority ultimately responsible for the integrity and confidentiality of their communications, so they have no information at all. The website itself also has next to no information because the certificate authority themselves are under no regulatory obligation to disclose or verify their security practices.

Without seeing the issuer, there’s no way for “issuer reputation” to be a selling point, which means there’s no market motivation for issuers to do a really good job securing their infrastructure. There’s no way for average users to notice if they are the victims of a targetted surveillance attack.

So please, browser vendors, consider making this information available to the general public so we can all start making informed decisions about who to trust.