One of the wonderful things about Python is the ease with which you can start
writing a script - just drop some code into a
.py file, and run
my_file.py. Similarly it’s easy to get started with modularity: split
my_lib.py, and you can
import my_lib from
my_app.py and start organizing your code into modules.
However, the details of the machinery that makes this work have some surprising, and sometimes very security-critical consequences: the more convenient it is for you to execute code from different locations, the more opportunities an attacker has to execute it as well...
Python needs a safe space to load code from
Here are three critical assumptions embedded in Python’s security model:
- Every entry on
sys.pathis assumed to be a secure location from which it is safe to execute arbitrary code.
- The directory where the “main script” is located is always on
- When invoking
pythondirectly, the current directory is treated as the “main script” location, even when passing the
If you’re running a Python application that’s been installed properly on your
computer, the only location outside of your Python install or virtualenv that
will be automatically added to your
sys.path (by default) is the location
where the main executable, or script, is installed.
For example, if you have
pip installed in
/usr/bin, and you run
/usr/bin/pip, then only
/usr/bin will be added to
sys.path by this
feature. Anything that can write files to that
/usr/bin can already make
you, or your system, run stuff, so it’s a pretty safe place. (Consider what
would happen if your
ls executable got replaced with something nasty.)
However, one emerging convention is to
/path/to/python -m pip in order to avoid the complexities of
$PATH properly, and to avoid dealing with divergent documentation
of how scripts are installed on Windows (usually as
.exe files these days,
This is fine — as long as you trust that you’re the only one putting files into the places you can import from — including your working directory.
Your “Downloads” folder isn’t safe
As the category of attacks with the name “DLL Planting” indicates, there are many ways that browsers (and sometimes other software) can be tricked into putting files with arbitrary filenames into the Downloads folder, without user interaction.
Browsers are starting to take this class of vulnerability more seriously, and adding various mitigations to avoid allowing sites to surreptitiously drop files in your downloads folder when you visit them.1
Even with mitigations though, it will be hard to stamp this out entirely: for
Content-Disposition HTTP header’s
exists entirely to allow the the site to choose the filename that it downloads
Composing the attack
You’ve made a habit of
python -m pip to install stuff. You download a Python
package from a totally trustworthy website that, for whatever reason, has a
Python wheel by direct download instead of on PyPI. Maybe it’s internal, maybe
it’s a pre-release; whatever. So you download
This seems like a reasonable thing to do, but unbeknownst to you, two weeks ago,
pip.py with some malware in it into your downloads folder.
Here’s a quick demonstration of the attack:
1 2 3 4 5
Just a few paragraphs ago, I said:
If you’re running a Python application that’s been installed properly on your computer, the only location outside of your Python install or virtualenv that will be automatically added to your
sys.path(by default) is the location where the main executable, or script, is installed.
So what is that parenthetical “by default” doing there? What other directories might be added?
Anything entries on your
environment variable. You wouldn’t put your current directory on
$PYTHONPATH, would you?
Unfortunately, there’s one common way that you might have done so by accident.
Let’s simulate a “vulnerable” Python application:
1 2 3 4 5
Make 2 directories:
attacker_dir. Drop this in
cd attacker_dir and put our sophisticated malware
there, under the name used by
Finally, let’s run it:
So far, so good.
But, here’s the common mistake. Most places that still recommend
recommend adding things to it like so:
Intuitively, this makes sense; if you’re adding project X to your
$PYTHONPATH, maybe project Y had already added something, maybe not; you
never want to blow it away and replace what other parts of your shell startup
might have done with it, especially if you’re writing documentation that lots
of different people will use.
But this idiom has a critical flaw: the first time it’s invoked, if
$PYTHONPATH was previously either empty or un-set, this then includes an
empty string, which resolves to the current directory. Let’s try it:
1 2 3
Oh no! Well, just to be safe, let’s empty out
$PYTHONPATH and try it again:
1 2 3
Still not safe!
What’s happening here is that if
PYTHONPATH is empty, that is not the same
thing as it being unset. From within Python, this is the difference between
os.environ.get("PYTHONPATH") == "" and
If you want to be sure you’ve cleared
$PYTHONPATH from a shell (or somewhere
in a shell startup), you need to use the
PYTHONPATH used to be the most common way to set up a Python
development environment; hopefully it’s mostly fallen out of favor, with
virtualenvs serving this need better. If you’ve got an old shell configuration
that still sets a
$PYTHONPATH that you don’t need any more, this is a good
opportunity to go ahead and delete it.
However, if you do need an idiom for
PYTHONPATH in a shell startup, use this
In both bash and zsh, this results in
with no extra colons or blank entries on your
$PYTHONPATH variable now.
Finally: if you’re still using
$PYTHONPATH, be sure to always use absolute
There are a bunch of variant unsafe behaviors related to inspecting files in
Downloads folder by doing anything interactive with Python. Other risky
python ~/Downloads/anything.py(even if
anything.pyis itself safe) from anywhere - as it will add your downloads folder to
sys.pathby virtue of
- Jupyter Notebook puts the directory that the notebook is in onto
sys.path, just like Python puts the script directory there. So
jupyter notebook ~/Downloads/anything.ipynbis just as dangerous as
Get those scripts and notebooks out of your downloads folder before you run ’em!
cd Downloads and then doing anything interactive remains a problem too:
- Running a
python -ccommand that includes an
importstatement while in your
pythoninteractively and importing anything while in your
~/Downloads/ isn’t special; it’s just one place where
unexpected files with attacker-chosen filenames might sneak in. Be on the
lookout for other locations where this is true. For example, if you’re
administering a server where the public can upload files, make extra sure
that neither your application nor any administrator who might run
Maybe consider changing the code that handles uploads to mangle file names to
.uploaded at the end, avoiding the risk of a
.py file getting
uploaded and executed accidentally.
If you have tools written in Python that you want to use while in your
downloads folder, make a habit of preferring typing the path to the script
/path/to/venv/bin/pip) rather than the module (
In general, just avoid ever having
~/Downloads as your current working
directory, and move any software you want to use to a more appropriate location
before launching it.
It’s important to understand where Python gets the code that it’s going to be executing. Giving someone the ability to execute even one line of arbitrary Python is equivalent to giving them full control over your computer!
Why I wrote this article
When writing a “tips and tricks” article like this about security, it’s very easy to imply that I, the author, am very clever for knowing this weird bunch of trivia, and the only way for you, the reader, to stay safe, is to memorize a huge pile of equally esoteric stuff and constantly be thinking about it. Indeed, a previous draft of this post inadvertently did just that. But that’s a really terrible idea and not one that I want to have any part in propagating.
So if I’m not trying to say that, then why post about it? I’ll explain.
Over many years of using Python, I’ve infrequently, but regularly, seen users
confused about the locations that Python loads code from. One variety of this
confusion is when people put their first program that uses Twisted into a file
twisted.py. That shadows the import of the library, breaking
everything. Another manifestation of this confusion is a slow trickle of
confused security reports where a researcher drops a module into a location
where Python is documented to load code from — like the current directory in
the scenarios described above — and then load it, thinking that this reflects
an exploit because it’s executing arbitrary code.
Any confusion like this — even if the system in question is “behaving as intended”, and can’t readily be changed — is a vulnerability that an attacker can exploit.
System administrators and developers are high-value targets in the world of cybercrime. If you hack a user, you get that user’s data; but if you hack an admin or a dev, and you do it right, you could get access to thousands of users whose systems are under the administrator’s control or even millions of users who use the developers’ software.
Therefore, while “just be more careful all the time” is not a sustainable recipe for safety, to some extent, those of us acting on our users’ behalf do have a greater obligation to be more careful. At least, we should be informed about the behavior of our tools. Developer tools, like Python, are inevitably power tools which may require more care and precision than the average application.
Nothing I’ve described above is a “bug” or an “exploit”, exactly; I don’t think that the developers of Python or Jupyter have done anything wrong; the system works the way it’s designed and the way it’s designed makes sense. I personally do not have any great ideas for how things could be changed without removing a ton of power from Python.
One of my favorite safety inventions is the SawStop. Nothing was wrong with the way table saws worked before its invention; they were extremely dangerous tools that performed an important industrial function. A lot of very useful and important things were made with table saws. Yet, it was also true that table saws were responsible for a disproportionate share of wood-shop accidents, and, in particular, lost fingers. Despite plenty of care taken by experienced and safety-conscious carpenters, the SawStop still saves many fingers every year.
So by highlighting this potential danger I also hope to provoke some thinking among some enterprising security engineers out there. What might be the SawStop of arbitrary code execution for interactive interpreters? What invention might be able to prevent some of the scenarios I describe below without significantly diminishing the power of tools like Python?
Stay safe out there, friends.
Any errors remain my own.
Restricting which sites can drive-by drop files into your downloads folder is a great security feature, except the main consequence of adding it is that everybody seems to be annoyed by it, not understand it, and want to turn it off. ↩