Declaratively

Insecure states should be unrepresentable.

This weekend a catastrophic bug in log4j2 was disclosed, leading to the potential for remote code execution on a huge number of unpatched endpoints.

In this specific case, it turns out there was not really any safe way to use the API. Initially it might appear that the issue was the treatment of an apparently fixed format string as a place to put variable user-specified data, but as it turns out it just recursively expands the log data forever, looking for code to execute. So perhaps the lesson here is nothing technical, just that we should remain ready to patch, or that we should pay the maintainers.

Still, it’s worth considering that injection vulnerabilities of this type exist pretty much everywhere, usually in places where the supposed defense against getting catastrophically RCE’d is to carefully remember that the string that you pass in isn’t that kind of string.

While not containing anything nearly so pernicious as a place to put a URL that lets you execute arbitrary attacker-controlled code, Python’s logging module does contain a fair amount of confusing indirection around its log message. Sometimes — if you’re passing a non-zero number of *args — the parts of the logging module will interpret msg as a format string; other times it will interpret it as a static string. This is in some sense a reasonable compromise; you can have format strings and defer formatting if you want, but also log.warning(f"hi, {attacker_controlled_data}") is fairly safe by default. It’s still a somewhat muddled and difficult to document situation.

Similarly, Twisted’s logging system does always treat its string argument as a format string, which is more consistent. However, it does let attackers put garbage into the log wherever the developer might not have understood the documentation.1

This is to say nothing of the elephant in the room here: SQL. Almost every SQL API takes a bunch of strings, and the ones that make you declare an object in advance (i.e. Java’s PreparedStatement) don’t mind at all if you create one at runtime.

In the interest of advancing the state of the art just a little here, I’d like to propose a pattern to encourage the idiomatic separation of user-entered data (i.e. attacker-controlled payloads) from pre-registration of static, sensitive data, whether it’s SQL queries, format strings, static HTML or something else. One where copying and pasting examples won’t instantly subvert the intended protection. What I suggest would look like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# module scope
with sql_statements.declarations() as d:
    create_table = d.declare("create table foo (bar int, baz str)")
    save_foo = d.declare("insert into foo values (?, ?)")
    load_by_bar = d.declare("select * from foo where bar = :bar")

# later, inside a function
con = sqlite3.connect(":memory:")
cur = con.cursor()
create_table.run(cur)
save_foo.run(cur, 3, "hello")
save_foo.run(cur, 4, "goodbye")
print((list(load_by_bar.run(cur, bar=3))))

The idea here is that sql_statements.declarations() detects which module it’s in, and only lets you write those declarations once. Attempting to stick that inside your function and create some ad-hoc formatted string should immediately fail with a loud exception; copying this into the wrong part of your code just won’t work, so you won’t have a chance to create an injection vulnerability.

If this idea appeals to you, I’ve written an extremely basic prototype here on github and uploaded it to PyPI here.


  1. I’m not dropping a 0day on you, there’s not a clear vulnerability here; it only lets you draw data from explicitly-specified parameters into the log. If you use it wrong, you just might get an "Unable to format event" type error, which we'll go out of our way to not raise back to you as an exception. It just makes some ugly log messages. 

Beyond ThunderDock

I Plugged Some Stuff Into A Thunderbolt Dock. You Won’t Believe what Happens Next

This weekend I found myself pleased to receive a Kensington SD5000T Thunderbolt 3 Docking Station3.

Some of its functionality was a bit of a weird surprise.

The Setup

Due to my ... accretive history with computer purchases, I have 3 things on my desk at home: a USB-C macbook pro, a 27" Thunderbolt iMac, and an older 27" Dell display, which is old enough at this point that I can’t link it to you. Please do not take this to be some kind of totally sweet setup. It would just be somewhat pointlessly expensive to replace this jumble with something nicer3. I purchased the dock because I want to have one cable to connect me to power & both displays.

For those not familiar, iMacs of a certain vintage1 can be jury-rigged to behave as Thunderbolt displays with limited functionality (no access from the guest system to the iMac’s ethernet port, for example), using Target Display Mode, which extends their useful lifespan somewhat. (This machine is still, relatively speaking, a powerhouse, so it’s not quite dead yet; but it’s nice to be able to swap in my laptop and use the big screen.)

On the back of the Thunderbolt dock, there are 2 Thunderbolt 3 ports. I plugged the first one into a Thunderbolt 3 to Thunderbolt 2 adapter which connects to the back of the iMac, and the second one into the Macbook directly. The Dell display plugs into the DisplayPort; I connected my network to the Ethernet port of the dock. My mouse, keyboard, and iPhone were plugged into the USB ports on the dock.

The Problem

I set it up and at first it seemed to be delivering on the “one cable” promise of thunderbolt 3. But then I switched WiFi off to test the speed of the wired network and was surprised to see that it didn’t see the dock’s ethernet port at all. Flipping wifi back on, I looked over at my router’s control panel and noticed that a new device (with the expected manufacturer) was on my network. nmap seemed to indicate that it was... running exactly the network services I expected to see on my iMac. VNCing into the iMac to see what was going on, I popped open the Network system preference pane, and right there alongside all the other devices, was the thunderbolt dock’s ethernet device.

The Punch Line

Despite the miasma of confusion surrounding USB-C and Thunderbolt 32, the surprise here is that apparently Thunderbolt is Thunderbolt, and (for this device at least) Thunderbolt devices connected across the same bus can happily drive whatever they’re plugged in to. The Thunderbolt 2 to 3 adapter isn’t just a fancy way of plugging in hard drives and displays with the older connector; as far as I can tell all the functionality of the Thunderbolt interface remains intact as both “host” and “guest”. It’s like having an ethernet switch for your PCI bus.

What this meant is that when I unplugged everything and then carefully plugged in the iMac before the Macbook, it happily lit up the Dell display, and connected to all the USB devices plugged into the USB hub. When I plugged the laptop in, it happily started charging, but since it didn’t “own” the other devices, nothing else connected to it.

Conclusion

This dock works a little bit too well; when I “dock” now I have to carefully plug in the laptop first, give it a moment to grab all the devices so that it “owns” them, then plug in the iMac, then use this handy app to tell the iMac to enter Target Display mode.

On the other hand, this does also mean that I can quickly toggle between “everything is plugged in to the iMac” and “everything is plugged in to the MacBook” just by disconnecting and reconnecting a single cable, which is pretty neat.


  1. Sadly, not the most recent fancy 5K ones. 

  2. which are, simultaneously, both the same thing and not the same thing. 

  3. Paid links. See disclosures

As some of you may have guessed from the unintentional recent flurry of activity on my Twitter account, twitter feed, the service I used to use to post blog links automatically, is getting end-of-lifed. I've switched to dlvr.it for the time being, unless they send another unsolicited tweetstorm out on my behalf...

Sorry about the noise! In the interests of putting some actual content here, maybe you would be interested to know that I was recently interviewed for PyDev of the Week?

Probably best to get this out of the way before this weekend:

If I meet you at a technical conference, you’ll probably see me extend my elbow in your direction, rather than my hand. This is because I won’t shake your hand at a conference.

People sometimes joke about “con crud”, but the amount of lost productivity and human misery generated by conference-transmitted sickness is not funny. Personally, by the time the year is out, I will most likely have attended 5 conferences. This means that if I get sick at each one, I will spend more than a month out of the year out of commission being sick.

When I tell people this, they think I’m a germophobe. But, in all likelihood, I won’t be the one getting sick. I already have 10 years of building up herd immunity to the set of minor ailments that afflict the international Python-conference-attending community. It’s true that I don’t particularly want to get sick myself, but I happily shake people’s hands in more moderately-sized social gatherings. I’ve had a cold before and I’ve had one again; I have no illusion that ritually dousing myself in Purell every day will make me immune to all disease.

I’m not shaking your hand because I don’t want you to get sick. Please don’t be weird about it!

Remember that thing I said in my pycon talk about native packaging being the main thing to worry about, and single-file binaries being at best a stepping stone to that and at worst a bit of a red herring? You don’t have to take it from me. From the authors of a widely-distributed command-line application that was rewritten from Python into Go specifically for easier distribution, and then rewritten in Python:

... [the] majority of people prefer native packages so distributing precompiled binaries wasn’t a big win for this type of project1 ...

I don’t want to say “I told you so”, but... no. Wait a second. That is exactly what I want to do. That is what I am doing.

I told you so.