internal fragmentation

a personal journal of hacking, science, and technology

Python optimization note: function calls are slow

Mon, 26 Jul 2010 10:01 by mpm in Uncategorized (link)

In Mercurial, we do a lot of gathering and filtering of long lists of things – filenames, revision numbers, etc. A typical pattern is something like this:

l = []
for f in files:
    if f.startswith(foo):

Usually it’s more involved than that – there’s often multiple lists to track and several tests, so a simple comprehension won’t help. And usually performance of these loops is pretty critical. If we can shave 10% off a call to status when there are 50,000 files, we’ll do it. So one of the first things we do is:

l = []
la = l.append
for f in files:
    if f.startswith(foo):

This saves a lookup in the inner loop, and that’s often a very measurable win. Another pattern is that sometimes one or more of the lists isn’t going to be needed, so we’ll avoid appending to it at all. So is it better to call a dummy function or stick an if statement in the inner loop? Let’s consider variations on this simple test:

a = []
f = False
aa = a.append
for x in xrange(1000000):
    if f:

Let’s explore some possibilities and time them with timeit (Python 2.6.5):

append function flag test time
None True if f: a.append 216 msec
None True a.append 197 msec
a.append True if f: la(x) 133 msec
a.append False if f: la(x) 63 msec
a.append True f and la(x) 133 msec
a.append False f and la(x) 65 msec
a.append True la(x) 113 msec
lambda x: None False la(x) 239 msec
id False la(x) 110 msec

The fastest way to add to a list (flag is True) is with a pre-looked-up list append function and no test.  But the fastest way to not add to our list is with the if test. And both of these are significantly faster than the naive ‘if f: a.append(x)’ approach.

Thus, if we expect to usually be adding to a given list, we should set la to a handy dummy function like the ‘id’ built-in when our flag tells us not to gather to a list. This is much faster than the obvious lambda function by virtue of being a trivial built-in, and gives us a ‘balanced’ 113/110 msec split between keep and discard. But if we usually don’t want a list (for instance, Mercurial usually doesn’t report ‘clean’ files), then an if statement is best, with a 133/63 keep/discard split.

Mercurial Fellowship status for June

Thu, 15 Jul 2010 17:53 by mpm in mercurial (link)

As I announced back in April, I’ve been putting together funding to work on Mercurial development and project leadership full-time for the next year without the usual distractions like finding support clients or working on embedded Linux products. That’s more or less come together, and I started working at the beginning of June. You can see most of my awesome sponsors here though there’s still one big one that the paperwork isn’t finished with yet that I’ll hopefully be announcing shortly.

Meanwhile, our project’s nonprofit home, the Software Freedom Conservancy has asked that I post some monthly status reports to give some visibility into what it is I actually do. So here’s a quick summary of what I did in June:

  • 1.5.4 release
  • 1.6 release code freeze
  • reviewed and merged 144 changesets
  • authored 61 changesets
  • introduced two major features (pushable bookmarks and revsets)
  • numerous bug fixes and other improvements
  • some wiki improvements (like the Trust page)
  • 5 weekly meetings with our GSoC students and related mentoring
  • 123 messages to the mercurial-devel list (out of 929)
  • 65 to the main list (out of 909)
  • 3 blog posts
  • daily “office hours” for developers and users on IRC
  • a Mercurial live video interview on FLOSS Weekly

The 1.6 release of course came out right on time on July 1st, and I’m pretty pleased with it.

Unifying the Mercurial wire protocol

17:18 by mpm in Uncategorized (tagged ) (link)

One Crazy Summer

The Mercurial ‘wire protocol’ is the set of commands that Mercurial uses to discover changesets to push and pull and actually exchange changesets with remote hosts. It was developed surprisingly rapidly, given that everything else in the project was also brand new:

  • April 20 2005: first Mercurial release
  • April 21 2005: static http pull support
  • May 10 2005: changeset bundle format created
  • May 11 2005: hgweb merged
  • May 12 2005: added smart http protocol to hgweb
  • May 25 2005: hg serve stand-alone server added
  • June 12 2005: crazy tunnel-http-over-ssh push hack
  • July 5 2005: ssh smart server support
  • July 6 2005: real ssh push

And that gets us up to the 0.6b release, less than three months after the 0.1 release.

Steady As She Goes

Because ssh push was good enough for most of our users at the time, not much happened with our wire protocol until about a year later when we finally got around to working on http push:

  • June 10 2006: add capability detection to protocol
  • June 15 2006: new push approach for ssh
  • June 20 2006: real http/https push
  • July 14 2006: add support for –uncompressed clones for fast networks

And again, the wire protocol hasn’t changed much since then. A branchmap command was added in May 2009 to improve named branch support, and pushkey was added a couple weeks ago to support bookmarks. We’ve also been careful to keep the protocol backwards compatible: Mercurial 0.5 can interoperate with even the most recent versions.

Preparing for the New Frontier

But there are a number of big projects on the horizon that will require extending the wire protocol, including:

  • a new, faster discovery algorithm
  • improved compression with parent deltas
  • support for shallow clones
  • support for lightweight copies

Which means its high time that the wire protocol got cleaned up! Currently adding a new command to the protocol means adding code in four different places: the ssh client code, the ssh server code, the http client code, and the http server code. And those interfaces aren’t exactly pretty. Here’s what the old ssh and http server support for lookup looks like:

    def do_lookup(self):
        arg, key = self.getarg()
        assert arg == 'key'
            r = hex(self.repo.lookup(key))
            success = 1
        except Exception, inst:
            r = str(inst)
            success = 0
        self.respond("%s %s\n" % (success, r))

def lookup(repo, req):
        r = hex(repo.lookup(req.form['key'][0]))
        success = 1
    except Exception, inst:
        r = str(inst)
        success = 0
    resp = "%s %s\n" % (success, r)
    req.respond(HTTP_OK, HGTYPE, length=len(resp))
    yield resp

Fortunately, we’ve been careful to keep the ssh and http ‘calling convention’ very similar. The bulk of the code is similar but there are some basic differences in how arguments are retrieved and how results are returned. Similar things need to be changed on the client side. Adding new commands meant writing one protocol implementation, then copy&pasting it into the other protocol and tweaking it.


So a long-standing wishlist item has been to unify these implementations, which is what I’ve been working on this week. Now the above command is implemented just once like this:

def lookup(repo, proto, key):
         r = hex(repo.lookup(key))
         success = 1
    except Exception, inst:
         r = str(inst)
         success = 0
    return "%s %s\n" % (success, r)

Note that the output is now delivered via return and the argument is now passed in as a parameter. A central dispatcher now handles argument unpacking and delivering results and has a table of all the function parameter lists (including support for variable-length argument lists!). Each function also has a ‘proto’ option which gives it access to an abstracted interface for various http- or ssh-specific methods, like doing bulk data transfers.  Similarly, the client-side support is also unified, and both the client and server support is now all in one file: This will hopefully make future extensions to the wire protocol much less painful.

The tricky part of course with a big change like this is how to break it down into pieces. Simply trying to refactor it all in one go might be possible, but it’s much safer to do it in a step by step process, testing each step along the way. So I started by making a new ssh argument parser, hooking that into a generic command dispatcher, and hooking that dispatcher into the existing ssh infrastructure. Now I could move commands out of the ssh framework and into the generic framework one by one. Once I’d moved a bunch of the simpler ssh commands, I went and tackled merging them on the http side with a similar approach. Overall, the process took a total of 18 changesets.