internal fragmentation

a personal journal of hacking, science, and technology

Python optimization note: function calls are slow

Mon, 26 Jul 2010 10:01 by mpm in Uncategorized (link)

In Mercurial, we do a lot of gathering and filtering of long lists of things – filenames, revision numbers, etc. A typical pattern is something like this:

l = []
for f in files:
    if f.startswith(foo):
        l.append(f)

Usually it’s more involved than that – there’s often multiple lists to track and several tests, so a simple comprehension won’t help. And usually performance of these loops is pretty critical. If we can shave 10% off a call to status when there are 50,000 files, we’ll do it. So one of the first things we do is:

l = []
la = l.append
for f in files:
    if f.startswith(foo):
        la(f)

This saves a lookup in the inner loop, and that’s often a very measurable win. Another pattern is that sometimes one or more of the lists isn’t going to be needed, so we’ll avoid appending to it at all. So is it better to call a dummy function or stick an if statement in the inner loop? Let’s consider variations on this simple test:

a = []
f = False
aa = a.append
for x in xrange(1000000):
    if f:
        aa(x)

Let’s explore some possibilities and time them with timeit (Python 2.6.5):

append function flag test time
None True if f: a.append 216 msec
None True a.append 197 msec
a.append True if f: la(x) 133 msec
a.append False if f: la(x) 63 msec
a.append True f and la(x) 133 msec
a.append False f and la(x) 65 msec
a.append True la(x) 113 msec
lambda x: None False la(x) 239 msec
id False la(x) 110 msec

The fastest way to add to a list (flag is True) is with a pre-looked-up list append function and no test.  But the fastest way to not add to our list is with the if test. And both of these are significantly faster than the naive ‘if f: a.append(x)’ approach.

Thus, if we expect to usually be adding to a given list, we should set la to a handy dummy function like the ‘id’ built-in when our flag tells us not to gather to a list. This is much faster than the obvious lambda function by virtue of being a trivial built-in, and gives us a ‘balanced’ 113/110 msec split between keep and discard. But if we usually don’t want a list (for instance, Mercurial usually doesn’t report ‘clean’ files), then an if statement is best, with a 133/63 keep/discard split.

1 Comment

  1. […] This post was mentioned on Twitter by alejolp and Yury Yurevich, Matt Mackall. Matt Mackall said: Some Python micro-optimization investigation: http://www.selenic.com/blog/?p=656 […]

    Pingback by Tweets that mention Python optimization note: function calls are slow « internal fragmentation -- Topsy.com — Mon, 26 Jul 2010 @ 12:31

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.