internal fragmentation

a personal journal of hacking, science, and technology

Extending the Mercurial protocol with pushkey and bookmarks

Mon, 28 Jun 2010 14:17 by mpm in mercurial (link)

Historically, Mercurial can send and receive precisely one type of data when pushing and pulling: changesets. As just about everything that you’d ever want to push and pull is recorded in a project’s history (including tags and branches), this works out very cleanly and simply.

But recently (ok, 2008), we’ve borrowed a concept from Git for lightweight markers that aren’t part of history and leave no mark when they’re added, changed, or removed. Git calls these ‘branches’, though Mercurial already has a notion of branches that’s much more in keeping with the permanent markers used in other systems (and Mercurial’s general philosophy of permanent history), so Mercurial has named these ‘bookmarks‘. Like real bookmarks, Mercurial bookmarks mark your place, but can be moved around or removed without altering the book.

The only problem was that there was no way to push and pull these things in the protocol, so sharing your bookmarks with other users was tedious. And as sometimes happens with open source, we had a plan for a solution, but the implementor became too busy with other things for quite a while. Fortunately, extending Mercurial’s protocol isn’t terribly difficult so a couple weeks ago, I sat down to do it myself. Here’s how I did it.

First, I introduced a generic notion called “pushkeys”. Pushkeys are key/value pairs that can be pushed to the server and listed back. There are multiple namespaces and any service or extension can register a namespace. There’s also a special “namespaces” namespace that shows the registered namespaces. To set a key, you have to also send along the old value to avoid races. The whole of is about 30 lines long and is introduced in this changeset.

Next we add the pushkey capability and pushkey/listkey command support to the local repository, the ssh client and server, and the http client and server. Finally, the debugpushkey lets us test the new support from the command line:

$ hg debugpushkey ~/hg namespaces
bookmarks
namespaces
$ hg debugpushkey ~/hg bookmarks
test-bookmark    ff2704a8ded37fbebd8b6eb5ec733731d725da8a
$ hg debugpushkey ~/hg bookmarks test-bookmark ff2704a8ded37fbebd8b6eb5ec733731d725da8a ''
True
$ hg debugpushkey ~/hg bookmarks
$ 

The above shows off the bookmark namespace service for pushkeys, which is added in the next changeset. Now all that remains is to add client support for bookmarks. This has a few different pieces:

  • default push and pull behavior: update bookmarks that are already present on both sides
  • push and pull -B: copy over specific bookmarks while pulling
  • in and out -B: list unique bookmarks on the client or server

These changes can be found starting here. The improved bookmark support will show up in Mercurial 1.6, due to be released on July 1st, but you can download a copy for testing today. Note that you’ll also need bookmark support enabled on your server (services like Google Code and Bitbucket may take a bit to catch up!).

There’s more to be done with bookmarks support, and there are a lot of other interesting things that can be done pretty easily with the pushkey protocol (for instance, those horrible advisory locks CVS folks always think they want). I’m sure we’ll be seeing some interesting ideas appear here soon.

Mapping import graphs

Sat, 5 Jun 2010 11:42 by mpm in mercurial (tagged ) (link)

Occasionally when adding features to Mercurial, we’ll run into interesting circular dependencies. For instance, repositories want to know about subrepositories, which naturally want to know about repositories. Often this manifests as a traceback when trying to import. And because the main Mercurial tool uses demand-loading to avoid wasting time on unused imports, this often isn’t caught until we run the test suite (which has a couple non-demand-loading scripts) or even later.

Fixing these problems usually involves moving some code around to make the dependencies more hierarchical. But figuring out what the hierarchy looks like can be tricky: there are about 70 modules in the core. But with the help of Graphviz and the following quick helper script, we can start to get a handle on things:

import re, sys
ignore = set("".split())

watch = set()
for fn in sys.argv[1:]:
    watch.add(fn[:-3])

print "digraph G {"
for fn in sys.argv[1:]:
    f = open(fn)
    mod = fn[:-3]
    if mod in ignore:
        continue
    for l in f:
        m = re.match("^\s*import (.*?)( as .*)?$", l)
        if not m:
	    m = re.match("^\s*from (.*?)( import .*)?$", l)
        if m:
            s = m.group(1).split(",")
            for i in s:
                i = i.strip()
		if i in ignore or i not in watch:
                    continue
                print '"%s" -> "%s"' % (mod, i)
print "}"

Here’s what that looks like (click through to zoom in):

Ouch! With a bit of ignoring of some of the leaf nodes and minor modules, we can reduce it to this:

And finally, by using Graphviz’ tred filter (which removes edges that are reachable through other routes) and ignoring explicitly delayed imports, we can get down to something nice and tidy like this:

Now it’s fairly easy to find the trouble spots and think about how to fix them. Ideally all the arrows in this graph would point downward, but we can easily see a couple that point upward for help and templatekw.

FLOSS Weekly

Wed, 2 Jun 2010 11:16 by mpm in Uncategorized (link)

I’ll be appearing “live” on the FLOSS Weekly show today at 11:30 CST (ie in a few minutes), talking about Mercurial. If you can’t tune in to the live video recording, you can watch it later in the podcast edition.

Update: the video can be found here.