Nested repositories

Our intention is to integrate a subset of the functionality of the ForestExtension into the core of Mercurial, while maintaining simplicity. This isn't quite a design document: it's more an exploration of the different design decisions that might make sense, and what the tradeoffs are.

TableOfContents

1. Similar concepts in other systems

git

svn

2. Goals

The goal is to be able to use multiple repositories as a single, loosely coupled, unit. A "parent" has a notion of several "modules" that live under it. In at least some cases, performing a command in the parent should affect the modules.

By "loosely coupled", we mean that repositories are largely independent.

Relationships are hierarchical and one-to-many: a parent knows about its modules, but they do not know about their parent or sibling repositories.

2.1. Use cases

Here are the important needs we would like to at least consider.

2.2. Terminology

Names used by sundry systems:

I'm arbitrarily choosing "module".

3. Managing modules

Modules are listed explicitly, in a file named ".hgmodules" in the root of the tree (similar to ".hgignore" and ".hgtags"). This file contains one of more ConfigParser-style entries like so:

[foo/bar]
url = http://hg.example.com/bar
branch = default
rev = tip

[quux]
url = http://hg.example.com/quux
rev = 9117c6561b0b
optional = true

This file is intended to be read and written by machine. If you edit it by hand, there is no guarantee that comments or formatting will be preserved.

The name of a section is the location under the working directory of the parent where the module should be placed. It must be a relative path. Other items are as follows:

What should the user interface be to this file? It should be formatted such that a user can edit it directly, if need be. But ...

  1. Do we modify the add, remove, and rename commands to edit it?
  2. Do we add a "hg module" command that will edit it?

Probably the latter.

4. Important open questions

Does it only make sense to think about modules when we have a working directory? If not, where do modules live when we don't have a working directory? (It would be technically possible to separate a module's working directory from its repository, for example, though I'm not sure we want to go there.)

For now, I'm assuming that if there's no working directory, there are no modules.

Here's another sticky question without an obvious answer: By default, should commands that operate in the working directory recurse into modules?

The alternative that I lean towards is to not recurse unless explicitly instructed to. Most probably, only a few commands should arguably even be aware of modules.

This model assumes that modules will usually only be read, and checked out at a fixed revision, such that automatically running status queries or updates in them makes little sense: they won't change often enough to be worth the effort. This is in line with the usual use of externals in SVN, and with CVS vendor branches.

For people who would be actively developing in multiple repositories, however, this provides poor support. If you have a better idea, let's hear it! Note that the existing config mechanism lets you add a "--modules" option to whatever commands you think need it.

If a command like "add" is run in a parent repository's working directory, and given a path to a file in a modules's working directory, what should its behaviour be? The current behaviour is to complain and fail: should this remain?

5. User interface changes

5.1. The module command

We add the "module" command, for managing modules. It has several subcommands.

To clone optional modules, do we extend the behaviour of the built-in clone command, or add a "clone" command here?

5.2. Changes to existing commands

5.2.1. Uniform option naming

We introduce a standard -M / --modules option for commands that need to become module-aware. The name of the option is standard: its interpretation can change, depending on the command.

5.2.2. clone

5.2.3. update

5.2.4. add, remove, rename

5.2.5. pull

5.2.6. push

5.2.7. incoming, outgoing

5.2.8. tag

5.2.9. status

5.3. Questionable commands

Here are some possible behaviours for commands where it's really not clear that being module-aware makes sense at all.

5.3.1. commit

We have the possibility of rolling every commit back if any commit fails, when using --modules. Do we want to do this?

5.3.2. Next sticky question

If we make "commit" module-aware, why not status, diff, and all the rest?