Note:

This page appears to contain material that is no longer relevant. Please help improve this page by updating its content.

{i} This page does not meet our wiki style guidelines. Please help improve this page by cleaning up its formatting.

Mercurial Frequently Asked Questions

(see also TipsAndTricks)

Contents

  1. General Questions
    1. What is the license of the project?
  2. Terminology
    1. What are revision numbers, changeset IDs, and tags?
    2. What are cloning, pulling, and pushing?
    3. What are branches, merges, heads, and the tip?
  3. General Usage
    1. How does merging work?
    2. What are some best practices for distributed development with Mercurial?
    3. How do I import from a repository created in a different SCM?
    4. What about Windows support?
    5. Is there a GUI front-end?
    6. How do I make sure that only known people can contribute/submit/commit/push changes?
  4. Common Problems
    1. Windows: The installer aborts with an error message
    2. Which revision have I checked out?
    3. What can I configure in Mercurial
    4. Configuring the username
    5. My repository is corrupted, help!
    6. I get an error while cloning a remote repository via ssh
    7. I get an "ssl required" error message when trying to push changes
    8. I did an hg pull and my working directory is empty!
    9. I want to retrieve an old version of my project, what do I do?
    10. hg status shows changed files but hg diff doesn't!
    11. hg export or log -p shows a strange diff for my merge!
    12. I did an hg revert and my working directory still has changes in it!
    13. I want a clean, empty working directory
    14. I committed a change containing nuclear launch codes, how do I delete it permanently?
    15. I committed a large binary file/files how do I delete them permanently?
    16. I tried to check in an empty directory and it failed!
    17. I want to get an email when a commit happens!
    18. I'd like to put only some few files of a large directory tree (home dir for instance) under Mercurial's control, and it is taking forever to diff or commit
    19. Why is the modification time of files not restored on checkout?
    20. When I do 'hg push' to a remote repository, why does the working directory appear to be empty?
    21. Any way to 'hg push' and have an automatic 'hg update' on the remote server?
    22. How can I store my HTTP login once and for all ?
    23. How can I do a "hg log" of a remote repository?
    24. How can I find out if there are new changesets in a remote repository?
    25. What can I do with a head I don't want anymore?
    26. The clone command is returning the wrong version in my workspace!
    27. Any way to track ownership and permissions?
    28. I get a "no space left" or "disk quota exceeded" on push
    29. Why do I get "abort: could not import module mpatch!" when invoking hg?
    30. Why do I get a traceback and ImportError when invoking hg?
    31. Why do I get "abort: requirement 'fncache' not supported!" when invoking hg?
    32. Why won't Mercurial let me merge when I have uncommitted changes?
    33. How do I overwrite branch x with branch y?
  5. Bugs and Features
    1. I found a bug, what do I do?
    2. What should I include in my bug report?
    3. Can Mercurial do <x>?
  6. Web Interface
    1. How do I link to the latest revision of a file?
    2. How do I change the style of the web interface to the visually more attractive gitweb?
  7. Technical Details
    1. What limits does Mercurial have?
    2. How does Mercurial store its data?
    3. How does Mercurial handle binary files?
    4. What about Windows line endings vs. Unix line endings?
    5. What about keyword replacement (i.e. $Id$)?
    6. How are Mercurial diffs and deltas calculated?
    7. How are manifests and changesets stored?
    8. How do Mercurial hashes get calculated?
    9. What checks are there on repository integrity?
    10. How does signing work with Mercurial?
    11. What about hash collisions? What about weaknesses in SHA1?
    12. How does "hg commit" determine which files have changed?
    13. What is the difference between rollback and strip?

Subpages:

1. General Questions

1.1. What is the license of the project?

See License

2. Terminology

2.1. What are revision numbers, changeset IDs, and tags?

Mercurial will generally allow you to refer to a revision in three ways: by revision number, by changeset ID, and by tag.

A revision number is a simple decimal number that corresponds with the ordering of commits in the local repository. It is important to understand that this ordering can change from machine to machine due to Mercurial's distributed, decentralized architecture.

This is where changeset IDs come in. A changeset ID is a 160-bit identifier that uniquely describes a changeset and its position in the change history, regardless of which machine it's on. This is represented to the user as a 40 digit hexadecimal number. As that tends to be unwieldy, Mercurial will accept any unambiguous substring of that number when specifying versions. It will also generally print these numbers in "short form", which is the first 12 digits.

You should always use some form of changeset ID rather than the local revision number when discussing revisions with other Mercurial users as they may have different revision numbering on their system.

Finally, a tag is an arbitrary string that has been assigned a correspondence to a changeset ID. This lets you refer to revisions symbolically.

2.2. What are cloning, pulling, and pushing?

In many other version control systems, all developers commit changes to a single, centralized repository. In Mercurial, every developer typically works in his or her own repository. A fundamental concept of Mercurial is transferring changesets among repositories. This is accomplished through the clone, push, and pull operations (see also CommunicatingChanges).

To begin a task on an existing project, a developer will typically create a local copy of the repository using the hg clone command. This operation creates a new repository containing all the files and all of their history.

If another developer has made changes to her repository, you can pull her changes into your repository using the hg pull command. If you have made changes to your repository and you wish to transfer them to another repository (say, to a shared repository), you would do this using the hg push command.

2.3. What are branches, merges, heads, and the tip?

In the simplest case, history consists of a linear sequence of changesets. In this case, every changeset (except for the first and last) has one parent and one child. For a variety of reasons, it is possible for the history graph to split into two or more independent lines of development. When this occurs, the history graph is said to have a branch. Where a branch occurs, a changeset has two or more children.

When two lines of development are joined into a single line, a merge is said to have occurred. Where a merge occurs, a changeset has two parents. If a line of development is not merged into another, the last changeset on that line is referred to as the head of that branch. Every repository always includes one or more heads. Heads have no children. Use the hg heads command to list the heads of the current repository.

The tip is the changeset added to the repository most recently. If you have just made a commit, that commit will be the tip. Alternately, if you have just pulled from another repository, the tip of that repository becomes the new tip. Use hg tip to show the tip of the repository.

The tip is always a head. If there are multiple heads in a repository, only one of them is the tip. Within a repository, changesets are numbered sequentially, so the tip has the highest sequence number. The word "tip" functions as a special tag to denote the tip changeset, and it can be used anywhere a changeset ID or tag is valid.

The following diagram illustrates these concepts.

The history has branched at revs 1:34ef and 4:ac98, and a merge has occurred at rev 4:ac98. Revs 5:0345, 6:19e3, and 7:8e92 are heads, and 7:8e92 is the tip.

Note that while hg tip shows the tip and hg heads shows the heads of the repository, the hg branch and hg branches commands do not list the branch changesets as described above. Instead, they show changesets corresponding to branches that have been given names. See NamedBranches.

The term "branch" has other meanings as well. See Branch for a fuller discussion.

3. General Usage

3.1. How does merging work?

See documentation for the merge command.

3.2. What are some best practices for distributed development with Mercurial?

See some typical working practices.

3.3. How do I import from a repository created in a different SCM?

See the Converting Repositories document for various tips.

3.4. What about Windows support?

See the Windows install guide for getting started using Windows.

Like TortoiseSVN, we recommend to turn off the indexing service on the working copies and repositories, and exclude them from virus scans.

3.5. Is there a GUI front-end?

See the page of related tools for information on graphical merge tools and other front-ends.

3.6. How do I make sure that only known people can contribute/submit/commit/push changes?

Since Mercurial lets users do anything they want with their repository clones, sharing them with whoever they like, enforcing restrictions on commits is not generally possible with Mercurial (note, however, that committing in centralised version control systems' and Mercurial's commit operation are not exactly the same thing). However, the critical operation is actually the push operation, since it is at such a point that changes are communicated between repository clones, and where an "official" repository would want to be able to reject "unverified" changesets: that is, changesets from people who are unknown or not authorised to contribute changes. So, although many clones may potentially exist with any individual (known or unknown) doing what they like, any work that makes its way to the "official" repository must have someone who is "verified" or "authorised" pushing that work; that person effectively takes responsibility for the work's suitability.

One extension which attempts to provide a verification capability is the commitsigs extension.

(Although one can argue that in centralised version control systems, where each person has a login to a central repository, the task of verifying submitters is easier, there is also no guarantee that work submitted by another person has not somehow incorporated changes made by an "unauthorised" person. After all, it is possible to share the contents of repositories by other means - perhaps a user lets other people on a system access their checkout directly in the filesystem - and thus the act of submitting work by an "authorised" person is no guarantee that they did the work all by themself, merely that they take responsibility for it.)

4. Common Problems

4.1. Windows: The installer aborts with an error message

"This installation package is not supported by this processor type. Contact your product vendor." This means you are trying to install a 64-bit version installer on a normal 32-bit operating system. You need to download and use the correct msi file for your OS. For normal 32-bit OS, make sure the msi file does not have an x64 in it.

4.2. Which revision have I checked out?

Use the summary command (TutorialClone shows an example call). The summary command will also tell you in brief what branch you're on, whether there are any newer changes than the one you're on, and what the state of your working directory is.

If you want this for a script and need terser output, take a look at identify command flags and at scripting.

4.3. What can I configure in Mercurial

See in MercurialIni.

4.4. Configuring the username

If hg says No username found, using 'user@hostname instead' when you make a commit, then you need to configure your username. Please see QuickStart for help on this.

4.5. My repository is corrupted, help!

Please read the page "Dealing With Repository And Dirstate Corruption" for recommendations on what to do.

4.6. I get an error while cloning a remote repository via ssh

If your remote repository is cloned thusly

hg clone ssh://USER@REMOTE/path/to/repo

And, you find that after successful ssh authentication you get the error message remote: abort: repository path/to/repo not found! , then you need to know the following:

  • Mercurial's remote repository syntax differs from syntax of other well known programs such as rsync, cvs - both of which use a : character to delimit USER@REMOTE from the path component (/path/to/repo).

  • The path to the remote repository is relative to $HOME of USER. i.e., it is  ~USER/path/to/repo .

  • Remember to use hg -v clone ssh://USER@REMOTE/path/to/repo and observe the remote command being executed via the ssh channel

On the other hand, if the error message is remote: bash: line 1: hg: command not found, the problem is that the environment used by ssh does not have hg in its PATH. There are a few ways to deal with this problem:

  • In your client ~/.hgrc file, set a remotecmd value in the [ui] section giving the exact path to hg.

  • As a one-off operation, you could write the clone command as follows:
    hg --config ui.remotecmd=/path/to/hg clone ssh://USER@REMOTE/path/to/repo
  • Define a PATH in .bashrc (or equivalent shell configuration file), noting that this may not always work for some versions of ssh and bash.

  • On the server, create a ~/.ssh/environment file that defines an appropriate PATH, and add PermitUserEnvironment yes to /etc/sshd_config.

  • On the server, place a symlink to the hg binary somewhere on the ssh PATH; run ssh username@server env to show it. Be careful to avoid paths managed by system package management, since package installations could conflict with it; /usr/local/bin is usually a good choice.

4.7. I get an "ssl required" error message when trying to push changes

That's because allowing anonymous, unauthenticated HTTP clients to push changes into your repository would be a huge security hole. If you are on a private network and you know that all HTTP clients are trustworthy, you can add

[web]
push_ssl = false

to .hg/hgrc on the server-side repository. (See also HgWebDirStepByStep.)

There's a reason for requiring SSL, however. If you do not trust the network you are using do not change this.

4.8. I did an hg pull and my working directory is empty!

There are two parts to Mercurial: the repository and the working directory. hg pull pulls all new changes from a remote repository into the local one but doesn't alter the working directory.

This keeps you from upsetting your work in progress, which may not be ready to merge with the new changes you've pulled and also allows you to manage merging more easily (see below about best practices).

To update your working directory, run hg update. If you're sure you want to update your working directory on a pull, you can also use hg pull -u. This will refuse to merge or overwrite local changes.

4.9. I want to retrieve an old version of my project, what do I do?

You want hg update -C <version>, which will clobber your current version with the requested version.

You don't want hg revert <version>, which reverts changes in your working directory back to that version, but keeps the current parents for the next checkin. This command exists for undoing changes in current versions, not for working on old versions.

4.10. hg status shows changed files but hg diff doesn't!

hg status reports when file contents or flags have changed relative to either parent. hg diff only reports changed contents relative to the first parent. You can see flag information with the --git option to hg diff and deltas relative to the other parent with -r.

4.11. hg export or log -p shows a strange diff for my merge!

The diff shown by hg diff, hg export and hg log is always against the first parent for consistency. Also, the files listed are only the files that have changed relative to both parents.

(Are diffs of merges really always against the first parent? Doesn't hg export have a --switch-parent option? It would also be good if the docs would give the rationale for hg diff and hg log not having that option (assuming they don't--the man page only mentions it for export).)

4.12. I did an hg revert and my working directory still has changes in it!

You've probably done an hg merge (see Merge), which means your working directory now has two parents according to hg parents. A subsequent hg revert --all -r . will revert all files in the working directory back to the first (primary) parent, but it will still leave you with two parents (see revert).

To completely undo the uncommitted merge and discard all local modifications, you will need to issue a hg update -C -r . (note the "dot" at the end of the command).

See also TutorialMerge.

4.13. I want a clean, empty working directory

The easiest thing to do is run hg clone -U which will create a fresh clone without checking out a working copy.

If the repository already has a working copy, you can remove it running hg update null.

Note: you might want to copy hgrc file from your old repository.

4.14. I committed a change containing nuclear launch codes, how do I delete it permanently?

If you've just committed it, and you haven't done any other commits or pulls since, you may be able to use rollback command to undo the last commit transaction:

$ hg rollback
rolling back last transaction

If you've made other changes but you haven't yet published it to the world, you can do something like the following:

$ hg clone -r <untainted-revision> tainted-repo untainted-repo

The strip command in the mq extension may also be useful here for doing operations in place.

This will get you a new repo without the tainted change or the ones that follow it. You can import the further changes with hg export and hg import or by using the TransplantExtension. See TrimmingHistory for possible future approaches.

If you've already pushed your changes to a public repository that people have cloned from, the genie is out of the bottle. Good luck cleaning up your mess.

“Judge Tries to Unring Bell Hanging Around Neck of Horse Already Out of Barn Being Carried on Ship That Has Sailed.” - William G. Childs

For more details, see EditingHistory.

4.15. I committed a large binary file/files how do I delete them permanently?

If you want to remove file(s) that shouldn't have been added, use the ConvertExtension with --filemap option to "convert" your Mercurial repository to another Mercurial repository. You'll want to make sure that you set convert.hg.saverev=False if you want to keep in common the history prior to your removed file(s). If convert.hg.saverev=True, the conversion embeds the source revision IDs into the new revisions under an extra header, visible if via hg log --debug.

See also the previous question for other options.

4.16. I tried to check in an empty directory and it failed!

Mercurial doesn't track directories, it only tracks files. Which works for just about everything, except directories with no files in them. As empty directories aren't terribly useful and it makes the system much simpler, we don't intend to fix this any time soon. A couple workarounds:

  • add a file, like "this-dir-intentionally-left-blank". On *nix, you can do this with find . -type d -empty -exec touch {}/.keep \;. There is also the Mono-based tool MarkEmptyDirs which allows to automate this task).

  • create the directory with your Makefiles or other build processes

4.17. I want to get an email when a commit happens!

Use the NotifyExtension

4.18. I'd like to put only some few files of a large directory tree (home dir for instance) under Mercurial's control, and it is taking forever to diff or commit

Just do a

printf "syntax: glob\n*\n" > .hgignore

or, if you are using 0.7 or below,

printf ".*\n" > .hgignore

This will make hg ignore all files except those explicitly added.

4.19. Why is the modification time of files not restored on checkout?

If you use automatic build tools like make or distutils, some built files might not be updated if you checkout an older revision of a file. Additionally a newer changeset might have an older commit timestamp due to pulling from someone else or importing patches somebody has done some time ago, so checking out a newer changeset would have to make the files older in this case.

If you need predictable timestamps you can use hg archive, which can do something like a checkout in a separate directory. Because this directory is newly created, there is nothing like switching to a different changeset afterwards, therefore the above mentioned problems don't apply here.

4.20. When I do 'hg push' to a remote repository, why does the working directory appear to be empty?

When changes are pushed to a repository, the working directory holding the repository is not changed. However, the changes are stored in the history and are available when performing operations on that repository. Thus, running commands like hg log in such a remote repository will show the full history even if a normal directory listing appears to be empty. (Repository publishing using hgweb also takes advantage of such history being available without needing a set of files in a working directory somewhere.)

Obviously, you can run hg update to make the files appear in such a repository, but unless you actually want to work within such a directory, it is arguably tidier to leave the directory in its "empty" state. This can be done by issuing an hg update null command in the directory holding the repository.

4.21. Any way to 'hg push' and have an automatic 'hg update' on the remote server?

[hooks]
changegroup = hg update

This goes in .hg/hgrc on the remote repository.

4.22. How can I store my HTTP login once and for all ?

You can specify the usename and password in the URL like:

http://user:password@mydomain.org

Then add a new entry in the paths section of your hgrc file. With Mercurial 1.3 you can also add an auth section to your hgrc file:

[auth]
example.prefix = https://hg.example.net/
example.username = foo
example.password = bar

Please see the hgrc manpage for more information.

4.23. How can I do a "hg log" of a remote repository?

You can't. Mercurial accepts only local repositories for the -R option (see hg help -v log).

> hg log -R https://www.mercurial-scm.org/repo/hello
abort: repository 'https://www.mercurial-scm.org/repo/hello' is not local

The correct way to do this is cloning the remote repository to your computer and then doing a hg log locally.

This is a very deliberate explicit design decision made by project leader Matt Mackall (mpm). See also issue1025 for the reasoning behind that.

4.24. How can I find out if there are new changesets in a remote repository?

To get the changeset id of the tipmost changeset of a remote repository you can do:

> hg id -i -r tip https://www.mercurial-scm.org/repo/hello
82e55d328c8c

When it changes, you have new changesets in the remote repository.

4.25. What can I do with a head I don't want anymore?

See PruningDeadBranches

4.26. The clone command is returning the wrong version in my workspace!

Clone checks out the tip of the default (aka unnamed) branch (see NamedBranches). Ergo, you probably want to keep your main branch unnamed.

4.27. Any way to track ownership and permissions?

If you're using Mercurial for config file management, you might want to track file properties (ownership and permissions) too. Mercurial only tracks the executable bit of each file.

Here is an example of how to save the properties along with the files (works on Linux if you've the acl package installed):

# cd /etc && getfacl -R . >/tmp/acl.$$ && mv /tmp/acl.$$ .acl
# hg commit

This is far from perfect, but you get the idea. For a more sophisticated solution, check out etckeeper.

4.28. I get a "no space left" or "disk quota exceeded" on push

I get a "no space left" or "disk quota exceeded" on push, but there is plenty of space or/and I have no quota limit on the device where the remote hg repository is.

The problem comes probably from the fact that mercurial uses /tmp (or one of the directory define by environment variables $TMPDIR, $TEMP or $TMP) to uncompress the bundle received on the wire. The decompression may then reach device limits.

You can of course set $TMPDIR to another location on remote in the default shell configuration file, but it will be potentially used by other processes than mercurial. Another solution is to set a hook in a global .hgrc on remote. See the description of how to set a hook for changing tmp directory on remote when pushing.

4.29. Why do I get "abort: could not import module mpatch!" when invoking hg?

If your current directory is that of the Mercurial source distribution, it is possible that hg is looking in the local mercurial package directory and fails to find the mpatch.so extension module. The solution (for most situations) is to move out of the source distribution directory and to try again. This is a common Python pitfall: Python will often be confused by packages or modules in the current directory and will import packages/modules from these local locations instead of looking in the appropriate places. If you are trying to use hg on a checkout of the Mercurial software itself, you might want to check any PYTHONPATH environment variable that you may have set and remove "empty" paths. For example, at a shell ($) prompt:

$ echo $PYTHONPATH
/home/me/lib:/home/me/morelib:

Here, the trailing comma (:) indicates that there is an empty final path in the list. This empty path is likely to become mapped to the current directory, and Python will then prefer to look at the current directory instead of its own package directories (containing your installed version of Mercurial). If you reset PYTHONPATH trimming off any such empty paths, the problem should go away:

$ export PYTHONPATH=/home/me/lib:/home/me/morelib

4.30. Why do I get a traceback and ImportError when invoking hg?

See the response to the previous question for a possible explanation and some solutions.

4.31. Why do I get "abort: requirement 'fncache' not supported!" when invoking hg?

In version 1.1 of a new repository format was introduced to work around file name limitations on Windows. Repositories created with Mercurial 1.1 or later will automatically have enabled the fncache repository format. You need Mercurial 1.1 or later to read these repositories. Repositories created with pre 1.1 Mercurial or with fncache disabled can still be read. See the page about the fncache repository format for more information.

4.32. Why won't Mercurial let me merge when I have uncommitted changes?

If hg merge fails with the message abort: outstanding uncommitted changes, it means that the usual process of merging two branches cannot proceed.

Consider the normal merge case, when the working directory is clean -- that is, there are no uncommitted changes and hg diff produces no output. Mercurial combines the revision being merged (the "other branch") with the working directory's revision (the "local branch"). It leaves the result of this merge in your working directory for you to test. When you approve the merge result, you commit it. The key insight: all of those changes are the result of the merge, i.e. they come from the local head, or the other head, or the combination (merge) of them.

Now consider what happens if the working directory already contains modified files -- that is, hg diff produces output. In this case, Mercurial has three states to worry about: your uncommitted changes, the local head, and the other head. That would be incoherent, since Mercurial only allows merging two heads (changesets) at a time. You might think Mercurial should save your uncommitted changes somewhere so you can do the merge and then restore your original changes, but that introduces additional complexity. (What if your merge affects some of the same code as your uncommitted changes? That means another merge will be required in the working directory after you commit the merge you were trying to do in the first place!) So Mercurial does not try; it requires you to have a clean working directory when you try to merge two heads.

In short, Mercurial is trying to keep one operation separate from another (local changes versus the merge) and avoid putting the working directory into some kind of special state (suspending local changes until they can be combined with the current revision).

There are four ways around this limitation:

  1. discard your uncommitted changes (only appropriate if they are temporary throwaway changes, and you don't need them anymore)
  2. commit your changes (only appropriate if they are done, working, and ready to commit)
  3. create a new working directory
  4. set aside your changes

The first two should be self-explanatory.

Creating a new working directory is a bit more overhead, but is simple and usually fast (unless your repository is very large). For example:

orig=`pwd`
hg clone -u . . ../temp-merge
cd ../temp-merge
hg merge
[...test the merge...]
hg commit -m"merge with ..."
hg push                  # to your previous repository
cd $pwd
hg update                # to the new merge changeset

Setting aside half-finished changes is an interesting problem with a variety of solutions, including but not limited to:

4.33. How do I overwrite branch x with branch y?

hg update x
hg commit --close-branch -m 'closing branch x, will be overwriten with branch y'
hg update y
hg branch -f x
hg ci

The result is a closed head in branch x, and a new commit which has as parent branch y. Don't try to just overwrite files in branch x with files in branch y because it will screw up future merges. This seems the correct way to do it.

5. Bugs and Features

5.1. I found a bug, what do I do?

Report it to the Mercurial mailing list at mercurial@mercurial-scm.org or in the bug tracker.

5.2. What should I include in my bug report?

Enough information to reproduce or diagnose the bug. If you can, try using the 'hg -v' and 'hg --debug' switches to figure out exactly what Mercurial is doing.

If you can reproduce the bug in a simple repository, that is very helpful. The best is to create a simple shell script to automate this process, which can then be added to our test suite.

5.3. Can Mercurial do <x>?

If you'd like to request a feature, please send your request to mercurial@mercurial-scm.org.

Be sure to see ToDo and MissingFeatures to see what's already planned and where we need help.

6. Web Interface

Find the URL for the file and then replace the changeset identifier with tip.

6.2. How do I change the style of the web interface to the visually more attractive gitweb?

In hgrc set

[web]
style = gitweb

To switch back to the default style specify "style = default" (see hgbook).

7. Technical Details

7.1. What limits does Mercurial have?

Mercurial currently assumes that single files, indices, and manifests can fit in memory for efficiency.

There should otherwise be no limits on file name length, file size, file contents, number of files, or number of revisions (see also BigRepositories for the sizes of some example repositories.)

The network protocol is big-endian.

File names cannot contain the null character or newlines. Committer addresses cannot contain newlines.

Mercurial is primarily developed for UNIX systems, so some UNIXisms may be present in ports.

Mercurial encodes filenames (see CaseFolding, CaseFoldingPlan, fncacheRepoFormat) when storing them in the repository. Most notably, uppercase characters in filenames are encoded as two characters in the filename in the repository ("FILE""_f_i_l_e").

7.2. How does Mercurial store its data?

The fundamental storage type in Mercurial is a revlog. A revlog is the set of all revisions of a named object. Each revision is either stored compressed in its entirety or as a compressed binary delta against the previous version. The decision of when to store a full version is made based on how much data would be needed to reconstruct the file. This lets us ensure that we never need to read huge amounts of data to reconstruct a object, regardless of how many revisions of it we store.

In fact, we should always be able to do it with a single read, provided we know when and where to read. This is where the index comes in. Each revlog has an index containing a special hash (nodeid) of the text, hashes for its parents, and where and how much of the revlog data we need to read to reconstruct it. Thus, with one read of the index and one read of the data, we can reconstruct any version in time proportional to the object size.

Similarly, revlogs and their indices are append-only. This means that adding a new version is also O(1) seeks.

Revlogs are used to represent all revisions of files, manifests, and changesets. Compression for typical objects with lots of revisions can range from 100 to 1 for things like project makefiles to over 2000 to 1 for objects like the manifest.

7.3. How does Mercurial handle binary files?

Core Mercurial tracks but never modifies file content, and it is thus binary safe. See BinaryFiles for more discussion of commands which interpret file content, e.g. merge, diff, export and annotate.

7.4. What about Windows line endings vs. Unix line endings?

See Win32TextExtension for techniques which automatically convert Windows line endings into Unix line endings when committing files to the repository, and convert back again when updating the workspace. This is not default Mercurial behaviour, and requires users to edit their configuration files to turn it on. Adopting this policy on line endings probably implies enabling a hook to prevent non-compliant commits from getting into your repository, which in turn forces people contributing code to enable the extension.

7.5. What about keyword replacement (i.e. $Id$)?

See KeywordExtension.

7.6. How are Mercurial diffs and deltas calculated?

Mercurial diffs are calculated rather differently than those generated by the traditional diff algorithm (but with output that's completely compatible with patch of course). The algorithm is an optimized C implementation based on Python's difflib, which is intended to generate diffs that are easier for humans to read rather than be 'minimal'. This same algorithm is also used for the internal delta compression.

In the course of investigating delta compression algorithms, we discovered that this implementation was simpler and faster than the competition in our benchmarks and also generated smaller deltas than the theoretically 'minimal' diffs of the traditional diff algorithms. This is because the traditional algorithm assumes the same cost for insertions, deletions, and unchanged elements.

7.7. How are manifests and changesets stored?

A manifest is simply a list of all files in a given revision of a project along with the nodeids of the corresponding file revisions. So grabbing a given version of the project means simply looking up its manifest and reconstructing all the file revisions pointed to by it.

A changeset is a list of all files changed in a check-in along with a change description and some metadata like user and date. It also contains a nodeid to the relevant revision of the manifest.

7.8. How do Mercurial hashes get calculated?

Mercurial hashes both the contents of an object and the hash of its parents to create an identifier that uniquely identifies an object's contents and history. This greatly simplifies merging of histories because it avoid graph cycles that can occur when a object is reverted to an earlier state.

All file revisions have an associated hash value (the nodeid). These are listed in the manifest of a given project revision, and the manifest hash is listed in the changeset. The changeset hash (the changeset ID) is again a hash of the changeset contents and its parents, so it uniquely identifies the entire history of the project to that point.

7.9. What checks are there on repository integrity?

Every time a revlog object is retrieved, it is checked against its hash for integrity. It is also incidentally doublechecked by the Adler32 checksum used by the underlying zlib compression.

Running 'hg verify' decompresses and reconstitutes each revision of each object in the repository and cross-checks all of the index metadata with those contents.

But this alone is not enough to ensure that someone hasn't tampered with a repository. For that, you need cryptographic signing.

7.10. How does signing work with Mercurial?

Take a look at the hgeditor script for an example. The basic idea is to use GPG to sign the manifest ID inside that changelog entry. The manifest ID is a recursive hash of all of the files in the system and their complete history, and thus signing the manifest hash signs the entire project contents.

7.11. What about hash collisions? What about weaknesses in SHA1?

The SHA1 hashes are large enough that the odds of accidental hash collision are negligible for projects that could be handled by the human race. The known weaknesses in SHA1 are currently still not practical to attack, and Mercurial will switch to SHA256 hashing before that becomes a realistic concern.

Collisions with the "short hashes" are not a concern as they're always checked for ambiguity and are still long enough that they're not likely to happen for reasonably-sized projects (< 1M changes).

See also: https://www.mercurial-scm.org/pipermail/mercurial/2009-April/025526.html by Matt Mackall.

7.12. How does "hg commit" determine which files have changed?

If hg commit is called without file arguments, it commits all files that have "changed" (see commit). Note however, that Mercurial doesn't detect changes that change neither the file time nor its size (This is by design. See also issue618 and DirState).

7.13. What is the difference between rollback and strip?

They overlap a bit, but are really quite different:

  • rollback will remove the last transaction.

    • Transactions are a concept often found in databases. In Mercurial we start a transaction when certain operations are run, such as commit, push, pull... When the operation finishes successfully, the transaction is marked as complete. If an error occurs, the transaction is "rolled back" and the repository is left in the same state as before.

      You can manually trigger a rollback with hg rollback. This will undo the last transactional command. If a pull command brought 10 new changesets into the repository on different branches, then hg rollback will remove them all.

      Please note: there is no backup when you rollback a transaction!

  • strip will remove a changeset and all its descendants.

    • The changesets are saved as a bundle, which you can apply again if you need them back.

FAQ (last edited 2012-10-08 19:05:50 by mpm)