[PATCH 00 of 15 RFC] verify: partial verification to detect repository corruption immediately

FUJIWARA Katsunori foozy at lares.dti.ne.jp
Wed Oct 3 11:38:57 CDT 2012


Sometimes, I assist Mercurial users with recovery from repository
corruption.

In many of such cases, filelogs are completely corrupted: header part
doesn't have valid value, and causes "unknown format".

In addition to it, many users can't notice such corruption
immediately: IMHO, dirstate cache reducing need of accessing to
filelog seems to hide this corruption from users.

This "silent corruption" may cause loss of file contents.

For example, if corruption happens just after adding new file on one
branch, updating to another branch removes newly added file from
working directory and causes loss of it.

But I also know that frequently accessing to filelogs has large impact
on runtime performance.

At first, I thought that repository verification just after
commit/addgroup transaction may detect such corruption immediately,
because verification read each revlogs (changelog, manifest and
filelog) in from filesystem again.

But fully verification costs a lot.

So, I tried to achieve partial verification: verification only for
specified revisions and entries in manifest/filelogs related to them.

This patch series is RFC to introducing partial verification and
verification just after commit/addgroup transaction.

This series consists of:

  - #1 ... #4
    trivial fixes around verification. these are not tightly related
    to partial verification

  - #5 ... #14
    patches for partial verification

  - #15
    patch for verification just after commit/addgroup transaction

This series is not yet tested well: just running Mercurial test set
with enabling verification just after commit/addgroup transaction.

Please comment especially about:

  - is this approach useful to detect corruption immediately ?

  - how should partial verification be invoked ?

    - just after commit/addgroup transaction (as I implemented)
    - as one of 3rd party pretxn* hooks
    - as one of 3rd party commit/changegroup hooks
    - and so on ...

  - and of course, the ways to achieve partial verification


More information about the Mercurial-devel mailing list