Converting from CVSNT
Michael Haggerty
mhagger at alum.mit.edu
Sun Jun 28 14:56:41 CDT 2009
Greg Ward wrote:
> On Thu, Jun 25, 2009 at 3:29 PM, Michael Haggerty<mhagger at alum.mit.edu> wrote:
>> With the latest trunk version of cvs2svn, it should also be possible to
>> convert directly from CVS to the hg-fastimport format by starting from
>> the "cvs2hg-example.options" options file. However, there haven't been
>> many reports about how well this works, so if you try this, please
>> report your experiences.
>
> Well, *I'm* the maintainer of hg-fastimport, and I have done one
> conversion using cvs2git (another of cvs2svn's personalities) and
> hg-fastimport. It worked, but it's slow and takes lots of memory. If
> you really want to make git look good, compare git-fast-import to hg
> fastimport. ;-(
>
> Part of the problem is hg-fastimport's inefficient handling of blobs
> (git-speak for "file contents") and part is that I recently overhauled
> hg-fastimport to use the 'convert' extension as its backend, and
> 'convert' keeps too much stuff in memory (so that it can toposort the
> changesets).
It is of course unnecessary to toposort the changesets read in
fastimport format, since they are necessarily already toposorted (I
think that the fastimport file format can't even represent a
non-toposorted graph because only backreferences are allowed). So I
suppose that the problem is that you are using a part of the convert
infrastructure that you don't actually need. That's unfortunate,
because the generator of the fastimport data has already done all that
work once (at great expense!).
But if the convert extension is also doing a toposort itself when
converting from CVS, I don't quite understand why cvs2git+fastimport
should be much slower than hg convert. Is the cvs2git part just
pathetically slow?
By the way, I've done some work (not yet published) on changing cvs2git
to generate the revision contents much more efficiently (by using the
internal checkout code instead of calling "cvs co" each time). This
would not save so much time for cvs2hg because hg-fastimport requires
inline blobs, which therefore have to be in topological order. If this
is a limiting factor in cvs2git+fastimport conversions, please let me
know and I'll consider implementing something similar for cvs2hg mode.
Michael
More information about the Mercurial
mailing list