Huge manifest when converting the FreeBSD cvs repository
Benoit Boissinot
bboissin at gmail.com
Sat Sep 1 17:04:56 CDT 2007
On 9/1/07, Ulrich Spoerlein <uspoerlein at gmail.com> wrote:
> On Sat, 01.09.2007 at 18:45:33 +0200, Benoit Boissinot wrote:
> > On 9/1/07, Ulrich Spoerlein <uspoerlein at gmail.com> wrote:
> > (I did this calculation with the .i given to me by Simon some weeks ago,
> > I think it is the same repo)
>
> It is. After not hearing from Simon for some time, I thought of speeding
> analysis up by posting it directly.
>
> > To me it looks like a bug in the converter:
> >
> > this shows how the manifest grows:
> > http://perso.ens-lyon.fr/benoit.boissinot/foo.png
> > I suspect that in the middle the converter is doing a lot of switches
> > between branches.
>
> So? Is that bad? Should it convert/commit branches one by one or in
> chronological order?
>
In hg, revision are stored as follow:
- first revision is the full data
- second is a delta vs the first version
- third is a delta vs the second version
etc. until the length(last full revision + the delta chain) >
2*length(full data)
in that case we store the full data.
so the "best" order to store the data is the linear order, you should avoid
switching between unrelated branches (it will make each delta bigger, insert
full revision more frequently etc...)
> > this shows the size we inserted for each revision, it wins the "most
> > weirdest graph award":
> > http://perso.ens-lyon.fr/benoit.boissinot/foo2.png
>
> Is that bytes/changeset? How did you extract those statistics? I would
> be willing to run the conversion on different combinations of branches,
> perhaps I can pinpoint the culprit. I'd also like to see what happens
> when converting the ports tree :)
Basically both graphs are done using the data from 'hg debugindex'.
The first one uses the offset field (ie. how much data was written
before this cset)
and the second uses the length field (how much stuff are we storing
for this rev).
I have the python code to reorder the DAG to optimize the layout
somewhere on my disk if you'd like to test.
regards,
Benoit
More information about the Mercurial
mailing list