Performance with binary-heavy repositories
Matt Mackall
mpm at selenic.com
Fri Aug 3 10:27:59 CDT 2007
On Fri, Aug 03, 2007 at 12:07:21PM +0200, Christoph.Spiel at partner.bmw.de wrote:
> Bryan -
>
> > Could you give me an idea of the
> > sizes of your files, please?
>
> I give you even more. ;) Here comes a histogram of sizes.
And I'll add a column here of cumulative size:
> Size/Bytes Occurrencies Total
> ========== ============ =====
> 5977 1355 8098835
> 46882 108 13162091
> 84918 42 16728647
> 107234 18 18658859
> 144558 13 20682671
> 196372 6 21860903
> 245490 2 22351883
> 256062 3 23120069
> 320656 3 24082037
> 450022 1 24532059
> 737280 2 26006619
> 975360 1
> 1167872 1
> 1624576 1
> 2211809 1
> 2694375 1
> 5317610 1
> 5505148 1
> 12460544 1
> 14458072 1
> 24047618 1
> 27227648 1
What this tells us is that the 11 files in the tail of your
distribution completely swamp the 1553 files at the head in terms of
file size. Even with a linear algorithm, you'd spend more time
compressing that last file than the first 1553.
It'd be interesting to get a graph of bdiff performance against file
size.
--
Mathematics is the supreme nostalgia of our time.
More information about the Mercurial
mailing list