Server load while cloning

Greg Ward greg-hg at gerg.ca
Wed May 6 11:42:45 CDT 2009


I'm still playing around with cloning my large Hg repo converted from
a large CVS repo.  The stats: ~100k changesets, ~26k files, ~850 MB.
The problem: cloning over HTTP imposes heavy CPU load on the server
for several minutes.  Specifically, using my desktop PC (3 GHz Pentium
4, 2 GB RAM, 1 SATA disk) as the server and a different machine on the
LAN as client, cloning over HTTP takes ~7-9 minutes, during which the
server-side Mercurial process is above 80% CPU usage for most of the
time.  (It consumed about 6 min of CPU time during one clone.)

In contrast, sending an uncompressed 850 MB tarball over an
uncompressed, unencrypted link took 25 sec, of which ~12 sec was CPU
time.  Same thing with SSH took ~50 sec, of which ~16 sec was CPU.

I'm worried about what will happen when we go live with Hg: if I tell
everyone, "OK, make your clones and get to work!", then the poor
server will be inundated with 25 concurrent clone requests, the clones
will crawl along from resource contention on the server, and
everyone's first impression of Hg in production use will be less than
stellar.

The obvious workaround is, "Don't clone, stupid, just download a
tarball of the repository".  I was wondering if anyone else has seen
similar behaviour, how you have worked around it, and what the
problems might be with downloading a tarball of a live repository.

Thanks --

Greg


More information about the Mercurial mailing list