ancestor() multiple-valued (long)
Pierre Asselin
pa at panix.com
Sun Jan 18 19:22:13 CST 2009
While trying to figure out the exact semantics of 'merge' I think
I found a problem.
The nodes in the revision graph are partially ordered by the relation
"x<y iff x is a descendent of y". I see a widespread assumption in the
code and the docs that any two nodes have a least upper bound under this
ordering, but this is not true. Consider the following revision graph:
+--A1---+
/ \
+--A--+ +------AB1
/ \ /
/ +--A2-- / ---+
fork+ / \
\ +--B1-+ +-AB2
\ / /
+--B--+ /
\ /
+--B2------+
The common ancestors of AB1, AB2 are the "tuning fork" [fork, A, B].
A and B are minimal common ancestors (no other common ancestors descend
from them) but neither is a least common ancestor (descendent of every
other common ancestor). What happens when you merge AB1 and AB2 ?
The code in ancestor.py picks among A and B nondeterministically,
whichever is first to come out of a 'generation' dictionary.
The script below creates two repositories 'ambiguous-one' and
'ambiguous-two' with the above revision graph, such that the
'debugancestor AB1 AB2' differ. Please confirm, I'm not sure how
reproducible this is but I get it everytime:
$ hg -R ambiguous-one log --template '{node|short}: {tags}\n' -r A -r B
2a959d4069c5: A
b6bb7e77431c: B
$ hg -R ambiguous-two log --template '{node|short}: {tags}\n' -r A -r B
2a959d4069c5: A
b6bb7e77431c: B
$ hg -R ambiguous-one debugancestor AB1 AB2
1:2a959d4069c5a536e1668ae68a1fcb862e0f37e4
$ hg -R ambiguous-two debugancestor AB1 AB2
1:b6bb7e77431c8e7453b7ac61070163f282713814
This would only be a semi-bug, except that 'hg merge' makes multiple calls
to ancestor.ancestor() and gets inconsistent results. If I attempt the
merge in ambiguous-two I get:
$ cd ambiguous-two
$ hg --debug merge
merge: warning: conflicts during merge
resolving manifests
overwrite None partial False
>>> ancestor b6bb7e77431c local 48111963ff2f+ remote 7df35e5fb308
searching for copies back to rev 3
testfile: versions differ -> m
picked tool 'filemerge' for testfile (binary False symlink False)
merging testfile
>>> my testfile at 48111963ff2f+ other testfile at 7df35e5fb308 ancestor testfile at 2a959d4069c5
merging testfile failed!
0 files updated, 0 files merged, 0 files removed, 1 files unresolved
There are unresolved merges, you can redo the full merge using:
hg update -C 8
hg merge 5
Now IMHO we have a bug. There is a superficial resemblance to issue
1327 but this is different: the trigger of 1327 is not present and,
unlike 1327, the merge base chosen in the end *is* a common ancestor of
the two changesets being merged. Still, the merge base could vary from
file to file. It's not fatal, but it just doesn't seem right.
(If your revision graph looks like the above your organization has
communication problems, but those are beyond Mercurial's jurisdiction;
plus, you never know what you'll get when you pull. I think this is a
real bug.)
In the example above, neither A nor B is the right merge base for AB1
and AB2. The best one would be the merge of A and B, if it existed.
That said, both A and B are probably good enough, so an acceptable
band-aid would be to break ties in an arbitrary but consistent manner,
preferably independent of accidents like the local changeset order
in the repo.
For example, in ancestor.ancestor(), intersect matching generations until
the result is nonempty. If the intersection has more than one element,
arbitrarily pick the one with the smallest changeset hash. Easy to
say, but I don't know enough Mercurial (or python?) to disentangle the
various class.ancestor()'s that call into ancestor.py and how to pass
the changeset info down that pipe.
The documentation of the various .ancestor()'s should also be changed
to say "return a minimal common ancestor of nodes a and b or None if
there is no such ancestor; if many such ancestors exist, choose one
arbitrarily but consistently across calls."
Test script follows.
--------------
#! /bin/sh
set -x
rm -rf ambiguous-one ambiguous-two
hg init ambiguous-one
cd ambiguous-one
for i in 1 2 3 4; do echo base $i >> testfile; done
hg add testfile
hg commit -m fork
hg tag -l fork
for I in A B; do
hg update --clean --rev fork
sed -i -e 2achange\ $I testfile
hg commit -m $I
hg tag -l $I
for J in 1 2; do
hg update --clean --rev $I
sed -i -e 3asubchange\ $I$J testfile
hg commit -m $I$J
hg tag -l $I$J
done
done
for J in 1 2; do
hg update --clean A$J
hg merge --rev B$J
# "fix" conflicts
sed -i \
-e '/^<<<<<<</d' -e '/^>>>>>>>/d' \
-e '/^=======/d' -e '/^|||||||/d' \
testfile \
;
hg commit -m AB$J
hg tag -l AB$J
done
cd ..
hg init ambiguous-two
for rev in B AB1 AB2; do
hg --repository ambiguous-one push --force --rev $rev ambiguous-two
done
#
# Hack, copy the tags for convenience.
cp ambiguous-one/.hg/localtags ambiguous-two/.hg
#
# Put the new repo on the same rev as the first.
(cd ambiguous-two && hg update --rev AB2)
set +x
echo
echo '#######################################################################'
echo Consistency check: ambiguous-one and ambiguous-two have same changesets
for action in incoming outgoing; do
cmd="hg --repository ambiguous-one $action ambiguous-two"
echo $cmd
eval $cmd
done
echo
echo Ancestor of AB1 and AB2:
for repo in ambiguous-one ambiguous-two; do
echo -n $repo:\
hg --repository $repo debugancestor AB1 AB2
done
exit 0
--
pa at panix dot com
More information about the Mercurial
mailing list