[darcs-users] Re: current status of darcs

David Roundy droundy at darcs.net
Sat Mar 10 00:16:42 UTC 2007


On Fri, Mar 09, 2007 at 02:32:00PM -0600, John Goerzen wrote:
> On 2007-03-09, David Roundy <droundy at darcs.net> wrote:
> > The basic idea is that we're going to have to give up the simplicity of
> > storing all changes as a linear sequence.  We'll have some sort of a
> > tree of primitive changes, with some branches "cancelled".  If there's
> > more than one branch that hasn't yet been cancelled, you've got a
> > conflict in your repository.  It's not as beautiful as conflictors
> > would be, and doesn't fit so well with the current code base, but it's
> > intuitively obvious that this approach describes a solution, and that
> > this solution can be represented in the existing commutation framework.
> 
> I'm not sure that I follow entirely here, but what you're describing
> sounds a lot like Mercurial, I think.
> 
> In Mercurial, your tree grows from the ground.  Each commit has a
> parent.  One parent can have multiple children; such a situation is a
> fork/branch.  If a patch has 2 parents, it is a merge.  Each patch has 1
> or 2 parents.
> 
> The topmost patch in a simple nonbranching tree is the head.  When you
> create a second branch, you now have two heads.  Heads represent
> unmerged changes.  When you merge, you reduce the number of heads in the
> repository by 1.

The difference (I believe) is that in darcs the tree representation would
just be a convenient representation of a set of changes, while the
mercurial tree itself is the history.  I'm not stating myself very clearly,
but the difference is that in mercurial, the shape of the tree, and the
order of the changes in the tree means something.  In darcs, it does not
(modulo dependency constraints).  This is why darcs supports first-class
cherry picking and mercurial (so far as I can see) does not.  Similarly, it
is why mercurial records a historical set of states of the repository
(which can indeed be useful) which darcs does not.

But the representation, or structure, is indeed very similar.  It's just
that darcs would only have a branch when there is a conflict, not for
ordinary merges, and the branch would only involve the primitive changes
that conflict (e.g. the conflicting lines) rather than the changesets.
Another difference is that there isn't a unique tree describing a given
state of the repository in darcs.

And, of course, in darcs the tree won't be user-visible as such, since it's
a tree of primitive patches... and won't be pretty.  But what *will* be
visible to users are the "conflicting" changes that they have to choose
between, and of course they will/should be able to look at what changes
were made in each named patch, and which of those conflicted and got
cancelled, etc.

> It all is very much like darcs on a certain level, and very much
> different than darcs on another, and I'm not sure I can quite verbalize
> that just yet.
> 
> But darcs essentially has a similar concept, with the difference that it
> is very unusual to have more than one head in a repo at a given time.
> That is, in a single repo, you usually merge as soon as there is a
> conflict (and sometimes a merge doesn't require a commit).

Right.  Currently a "conflicted" repository requiring a resolution is
purely a user-interface feature in darcs.  With the new conflict code,
it'll be at the core of darcs.

In fact, the invisibility of (ordinary) merges I consider a *extremely*
important feature.  It greatly affects how the code scales to large number
of developers, as an approach like mercurial that records everything
everyone does generally leads to a number of commits that is O(N^2) in the
number of active developers in the limit that pulls happen regularly (or
changes are pushed rarely, but most developers have recorded changes that
aren't in the central repo, i.e. they are active)--which isn't an unlikely
limit.

> I wonder how close you feel what you're proposing is to what Mercurial
> is doing?
> 
> http://upload.wikimedia.org/wikipedia/en/a/ac/Hgk.png
> 
> shows the Mercurial repo viewer which illustrates this, though this is
> an extremely complex example.

In many ways quite different.  Really only similar in that it's a tree or
dag describing a bunch of changes.  In the case of darcs, the structure
won't be related to the history, and the changes will be primitive changes,
so the nodes won't in general be meaningful states (i.e. states that
anyone's repository was ever in).

Also, I think we'll probably go with a tree rather than a DAG, mostly
because the tree seems simpler to deal with.  We could represent the same
information in a DAG, but we still couldn't avoid duplicate information,
and it's just more complicated.  Igloo thinks (or thought?) we should go
with a DAG, but I'm not unconvinced.
-- 
David Roundy
Department of Physics
Oregon State University



More information about the darcs-users mailing list