[darcs-users] so long and thanks for all the darcs
Ben Franksen
ben.franksen at online.de
Mon Mar 19 23:40:58 UTC 2018
Am 19.03.2018 um 09:12 schrieb Stephen J. Turnbull:
> Ben Franksen writes:
> > Certainly. But I think we should distinguish here between public
> > (shared) branches and unpublished (local, one developer)
> > branches. The latter are unavoidable in practice and I would not
> > use anything that did not allow me to have several local branches
> > for the different things I am working on (in parallel or
> > intermittently). The former can be problematic but (at least with
> > Darcs, as it currently is) the main problem IME is not merge
> > conflicts but discoverability!
>
> Sure. You can't get into a conflict with a branch you never tried to
> merge because you didn't know about it. I think what you're seeing
> here is a form of selection bias.
Probably. I was speaking form my personal experience.
> It will be interesting to see if
> patch theory really is powerful enough to keep conflicts manageable
> when you're looking at something like git.kernel.org with 25 core
> developers maintaining an average of 2.5 branches each, and fans
> feverishing trying all possible merges. :-) My guess is "no", but it
> would be very cool if I were proved wrong!
Don't get me wrong. The way Darcs handles conflicts is not in any way
optimal. There are fundamental problems with representing conflicts with
patches in the way Darcs does it. I have not yet seen a patch theory
that really solves these problems. I can't go into the details here that
would side-track us too much.
> > FWIW, I think that pulling from a repo with more than one branch should
> > fail if no branch is given explicitly and no default has been specified
> > locally. With an error message that says "Sorry, I have no idea which
> > branch you want to pull from"
>
> That is what happens with git.
>
> > followed by a list of available branches or a hint to which command
> > to use to list them.
>
> I don't think this is possible with raw git on a remote repository. I
> believe you need to fetch all the remote refs, and query locally.
In Darcs we have to query the remote repo anyway. You don't want to
transfer patches that are already present at the other end. I am sure
git has a way to avoid sending commits that the remote already has. But
git chooses to not clone all the refs by default and there is a reason
for that because it would have to pull all the referenced commits, too,
and that is costly. Not so in Darcs.
> > Perhaps, what drives the complexity of the branch handling over the
> > edge in git is that they chose to give local names to remote
> > branches.
>
> Well, git simply doesn't "do" remote branches the way that Mercurial
> and especially Bazaar do. Yes, you could enhance git to have a
> distributed DAG quite easily, but what users manipulate as "branches"
> are nothing more than local variables pointing to head commits.
Okay.
> So
> the solution the git developers came up with was providing namespaces
> (called "remotes" in git documentation) so that one name could refer
> to several heads at the same time. (In practice, there's no way to
> change the default namespace, so to refer to a name in a non-default
> namespace you need to spell out the remote, e.g., "origin/test"
> vs. "test" in the example you gave.)
This is what I mean. Even following mentally what you wrote here gives
me headaches. Not because of its complexity per se, but because of
*unnecessary* complexity.
> What you described as "linking at clone time" is exactly what git
> does:
AFAIR you must supply some obscure options in order to get all the
remote refs. But I understand what you mean.
> it automatically copies the specified branch ref (default
> "master") from the "origin" namespace to the default (unnamed)
> namespace. It is strongly discouraged, though not impossible, to
> change refs in a remote's namespace locally.
Yeah, discourage the feature, but first add it because it's oh so cool
and cheap, making everyone's live difficult because they now all have to
cope with the resulting complexity.
Sorry, I'm a bit grumpy today.
> > Git complains, too, IIRC, and as usual with its own rather cryptic
> > language (the "you are in detached HEAD..." sermon).
>
> But to get that message you need to explicitly checkout a commit that
> is not the target of a branch ref.
A tag, for instance.
> I believe the difference in philosophy is that git expects your object
> database manipulations to be entirely local for speed reasons. The
> only remote operations are fetch and push (pull = fetch + merge).
> This requires that you have a "handle" for the fetched commits, which
> may as well be a ref. SHA1s are awkward and typo-prone,
Copy & paste? It's 2018, not the 1970s.
> > > My point is that in git, there is no conflict until you explicitly ask
> > > for a merge of the branches. They can coexist indefinitely.
> >
> > This is the same in Mercurial, technically speaking, even though it
> > might permanently nag you about it ;)
>
> The nagging matters, though. ;-)
In practice it does, yes. I meant that the "kernel" has no problems with it.
> > > - What happens to the old patch when you do "darcs amend"?
> >
> > It continues to swim in the large pool of patches (more precisely:
> > patch representations) consisting of your repo, related repos, and
> > the cache.
>
> As I understand it, the patch knows about its dependencies, right?
Only the explicit ones. The implicit dependencies are, well, implicit. A
patch depends directly on some other patch if they are adjacent and
cannot be commuted. Indirect dependencies arise when there is no way to
commute patches so that they become adjacent. So, if you want to know
precisely, you must go ahead and try to commute them (except in some
simple cases where the result is obvious), and before you can do that
you might have to do other commutations etc.
> So
> if you've been diligent about recording semantic dependencies, you
> should be able to reconstruct the feature the patch helps implement.
>
> Of course humans can't do that consistently or accurately, so in
> practice you need some luck if you don't have an inventory with that
> patch to establish "close to correct" context.
>
> Do I have that right?
Yes, I think so.
> > (As an aside, this means you could emulate version-based VCSes by
> > always explicitly depending on all existing patches when you
> > record. This proves that Darcs is strictly more powerful than git
> > or Mercurial, since you can't emulate Darcs with them.)
>
> That's very cute!
:-)
> Without contesting that basic fact, let me think about the
> implementation a little...
>
> I'll grant that you can recreate the DAG in this way, but that's a far
> cry from emulating git or Mercurial. You'd need to add branch refs,
> but that's surely trivial.
Just use a subdirectory with one clone for each branch; we're talking
"in principle", right?
What would be the analogue of merges? Well you could pull patches from
one branch to another and then record a tag to mark the point where the
two branches join. In the case of a conflict, the resolution patch
serves the same purpose.
> I suspect you'd have some work to do to
> even get logging right, since that presumably is based on inventories,
> not on the dependency poset.
Doesn't matter. The dependencies (if done in this rather un-Darcsy way)
enforce a single linear sequence per branch/repo and the inventory
merely reflects that one fixed order...
> I don't know what "DAG" means if Darcs
> is going to go around commuting patches, so I guess you have in mind
> some sort of restricted mode
It *cannot* commute patches any longer because all the explicit
dependencies would forbid it.
The point is, you don't need any "new" restrictions, they are already
there, all you need is a slightly different set of defaults.
> -- I'm not sure it's fair to call that "Darcs". :-)
It would be a deliberately crippled version of Darcs. But note that it
would /still/ be more powerful because you could decide to override the
default or you could amend patches later and remove some of the
dependencies.
> Offhand, I can mention submodules (ie, attaching a separate repo
> instead of a tree to represent a directory),
Yes that's something we do not support yet. Though I'd say the existing
support in git is of the shallow sort. IIUC it's more or less a file
with some associations between subdirectories and subrepos (plus some
information about their remotes) and the normal git commands ignore
submodules completely. Correct me if I am wrong.
BTW, a project I occasionally work on but very often work with (at work)
recently decided to split into several submodules. This led to general
and widespread confusion about how to handle these submodules and lots
of criticism from users and contributors.
> and git's filter-branch capabilities.
I don't know anything about that feature.
> I'm not sure what submodules would mean in the context
> of Darcs, which doesn't have the concept of tree as far as I know.
Huh? Of course it has. If you mean tree as in "tree of files and dirs
that make up a version". But the question is still a good one and I dont
have an answer ready.
How does git cope with a conflict between a module and a submodule? Say
I have a submodule in a directory x and I add a file to the parent
module with name x/y.
Anyway, the thing about Darcs being "strictly more powerful" was meant
in a theoretical way. In practice it would be extremely difficult and a
lot of pointless work to /exactly/ emulate something like git or
Mercurial. The point is that if you want you can forego commutation,
append a line 'record --ask-deps' to your ~/.darcs/defaults and always
answer the dependency question with 'a' (for 'all').
> Which reminds me: I've long thought it might be an interesting
> experiment to use git's object database as a backing store for Darcs.
> The idea is to add a patch object type, which would contain a
> dependency list and a representation of the patch. You'd need at
> least two subtypes. diff-style patches would be represented by a pair
> of tree IDs, and other types of patches would contain a script, which
> would allow a crude token-replace implemented with sed or awk as well
> as rename and copy operations. If you're willing to forego
> token-replace, it's possible to represent rename and copy (and
> creation and removal of empty directories) with a tree pair. I guess
> you'd also want an inventory object type.
>
> Most diff vs. diff commutes would be extremely fast, since the
> intersection of changed objects in two patches would be null, and
> you'd never need to look at the content of patches.
Commuting hunks in Darcs is already fast. We do optimize and handle the
case of hunks in different files quickly (there is no need to change the
patch rep in this case; we say they commute "trivially"). When they are
in the same file, commutation more or less consists of a handful of
comparisons, additions, and substractions with machine integers. The
actual content of what is removed and added is not needed, only the size
(number of lines).
> This would be
> offset by the overhead of reading several objects, I suppose, and by
> the need to actually do diffs in the case of collisions, of course.
It is not quite clear to me what the motivation behind this whole idea is.
Cheers
Ben
--
"I tend to avoid fiction about dysfunctional urban middle-class people
written in the present tense." -- Ursula K. Le Guin
More information about the darcs-users
mailing list