[darcs-users] darcs conflicts/dependencies -- is patch theory the place to start?

Kevin Quick quick at sparq.org
Tue Sep 18 06:17:02 UTC 2012


> If the VCS gives each file an 'internal' id that is:
> - unique across all repos; and
> - persistent wherever the file goes, or however its location/name  
> changes.

Although this sounds alluring from the implementation perspective, I worry  
about the user perspective on something like this, because you are  
deemphasizing the main framework under which they (and all the other  
tools) have been operating.  This may turn out to be an advantage, but it  
is something to consider.

> I'm suggesting the file id be the ppid of when it got added, to help  
> with the
> book-keeping. (I'm assuming this can also tell us in which repo the file
> started life.)

For your purposes wouldn't any guid suffice?  I'm not sure that the source  
repo is useful.

> This discussion is all by way of warming-up for dealing with
> hunk changes, where we need to implement some sort of line-id, and  
> detect line
> movements.

I'm not sure line mapping follows from file mapping.  The file tends to be  
a long-lived entity even though it's contents morph over time/patches.  A  
particular line doesn't really have a "rename" operation; in the absence  
of context a new version simply replaces the old at a fairly atomic level  
(ie. there are no sub-operations could be mapped from the old line to the  
new line as is the case for a file).

That said, I think there would be great value to a VCS that is  
context-aware (as has been discussed previously in this thread) and  
perhaps degrades to line-oriented management if no context can be  
determined.  Two brief examples:

      original:    if (ready && remaining < 10) {

      change1:     if (!ready && remaining < 10) {

      change2:     if (ready &&
                       remaining < 10)
                   {

For a C-style context, it seems reasonable that change1 and change2 should  
not conflict.  The second example:

      original:    if (remaining < 10) {
                      x();
                      y(remaining);
                   }

      change1:     if (ready) {
                     if (remaining < 10) {
                       x();
                       y(remaining);
                     }
                   }

      change2:     if (remaining < 10) {
                     x();
                     y(remaining, 10);
                   }

Again, it would be wonderful if these two changes didn't conflict during a  
VCS merge (and more specifically if the VCS knew the language context was  
whitespace insensitive like C and therefore the additional indentation in  
change1 wasn't a conflict with any other change in the region).

Of course, Stephen Turnbull has already provided a counter-example in this  
email thread showing the indeterminism even in a context-aware system, but  
since we're having a fairly open discussion I thought I'd toss these  
thoughts into it.  :-)   Neither of the above is possible with a  
line-oriented scenario like L/S/L but Eelco Lempsink's thesis makes for an  
interesting read in this area:  http://eelco.lempsink.nl/thesis.pdf


>>
>> > Possibly we could expose the non-equivalence to the programmer even
>> > before
>> > pulling the hunk change, by the VCS linking B's file G to F to branch  
>> A,
>> > but not linking C's file G.
>>
>> Explicit dependencies like that (which are normally impossible or at  
>> best
>> over-restrictive in darcs because it forces explicit repo  
>> relationships) ...
>
> Could you explain a bit more what you mean by 'explicit repo  
> relationships',
> and what's bad about them?

I (mis?)understood your "linking" above to be recording in repo B a  
reference to repo A.  Thus A would be explicitly referenced in B.  The  
normal repo relationships in darcs are more ephemeral and outside of a  
push/pull the only relationship is a "suggestion" for the default of the  
next push/pull operation.  The explicit A reference in B would require the  
persistence and accessibility of A while working in B and make it  
difficult to recognize C cloned from A as an equivalent for that explicit  
reference.

>
> Pulling a patch from one repo to another sets up a relationship anyway  
> (so I
> understand?). One purpose of that is to detect duplicates. (That is,  
> dependen-
> upon patches already pulled to the target -- from Owen's description.) I  
> don't
> think anything about ppid-as-file-id gets in the way of repos working
> standalone; that is until you want to pull/merge patches -- at which  
> point
> surely a repo relationship and restrictions is exactly what you want(?)

Only inasmuch as the relationship is determined in the moment of the  
push/pull and not persisted outside of that action.  Once the action is  
completed, there is no explicit relationship between the repos.  They may  
contain common patches which forms an implicit relationship, but they must  
re-discover this on the next push/pull between the two (at least for  
darcs).

I don't know if I'm helping the discussion: it seems we are exploring  
theoretical VCS infrastructure so I'm enjoying our semi-abstract discourse  
and I may not be contributing to your goal.  If that's the case, please  
excuse my maunderings.  :-)

-- 
-KQ


More information about the darcs-users mailing list