[darcs-users] darcs conflicts/dependencies -- is patch theory the place to start?
AntC
anthony_clayden at clear.net.nz
Fri Sep 21 02:53:01 UTC 2012
Stephen J. Turnbull <stephen <at> xemacs.org> writes:
>
> AntC writes:
> >
> > Careful here! A move/rename is not the same as a file copy. Tentatively:
> > - move/rename must retain the file's identity
>
> Which means just *what* if from the author's point of view the
> functional set of changes involves ending up with two files, both with
> new names, each containing 50% of the premove file?
THis is exactly the sort of example I'm trying to work through. So my approach
is (trying to) separate out what applies for the container (file) vs the
contents (lines).
Remember that we get to observe the repo only intermittently (at record
points). Also darcs doesn't (currently, I believe) try to compare before/after
contents to guess where lines have gone. It's impressive that git does try. I
wonder how reliable it is?
I'm envisaging a move-file command (as per darcs), and a move-lines command,
so that the programmer can be explicit about their intent:
- are these two completely new files?
- or one with continuing identity, one new?
- (whether or not one of the files has the same name as before
is an orthogonal issue)
- for each file, is this completely new content?
- or continuing content (from where)?
- or (more likely) a mix of new and continuing?
The critical issue is determining how to apply patches pulled from other repos
where the file splitting hasn't occured (perhaps a bugfix on the pre-
refactored code). At some point the VCS has to give up as being all too hard,
but where is that point? By trying to track continuing lines, can we improve
the likelihood that patches can be pulled without conflict?
>
> Again, which one is the original file identity, if the purpose of the
> copy is to divide the file into two by deleting a range of lines from
> each?
I'd prefer to handle that as a move-lines for one (or both) of the ranges. If
the scenario is:
- move-lines from top-half of file A to file B
- move-lines from bottom-half of file A to file C
- (then A is empty)
- remove file A
- rename file C to "A"
Then the identity of the lines continues. The identity of file A is gone.
The compiler won't know that (nor care), because files D, E, F, etc have a
reference to a file named "A". It's only significant if we try to pull a patch
that depends on file A: it ain't there, the file now named "A" is not it.
> ... Suppose that you have decided on an interface-implementation
> split. Then it is quite likely that the interface file retains the
> "identity" (ie, the relationship to other modules in the application,
> but gets a new name).
If we want files D, E, F to refer to stuff in file B, we'll have to change
them to refer to "B". None of this is perfect, but it's really hard to see how
we could greatly improve it without the VCS being very semantics-aware.
>
> > Similarly if we rename a file, we also want to change any #include
> > in other files that refer to it. Perhaps this means token replaces
> > apply to the file system _and_ file contents?
>
> This is precisely the problem that "container tracking" is intended to
> address. It clearly is *not* a token replace as Darcs knows it,
> because it needs to be syntax-aware (eg, you wouldn't want it changing
> a comment like
>
> In version 42, this class was renamed from Foo to Bar. ...
So darcs token-replace discourages you from changing text where the target
token ("Bar" in this case) already appears. The reason is that inverting the
replace would not get you back to where you started (it would change "Bar"
to "Foo", yielding even more hilarity, and making it impossible to un-invert);
but your example is another excellent reason.
AntC
More information about the darcs-users
mailing list