[darcs-users] More thoughts on line endings, meta tags, patch types, etc

David Roundy droundy at abridgegame.org
Sun Dec 12 12:33:36 UTC 2004


On Sat, Dec 11, 2004 at 06:17:59PM -0500, Michael Conrad wrote:
> On Saturday, December 11, 2004 10:29 AM, David Roundy wrote:
> > On Thu, Dec 09, 2004 at 07:44:36PM -0500, Michael Conrad wrote:
> > > So, I think the new direction of this thought should be about how to
> > > make it easy to extend the number of patch-types that darcs can
> > > support.  Also, how to let the user specify which type the file is.
> > >
> > > Mark has been having some issues lately with darcs determining the
> > > type of one of his files.  Also, there's the case where a file might
> > > look like text, but the user wants it to be handled as binary.  And
> > > with logical lines, there would be an issue of telling darcs which
> > > files should be seen as logical-lined.
> >
> > At the moment, darcs doesn't have a persistent opinion of what the type
> > of a file is, and if possible, I'd rather not add that feature.  In
> > general, I don't like the idea of adding metadata that is only relevant
> > for darcs' operation.
> >
> > On the other hand, it may be necesary when dealing with line-endings
> > issues.  In particular, the contents of a file are currently uniquely
> > determined by the patches that modify that file.
> 
> You could find the most recent patch to touch a file, and also find out
> what patch-type was used to alter the file's contents.  This then is the
> file type.  It doesn't need to be stored, only looked-up.  (although
> setting the type for a new file might require something to be written
> into pending to remember the preference)

But this would make darcs whatsnew potentially an O(N*W*W) operation rather
than an O(W) operation, where W is the number of files in the repository,
and N is the length of the history of the repository.  This is not an
option, we would need to store the file types if we went with your proposal.

> > If we introduce either a new patch type or a new repository-creation flag
> > that gives darcs the flexibility to support the same file having different
> > line endings in different repositories while having the same set of
> > patches, then this is broken, and no longer can a file's contents be
> > determined based only upon the patches that created and modified it.  This
> > would mean that we could no longer blithely create binary patches for a
> > file that used to be text, since there is no well-defined "old" version of
> > the file.  :(
> 
> Well, with a new patch type, I think you still could determine file contents
> based on patches alone.  If someone takes a text file and wants it to become
> a logical-line file, (per my proposal) it would cause a 'hunk' patch to be
> written that deletes all the lines of the file, followed by a 'logiline'
> patch that would contain the logical lines for that file.

No, the whole point of special line-endings treatment is that the actual
file contents are *not* determined based on the patches alone, but also
depend on the operating system one used when creating a copy of the
repository and the settings given to darcs at that time.  Without knowing
those settings, there's no way to determine the logical lines in a given
file.

> > > Now, for making this extensible, it would be really cool if there was
> > > some kind of interface that a person could use when writing a new
> > > patch type such that they could just put their lhs file into the
> > > darcs directory and recompile. (*3) The interface would basically
> > > need a list of patch-types that it handles, a list of file-type-names
> > > (for the user to type on the command line), and a list of functions
> > > that darcs would use during loading, saving, and commutation.
> > >
> > > To read a patch with new types, you'd just have to get that lhs file
> > > and recompile.
> >
> > There is an appeal to this, but the problem is that writing commutation
> > code that won't lead to corruption is far from trivial (witness recent
> > bug described at http://www.scannedinavian.org/DarcsWiki/Issues1_2e0_2e1).
> > Lowering the barrier to creating new patch types is sort of nice, but I'm
> > not sure how much this would lower it in practice, and whether that would
> > outweigh the number of corrupt repositories that would result.
> 
> True, but it would allow people to focus on correct behavior of their module
> without having to worry about dammaging existing darcs code.  (recall my
> assumption that a module's functions will only ever see data which they have
> created, and can thus ignore all the rest of the working of darcs)
> 
> To go ahead and expand this idea (while I'm thinking about it) the interface
> might look like this:

True, if you restricted a patch type to not commute with any other patch
types, the danger would be limited.  However, the benefit would also be
quite limited.  The major benefits of having a variety of patch types only
comes about when they are able to commute.  That's when you can have things
like indentation patches that commute with hunk patches, so the code can be
reformatted without causing gratuitous conflicts.

True, for opaque formats like XML (which aren't really designed to be
human-edited), perhaps one patch type would be sufficient.  But for
human-edited files, it would be nice to have multiple patch types, so
formatting changes can (when possible) not conflict with content chances.
-- 
David Roundy
http://www.darcs.net




More information about the darcs-users mailing list