[darcs-users] darcs patch: TreeIO is smart enough now to unfold as ... (and 9 more)

Petr Rockai me at mornfall.net
Fri Jun 19 13:31:23 UTC 2009


Eric Kow <kowey at darcs.net> writes:

> Cute, but it may be worthwhile to explain  that this is because this is
> not System.IO.readFile that we're importing (and presumably, that your
> Storage.Hashed.Monad.readFile avoids this problem?)
I don't think it's ever been worth it to have this readFile error at all. Maybe
the cursing that's piling up across the codebase would justify fixing (or
removing) haskell_policy? It's cumbersome, poorly written and of questionable
value (especially since we require review anyway).

> We need to unfold the pristine Tree before rebuilding the index.
> ----------------------------------------------------------------
>> Petr Rockai <me at mornfall.net>**20090603203631
>>  Ignore-this: a7501a0d75c5596d91bc7ee96f5895f9
>> ] hunk ./src/Darcs/Gorsvet.hs 251
>
>> -  pristine <- readRecordedAndPending repo
>> +  pristine <- readRecordedAndPending repo >>= unfold
>
> Or else the consequence would be that the index only contains top-level
> entries?
Else the consequence would be a runtime failure in the indexing code that
asserts there are no Stubs in the Tree.

> By the way, if hashed-storage starts to be more stable and to be used
> by tools other than darcs, it would be a good idea to follow the Package
> versioning policy in
>   http://www.haskell.org/haskellwiki/Package_versioning_policy
> ...doesn't it want you to bump to 0.4 in this case, since the new
> argument is a non-backward compatible change?
Yes, I guess so. And 0.3.x is now in that mode (no backward-incompatible API
changes allowed).

> Optimise the file existence checking in whatsnew <files>.
> ---------------------------------------------------------
> This replaces our code for checking if the user passed in an filename to
> darcs whatsnew for a file that either isn't there or (more usefully)
> isn't being tracked by darcs.  Here's the old code for reference...
>
>> -warn_if_bogus :: (Slurpy,Slurpy) -> [SubPath] -> IO()
>> -warn_if_bogus _ [] = return ()
>> -warn_if_bogus (rec, pend) (f:fs) =
>> -    do exist1 <- doesFileExist file
>> -       exist2 <- doesDirectoryExist file
>> -       let exist = exist1 || exist2
>> -       if exist then when (not (slurp_has fp rec || slurp_has fp pend)) $
>> -                       putStrLn $ "WARNING: File '"
>> -                         ++file++"' not in repository!"
>> -                else putStrLn $ "WARNING: File '"++file++"' does not exist!"
>> -       warn_if_bogus (rec, pend) fs
>> -    where fp =  toFilePath f
>> -          file = encode_white fp
>
> And the new code.
>
>>  announce_files :: (RepoPatch p) => Repository p C(r u t) -> [SubPath] -> IO ()
>>  announce_files repo files =
>>      when (areFileArgs files) $ do
>> -      slurps <- slurp_recorded_and_unrecorded repo
>> -      warn_if_bogus slurps files
>> +      nonboring <- restrictBoring
>> +      working <- nonboring `fmap` readPlainTree "."
>> +      pristine <- readRecordedAndPending repo
>> +      let paths = map (fn2fp . sp2fn) files
>> +          check = virtualTreeIO (mapM exists $ map floatPath paths)
>> +      (in_working, _) <- check working
>> +      (in_pending, _) <- check pristine
>> +      mapM_ maybe_warn $ zip3 paths in_working in_pending
>
> Looks good to me at a glance.  I was initially confused by the zip3 but
> now I understand we're just cruising the list of booleans (no size
> mis-matches here).
>
> How is this an optimisation?  Is just by virtue of your hashed-storage
> stuff being better than slurpies?
Slurping reads directories eagerly, therefore forcing parse of all directory
nodes of the pristine cache. This is fairly expensive. The hashed-storage code
only reads the toplevel directory and those that are actually required.

> Also, we've stopped calling encode_white, but that's probably a good
> thing because (a) encode_white translates whitespace to something
> presumably darcs-specific (e.g. \32 for ' ') whereas I'll bet we
> originally meant to just use for back-slashing and (b) we already wrap
> the filename in quotes anyway.  I'll disregard filenames with
> apostrophes in them here :-).  [I'll be sending my haddock patches for
> encode_white in a bit... good place to apply the haddock-as-you-go
> principle for reading darcs code].
I don't know what is the purpose of encode_white in here. It didn't quite make
sense and my code works with whitespace-containing paths just fine.

>>        putStrLn $ "What's new in "++unwords (map show files)++":\n"
>> hunk ./src/Darcs/Commands/WhatsNew.lhs 108
>> +    where maybe_warn (file, False, False) =
>> +              putStrLn $ "WARNING: File '"++file++"' does not exist!"
>> +          maybe_warn (file, True, False) =
>> +              putStrLn $ "WARNING: File '" ++ file ++ "' not in repository!"
>> +          maybe_warn _ = return ()
>
> Should we be giving a warning if the user asks what's new on a boring
> file?
>
> Minor style comment: zipWith3 could be helpful here.  Also, an approach
> like the below may be more explicit and easier to follow, albeit more
> verbose.
>
>   maybe_warn (file, exists, tracked) =
>     if exists then if tracked then ...
I find the original more readable (I even tend to use case (cond1, cond2) of
... whenever deciding on two conditions at once instead of cascading if). That,
or even better

    maybe_warn (file, exists, tracked)
        | not exists && not tracked = ...
        | exists && not tracked = ...
        | otherwise =

> Basic "show index" implementation.
> ----------------------------------
>> +      "The `darcs show index' command lists all version-controlled files and " ++
>> +      "directories along with their hashes as stored in _darcs/index. " ++
>> +      "For files, the fields correspond to file size, sha256 of the current " ++
>> +      "file content and the filename.",
>
> No timestamp?  Not useful?
With current implementation, timestamps are hard to extract, as they are an
"implementation detail" of Index and not exported anywhere in the API.

>> +show_index_cmd :: [DarcsFlag] -> [String] -> IO ()
>> +show_index_cmd opts _ = withRepository opts $- \repo -> do
>
> UI nit: it may be useful to demand non-empty args here (or is this
> somehow already enforced by other parts of darcs?).
I don't quite follow. `darcs show index` doesn't take any args, it's just the
way command API works in darcs.

>> +  checkIndex repo
>> +  x <- unfold =<< readIndex "_darcs/index"
>> +  let line | NullFlag `elem` opts = \t -> putStr t >> putChar '\0'
>> +           | otherwise = putStrLn
>> +      output (p, i) = do
>> +        let hash = case itemHash i of
>> +                     Just h -> BS.unpack $ darcsFormatHash h
>> +                     Nothing -> "(no hash available)"
>> +            path = anchorPath "" p
>> +            isdir = case i of
>> +                      SubTree _ -> "/"
>> +                      _ -> ""
>> +        line $ hash ++ " " ++ path ++ isdir
>> +  mapM_ output $ list x
>
> Looks fine to me.  I tend to prefer more separation between my IO stuff
> and my non-IO stuff, but it doesn't really matter.

> I don't really understand what purpose NullFlag (-0, separate file names
> by NUL characters) serves, but darcs show files implements it too.
Because when filenames contain whitespace, 0 is a much better field separator
than a newline. See find -print0, xargs -0 and so on. Although admittedly, this
is a little less useful with show index/pristine than it is with show files. I
just took it from the latter.

Yours,
   Petr.

-- 
Petr Ročkai | http://web.mornfall.net
A physicist is an atom's way of knowing about atoms. (George Wald)


More information about the darcs-users mailing list