[darcs-users] Re: more feedback on binary file detection

Mark Stosberg mark at summersault.com
Thu Dec 9 16:23:52 UTC 2004


On 2004-12-09, Ralph Corderoy <ralph at inputplus.co.uk> wrote:
>
> Hi Mark,
>
>> > Can we have the output of
>> >
>> >     od -An -w1 -td1 -v $file | sort -n | uniq -c | pr -t --columns=4
>> 
>> Oh, some new unix commands to try. :) I really wanted to try this, but
>> FreeBSD doesn't support this syntax.  
>> 
>> The 'od' command is present, but lacking the '-w' flag, which I
>> couldn't find the equivalent to. Also, the 'pr' command is missing,
>> which I just took out of the pipeline. 
>
> OK, the -w1 means one byte per line.  Try this instead.
>
>    od -An -td1 -v $file | sed 's/  */x/g' | tr x '\012' | grep . |  sort -n | uniq -c

Thanks. Once I expanded my mind a bit, I realized I could run the Linux
command easily...because I have that directory mounted on my Linux box.

First, here's the output of my (currently working) file.  Next, I have
the output for a co-workers older copy. If I read the output correctly,
his copy has a NUL character in it that mine does not. 

My newer copy:
   3014    10           2    56         233    79        1697   104
  30171    32           2    57         137    80        2698   105
     24    33         149    58          10    81          25   106
    347    34         776    59         324    82         117   107
    572    35          23    60         260    83        2674   108
   1542    36        1032    61         271    84        1756   109
     79    37        1198    62         108    85        2281   110
     21    38         140    63          35    86        2713   111
   1156    39         128    64         106    87        1727   112
    772    40         187    65          24    88         244   113
    772    41         222    66          15    89        4075   114
     12    42         144    67          79    91        3836   115
     18    43         348    68          81    92        4302   116
    852    44         488    69          79    93        1598   117
    850    45         218    70          23    94         240   118
    448    46          72    71        3226    95         311   119
    248    47         335    72        3206    97         147   120
     47    48         193    73         618    98         745   121
     82    49           9    74        1811    99          60   122
     43    50          10    75        2226   100         686   123
     12    51         178    76        6426   101          55   124
      9    52         190    77        1330   102         685   125
      4    55         144    78         629   103          19   126

Co-workers older copy:
(Is this the issue?)
      vvvvvvv
      1     0           4    55         231    79        1697   104
   1344     9           2    56         135    80        2695   105
   3013    10           2    57          10    81          25   106
  24821    32         144    58         324    82         117   107
     24    33         775    59         256    83        2673   108
    347    34          23    60         270    84        1754   109
    571    35        1031    61         107    85        2280   110
   1541    36        1198    62          35    86        2709   111
     79    37         140    63         106    87        1725   112
     21    38         128    64          23    88         244   113
   1156    39         182    65          15    89        4074   114
    772    40         221    66          79    91        3832   115
    772    41         142    67          81    92        4298   116
     12    42         346    68          79    93        1595   117
     18    43         487    69          23    94         240   118
    852    44         218    70        3225    95         311   119
    850    45          70    71        3204    97         147   120
    448    46         335    72         617    98         745   121
    248    47         190    73        1810    99          59   122
     45    48           9    74        2226   100         686   123
     81    49          10    75        6421   101          55   124
     42    50         176    76        1326   102         685   125
     12    51         188    77         628   103          19   126

##############

Finally, could this command be adjusted to that it just checks for ASCII
0 and 26, the characters darcs seem to char about? 

Until darcs binary support can be improved, it would be great to have a tool
to pinpoint where stray unwanted control characters are.







More information about the darcs-users mailing list