[darcs-users] Latin vs. Unicode
Ben Franksen
ben.franksen at online.de
Sun Nov 16 01:40:02 UTC 2014
This came up when re-factoring the options system and is of wider interest,
I think, so I send it to darcs-users.
The issue is, I should say, limited to stuff we get from the command line,
or from the environment, that is, patch meta-data like author, patch name,
etc. Here, Darcs has currently built in extra support for handling 8-bit
encodings like iso latin1. This works by casting the unicode characters in
the Strings to Word8, which effectively calculates their value modulo 256.
This is not noticeable as long as you use only languages with characters
whose code points are below 256, which is the case for most European
languages; but for Asian ones, not to speak of the other continents, this
breaks as soon as they enter data in their native languages.
Over the last years, unicode has established itself world-wide and firmly
and is well supported by all the major operating systems. This is why I vote
for dropping support for older 8-bit encodings that are not unicode
compatible, thereby allowing e.g. Chinese users to use Darcs with their
native languages.
Cheers
Ben
--
"Make it so they have to reboot after every typo." -- Scott Adams
More information about the darcs-users
mailing list