Comment Unicode in C, C++ and Perl (Score 2, Informative) 728
One thing many people aren't aware of is that for several years now (since GCC3), GCC and G++ accept UTF-8 as their default input encoding, and internally store narrow and wide strings as UTF-8 and UTF-32, respectively. It's recoded to the output stream locale when you do any output. This means you can write your source code in Unicode (in strings and comments at least) and it all works perfectly. It has full support in the C and C++ standard libraries. I've been using it for years; it works perfectly. It would be nice to get support for UTF-8 symbols in the linker, so we can have UTF-8 variable names as well. The same applies to Perl, though perl6 even gives you the ability to have Unicode operators, and possibly variable names.
I do routinely use UTF-8 symbols in R (example: "deltaCt" can be replace with the actual Delta symbol [Slashdot ate the Unicode--seriously poor!]). It makes the code more readable, and entry isn't the massive issue people make it out to be. AltGr/compose keys handle the common symbols, and you can look up the few odd ones that aren't in the compose tables.
Having the ability to use Unicode does not in any way detract from the ability to use ASCII. Since ASCII is a strict Unicode subset, the ability to use Unicode imposes zero overhead on those who wish to stick with ASCII, so the extent of the hate seen for wanting a bit of progress is a bit shocking. People pointed out how unreadable code could be made, but the reality is that when used sensibly and judiciously, it can make code more concise and readable.
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=522776 for information about some of the issues.
Having native Unicode support end-to-end by default is still a goal we want to achieve; the ASCII C locale is the last holdout. Getting a UTF-8 C locale is the last remaining step, though it'll take a few years to get there.
Regarding editing Unicode sources, both Emacs and vim have pretty decent Unicode support, and Linux distributions have had unicode support for a decade now, and really good support for at least six years. Broken tools are no longer an excuse for not using Unicode.
Regards,
Roger