Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Unix

Journal divbyzero's Journal: Late reply: Use something else

This is a late reply (after the thread was archived) to Use something else under Unicode and the Unix Console.

Certainly scanning forward and backward by character is computationally less efficient in UTF-8 than in a fixed-width encoding. However, such scanning is not necessary when searching for tokens. Due to the "no false hits" design of UTF-8, simple byte-for-byte comparisons and scanning will work perfectly. This turns out to be true of nearly all string operations.

The space issue is harder to answer. Whether it was intentional or not, choices made by the Unicode Consortium when it laid out the character table work to bias UTF-8 against CJK, even though those languages are used by a very large percentage of the people in the world.

This discussion has been archived. No new comments can be posted.

Late reply: Use something else

Comments Filter:

Our business in life is not to succeed but to continue to fail in high spirits. -- Robert Louis Stevenson

Working...