Comment Re:Unicode is a bug (Score 1) 68
Unicode is arguably the wrong tool for the job. It was designed to represent all human writing, across every language, living or dead. Even within one language, defining the character set unambiguously is difficult. Across multiple languages it's practically impossible. So Unicode goes for an inclusive approach - if something is plausibly a character of a human language, there's at least one way to represent it in Unicode. Possibly multiple ways, which is preferred over no ways. And uniqueness and identity of characters is poorly defined.
In contrast, computer code, including URLs, requires a robust definition of uniqueness and identity. Two URLs are either identical or not. And they're supposed to be human-readable, so non-identical URLs should not appear identical to a human. Which is impossible if URLs are allowed to consist of an unrestricted string of Unicode characters.