Comment Re:Schneier got it right a decade and a half ago (Score 1) 119
Maybe I'm still not getting your point. Sure, if you need to understand the details of Unicode character composition and such because you're the one rendering the output glyphs, or you want to sort or search across different encodings of the same word, that's rough, but there's no excuse for a security failure while doing those tasks.
On your other point: the notion of "sanitizing input" is fundamentally flawed to begin with. You can never know what future framework that user data will be interacting with, and what might be interpreted as an escape sequence in that mysterious future, but you can assume that the guy doing that future work will just assume "the input was sanitized", and you're screwed. Instead, don't go there. If e.g. you need to store a user string in a SQL DB, do it in such a way that there's no possible problematic string (perhaps the DB has a way of doing queries that's guaranteed safe, for example). If e.g. you need to send a user sting inside an XML blob, just convert the user string to a hex/base64/whatever representation first - guaranteed safe.
What usecase were you thinking of that makes any of this hard at all?