Yep, they have been UTF-16 for a long time. And Unicode has been widely broken for a long time. It's not a coincidence.
Someone on StackExchange did some tests last year, adding in 4-byte unicode characters in common applications and seeing how they behaved. The results were really bad:
Opera has problem with editing them (delete required 2 presses on backspace)
Notepad can't deal with them correctly (delete required 2 presses on backspace)
File names editing in Window dialogs in broken (delete required 2 presses on backspace)
All QT3 applications can't deal with them - show two empty squares instead of one symbol.
Python encodes such characters incorrectly when used directly u'X'!=unicode('X','utf-16') on some platforms when X in character outside of BMP.
Python 2.5 unicodedata fails to get properties on such characters when python compiled with UTF-16 Unicode strings.
StackOverflow seems to remove these characters from the text if edited directly in as Unicode characters (these characters are shown using HTML Unicode escapes).
WinForms TextBox may generate invalid string when limited with MaxLength.
I've had more than my share of these sort of experiences too.
UTF-16 is dangerous, and should be phased out as much as possible. Where absolutely needed for performance reasons, it should be an internal representation only, hidden from the developer as much as possible. In particular, "length" functions should return the actual string length in characters, not code units; indexing functions should take character offsets; not code unit offsets; and returned "single characters" exposed to the developer should be of a format capable of handling multi-code-unit glyphs. Anything involving working with actual singular UTF-16 code units should only be available as a "for advanced users only, use at your own risk" functionality.