Real programmers avoid using MySQL.
The "Han Unification" hack does have it's problems (often exaggerated,
but still there there), but I wouldn't say that that's the real problem:
I think you're right about needing metadata for every string, and
the real question in my mind is why isn't that part of unicode
itself? There used to be a way to embed locale hints in the
text, but that was deprecated with Unicode 5. WTF? What exactly
were they thinking?
There's another issue I don't get at all, which is why doesn't
someone out there (like say, google's web fonts?) index fonts
according to the codepoints they cover? Then you could do things
like check the content you need to display, and make sure you've
specified fonts that cover the entire range you're working with.
(Or perhaps even better: wouldn't it be cool if the *browser*
automatically supplied default fonts if the specified fonts
couldn't handle it? Then no more tofu!).