In this case, the illegal UTF-8 sequence is the string after you have blown part of its funny foreign squiggle.
Where has it been proven that the bug is the trashing of a UTF-8 sequence?
First of all, Apple tends to use UTF-16 in the higher-level frameworks, e.g. that's how CFString/NSString work internally.
Second of all, processing entire characters rather than bytes is something I suspect Apple got right fairly early in the process. I suspect the problem is either that 1) when truncating the message for display, they're not processing entire graphemes, they're processing entire characters or 2) they're not taking bidirectionality into account or 3) they're not handling a combination of both issues.
He's saying that thing you call with your newly minted mangled string shouldn't fail.
Which is one way to solve it.
There are multiple things here that should be fixed. That's one of them - the renderer shouldn't crash if handed a bad string, it should fail more softly, e.g. put in a REPLACEMENT CHARACTER for all bad sequences and, if possible, log the error in a way that indicates that routine XXX has handed a bad character sequence to it.
I would argue, if the thing you calls mangles strings, sanitize its inputs so it doesn't get a string with a bad character (a unicode character of whatever format it uses internally, post-mangle).
And I would argue (all the way to the heat death of the universe) that, if you know that the thing you call mangles strings, and if it's produced by somebody else working on the same OS, you get it fixed so that it doesn't do that; you don't mangle user input (which includes text messages from other users) in released software, unless you don't have time to fix the underlying problem for the release.