Yeah that makes sense - to get the high compression rates the assumption is made of a single voice. We fit a model around that. Break that assumption and you break the codec.
"No, no, I don't mind being called the smartest man in the world. I just wish it wasn't this one." -- Adrian Veidt/Ozymandias, WATCHMEN