The example they give is for a small set of data, and percentages vary more dramatically as sample sizes decrease.
We wanted to give an idea of the speed without trying to boast too much or look like we were directly challenging anyone. Of course every news outlet has chosen to highlight the speed comment -- including the numbers which were intended to be ballpark figures -- more than was intended, but I guess that isn't surprising.
I agree that the tiny "person" example is not a good benchmark case. It was intended as a usage example, not a speed example, but I stuck the speed numbers in there just meaning to give people a vague idea of the difference. The "20-100 times faster" comment is based on testing a variety of formats -- both unrealistic ones and real-life formats used in our search pipeline -- against programmatically generated XML equivalents (which may or may not themselves be realistic, though they contain the same data with the same structure). libxml2 was used for parsing XML. I don't really know how libxml2's speed compares to other XML parsers, but I didn't have a lot of time to investigate. The 20x faster number comes from the largest data set (~100k-ish) while the 100x number comes from a very small message. The most realistic case was about 50x. Sorry that I cannot provide exact details of the benchmark setup since many of the test cases were proprietary internal formats.
In any case, I'm hoping that some independent source conducts some tests because I think anything we produced would probably have unintentional biases in it. Of course, I'll update the numbers in the docs if they turn out to be wildly off-base.